ElasticSearch ODM (Object Document Mapper) for Python - pip install esengine

Related tags

Searchesengine
Overview

esengine - The Elasticsearch Object Document Mapper

PyPI versions downloads Travis CI Coverage Status Code Health

esengine is an ODM (Object Document Mapper) it maps Python classes in to Elasticsearch index/doc_type and object instances() in to Elasticsearch documents.



Modeling

Out of the box ESengine takes care only of the Modeling and CRUD operations including:

  • Index, DocType and Mapping specification
  • Fields and its types coercion
  • basic CRUD operations (Create, Read, Update, Delete)

Communication

ESengine does not communicate directly with ElasticSearch, it only creates the basic structure, To communicate it relies on an ES client providing the transport methods (index, delete, update etc).

ES client

ESengine does not enforce the use of the official ElasticSearch client, but you are encouraged to use it because it is well maintained and has the support to bulk operations. But you are free to use another client or create your own (useful for tests).

Querying the data

ESengine does not enforce or encourage you to use a DSL language for queries, out of the box you have to write the elasticsearch payload representation as a raw Python dictionary. However ESEngine comes with utils.payload helper module to help you building payloads in a less verbose and Pythonic way.

Why not elasticsearch_dsl?

ElasticSearch DSL is an excellent tool, a very nice effort by the maintainers of the official ES library, it is handy in most of the cases, but because it is built on top of operator overiding, sometimes leads to a confuse query building, sometimes it is better to write raw_queries or use a simpler payload builder having more control and visibility of what os being generated.

ElasticSearch_DSL as a high level abstraction promotes Think only of Python objects, dont't worry about Elastic queries while ESengine promotes Know well the Elastic queries and then write them as Python objects.

ElasticSearch_DSL is more powerful and more complete, tight more with ES specifications while ESEngine is simpler, lightweight shipping only the basics.

Project Stage

It is in beta-Release, working in production, but missing a lot of features, you can help using, testing,, discussing or coding!

Getting started

Installation

ESengine needs a client to communicate with E.S, you can use one of the following:

  • ElasticSearch-py (official)
  • Py-Elasticsearch (unofficial)
  • Create your own implementing the same api-protocol
  • Use the MockES provided as py.test fixture (only for tests)

Because of bulk operations you are recommendded to use elasticsearch-py (Official E.S Python library) so the instalation depends on the version of elasticsearch you are using.

in short

Install the client and then install ESEngine

  • for 2.0 + use "elasticsearch>=2.0.0,<3.0.0"
  • for 1.0 + use "elasticsearch>=1.0.0,<2.0.0"
  • under 1.0 use "elasticsearch<1.0.0"

For the latest use:

$ pip install elasticsearch
$ pip install esengine

Or install them together

Elasticsearch 2.x

pip install esengine[es2]

Elasticsearch 1.x

pip install esengine[es1]

Elasticsearch 0.90.x

pip install esengine[es0]

The above command will install esengine and the elasticsearch library specific for you ES version.

Usage

# importing

from elasticsearch import ElasticSearch
from esengine import Document, StringField

# Defining a document
class Person(Document):
    # define _meta attributes
    _doctype = "person"  # optional, it can be set after using "having" method
    _index = "universe"  # optional, it can be set after using "having" method
    _es = ElasticSearch(host='host', port=port)  # optional, it can be explicit passed to methods
    
    # define fields
    name = StringField()

# Initializing mappings and settings
Person.init()

If you do not specify an "id" field, ESEngine will automatically add "id" as StringField. It is recommended that when specifying you use StringField for ids.

TIP: import base module

A good practice is to import the base module, look the same example

import esengine as ee

class Person(ee.Document):
    name = ee.StringField()

Fields

Base Fields

name = StringField()
age = IntegerField()
weight = FloatField()
factor = LongField()
active = BooleanField()
birthday = DateField()

Special Fields

GeoPointField

A field to hold GeoPoint with modes dict|array|string and its mappings

class Obj(Document):
    location = GeoPointField(mode='dict')  # default
    # An object representation with lat and lon explicitly named

Obj.init() # important to put the proper mapping for geo location

obj = Obj()

obj.location = {"lat": 40.722, "lon": -73.989}}

class Obj(Document):
    location = GeoPointField(mode='string')
    # A string representation, with "lat,lon"

obj.location = "40.715, -74.011"

class Obj(Document):
    location = GeoPointField(mode='array')
    # An array representation with [lon,lat].

obj.location = [-73.983, 40.719]

ObjectField

A field to hold nested one-dimension objects, schema-less or with properties validation.

# accepts only dictionaries having strct "street" and "number" keys
address = ObjectField(properties={"street": "string", "number": "integer"})

# Accepts any Python dictionary
extravalues = ObjectField() 

ArrayField

A Field to hold arrays (python lists)

In the base, any field can accept multi parameter

colors = StringField(multi=True)   # accepts ["blue", "green", "yellow", ....]

But sometimes (specially for nested objects) it is better to be explicit, and also it generates a better mapping

# accepts an array of strings ["blue", "green", "yellow", ....]
colors = ArrayField(StringField()) 

It is available for any other field

locations = ArrayField(GeoPointField())
numbers = ArrayField(IntegerField())
fractions = ArrayField(FloatField())
addresses = ArrayField(ObjectField(properties={"street": "string", "number": "integer"}))
list_of_lists_of_strings = ArrayField(ArrayField(StringField()))

Indexing

person = Person(id=1234, name="Gonzo")
person.save()  # or pass .save(es=es_client_instance) if not specified in model 

Getting by id

Person.get(id=1234)

filtering by IDS

ids = [1234, 5678, 9101]
power_trio = Person.filter(ids=ids)

filtering by fields

Person.filter(name="Gonzo")

Searching

ESengine does not try to create abstraction for query building, by default ESengine only implements search transport receiving a raw ES query in form of a Python dictionary.

query = {
    "query": {
        "filtered": {
            "query": {
                "match_all": {}
            },
            "filter": {
                "ids": {
                    "values": [1, 2]
                }
            }
        }
    }
}
Person.search(query, size=10)

Getting all documents (match_all)

Person.all()

# with more arguments

Person.all(size=20)

Counting

Person.count(name='Gonzo')

Updating

A single document

A single document can be updated simply using the .save() method

person = Person.get(id=1234)
person.name = "Another Name"
person.save()

Updating a Resultset

The Document methods .get, .filter and .search will return an instance of ResultSet object. This object is an Iterator containing the hits reached by the filtering or search process and exposes some CRUD methods[ update, delete and reload ] to deal with its results.

people = Person.filter(field='value')
people.update(another_field='another_value')

When updating documents sometimes you need the changes done in the E.S index reflected in the objects of the ResultSet iterator, so you can use .reload method to perform that action.

The use of reload method

people = Person.filter(field='value')
print people
... <Resultset: [{'field': 'value', 'another_field': None}, 
                 {'field': 'value', 'another_field': None}]>

# Updating another field on both instances
people.update(another_field='another_value')
print people
... <Resultset: [{'field': 'value', 'another_field': None}, {'field': 'value', 'another_field': None}]>

# Note that in E.S index the values weres changed but the current ResultSet is not updated by defaul
# you have to fire an update
people.reload()

print people
... <Resultset: [{'field': 'value', 'another_field': 'another_value'},
                 {'field': 'value', 'another_field': 'another_value'}]>

Deleting documents

A ResultSet

people = Person.all()
people.delete()

A single document

Person.get(id=123).delete()

Bulk operations

ESEngine takes advantage of elasticsearch-py helpers for bulk actions, the ResultSet object uses bulk melhod to update and delete documents.

But you can use it in a explicit way using Document's update_all, save__all and delete_all methods.

Lets create a bunch of document instances

top_5_racing_bikers = []

for name in ['Eddy Merckx', 
             'Bernard Hinault', 
             'Jacques Anquetil', 
             'Sean Kelly', 
             'Lance Armstrong']:
     top_5_racing_bikers.append(Person(name=name))

Save it all

Person.save_all(top_5_racing_bikers)

Using the create shortcut

The above could be achieved using create shortcut

A single
Person.create(name='Eddy Merckx', active=False)

Create will return the instance of the indexed Document

All using list comprehension
top_5_racing_bikers = [
    Person.create(name=name, active=False)
    for name in ['Eddy Merckx', 
                 'Bernard Hinault', 
                 'Jacques Anquetil', 
                 'Sean Kelly', 
                 'Lance Armstrong']
]

NOTE: .create method will automatically save the document to the index, and will not raise an error if there is a document with the same ID (if specified), it will update it acting as upsert.

Updating all

Turning the field active to True for all documents

Person.update_all(top_5_racing_bikes, active=True)

Deleting all

Person.delete_all(top_5_racing_bikes)

Chunck size

chunk_size is number of docs in one chunk sent to ES (default: 500) you can change using meta argument.

Person.update_all(
    top_5_racing_bikes, # the documents
    active=True,  # values to be changed
    meta={'chunk_size': 200}  # meta data passed to **bulk** operation    
)

Utilities

Mapping and Mapping migrations

ESEngine does not saves mappings automatically, but it offers an utility to generate and save mappings on demand You can create a cron job to refresh mappings once a day or run it every time your model changes

Using the document
class Person(Document):
    # define _meta attributes
    _doctype = "person"  # optional, it can be set after using "having" method
    _index = "universe"  # optional, it can be set after using "having" method
    _es = ElasticSearch(host='host', port=port)  # optional, it can be explicit passed to methods
    
    # define fields
    name = StringField()
    
You can use init() class method to initialize/update mappings, settings and analyzers
Person.init()  # if not defined in model, pass an **es=es_client** here

Include above in your the last line of your model files or cron jobs or migration scripts

Dynamic meta attributes

In ESEngine Document all attributes starting with _ is a meta attribute, sometimes you can't define them hardcoded in your models and want them to be dynamic. you can achieve this by subclassing your base document, but sometimes you really need to change at runtime.

Sometimes it is useful for sharding.

from models import Person

BrazilianUsers = Person.having(index='another_index', doctype='brasilian_people', es=Elasticsearch(host='brazil_datacenter'))
AmericanUsers = Person.having(index='another_index', doctype='american_people', es=Elasticsearch(host='us_datacenter'))

brazilian_users = BrasilianUsers.filter(active=True)
american_users = AmericanUsers.search(query=query)

Validators

Field Validator

To validate each field separately you can set a list of validators, each validator is a callable receiving field_name and value as arguments and should return None to be valid. If raise or return the data will be invalidated

from esengine.exceptions import ValidationError

def category_validator(field_name, value):
    # check if value is in valid categories
    if value not in ["primary", "secondary", ...]:
        raise ValidationError("Invalid category!!!")
    
class Obj(Document):
    category = StringField(validators=[category_validator])

obj = Obj()
obj.category = "another"
obj.save()
Traceback: ValidationError(....)
Document Validator

To validate the whole document you can set a list of validators, each validator is a callable receiving the document instance and should return None to be valid. If raise or return the data will be invalidated

from esengine.exceptions import ValidationError

def if_city_state_is_required(obj):
    if obj.city and not obj.state:
        raise ValidationError("If city is defined you should define state")
        
class Obj(Document):
    _validators = [if_city_state_is_required]
    
    city = StringField()
    state = StringField()

obj = Obj()
obj.city = "Sao Paulo"
obj.save()
Traceback: ValidationError(....)

Refreshing

Sometimes you need to force indices-shards refresh for testing, you can use

# Will refresh all indices
Document.refresh()

Payload builder

Sometimes queries turns in to complex and verbose data structures, to help you (use with moderation) you can use Payload utils to build queries.

Example using a raw query:
query = {
    "query": {
        "filtered": {
            "query": {
                "match_all": {}
            },
            "filter": {
                "ids": {
                    "values": [1, 2]
                }
            }
        }
    }
}

Person.search(query=query, size=10)
Same example using payload utils
from esengine import Payload, Query, Filter
payload = Payload(query=Query.filtered(query=Query.match_all(), filter=Filter.ids([1, 2])))
Person.search(payload, size=10)

Payload utils exposes Payload, Query, Filter, Aggregate, Suggesters

You can also set model on payload initialization to create a more complete payload definition

from esengine import Payload, Query, Filter
payload = Payload(
    model=Person,
    query=Query.filtered(query=Query.match_all(), filter=Filter.ids([1, 2]))
    sort={"name": {"order": "desc"}},
    size=10
)
payload.search()
More examples

You can use Payload, Query or Filter direct in search

from esengine import Payload, Query, Filter

Person.search(Payload(query=Query.match_all()))

Person.search(Query.bool(must=[Query.match("name", "Gonzo")]))

Person.search(Query.match_all())

Person.search(Filter.ids([1, 2, 3]))
chaining

Payload object is chainable so you can do:

payload = Payload(query=query).size(10).sort("field", order="desc")
Document.search(payload) 
# or the equivalent
payload.search(Document)

Pagination

You can paginate a payload, lets say you have indexed 500 documents under 'test' category and now you need to retrieve 50 per page.

Result will be included in pagination.items

from esengine import Payload, Filter
from models import Doc

payload = Payload(Doc, filter=Filter.term('category', 'test'))

# Total documents
payload.count()
500

# Paginate it
current_page = 1  # you have to increase it on each pagination
pagination = payload.paginate(page=current_page, per_page=50)

pagination.total
500

pagination.pages
10

pagination.has_prev
False

pagination.has_next
True

pagination.next_num
2

len(pagination.items)
50

for item in pagination.items:
    # do something with item

# Turn the page

current_page += 1
pagination = payload.paginate(page=current_page, per_page=50)
pagination.page
2
pagination.has_prev
True

# Another option to move pages

pagination  = pagination.next_page()
pagination.page
3

pagination = pagination.prev_page()
pagination.page
2

# Turn the page in place

pagination.backward()
pagination.page
1

pagination.forward()
pagination.page
2
Create a paginator in Jinja template

So you want to create buttons for pagination in your jinja template

{% macro render_pagination(pagination, endpoint) %}
  <div class=pagination>
  {%- for page in pagination.iter_pages() %}
    {% if page %}
      {% if page != pagination.page %}
        <a href="{{ url_for(endpoint, page=page) }}">{{ page }}</a>
      {% else %}
        <strong>{{ page }}</strong>
      {% endif %}
    {% else %}
      <span class=ellipsis>…</span>
    {% endif %}
  {%- endfor %}
  </div>
{% endmacro %}

Contribute

ESEngine is OpenSource! join us! Small Acts Manifesto

MADE WITH #LOVE AND #PYTHON (which is the same) AT CathoLabs

catholabs

Comments
  • 'unicode' problem with Python3

    'unicode' problem with Python3

    I was trying to use esengine with Python3 for a project but is raising the following exception during import:

    from esengine import StringField
    
    in /esengine/bases/field.py
    7 class BaseField(object):
    8       _type = unicode
    9       _default = None
    10     _default_mapping = {'type': 'string'}
    NameError: name 'unicode' is not defined
    

    I am using python 3.5 and esengine 0.0.18

    bug enhancement 
    opened by tiagorm 3
  • How do I make use of esengine + application factory + sqlalchemy?

    How do I make use of esengine + application factory + sqlalchemy?

    How do I make use of esengine and have it play nice with in an application factory and sqlalchemy ORM?

    I know I can make a flask plugin/extension for this but, how should I be mapping my sqlalchemy models over to esengine? I know the fields I want to index, it is just a matter of how you guys would recommending doing this? elasticsearch_dsl has a similar way of ODM.

    Do you guys have documentation or a simple example ? Thanks!

    opened by rlam3 3
  • ArrayField of DateFields

    ArrayField of DateFields

    Hi. There is a bug when using an ArrayField of DateFields. If the field is:

    dates = ArrayField(DateField())

    saving the value is ok. But get the document raises an exception because the class ArrayField don't implement the from_dict function from DateField. Instead, it uses the from_dict from BaseField class. In other types this is ok, but in DateField an exception is raised because it tries to parse the value in this way: datetime("2010-10-10") [line 84 from bases/field.py]

    bug 
    opened by andryw 2
  • Esengine need to provide a a way to manage derived fields created in querys.

    Esengine need to provide a a way to manage derived fields created in querys.

    When we add a field to a document inside a query, we need a way to use this field in our document model. Any ideias @rochacbruno, @andryw?

    • Use a anotation to emulate derived script_fields as properties.
    • Create a ScriptField that is a field derived from a script_field query
    • Create a ViewDoc that represent a view of the doctype on index Questions:
    • How to manage null values generated by a query that select few document fields?
    enhancement 
    opened by ederfmartins 1
  • deepcopy at Resultset results performance problem

    deepcopy at Resultset results performance problem

    We put ESEngine in production. If a document has data structure with many values (lists, dicts), deepcopy can be slow. Is it really necessary deepcopy here?

    opened by andryw 1
  • Save defaults and empty fields?

    Save defaults and empty fields?

    Currently Document will save every field even if it is empty, it needs to be configurable

    adding:

    • Document._write_empty_fields
      defaults to False because it saves disk space and memory
    
    class Doc(Document):
        _write_empty_fields = True
    
    • BaseField._default
      default data to be saved when field is None
      defaults to None = empty
    city = StringField(default="São Paulo")
    

    also there will be _write_empty for fields

    class Doc(Document):
        _write_empty_fields = False 
        name = StringField(write_empty=True)
    

    Despite the Doc saying it will not save empty fields, for the case of name it will save even if the value is None

    opened by rochacbruno 1
  • A lot of changes

    A lot of changes

    • Document can specify a default connection using _es
    • Document search and filtering methods return ResultSet managed iterator/generator which exposes CRUD methods
    • Using generators for bulk actions
    • Implemented new methods for CRUD (delete, update, delete, create)
    • Added utils module
    • Tests now uses py.test fixtures for mocking
    • Readme created
    opened by rochacbruno 1
  • Missing Field: Output error could specify all fields and type missing

    Missing Field: Output error could specify all fields and type missing

    Currently, the output error from missing field only prints one variable at time. Also doesn't specify the type of the variable. If we know what variables are missing and what is the correct type we can fix all the incoherence once.

    enhancement 
    opened by dmoliveira 1
  • Getting more done in GitHub with ZenHub

    Getting more done in GitHub with ZenHub

    Hola! @emanuelvianna has created a ZenHub account for the catholabs organization. ZenHub is the only project management tool integrated natively in GitHub – created specifically for fast-moving, software-driven teams.


    How do I use ZenHub?

    To get set up with ZenHub, all you have to do is download the browser extension and log in with your GitHub account. Once you do, you’ll get access to ZenHub’s complete feature-set immediately.

    What can ZenHub do?

    ZenHub adds a series of enhancements directly inside the GitHub UI:

    • Real-time, customizable task boards for GitHub issues;
    • Multi-Repository burndown charts, estimates, and velocity tracking based on GitHub Milestones;
    • Personal to-do lists and task prioritization;
    • Time-saving shortcuts – like a quick repo switcher, a “Move issue” button, and much more.

    Add ZenHub to GitHub

    Still curious? See more ZenHub features or read user reviews. This issue was written by your friendly ZenHub bot, posted by request from @emanuelvianna.

    ZenHub Board

    opened by emanuelvianna 0
  • DateField does not support `multi` and `date_format` at the same time

    DateField does not support `multi` and `date_format` at the same time

    When DateField override the to_dict() from BaseField it doesn't treat the multi case.

    How to repeat:

    >>> from esengine import Document, DateField
    >>> from elasticsearch import Elasticsearch
    >>> from datetime import datetime
    >>> class A(Document):
    ...     _index = 'A'
    ...     _doctype = 'A'
    ...     _es = Elasticsearch()
    ...     a = DateField(date_format='%Y-%m-%d', multi=True) 
    >>> a = A(a=[datetime.now()])
    >>> a.save()
    Traceback (most recent call last):
      File "t.py", line 12, in <module>
        a.save()
      File "/usr/local/lib/python2.7/dist-packages/esengine/document.py", line 91, in save
        doc = self.to_dict()
      File "/usr/local/lib/python2.7/dist-packages/esengine/bases/document.py", line 79, in to_dict
        for field_name, field_instance in iteritems(fields)
      File "/usr/local/lib/python2.7/dist-packages/esengine/bases/document.py", line 79, in <dictcomp>
        for field_name, field_instance in iteritems(fields)
      File "/usr/local/lib/python2.7/dist-packages/esengine/fields.py", line 257, in to_dict
        return value.strftime(self._date_format)
    AttributeError: 'list' object has no attribute 'strftime'
    
    opened by emanuelvianna 0
  • ResultSet & Payload return only values.

    ResultSet & Payload return only values.

    
    results = Doc.all()
    
    # this only gets specified fields from resultset (query already done)
    result.get_values("id")
    [123, 456, 789]
    
    # This also sets "_source.fields" property of query to save memory returning only desired fields
    payload = Payload(Doc, query=Query.match_all())
    payload.get_values("id")
    [123, 456, 789]
    
    opened by rochacbruno 0
  •  Abandoned ? What to do ?

    Abandoned ? What to do ?

    Hi, It looks like there are no activities since Jun 13, 2019

    Is it possible that you add me to the contributers and I can take care of the package along with people's contributions ?

    P.S. I managed to support elasticsearch 7 in my fork

    Thank you

    opened by 0mars 0
  • Bug in bulk operations

    Bug in bulk operations

    The bulk helper from python elasticserach library receives "raise_on_error" and "chunk_size" parameters. The bulk operations will be executed for "chunk_size" documents at time (default 500). If raise_on_error is True, an exception is raised and the documents not processed at this point will not be processed. This is problematic in delete_all operation of ESEngine.

    If I send 501 documents to be deleted, which the first 500 documents aren't idenxed but the 501° is, an exception will be raised on the first "chunck" and the 501° will not be deleted.

    I know that it's possible to pass kwargs on esengine. Perhaps this behavior could be more explicit. What do you think?

    bug 
    opened by andryw 0
  • Port esengine to use it with solr indices

    Port esengine to use it with solr indices

    IMO elasticsearch and solr are very similar databases. I think that add support to solr in esengine will be a good move. Maybe, at least as a plug-in or connector.

    @rochacbruno, @andryw, what do you think about it? What would be the best option?

    enhancement 
    opened by ederfmartins 0
Releases(0.0.7)
  • 0.0.6(Jan 3, 2016)

    • Fields and Document validators
    • GeoPointField has 3 different improved mappings
    • Mappings is being generated at Documet.put_mapping()
    • Document.search accepts raw query, Query or Payload
    Source code(tar.gz)
    Source code(zip)
  • 0.0.5(Jan 1, 2016)

  • 0.0.3(Dec 14, 2015)

    Available on pip https://pypi.python.org/pypi/esengine

    • Reimplemented get method to return a single document
    • Filter and Search return a ResultSet instance
    • New CRUD methods created
    • More tests
    • README documentation
    • Travis, landscape.io, coveralls added
    • Bulk operations improved
    Source code(tar.gz)
    Source code(zip)
  • 0.0.1(Dec 3, 2015)

Owner
SEEK International AI
Artificial intelligence division of SEEK
SEEK International AI
A Python web searcher library with different search engines

Robert A simple Python web searcher library with different search engines. Install pip install roberthelper Usage from robert import GoogleSearcher

1 Dec 23, 2021
A simple search engine that allow searching for chess games

A simple search engine that allow searching for chess games based on queries about opening names & opening moves. Built with Python 3.10 and python-chess.

Tyler Hoang 1 Jun 17, 2022
基于RSSHUB阅读器实现的获取P站排行和P站搜图,使用时需使用代理

基于RSSHUB阅读器实现的获取P站排行和P站搜图

34 Dec 05, 2022
A web search server for ParlAI, including Blenderbot2.

Description A web search server for ParlAI, including Blenderbot2. Querying the server: The server reacting correctly: Uses html2text to strip the mar

Jules Gagnon-Marchand 119 Jan 06, 2023
Simple algorithm search engine like google in python using function

Mini-Search-Engine-Like-Google I have created the simple algorithm search engine like google in python using function. I am matching every word with w

Sachin Vinayak Dabhade 5 Sep 24, 2021
This is a Telegram Bot written in Python for searching data on Google Drive.

This is a Telegram Bot written in Python for searching data on Google Drive. Supports multiple Shared Drives (TDs). Manual Guide for deploying the bot

Levi 158 Dec 27, 2022
Google Drive file searcher

Google Drive file searcher

Hafitz Setya 25 Dec 09, 2022
A library for fast parse & import of Windows Prefetch into Elasticsearch.

prefetch2es Fast import of Windows Prefetch(.pf) into Elasticsearch. prefetch2es uses C library libscca. Usage When using from the commandline interfa

S.Nakano 5 Nov 24, 2022
Yuno is context based search engine for anime.

Yuno yuno.mp4 Table of Contents Introduction Power Of Yuno Try Yuno How Yuno was created? References Introduction Yuno is a context based search engin

IAmParadox 354 Dec 19, 2022
High level Python client for Elasticsearch

Elasticsearch DSL Elasticsearch DSL is a high-level library whose aim is to help with writing and running queries against Elasticsearch. It is built o

elastic 3.6k Dec 30, 2022
Reverse-ikea-image-search - A simple image of ikea search using jina.ai

IKEA Reverse Image Search This is a demo project to fetch ikea product images(IK

SOUVIK GHOSH 4 Mar 08, 2022
GitScanner is a script to make it easy to search for Exposed Git through an advanced Google search.

GitScanner Legal disclaimer Usage of GitScanner for attacking targets without prior mutual consent is illegal. It is the end user's responsibility to

Kaio Gomes 3 Oct 28, 2022
Inverted index creation and query search mechanism on Wikipedia pages.

WikiPedia Search Engine Step 1 : Installing Requirements Install "stemming" module for python using pip. Step 2 : Parsing the Data To parse the data,

Piyush Atri 1 Nov 27, 2021
Searches for MAC addresses in a text file of a Cisco "show IP arp" in any address format

show-ip-arp-mac-lookup Searches for MAC addresses in a text file of a Cisco "show IP arp" in any address format What it does: Takes a text file with t

Stew Alexander 0 Dec 24, 2022
a Telegram bot writen in Python for searching files in Drive. Based on SearchX-bot

Drive Search Bot This is a Telegram bot writen in Python for searching files in Drive. Based on SearchX-bot How to deploy? Clone this repo: git clone

Hafitz Setya 25 Dec 09, 2022
Google Project: Search and auto-complete sentences within given input text files, manipulating data with complex data-structures.

Auto-Complete Google Project In this project there is an implementation for one feature of Google's search engines - AutoComplete. Autocomplete, or wo

Hadassah Engel 10 Jun 20, 2022
A sentence search engine that fetches examples from trusted news/media organisations. Great for writing better English.

A sentence search engine that fetches examples from trusted news/media websites. Great for improving writing & speaking better English.

Stephen Appiah 1 Apr 04, 2022
User-friendly, tiny source code searcher written by pure Python.

User-friendly, tiny source code searcher written in pure Python. Example Usages Cat is equivalent in the regular expression as '^Cat$' bor class Cat

Furkan Onder 106 Nov 02, 2022
An open source, non-profit search engine implemented in python

Mwmbl: No ads, no tracking, no cruft, no profit Mwmbl is a non-profit, ad-free, free-libre and free-lunch search engine with a focus on useability and

639 Jan 04, 2023
Pythonic Lucene - A simplified python impelementaiton of Apache Lucene

A simplified python impelementaiton of Apache Lucene, mabye helps to understand how an enterprise search engine really works.

Mahdi Sadeghzadeh Ghamsary 2 Sep 12, 2022