当前位置：网站首页>Es full text index

Es full text index

2022-07-06 14:37:00 【Xiaobai aYuan】

Elasticsearch It's based on Lucene Search server for . It provides a distributed multi-user capability of full-text search engine , be based on RESTful web Interface .Elasticsearch Yes, it is Java Language development , And as a Apache Open source distribution under license terms , Is a popular enterprise search engine .Elasticsearch For Cloud Computing , Real time search , Stable , reliable , Fast , Easy to install and use .

The official client is in Java、.NET（C#）、PHP、Python、Apache Groovy、Ruby And many other languages are available . according to DB-Engines The ranking shows ,Elasticsearch Is the most popular enterprise search engine

Now the most common open source search engine on the market is ElasticSearch, be based on Lucene The implementation of the ,ElasticSearch Relatively more heavyweight , It also performs better in a distributed environment , In case of large amount of data Elasticsearch Excellent performance

ES The benefits of full-text indexing ：

1. As a database （ Instead of MySQL）;
2. Do retrieval services under the condition of big data 、 Synonym processing 、 Relevance ranking 、 Complex data analysis 、 Near real time processing of massive data ;
3. Analysis of records and logs
4. Full text search
5. Supplement to relational database , Substitution of traditional database
6. On-site search 、 Vertical search

1.docker

Application container engine

“Docker Is an open source application container engine , Allows developers to package their applications and dependencies into a portable image , Then post to any popular Linux or Windows On the machine with the operating system , You can also implement virtualization . Containers are completely sandboxed using the sandbox mechanism , There will be no interface between them .”

2.docker install

a. Update source ： apt-get update
b. install docker： apt-get install docker.io

c. see docker： docker ps # Check whether the installation is successful

# It depends on your own situation , Prompt that if you don't have permission, add sudo

3.docker install ES

1、 Pull es Mirror image

docker pull bitnami/elasticsearch
2、 establish es Containers

docker run -d -p 9200:9200 -p 9300:9300 --name elasticsearch bitnami/elasticsearch

# It depends on your own situation , Prompt that if you don't have permission, add sudo

3. Test whether the installation is successful in the web page ：

1. stay flask Project use ES Indexes

Create a story about es Indexed folder

es.py in

example es object 、 Initialize a connection Elasticsearch Action object 、 according to id Get document data and insert documents id

Establish and es Links to containers

from elasticsearch import Elasticsearch

from celery_task import celery_app


#  Establishing a connection 
es = Elasticsearch("http://101.42.224.35:9200/")


class ES(object):
    """
    es  object 
    """

    def __init__(self, index_name: str):
        self.es = es
        self.index_name = index_name

    def get_doc(self, uid):
        return self.es.get(index=self.index_name, id=uid)

    def insert_one(self, doc: dict):
        self.es.index(index=self.index_name, body=doc)

    def insert_array(self, docs: list):
        for doc in docs:
            self.es.index(index=self.index_name, body=doc)

    def search(self, query, count: int = 30, fields=None):
        fields = fields if fields else ["title", 'pub_date']
        dsl = {
            "query": {
                "multi_match": {
                    "query": query,
                    "fields": fields
                },
                # 'wildcard': {
                #     'content': {
                #         'value': '*' + query + '*'
                #     }
                # }
            },
            "highlight": {
                "fields": {
                    "title": {}
                }
            }
        }
        match_data = self.es.search(index=self.index_name, body=dsl, size=count)
        return match_data

    def _search(self, query: dict, count: int = 20, fields=None):  # count:  The size of the data returned 
        results = []

        match_data = self.search(query, count, fields)
        for hit in match_data['hits']['hits']:
            results.append(hit['_source'])
        return results

    def create_index(self):
        if self.es.indices.exists(index=self.index_name) is True:
            self.es.indices.delete(index=self.index_name)
        self.es.indices.create(index=self.index_name, ignore=400)

    def delete_index(self):
        try:
            self.es.indices.delete(index=self.index_name)
        except:
            pass

one_scripts.py

Import database data es

# You need to pay attention to whether the port number and database are local databases, as well as the name of the database to be imported and the decoding method of the database

import pymysql
import traceback
from elasticsearch import Elasticsearch


def get_db_data():
    #  Open database connection （ip/ Database user name / The login password / Database name ）
    db = pymysql.connect(host="127.0.0.1", user=" user name ", password=" password ",
                         database=" Database name ", charset='utf8')
    #  Use  cursor()  Method to create a cursor object  cursor
    cursor = db.cursor()
    sql = "SELECT * FROM course"
    #  Use  execute()   Method execution  SQL  Inquire about 
    cursor.execute(sql)
    #  Get a list of all records 
    results = cursor.fetchall()
    #  Close database connection 
    db.close()
    return results


def insert_data_to_es():
    es = Elasticsearch("http://101.42.224.35:9200/")
    #  Empty data 
    es.indices.delete(index='course')
    try:
        i = -1
        for row in get_db_data():
            print(row)
            print(row[1], row[2])
            i += 1
            es.index(index='course', body={
                'id': i,
                'table_name': 'table_name',
                'pid': row[4],
                'title': row[5],
                'desc': str(row[6]),
            })
    except:
        error = traceback.format_exc()
        print("Error: unable to fecth data", error)


if __name__ == "__main__":
    insert_data_to_es()

The index needs to be defined in the model class and set to support Chinese

__searchable__ = ['title'] # Search for related fields 
__analyzer__ = ChineseAnalyzer()# Support Chinese index

Use in view es Redefine the interface , Finally, return the data

stay python The method used in ----es.searche()

es = ES(index_name='Tag')
result = es._search(q, fields=['title', 'desc'])

  import traceback
from common.es.es import ES
 
class GetTag(Resource):
    def get(self):
       """
        Get front-end data 
        Use es Full text search 
"""
        parser = reqparse.RequestParser()
        parser.add_argument('q')
        args = parser.parse_args()
        q= args.get('q')
        try:
            es = ES(index_name='Tag')
            result = es._search(q, fields=['title', 'desc'])
            return marshal(result, tag_fields)
        except:
            error = traceback.format_exc()
            print('111111111111', error)
            return {'message': error}, 500

Through the above configuration and rewritten interface, you can realize the full text es Use of index

原网站

版权声明
本文为[Xiaobai aYuan]所创，转载请带上原文链接，感谢
https://yzsam.com/2022/187/202207060918473486.html