当前位置:网站首页>Stack - es - official documents - filter search results
Stack - es - official documents - filter search results
2022-07-02 02:32:00 【Whose blog is this?】
There is no perfect program in the world , But we are not depressed , Because writing a program is a process of constantly pursuing perfection .
- Hou's workshop
List of articles
Reference resources
- Filter search results
- In depth study visit The basic chapter and Advanced
- All column contents refer to Column preview
Filter search results
- There are two ways to filter search results
- Use a filter Boolean query of clause . Search request for search and polymerization application Boolean filter .
- Use search API Of post_filter Parameters . Search requests are only for search applications post filter , Instead of aggregation . You can use post Filters to calculate aggregations based on a broader result set , Then further reduce the result .
- You can also post Recalculate the hit score after the filter , To improve relevance and reorder results .
Rear filter
- When you use post_filter Parameter when filtering search results , After calculating the aggregation , Filter search hits . The post filter has no effect on the aggregation results .
- for example , You are selling shirts with the following attributes :
PUT /shirts
{
"mappings": {
"properties": {
"brand": { "type": "keyword"},
"color": { "type": "keyword"},
"model": { "type": "keyword"}
}
}
}
PUT /shirts/_doc/1?refresh
{
"brand": "gucci",
"color": "red",
"model": "slim"
}
- Suppose a user specifies two filters :
- color:red and brand:gucci. You just need to show them in the search results Gucci Made red shirt . Usually , You can use bool Inquire about :
GET /shirts/_search
{
"query": {
"bool": {
"filter": [
{ "term": { "color": "red" }},
{ "term": { "brand": "gucci" }}
]
}
}
}
- however , You also want to use faceted navigation to display a list of other options that users can click . Maybe you have one model Field , Allow users to limit their search results to red Gucci t-shirts or dress-shirts.
- This can be done through a term Aggregation to achieve :
GET /shirts/_search
{
"query": {
"bool": {
"filter": [
{ "term": { "color": "red" }},
{ "term": { "brand": "gucci" }}
]
}
},
"aggs": {
"models": {
"terms": { "field": "model" }
}
}
}
- But maybe you want to tell users about other colors Gucci How many shirts are there . If you just add one in the color field term polymerization , Then you will only return red , Because your query only returns Gucci Red shirt .
- contrary , You want to include shirts of all colors in the aggregation process , Then apply color filters only to search results . This is a post_filter Purpose :
GET /shirts/_search
{
"query": {
"bool": {
"filter": {
"term": { "brand": "gucci" }
}
}
},
"aggs": {
"colors": {
"terms": { "field": "color" }
},
"color_red": {
"filter": {
"term": { "color": "red" }
},
"aggs": {
"models": {
"terms": { "field": "model" }
}
}
}
},
"post_filter": {
"term": { "color": "red" }
}
}
Re grade the filtering results
- Re scoring can be done by using the second algorithm ( Usually more expensive ), Instead of applying expensive algorithms to all documents in the index , Only for queries and post_filter Top of stage return ( Such as 100 - 500) Reorder documents to help improve accuracy .
- Before each fragment returns the results and is sorted by the node processing the entire search request , A rescore request .
- at present rescore API There is only one implementation : Inquire about rescorer, It uses a query to adjust the score . In the future , Alternative retrievers may be provided , for example ,pair-wise Retriever .
Be careful : If an explicit sort is provided ( except _score Out of descending order ) And provides a rescore Inquire about , Will throw an error .
Be careful : When you show the page to the user , You should not change when you traverse every page window_size( By passing different values ), Because this will change the click through rate at the top , As a result, the user moves confusedly while traversing the page .
Inquire about Rescorer
- Inquire about Rescorer Only for queries and post_filter Phase returned Top-K Results execute the second query . At every shard The number of documents checked on can be determined by window_size Parameter control , The default value is 10.
- By default , Original query and rescore The scores of the query will be linearly combined , Generate the final for each document _score. Original query and rescore The relative importance of queries can be determined by query_weight and rescore_query_weight To control . Both default to 1.
- for example :
POST /_search
{
"query" : {
"match" : {
"message" : {
"operator" : "or",
"query" : "the quick brown"
}
}
},
"rescore" : {
"window_size" : 50,
"query" : {
"rescore_query" : {
"match_phrase" : {
"message" : {
"query" : "the quick brown",
"slop" : 2
}
}
},
"query_weight" : 0.7,
"rescore_query_weight" : 1.2
}
}
}
- Scores can be combined by score_mode To control :
| score_mode | describe |
|---|---|
| total | Add the original score and the re scoring query score . The default value is . |
| multiply | Multiply the original score by the re scoring query score . Used for function query scoring . |
| avg | Add the original score and the re scoring query score , Average. . |
| max | Take the maximum value of the original score and the re scoring query score . |
| min | Take the minimum value of the original score and the re scoring query score . |
Multiple re ratings
- You can also perform multiple re scoring in sequence :
POST /_search
{
"query" : {
"match" : {
"message" : {
"operator" : "or",
"query" : "the quick brown"
}
}
},
"rescore" : [ {
"window_size" : 100,
"query" : {
"rescore_query" : {
"match_phrase" : {
"message" : {
"query" : "the quick brown",
"slop" : 2
}
}
},
"query_weight" : 0.7,
"rescore_query_weight" : 1.2
}
}, {
"window_size" : 10,
"query" : {
"score_mode": "multiply",
"rescore_query" : {
"function_score" : {
"script_score": {
"script": {
"source": "Math.log10(doc.count.value + 2)"
}
}
}
}
}
} ]
}
- The first one is to get the result of the query , The second one gets the result of the first one , And so on . The second re rating will “ notice ” The ranking made by the first re scoring , So you can use a large window to pull the document from the first re scoring to a smaller window as the second re scoring .
边栏推荐
- JVM面试篇
- Summary of some experiences in the process of R & D platform splitting
- Golang lock
- What is the difference between an intermediate human resource manager and an intermediate economist (human resources direction)?
- Calculation (computer) code of suffix expression
- Analysis of FLV packaging format
- how to add one row in the dataframe?
- Open that kind of construction document
- C write TXT file
- Types of exhibition items available in the multimedia interactive exhibition hall
猜你喜欢
![[question 008: what is UV in unity?]](/img/f7/5ee0b18d1fe21ff3b98518c46d9520.jpg)
[question 008: what is UV in unity?]

Spend a week painstakingly sorting out the interview questions and answers of high-frequency software testing / automated testing

花一个星期时间呕心沥血整理出高频软件测试/自动化测试面试题和答案

CVPR 2022 | Dalian Institute of technology proposes a self calibration lighting framework for low light level image enhancement of real scenes

OpenCASCADE7.6编译

Additional: information desensitization;
![[technology development -21]: rapid overview of the application and development of network and communication technology -1- Internet Network Technology](/img/2d/299fa5c76416f74bd1a693c433dd09.png)
[technology development -21]: rapid overview of the application and development of network and communication technology -1- Internet Network Technology

A quick understanding of digital electricity

MySQL operates the database through the CMD command line, and the image cannot be found during the real machine debugging of fluent

As a software testing engineer, will you choose the bank post? Laolao bank test post
随机推荐
The wave of layoffs in big factories continues, but I, who was born in both non undergraduate schools, turned against the wind and entered Alibaba
Quality means doing it right when no one is looking
[learn C and fly] 4day Chapter 2 program in C language (exercise 2.5 generate power table and factorial table
What is the difference between an intermediate human resource manager and an intermediate economist (human resources direction)?
Duplicate keys detected: ‘0‘. This may cause an update error. found in
query词权重, 搜索词权重计算
How to turn off debug information in rtl8189fs
[learn C and fly] day 5 chapter 2 program in C language (Exercise 2)
剑指 Offer 31. 栈的压入、弹出序列
[learn C and fly] 2day Chapter 8 pointer (practice 8.1 password unlocking)
How to use redis ordered collection
trading
CVPR 2022 | Dalian Institute of technology proposes a self calibration lighting framework for low light level image enhancement of real scenes
es面試題
How to solve MySQL master-slave delay problem
Cesium dynamic diffusion point effect
MySQL constraints and multi table query example analysis
2022低压电工考试题模拟考试题库模拟考试平台操作
leetcode373. Find and minimum k-pair numbers (medium)
花一个星期时间呕心沥血整理出高频软件测试/自动化测试面试题和答案