当前位置:网站首页>Stack - es - official documents - filter search results
Stack - es - official documents - filter search results
2022-07-02 02:32:00 【Whose blog is this?】
There is no perfect program in the world , But we are not depressed , Because writing a program is a process of constantly pursuing perfection .
- Hou's workshop
List of articles
Reference resources
- Filter search results
- In depth study visit The basic chapter and Advanced
- All column contents refer to Column preview
Filter search results
- There are two ways to filter search results
- Use a filter Boolean query of clause . Search request for search and polymerization application Boolean filter .
- Use search API Of post_filter Parameters . Search requests are only for search applications post filter , Instead of aggregation . You can use post Filters to calculate aggregations based on a broader result set , Then further reduce the result .
- You can also post Recalculate the hit score after the filter , To improve relevance and reorder results .
Rear filter
- When you use post_filter Parameter when filtering search results , After calculating the aggregation , Filter search hits . The post filter has no effect on the aggregation results .
- for example , You are selling shirts with the following attributes :
PUT /shirts
{
"mappings": {
"properties": {
"brand": { "type": "keyword"},
"color": { "type": "keyword"},
"model": { "type": "keyword"}
}
}
}
PUT /shirts/_doc/1?refresh
{
"brand": "gucci",
"color": "red",
"model": "slim"
}
- Suppose a user specifies two filters :
- color:red and brand:gucci. You just need to show them in the search results Gucci Made red shirt . Usually , You can use bool Inquire about :
GET /shirts/_search
{
"query": {
"bool": {
"filter": [
{ "term": { "color": "red" }},
{ "term": { "brand": "gucci" }}
]
}
}
}
- however , You also want to use faceted navigation to display a list of other options that users can click . Maybe you have one model Field , Allow users to limit their search results to red Gucci t-shirts or dress-shirts.
- This can be done through a term Aggregation to achieve :
GET /shirts/_search
{
"query": {
"bool": {
"filter": [
{ "term": { "color": "red" }},
{ "term": { "brand": "gucci" }}
]
}
},
"aggs": {
"models": {
"terms": { "field": "model" }
}
}
}
- But maybe you want to tell users about other colors Gucci How many shirts are there . If you just add one in the color field term polymerization , Then you will only return red , Because your query only returns Gucci Red shirt .
- contrary , You want to include shirts of all colors in the aggregation process , Then apply color filters only to search results . This is a post_filter Purpose :
GET /shirts/_search
{
"query": {
"bool": {
"filter": {
"term": { "brand": "gucci" }
}
}
},
"aggs": {
"colors": {
"terms": { "field": "color" }
},
"color_red": {
"filter": {
"term": { "color": "red" }
},
"aggs": {
"models": {
"terms": { "field": "model" }
}
}
}
},
"post_filter": {
"term": { "color": "red" }
}
}
Re grade the filtering results
- Re scoring can be done by using the second algorithm ( Usually more expensive ), Instead of applying expensive algorithms to all documents in the index , Only for queries and post_filter Top of stage return ( Such as 100 - 500) Reorder documents to help improve accuracy .
- Before each fragment returns the results and is sorted by the node processing the entire search request , A rescore request .
- at present rescore API There is only one implementation : Inquire about rescorer, It uses a query to adjust the score . In the future , Alternative retrievers may be provided , for example ,pair-wise Retriever .
Be careful : If an explicit sort is provided ( except _score Out of descending order ) And provides a rescore Inquire about , Will throw an error .
Be careful : When you show the page to the user , You should not change when you traverse every page window_size( By passing different values ), Because this will change the click through rate at the top , As a result, the user moves confusedly while traversing the page .
Inquire about Rescorer
- Inquire about Rescorer Only for queries and post_filter Phase returned Top-K Results execute the second query . At every shard The number of documents checked on can be determined by window_size Parameter control , The default value is 10.
- By default , Original query and rescore The scores of the query will be linearly combined , Generate the final for each document _score. Original query and rescore The relative importance of queries can be determined by query_weight and rescore_query_weight To control . Both default to 1.
- for example :
POST /_search
{
"query" : {
"match" : {
"message" : {
"operator" : "or",
"query" : "the quick brown"
}
}
},
"rescore" : {
"window_size" : 50,
"query" : {
"rescore_query" : {
"match_phrase" : {
"message" : {
"query" : "the quick brown",
"slop" : 2
}
}
},
"query_weight" : 0.7,
"rescore_query_weight" : 1.2
}
}
}
- Scores can be combined by score_mode To control :
score_mode | describe |
---|---|
total | Add the original score and the re scoring query score . The default value is . |
multiply | Multiply the original score by the re scoring query score . Used for function query scoring . |
avg | Add the original score and the re scoring query score , Average. . |
max | Take the maximum value of the original score and the re scoring query score . |
min | Take the minimum value of the original score and the re scoring query score . |
Multiple re ratings
- You can also perform multiple re scoring in sequence :
POST /_search
{
"query" : {
"match" : {
"message" : {
"operator" : "or",
"query" : "the quick brown"
}
}
},
"rescore" : [ {
"window_size" : 100,
"query" : {
"rescore_query" : {
"match_phrase" : {
"message" : {
"query" : "the quick brown",
"slop" : 2
}
}
},
"query_weight" : 0.7,
"rescore_query_weight" : 1.2
}
}, {
"window_size" : 10,
"query" : {
"score_mode": "multiply",
"rescore_query" : {
"function_score" : {
"script_score": {
"script": {
"source": "Math.log10(doc.count.value + 2)"
}
}
}
}
}
} ]
}
- The first one is to get the result of the query , The second one gets the result of the first one , And so on . The second re rating will “ notice ” The ranking made by the first re scoring , So you can use a large window to pull the document from the first re scoring to a smaller window as the second re scoring .
边栏推荐
- Build a modern data architecture on the cloud with Amazon AppFlow, Amazon lake formation and Amazon redshift
- Software testing learning notes - network knowledge
- 使用开源项目【Banner】实现轮播图效果(带小圆点)
- A quick understanding of digital electricity
- MySQL operates the database through the CMD command line, and the image cannot be found during the real machine debugging of fluent
- 2022安全员-C证考试题及模拟考试
- Learning notes of software testing -- theoretical knowledge of software testing
- QT使用sqllite
- QT实现界面跳转
- Use the open source project [banner] to achieve the effect of rotating pictures (with dots)
猜你喜欢
JVM面试篇
leetcode2312. Selling wood blocks (difficult, weekly race)
花一个星期时间呕心沥血整理出高频软件测试/自动化测试面试题和答案
A quick understanding of analog electricity
LeetCode刷题(十)——顺序刷题46至50
Opencascade7.6 compilation
Webgpu (I): basic concepts
[learn C and fly] 3day Chapter 2 program in C language (exercise 2.3 calculate piecewise functions)
How to use redis ordered collection
pytest 测试框架
随机推荐
Flutter un élément au milieu, l'élément le plus à droite
【带你学c带你飞】4day第2章 用C语言编写程序(练习 2.5 生成乘方表与阶乘表
Decipher the AI black technology behind sports: figure skating action recognition, multi-mode video classification and wonderful clip editing
how to add one row in the dataframe?
Is bone conduction earphone better than traditional earphones? The sound production principle of bone conduction earphones is popular science
MySQL operates the database through the CMD command line, and the image cannot be found during the real machine debugging of fluent
2022 low voltage electrician test question simulation test question bank simulation test platform operation
how to come in an investnent bank team
leetcode2310. The one digit number is the sum of integers of K (medium, weekly)
Which brand of running headphones is good? How many professional running headphones are recommended
Open that kind of construction document
【带你学c带你飞】2day 第8章 指针(练习8.1 密码开锁)
Divorce for 3 years to discover the undivided joint property, or
2022 safety officer-c certificate examination questions and mock examination
QT使用sqllite
使用开源项目【Banner】实现轮播图效果(带小圆点)
es面试题
Bash bounce shell encoding
C return multiple values getter setter queries the database and adds the list return value to the window
How to run oddish successfully from 0?