当前位置:网站首页>Stack - es - official documents - filter search results
Stack - es - official documents - filter search results
2022-07-02 02:32:00 【Whose blog is this?】
There is no perfect program in the world , But we are not depressed , Because writing a program is a process of constantly pursuing perfection .
- Hou's workshop
List of articles
Reference resources
- Filter search results
- In depth study visit The basic chapter and Advanced
- All column contents refer to Column preview
Filter search results
- There are two ways to filter search results
- Use a filter Boolean query of clause . Search request for search and polymerization application Boolean filter .
- Use search API Of post_filter Parameters . Search requests are only for search applications post filter , Instead of aggregation . You can use post Filters to calculate aggregations based on a broader result set , Then further reduce the result .
- You can also post Recalculate the hit score after the filter , To improve relevance and reorder results .
Rear filter
- When you use post_filter Parameter when filtering search results , After calculating the aggregation , Filter search hits . The post filter has no effect on the aggregation results .
- for example , You are selling shirts with the following attributes :
PUT /shirts
{
"mappings": {
"properties": {
"brand": { "type": "keyword"},
"color": { "type": "keyword"},
"model": { "type": "keyword"}
}
}
}
PUT /shirts/_doc/1?refresh
{
"brand": "gucci",
"color": "red",
"model": "slim"
}
- Suppose a user specifies two filters :
- color:red and brand:gucci. You just need to show them in the search results Gucci Made red shirt . Usually , You can use bool Inquire about :
GET /shirts/_search
{
"query": {
"bool": {
"filter": [
{ "term": { "color": "red" }},
{ "term": { "brand": "gucci" }}
]
}
}
}
- however , You also want to use faceted navigation to display a list of other options that users can click . Maybe you have one model Field , Allow users to limit their search results to red Gucci t-shirts or dress-shirts.
- This can be done through a term Aggregation to achieve :
GET /shirts/_search
{
"query": {
"bool": {
"filter": [
{ "term": { "color": "red" }},
{ "term": { "brand": "gucci" }}
]
}
},
"aggs": {
"models": {
"terms": { "field": "model" }
}
}
}
- But maybe you want to tell users about other colors Gucci How many shirts are there . If you just add one in the color field term polymerization , Then you will only return red , Because your query only returns Gucci Red shirt .
- contrary , You want to include shirts of all colors in the aggregation process , Then apply color filters only to search results . This is a post_filter Purpose :
GET /shirts/_search
{
"query": {
"bool": {
"filter": {
"term": { "brand": "gucci" }
}
}
},
"aggs": {
"colors": {
"terms": { "field": "color" }
},
"color_red": {
"filter": {
"term": { "color": "red" }
},
"aggs": {
"models": {
"terms": { "field": "model" }
}
}
}
},
"post_filter": {
"term": { "color": "red" }
}
}
Re grade the filtering results
- Re scoring can be done by using the second algorithm ( Usually more expensive ), Instead of applying expensive algorithms to all documents in the index , Only for queries and post_filter Top of stage return ( Such as 100 - 500) Reorder documents to help improve accuracy .
- Before each fragment returns the results and is sorted by the node processing the entire search request , A rescore request .
- at present rescore API There is only one implementation : Inquire about rescorer, It uses a query to adjust the score . In the future , Alternative retrievers may be provided , for example ,pair-wise Retriever .
Be careful : If an explicit sort is provided ( except _score Out of descending order ) And provides a rescore Inquire about , Will throw an error .
Be careful : When you show the page to the user , You should not change when you traverse every page window_size( By passing different values ), Because this will change the click through rate at the top , As a result, the user moves confusedly while traversing the page .
Inquire about Rescorer
- Inquire about Rescorer Only for queries and post_filter Phase returned Top-K Results execute the second query . At every shard The number of documents checked on can be determined by window_size Parameter control , The default value is 10.
- By default , Original query and rescore The scores of the query will be linearly combined , Generate the final for each document _score. Original query and rescore The relative importance of queries can be determined by query_weight and rescore_query_weight To control . Both default to 1.
- for example :
POST /_search
{
"query" : {
"match" : {
"message" : {
"operator" : "or",
"query" : "the quick brown"
}
}
},
"rescore" : {
"window_size" : 50,
"query" : {
"rescore_query" : {
"match_phrase" : {
"message" : {
"query" : "the quick brown",
"slop" : 2
}
}
},
"query_weight" : 0.7,
"rescore_query_weight" : 1.2
}
}
}
- Scores can be combined by score_mode To control :
score_mode | describe |
---|---|
total | Add the original score and the re scoring query score . The default value is . |
multiply | Multiply the original score by the re scoring query score . Used for function query scoring . |
avg | Add the original score and the re scoring query score , Average. . |
max | Take the maximum value of the original score and the re scoring query score . |
min | Take the minimum value of the original score and the re scoring query score . |
Multiple re ratings
- You can also perform multiple re scoring in sequence :
POST /_search
{
"query" : {
"match" : {
"message" : {
"operator" : "or",
"query" : "the quick brown"
}
}
},
"rescore" : [ {
"window_size" : 100,
"query" : {
"rescore_query" : {
"match_phrase" : {
"message" : {
"query" : "the quick brown",
"slop" : 2
}
}
},
"query_weight" : 0.7,
"rescore_query_weight" : 1.2
}
}, {
"window_size" : 10,
"query" : {
"score_mode": "multiply",
"rescore_query" : {
"function_score" : {
"script_score": {
"script": {
"source": "Math.log10(doc.count.value + 2)"
}
}
}
}
}
} ]
}
- The first one is to get the result of the query , The second one gets the result of the first one , And so on . The second re rating will “ notice ” The ranking made by the first re scoring , So you can use a large window to pull the document from the first re scoring to a smaller window as the second re scoring .
边栏推荐
- 批量检测url是否存在cdn—高准确率
- How to execute an SQL in MySQL
- Duplicate keys detected: ‘0‘. This may cause an update error. found in
- leetcode2309. The best English letters with both upper and lower case (simple, weekly)
- [reading notes] programmer training manual - practical learning is the most effective (project driven)
- How to solve MySQL master-slave delay problem
- Bash bounce shell encoding
- Is bone conduction earphone better than traditional earphones? The sound production principle of bone conduction earphones is popular science
- RTL8189FS如何关闭Debug信息
- How to turn off the LED light of Rog motherboard
猜你喜欢
【带你学c带你飞】1day 第2章 (练习2.2 求华氏温度 100°F 对应的摄氏温度
[opencv] - comprehensive examples of five image filters
【带你学c带你飞】3day第2章 用C语言编写程序(练习 2.3 计算分段函数)
[question 008: what is UV in unity?]
Cesium dynamic diffusion point effect
SAP ui5 beginner tutorial 19 - SAP ui5 data types and complex data binding
[liuyubobobo play with leetcode algorithm interview] [00] Course Overview
Formatting logic of SAP ui5 currency amount display
LFM signal denoising, time-frequency analysis, filtering
CSDN article underlined, font color changed, picture centered, 1 second to understand
随机推荐
SQL server calculates the daily average and annual average of the whole province
STM32__05—PWM控制直流电机
【无标题】
No programming code technology! Four step easy flower store applet
离婚3年以发现尚未分割的共同财产,还可以要么
leetcode2312. Selling wood blocks (difficult, weekly race)
[deep learning] Infomap face clustering facecluster
实现一个自定义布局的扫码功能
大厂裁员潮不断,双非本科出身的我却逆风翻盘挺进阿里
Pat a-1165 block reversing (25 points)
花一个星期时间呕心沥血整理出高频软件测试/自动化测试面试题和答案
[untitled]
[reading notes] programmer training manual - practical learning is the most effective (project driven)
Infix expression to suffix expression (computer) code
使用开源项目【Banner】实现轮播图效果(带小圆点)
WebGPU(一):基本概念
CSDN insertion directory in 1 second
Set status bar color
STM32F103——两路PWM控制电机
C # use system data. The split mixed mode assembly is generated for the "v2.0.50727" version of the runtime, and it cannot be loaded in the 4.0 runtime without configuring other information