当前位置:网站首页>Query word weight, search word weight calculation
Query word weight, search word weight calculation
2022-07-02 02:15:00 【AI Zeng Xiaojian】
query Word weight (term weighting) In order to Calculation query After word segmentation , Every term The importance of . Commonly used indicators are tf*idf(query in term Of tf Most of them are 1), That is, a term The more times it appears , Indicates that the less information , On the contrary term Less times , It shows that the more information . however term Is not as important as term The number of occurrences of is strictly monotonic , also idf Lack of contextual considerations ( such as “windows” stay “windows Application software ” It's more important , And in the “windows xp System iphone xs Guide photos ” The importance of is relatively low ). Word weight calculation as a basic resource in text relevance , Lose words and other tasks play an important role , Its optimization methods are mainly divided into the following three categories :
1) Based on corpus statistics
2) Based on click logs
3) Based on supervised learning
This paper first introduces some computational methods based on corpus statistics .
One 、imp(importance Abbreviation )
idf One drawback of is to rely solely on Word frequency comparison ,imp From query Based on the proportion of importance in , The static weighting of words is optimized by iterative calculation , The calculation process is as follows :
among BT by term Of imp value , The initial value can be set to 1,Tmp_i yes query No i individual term Proportion of importance ,N Refers to all including i individual term Of query number .
Two 、DIMP(Dynamic imp)
idf and imp A common disadvantage of is that they are all static empowerment .DIMP according to query The context of each term Dynamic empowerment , The main assumption is arbitrary query The word weight in can be determined by the Correlation query Word weight of , The calculation process can be divided into two parts :
1) Top down query Tree construction
Different construction methods are adopted according to the actual scene , Here is a way to search . Here's the picture , Given query As root node , First of all get query Correlation query As the second layer node , On the basis of the second layer , Enumerate related query The son of query As the third layer node , The last layer is after the word segmentation term node . therefore query The nodes of tree species are text strings with different granularity , Edges are the correlation between text strings . In the auction word recommendation task , user query Are relatively short keywords , It can build corresponding through the common purchase relationship between auction words query Trees .
边栏推荐
- Sword finger offer 29 Print matrix clockwise
- Start from scratch - Web Host - 01
- No programming code technology! Four step easy flower store applet
- MySQL constraints and multi table query example analysis
- If you want to rewind the video picture, what simple methods can you use?
- MySQL约束与多表查询实例分析
- [question] - why is optical flow not good for static scenes
- flutter 中间一个元素,最右边一个元素
- Word search applet design report based on cloud development +ppt+ project source code + demonstration video
- CSDN insertion directory in 1 second
猜你喜欢
Opencascade7.6 compilation
MySQL view concept, create view, view, modify view, delete view
No programming code technology! Four step easy flower store applet
MySQL operates the database through the CMD command line, and the image cannot be found during the real machine debugging of fluent
Pat a-1165 block reversing (25 points)
研发中台拆分过程的一些心得总结
剑指 Offer 62. 圆圈中最后剩下的数字
Decipher the AI black technology behind sports: figure skating action recognition, multi-mode video classification and wonderful clip editing
【带你学c带你飞】2day 第8章 指针(练习8.1 密码开锁)
leetcode2305. 公平分发饼干(中等,周赛,状压dp)
随机推荐
Additional: information desensitization;
leetcode2309. The best English letters with both upper and lower case (simple, weekly)
MySQL主从延迟问题怎么解决
C return multiple values getter setter queries the database and adds the list return value to the window
479. Additive binary tree (interval DP on the tree)
Selection of field types for creating tables in MySQL database
The concepts and differences between MySQL stored procedures and stored functions, as well as how to create them, the role of delimiter, the viewing, modification, deletion of stored procedures and fu
Construction and maintenance of business websites [15]
The wave of layoffs in big factories continues, but I, who was born in both non undergraduate schools, turned against the wind and entered Alibaba
No programming code technology! Four step easy flower store applet
剑指 Offer 29. 顺时针打印矩阵
【读书笔记】程序员修炼手册—实战式学习最有效(项目驱动)
Sword finger offer 47 Maximum value of gifts
附加:信息脱敏;
How to execute an SQL in MySQL
Duplicate keys detected: ‘0‘. This may cause an update error. found in
Redis有序集合如何使用
[技术发展-21]:网络与通信技术的应用与发展快速概览-1- 互联网网络技术
essay structure
C language 3-7 daffodils (enhanced version)