当前位置:网站首页>Query word weight, search word weight calculation
Query word weight, search word weight calculation
2022-07-02 02:15:00 【AI Zeng Xiaojian】
query Word weight (term weighting) In order to Calculation query After word segmentation , Every term The importance of . Commonly used indicators are tf*idf(query in term Of tf Most of them are 1), That is, a term The more times it appears , Indicates that the less information , On the contrary term Less times , It shows that the more information . however term Is not as important as term The number of occurrences of is strictly monotonic , also idf Lack of contextual considerations ( such as “windows” stay “windows Application software ” It's more important , And in the “windows xp System iphone xs Guide photos ” The importance of is relatively low ). Word weight calculation as a basic resource in text relevance , Lose words and other tasks play an important role , Its optimization methods are mainly divided into the following three categories :
1) Based on corpus statistics
2) Based on click logs
3) Based on supervised learning
This paper first introduces some computational methods based on corpus statistics .
One 、imp(importance Abbreviation )
idf One drawback of is to rely solely on Word frequency comparison ,imp From query Based on the proportion of importance in , The static weighting of words is optimized by iterative calculation , The calculation process is as follows :

among BT by term Of imp value , The initial value can be set to 1,Tmp_i yes query No i individual term Proportion of importance ,N Refers to all including i individual term Of query number .
Two 、DIMP(Dynamic imp)
idf and imp A common disadvantage of is that they are all static empowerment .DIMP according to query The context of each term Dynamic empowerment , The main assumption is arbitrary query The word weight in can be determined by the Correlation query Word weight of , The calculation process can be divided into two parts :
1) Top down query Tree construction
Different construction methods are adopted according to the actual scene , Here is a way to search . Here's the picture , Given query As root node , First of all get query Correlation query As the second layer node , On the basis of the second layer , Enumerate related query The son of query As the third layer node , The last layer is after the word segmentation term node . therefore query The nodes of tree species are text strings with different granularity , Edges are the correlation between text strings . In the auction word recommendation task , user query Are relatively short keywords , It can build corresponding through the common purchase relationship between auction words query Trees .
边栏推荐
- 【读书笔记】程序员修炼手册—实战式学习最有效(项目驱动)
- LFM信号加噪、时频分析、滤波
- Flutter un élément au milieu, l'élément le plus à droite
- trading
- RTL8189FS如何关闭Debug信息
- With the innovation and upgrading of development tools, Kunpeng promotes the "bamboo forest" growth of the computing industry
- AR增强现实可应用的场景
- mysql列转行函数指的是什么
- Spend a week painstakingly sorting out the interview questions and answers of high-frequency software testing / automated testing
- [opencv] - comprehensive examples of five image filters
猜你喜欢

Ar Augmented Reality applicable scenarios

Redis有序集合如何使用

WebGPU(一):基本概念

【带你学c带你飞】1day 第2章 (练习2.2 求华氏温度 100°F 对应的摄氏温度

JMeter (II) - install the custom thread groups plug-in

AR增强现实可应用的场景

Cesium dynamic diffusion point effect

leetcode2311. Longest binary subsequence less than or equal to K (medium, weekly)

leetcode2309. 兼具大小写的最好英文字母(简单,周赛)

leetcode2312. Selling wood blocks (difficult, weekly race)
随机推荐
Regular expression learning notes
479. Additive binary tree (interval DP on the tree)
2022 Q2 - 提升技能的技巧总结
MySQL中一条SQL是怎么执行的
SQLite 3 of embedded database
MySQL view concept, create view, view, modify view, delete view
RTL8189FS如何关闭Debug信息
Redis环境搭建和使用的方法
Openssl3.0 learning XXI provider encoder
Architecture evolution from MVC to DDD
Quality means doing it right when no one is looking
[deep learning] Infomap face clustering facecluster
MySQL主从延迟问题怎么解决
trading
What is the function of the headphone driver
Duplicate keys detected: ‘0‘. This may cause an update error. found in
Sword finger offer 47 Maximum value of gifts
Open that kind of construction document
What are the necessary things for students to start school? Ranking list of Bluetooth headsets with good sound quality
【读书笔记】程序员修炼手册—实战式学习最有效(项目驱动)