当前位置:网站首页>Voice assistant -- Qu -- query error correction and rewriting
Voice assistant -- Qu -- query error correction and rewriting
2022-06-12 07:32:00 【Turned_ MZ】
This chapter , Let's take a look at the voice assistant QU Layer query Correction and rewriting .
Why should we correct mistakes :
Because the vast majority of voice assistants query The sources are all voice conversations ,ASR The result of the module will be due to incorrect reception 、 Less radio 、 Or the error of the recognized word causes the input to NLU Layer of query It's wrong. . FALSE query It will directly affect the identification of downstream services , Resulting in an error in the final execution result , Impact on user experience . such as :
- ( error ) Turn on peace silence ->( correct ) Open the peace elite .
- ( error ) first 7 An alarm clock at ->( correct ) Order one 7 An alarm clock at .
- ( error ) Order one 7 Point coward -> ( correct ) Order one 7 An alarm clock at .
At the same time as ASR It has its own language model query To smooth , So into NLU Is the smoothed result , So for NLU To make corrections , Because there is no user's original voice message , So it is more difficult to implement . meanwhile , Considering the online real-time and stability , This error correction requires high performance .
Solution :
query The error correction solution mainly includes several steps : Confusion mining 、 Data cleaning 、 Error detection 、 Candidate recall 、 Candidate ranking 、 post-processing . According to the effect and real-time requirements , It can also be divided into online end and offline end .
The following is a brief description of each module .
1、 Confuse mining with data cleansing
The purpose of this step is mainly to establish confusion words 、 Confusion database , Provide database support for subsequent error detection and candidate recall .
In order to dig out the possible confused words and sentences , Here we can use new word discovery to do , Mining words with high internal cohesion and rich adjacent words , The methods of new word discovery are not discussed here . besides , You can also mine confused words and sentences , Make use of the near tone similarity 、 Semantic similarity 、 The method of phonetic string mining to confused words . New words to be discovered + Confused words , After cleaning by various methods , Form a dictionary of confusing words 、 pinyin trie Trees, etc , As a basic database .
2、 Error detection
This step is mainly for the input query Error detection , Identify if there may be an error , And where there are errors . The more common method is :
- Use the obfuscation dictionary to match , the query After word segmentation , Whether to match the confusion words in the confusion dictionary .
- Using the language model ngram, That is to calculate the co-occurrence probability of a word in a sentence and its left and right neighbors , If the co-occurrence probability is very low , It is very likely that this word is wrong , Here we assume that in the data of full sentences , Right co-occurrence is far more than wrong co-occurrence , Otherwise, it cannot be detected .
- Use Pinyin CBOW, That is, add the Pinyin vector , The phonetic vector of the context of the target word , utilize CBOW To predict the Pinyin vector of the target word , Then match the predicted vector with the Pinyin vector of the target word , If the similarity is low, there may be errors .
- Use Bert As a language model , Directly predict the probability of each word in the sentence , However, the effect of this method is not controllable , At the same time as bert Slow performance , Therefore, online real-time solutions are rarely used , It can be used in offline mining .
3、 Candidate recall
This step is mainly to find the correct word for the wrong word , such as “ Peaceful silence ” Medium “ Mute ”, The correct word should be “ elite ”. This step mainly uses various methods to find the correct word corresponding to the wrong word , And the probability of these correct words , And then do a comprehensive sort . Some of the available features and options are :
- Confusing dictionaries : That is, through the detected wrong words , To the dictionary of confused words , Find the correct word .
- Language model : Through two-way 2gram, The candidate words with high probability corresponding to the wrong word position are predicted .
- pinyin CBOW: That is, through the context Pinyin vector , utilize CBOW The phonetic vector of the wrong secondary position can be predicted in the way of , And the corresponding candidate words .
- DTW Near tone similarity : mention DTW Before , First of all, we can understand dimsim This open source library , This library can be used to measure the similarity of two Chinese pronunciation , It mainly uses the vowel and vowel vectors to carry out inner product , however DTW It is required that the length of two words comparing similarity must be the same , But in practice, there may be multiple radio stations , Less radio, etc , Therefore, it is necessary to compare the phonetic similarity of two different lengths , It's here DTW Method , namely Dynamic Time Warping, Its main idea is dynamic programming , The point at one time of the sequence corresponds to multiple continuous points at another time , There is no explanation here , Sure google Search for . Match by near tone similarity , We can find more similar words in the confusion dictionary .
- bert Language model : That is, direct use of Bert Predict the correct word corresponding to the wrong word position .
4、 Candidate ranking
This step is about the potentially correct words found above , Make a comprehensive sort . The available features are : Edit distance 、 Near tone similarity 、 Language model probability , Black and white list ,PPL fraction .
Here's to say PPL fraction ,perplexity( Confusion ) It is used to measure the quality of a probability distribution or probability model to predict the sample , Simply speaking , Is to measure whether a sentence is natural language , The lower the score, the better .

take 2.3 Candidate words found in , utilize 2.4 Find the scores of each dimension of each candidate word in , Then use the gradient lifting tree GBDT Make a comprehensive sort , Of course , You can also use xgboost, The effect will be better , About GBDT and xgboost The difference between , Later articles will compare the two .
Expand ideas :
In the above modules , It's all mentioned bert, In fact, it can be used directly bert, Make an end-to-end translation model , Take the task of error correction as the task of translation , The input is wrong query, The output is corrected query, However, the effect of this method is not controllable , The risk of online real-time error correction is high , Because once the error is corrected in this step , Lead to query More difficult to understand , It will bring butterfly effect errors to downstream tasks . Of course , If its application scenario is similar to that on search engines , Give a candidate result for the user to choose , Instead of directly executing , This is also a good choice .
边栏推荐
- Test manager defines and implements test metrics
- Shortcut key modification of TMUX and VIM
- openwrt uci c api
- Why must coordinate transformations consist of publishers / subscribers of coordinate transformation information?
- R语言使用caTools包的sample.split函数将机器学习数据集划分为训练集和测试集
- Study on display principle of seven segment digital tube
- 2022 simulated test platform operation of hoisting machinery command test questions
- Kali and programming: how to quickly build the OWASP website security test range?
- Learning to continuously learn paper notes + code interpretation
- Decryption game of private protocol: from secret text to plaintext
猜你喜欢

AI狂想|来这场大会,一起盘盘 AI 的新工具!

Question bank and answers of special operation certificate examination for safety management personnel of hazardous chemical business units in 2022

面试计算机网络-传输层

Interview computer network - transport layer

Federated meta learning with fast convergence and effective communication

Detailed explanation of TF2 command line debugging tool in ROS (parsing + code example + execution logic)

Summary of machine learning + pattern recognition learning (V) -- Integrated Learning

Machine learning from entry to re entry: re understanding of SVM

SQL -- course experiment examination

Design an open source continuous deployment pipeline based on requirements
随机推荐
Federated reconnaissance: efficient, distributed, class incremental learning paper reading + code analysis
Use of gt911 capacitive touch screen
2022 simulated test platform operation of hoisting machinery command test questions
Detailed explanation of coordinate tracking of TF2 operation in ROS (example + code)
openwrt uci c api
MySQL索引(一篇文章轻松搞定)
Detailed explanation of 14 registers in 8086CPU
[Li Kou] curriculum series
C language queue implementation
Detailed explanation of 8086/8088 system bus (sequence analysis + bus related knowledge)
The most understandable explanation of coordinate transformation (push to + diagram)
Go common usage
Dynamic coordinate transformation in ROS (dynamic parameter adjustment + dynamic coordinate transformation)
Velocity autocorrelation function lammps v.s MATALB
Day 5 of pyhon
Bi skills - beginning of the month
2022 G3 boiler water treatment recurrent training question bank and answers
Pyhon的第六天
Detailed explanation of addressing mode in 8086
knife4j 初次使用