当前位置:网站首页>Generalized semantic recognition based on semantic similarity

Generalized semantic recognition based on semantic similarity

2022-06-12 07:33:00 Turned_ MZ

         In the previous chapter , Here we are , For a vertical region BOT The identification of , There will be classification models 、 Intention slot model to identify its corresponding semantics , But this is generally aimed at those that are already mature ( That is, those who have accumulated certain data ) Scene can be done , For the following three scenarios , This approach is not applicable :

  1. Scene cold start , A new scene , There is no script corresponding to the scene on the line , Generally, we will use the template matching method for the cold start problem , But its generalization ability is limited , How to improve its generalization ability ?
  2. Vertical domain BOT Unrecognized script , That is, the previous chapter 《 Potential skills and uncalled calls 》 The uncalled script mentioned in , In this part of the script, we will add it to the existing scene classification 、 Intention slot model , But there is a period for the algorithm to iterate and then go online , Besides algorithm training , It has to go through strict tests , How can we do more timely identification ?
  3. For some festivals or activities , Some skills will be configured for the operation of the student union as festival eggs , But the scripts they usually configure are quite rigid , such as :“ Set off a fireworks ”, How can I recognize “ Have a fireworks ”,“ Set off some fireworks ” This kind of script ?

Answer the above question , We build a generalized semantic recognition system based on semantic similarity . Here's the picture :

 

        This system is divided into offline part and online part , The offline part collects scripts of operation configuration 、 Scenario standard script of product definition 、BOT Medium TOP Script together , As a standard script library , Stored as query-intent Data on , These scripts are also stored in ES In the database , Use these data to train BERT, Make the same intent Of query The vector is closer , The training method is not expanded here , You can use comparative learning or the twin tower model to train . After the training, we can get the pre trained BERT, To get the corresponding query Semantic vector of .

        The online part is a real-time recognition algorithm , For the user query, First go through the defined BOT Semantic recognition , For unrecognized query, after ES A preliminary search of the database , For the retrieved query, Use pre trained BERT Get the semantic vector , At the same time, the user query We also get the semantic vector , The vectors obtained by both sides are matched by similarity , The final results are sorted based on the score threshold .

        

原网站

版权声明
本文为[Turned_ MZ]所创,转载请带上原文链接,感谢
https://yzsam.com/2022/03/202203010556460349.html