当前位置：网站首页>Voice assistant - Multi round conversation (process implementation)

Voice assistant - Multi round conversation (process implementation)

2022-06-12 07:32:00 【Turned_ MZ】

This chapter , Let's take a look at the main process of multi round conversation in voice assistant . Here we mainly talk about the implementation of semantic inheritance and ellipsis completion in the open domain . Omission and completion refers to the current query There is no apparent intention in not combining the above , But in combination with the above, we can get the intention , Semantic succession is the intention of this round , But some slots are missing , At this point, useful slot information can be obtained in combination with the above . Take a look at the flow chart below ：

Some modules are explained below ：

1、 User intention identification

This module is designed to identify the user's intention , For modules in the normal process , When there is intention , It has the potential of semantic inheritance , When there is no intention , It has the potential to omit and complete , It needs to be judged in combination with the following process .

2、 Association recognition

Purpose ： Association recognition has two main purposes ：

To identify the current query Whether it is related to the above , And which round of dialogue above is relevant

U1: Order one 8 An alarm clock at
U2: What's the weather like today? # And U2 Nothing above
U3: Turn off the alarm clock just now # And U2 irrelevant , And U1 of

Identify associated categories

U1： What's the weather like in Shenzhen today
U2: And tomorrow # Omit and complete
U3: Buy a train ticket to go there # Semantic succession （ When there are demonstrative pronouns , It can be done as anaphora resolution ）

Realization way ：

The implementation here can be implemented using rules , You can also use models to implement , The model aspect can use the correlation model , For example, the typical DSSM, Rules can maintain a black-and-white list of context related intentions . At the same time, we can judge whether the context is related except whether there is semantic relevance , You also need to limit the time window , Beyond the time window, it is no longer considered valid .

3、 Slot alignment

Slot alignment refers to some operations on slots , For example, add 、 modify 、 Delete and other operations , You can use triples to represent ：（ Slot position A, Alignment operation O, Slot position B）, Indicates the use of slots A The information on the slot B Perform operations on it O, Take the following example ：

add to （ Slot inheritance ）, When the slot is empty , Supplement the slot

U: Set a tomorrow 8 An alarm clock at
U: Turn off the alarm clock # The time slot is empty , Use the time slot of the previous round

Replace ： Currently available slot information , You need to replace the slot information above

U: What's the weather like today?
U： What about tomorrow ？ # Trigger ellipsis completion , There is currently time information available , Use the above intention , Replace the above time slot with the current time .
U： Where is the window of the world ？
U： Navigate there . # Trigger anaphora resolution , The current round actually has a destination , That is to say “ Where? ”, Replace with the above slot .

Delete ： Delete slot information

U: Buy one for tomorrow 8 The train ticket to Beijing
U: Forget it , I don't want to go to Beijing # The user deletes the destination slot
# Because the user deleted the destination slot , At this time, the slot is missing , Then, the slot position inquiry will be triggered , Enter the enclosed multi wheel .

The following two points shall be met during slot alignment ：

Consistent attributes ： Requirements for consistent slot attributes , For example, location attribute 、 Time attribute , If further divided , Can be divided into destinations 、 Place of departure 、 Starting time 、 End time, etc . Attributes are inconsistent and cannot be replaced , For example, the name of a character cannot be filled in the location slot .
Slot black and white list ： Sometimes some slot attributes are consistent , But because of the intended characteristics or product requirements , The two are not suitable for alignment , It can be restricted through the black-and-white list .

4、 Knowledge verification

Knowledge verification refers to the validity verification of slots to be filled , Judge whether the slot position modification is reasonable , This can be combined with the knowledge map , Verify according to the relevant attributes of the entities in the slot , Take the following example ：

Rationality check ： Judge whether the slot is reasonable after filling

U: What's the weather like today? ？ # Suppose today is 8 month 1 Number
U: National Day # National Day is far away 8 month 1 The sun is too far away , No information about the weather , Multiple rounds of inheritance should not be performed at this time , Need to publish the National Day encyclopedia or national day calendar query .

Authenticity verification ： Judge whether the information to be filled in the slot really exists

U: I want to hear 《 blue and white porcelain 》
U： Du Fu's # Dufu is a person's name , But I haven't sung 《 blue and white porcelain 》, So it is not suitable to be filled in

After knowledge verification , Determine the unreasonable slot position , Knowledge conflict handling is required , For example, intention jump （ That is, jump to other intentions ）、 Slot reset （ Empty or reset the slot ）、 Multiple rounds of inquiry （ Add a round of inquiry to obtain a reasonable slot position ）

Some ideas ：

The plan above , It is to modularize the process of multi round processing , The advantage is that the process and effect can be controlled , At the same time, some modules can be upgraded , For example, a module is replaced by a model to improve the calling , If some scenarios want to tighten the multi round strategy , It will also be very convenient . For most multi round requirements , Through the above process, you can achieve .

Although the above scheme is more practical , But not enough kool, If there is an end-to-end model directly through query Rewrite to achieve multiple rounds , Or there is an end-to-end model to realize both context sensitive recognition and slot alignment , Will appear more advanced .

query Rewriting the plan is actually relatively simple , Through one encoder-decoder The model can realize , Some models for translation tasks can be used to do .

Another way is the implementation of a paper 《Incomplete Utterance Rewriting as Semantic Segmentation》, The main idea of this article is to refer to the idea of semantic segmentation in images , Use it for text cutting , Find the associated slot and replaceable part in the context .