当前位置：网站首页>Voice assistant - overall architecture and design

Voice assistant - overall architecture and design

2022-06-12 07:31:00 【Turned_ MZ】

In this chapter, let's take a look at the overall architecture and design of voice assistant .

In general , A relatively perfect voice assistant can be divided into ： Central control part + BOT part , For one BOT for , Its essence is a service that can run independently , Include your own central control , Its interior is a small Sunday , The existence of central control is to deal with some problems for each BOT In terms of public treatment , And each BOT Distribution of 、 Sorting and other functions . Here's the picture ：

The blue part , For each BOT, For different system types ,BOT The interior design is also different , There are three typical BOT： gossip BOT、 Mission BOT、 Question and answer BOT, As for each BOT The interior design , We will go into more detail in the following chapters . Here we mainly explain the design of the central control system . The rest of the figure , It contains ：QU、 Operation intervention layer 、 Dialogue management and post-processing policy layer . The following briefly describes the functions of each module ：

One 、QU：

QU For basic semantic understanding , Here is the input query Do some common basic semantic understanding , Include entity identification 、 Text classification 、 Semantic Role Labeling 、 Semantic retrieval 、 Text rewriting, etc , Central control and BOT It will be processed according to the results of semantic understanding , For example, semantic role annotation 、 Text classification 、 Semantic retrieval 、 Entity recognition can be used in conversation management BOT Distribution and sorting of 、 Text rewriting is used to query Error correction , Improve downstream identification effect .

Two 、 Operation intervention layer ：

Operation intervention layer , seeing the name of a thing one thinks of its function , It is mainly used for operation intervention , It can be used in two cases ：

1. Some scripts , Don't want to use its own semantics , Instead, we want to give it special semantics to achieve specific effects , for example ：“ Who is the most beautiful person in the world ”, This is a question , You should search the corresponding Q & A results , But operators sometimes want to turn on the front camera , As a little egg , At this point, we need here to query Intervene .

2、 Some scripts , As a result, we found that it was not carried out as expected , At this point, we can intervene in the results here , Or right query To rewrite , To correct the results .

Of course , In addition to operational availability , There are many things we can do on this floor , For some online questions , It can be handled quickly through intervention here , Avoid impacting more users , For example, you can do BOT Distributed interventions , Interventions that return results, etc .

3、 ... and 、DM layer ：

Dialogue management , The dialog management here mainly includes two functions ：BOT Distribution and sorting of , Multiple rounds of dialogue . It will take advantage of the current query The result of semantic understanding 、 Historical context 、 The environment is decided by the context Action（ Executive action ） And the next state , Here's the picture ：

About the distribution and sorting of central control , Multi round conversation , These will be explained in detail in the following chapters .

Four 、 Pendula BOT layer ：

Pendula bot layer , It includes each independent BOT, Deal with the contents of their respective fields separately , Like chatting BOT It mainly deals with small talk , Chat with users , Question and answer BOT It mainly deals with the dialogue of knowledge retrieval type , Help users search for knowledge , alarm clock BOT It mainly deals with the semantics related to the alarm clock , music BOT It mainly deals with music related semantics and so on . Different types of BOT Internal detailed design of , Let's talk about it later , It's not going to unfold here .

5、 ... and 、 Post processing strategy layer ：

The post-processing policy layer contains ： Client interaction , Recommended services , Dialogue strategies, etc .

Client interaction , That is, the final skill result , encapsulate , Get the structure that the client can execute .

Recommended services , According to the knowledge map 、 User portrait 、 Product strategy, etc. make some recommendations for current user scripts , Such as user query：“ Who is Jay Chou ”, You can make related recommendations ：“ Jay Chou's itinerary ”,“ Who is Jay Chou's wife ”,“ Play Jay Chou's song ” wait , It can also be based on the user's usage habits and current location 、 Time to make some personalized recommendations , such as ： The current night 10 spot , You can recommend “ Tomorrow morning 8 An alarm clock at ”. Recommendation service is also a separate content , In the following chapters, we will talk about it separately .