当前位置:网站首页>Voice assistant -- Architecture and design of Instruction Assistant
Voice assistant -- Architecture and design of Instruction Assistant
2022-06-12 07:33:00 【Turned_ MZ】
In this chapter, let's take a look at voice assistant , Command type ( Task oriented ) The overall architecture and design of the assistant .
One 、 Application scenarios of instruction assistant
ad locum , If the assistant's role is to help the user perform certain operations , such as : Voice alarm clock , Listen to music by voice , Voice navigation, etc , The assistant that implements this kind of task is called instruction type voice assistant , That is, the user sends some instructions through voice , The assistant operates according to the instructions , So as to realize the liberation of both hands , The purpose of the shortcut , For example, the following examples :

Two 、 Instruction Assistant architecture and design
We are in the previous chapter 《 The overall architecture and design of voice assistant 》 Mentioned in , Each vertical class in the instruction assistant acts as BOT There is a form of ,BOT It's a small Sunday , Each module has , The following figure shows a BOT The overall architecture and design of :

Pictured above , stay BOT There are various worker, Be responsible for handling different tasks ,worker There is no direct communication between , But through DataBus To deliver data , therefore ,worker And worker They are completely independent of each other , Just need to satisfy DataBus Interface definition in , Data transmission can be realized , This design , You can make worker You can iterate independently , Sub module development .
Let's look at each one separately worker The role of :
1、 Basic semantic understanding
The basic semantic understanding here , It refers to the input query Some basic processing of , such as NER、 Syntactic parsing 、 participle 、 Scene classification , Error correction, etc , This step is usually performed by external DM Common module processing in , The incoming to BOT in .
2、history_session_worker
this worker It is mainly used to process historical information , Including getting global dialog history information , The BOT Relevant historical information, etc , Generally, if it is an open multi wheel , Global dialog history information is required , In case of closed multi wheel , This is generally required BOT Relevant historical information .
3、qu_worker(BOT Inside )
this worker And the outside DM Medium qu Modules are different , External qu Modules are used to handle basic semantic understanding , The identified content will be shared to each active BOT, and BOT Inside qu_worker It is specially used to deal with the worker Peculiar qu Information ( If necessary ), for instance , If there is an alarm clock scene query rewrite , It takes effect here .
4、scene_identify_worker
this worker Generally, it is multi classification or two classification , We talked about that before , Every BOT Need to have “ Identify incoming query Whether it belongs to BOT” The ability of , So it's time to worker The main function is to exclude those that do not belong to the scenario query,DM Recall the scene classification model in , So for some BOT Vague words will be recalled by mistake , At this point the worker The classification in the needs to be accurate , It can improve the accuracy . Of course , Besides this function , It can also be implemented as multi classification , For... Below worker Provide information .
5、intent_entity_worker
this worker It is mainly used to identify the intention , Extraction slot , Its internal implementation can be based on BOT Type to implement , Generally speaking, there are several : Templates 、 Model ( Intention slot model )、 Templates + Model 、 Semantic role annotation, etc .
6、context_association_worker
this worker The ability to implement multiple sessions , Generally including : Closed multi wheel, i.e. slot position inquiry , Open multiple rounds means semantic inheritance 、 Omit and complete 、 Anaphora digestion , Scripted multi wheel .
7、NLU_postprocess_worker
this worker Used for post-processing semantic content . this worker Before ( Include this worker) The content of , We are all divided into semantic layers , That is, the content identified here is the result of pure semantics , namely query Its own semantics . Distinguish it from the skill level below , Sometimes according to the product demand , Some semantics need to be executed with special skills , such as :query:“ Search for facial cleanser ”, The query The meaning of is “ Search for items ”, If the item is a commodity , You need to perform shopping skills . If the query Semantics divided into shopping , and “ Buy facial cleanser on Taobao ” In the same way , May cause the BOT False recall of .
therefore , take query Divide according to semantics , Each different type of semantics can be iterated separately , And some processing according to different semantics , Take the example above , If query by “ Search for xxx”, It needs to be judged in combination with the knowledge map “xxx” Whether it is a commodity , If it is a commodity, go shopping , Otherwise, go to encyclopedia ; If query by “ Buy on Taobao xxx”, It doesn't matter “xxx” What is it? , All need to go shopping .
The worker in , Generally, entity disambiguation is implemented 、 Post processing at semantic level such as entity verification .
8、skill_worker
This is the first step in the skill level worker, Mapping semantic results to skills , A mapping table is usually maintained here , Mapping semantics - Skill .
9、skill_rank_worker
Realize here BOT Inside is different skill Sorting between , for instance , Map BOT Inside ,query:“xxx Where is the ”, This meaning will be mapped to two skills :1、“ Search the map xxx”.2、“ The encyclopedia card shows xxx Introduction to ”. These two skills will be based on "xxx" Sort by type , If “xxx” For scenic spots , Then the skills 2, If “xxx” It is a common place , Then the skills 1.
10、skill_postprocess_worker
The implementation here is for skill Post processing of , such as query:“ What's the weather like today? ”, After identifying to semantics and mapping to “ Check the weather ” After skill , Can be here worker To access the weather server , Get weather information , In the form of a card , Distribute to client .
11、nlg_worker
This step is generally used to generate nlg Talking skill , In the skill type BOT in , commonly nlg They are all configured based on templates , according to slot、intent as well as context Information , Different combinations generate different reply scripts . Chatting BOT Or ask and answer BOT in , this worker Instead, you can generate a model or retrieve a model to generate a reply .
General recommendations are also available here , such as :query:“ What's the weather like today? ”, Get recommendations :“ What about tomorrow ”、” What to wear today “ etc.
12、execution_worker
This is mainly used for final result encapsulation , Encapsulated as structured data , Return to external DM.
3、 ... and 、 Written in the back
The above architectural form , It can be realized that BOT The internal links are detailed 、 decompose , each worker Independent of each other , It can be developed independently by different principals 、 upgrade . meanwhile BOT And BOT They are also independent of each other , It can be upgraded independently , individual worker perhaps BOT error , It does not necessarily affect the overall results ( Notice that this is ” not always “, That is, the robustness of the system is improved to a certain extent , But the final result may not be optimal ).
Because of the various worker Independent of each other , Only pass data_bus communicate , Therefore, theoretically, it can be designed to pass RPC perhaps http In the form of communication , In this way, we can really realize worker Independent upgrade of . But this will increase the network overhead , May cause system performance degradation .
边栏推荐
- R语言dplyr包mutate_at函数和one_of函数将dataframe数据中指定数据列(通过向量指定)的数据类型转化为因子类型
- [college entrance examination] prospective college students look at it, choose the direction and future, and grasp it by themselves
- VS2019 MFC IP Address Control 控件继承CIPAddressCtrl类重绘
- Arrangement of statistical learning knowledge points -- maximum likelihood estimation (MLE) and maximum a posteriori probability (map)
- Detailed explanation of TF2 command line debugging tool in ROS (parsing + code example + execution logic)
- ‘CMRESHandler‘ object has no attribute ‘_ timer‘,socket. gaierror: [Errno 8] nodename nor servname pro
- 2022 electrician (elementary) examination question bank and simulation examination
- Bi skills - beginning of the month
- Fcpx plug-in: simple line outgoing text title introduction animation call outs with photo placeholders for fcpx
- Golang quickly generates model and queryset of database tables
猜你喜欢

sql——课程实验考查

paddlepaddl 28 支持任意维度数据的梯度平衡机制GHM Loss的实现(支持ignore_index、class_weight,支持反向传播训练,支持多分类)

Installation and use of eigen under vs2017

Thoroughly understand the "rotation matrix / Euler angle / quaternion" and let you experience the beauty of three-dimensional rotation

Missing getting in online continuous learning with neuron calibration thesis analysis + code reading

There is no solid line connection between many devices in Proteus circuit simulation design diagram. How are they realized?

2022电工(初级)考试题库及模拟考试

Adaptive personalized federated learning paper interpretation + code analysis

Explain ADC in stm32

Test left shift real introduction
随机推荐
Detailed explanation of 14 registers in 8086CPU
Personalized federated learning using hypernetworks paper reading notes + code interpretation
AcWing——4268. Sexy element
Imx6q PWM drive
Keil installation of C language development tool for 51 single chip microcomputer
Explain ADC in stm32
Complete set of typescript Basics
R语言e1071包的naiveBayes函数构建朴素贝叶斯模型、predict函数使用朴素贝叶斯模型对测试数据进行预测推理、table函数构建混淆矩阵
Summary of semantic segmentation learning (I) -- basic concepts
Velocity autocorrelation function lammps v.s MATALB
Vs 2019 MFC connects and accesses access database class library encapsulation through ace engine
sql——课程实验考查
Class as a non type template parameter of the template
Voice assistant - Measurement Indicators
Why must coordinate transformations consist of publishers / subscribers of coordinate transformation information?
私有协议的解密游戏:从秘文到明文
knife4j 初次使用
鸿蒙os-第一次培训
AI狂想|来这场大会,一起盘盘 AI 的新工具!
Study on display principle of seven segment digital tube