当前位置:网站首页>Deep learning - brnn and DRNN
Deep learning - brnn and DRNN
2022-06-30 07:44:00 【Hair will grow again without it】
Two way recurrent neural network (Bidirectional RNN)
two-way RNN The model allows you to not only get the previous information at some point in the sequence , You can also get information about the future
why we need BRNN?
There is a problem with this network , In judging the third word Teddy( Figure number 1 Shown ) Is it part of a person's name , It's not enough just to look at the front part of the sentence , To judge 𝑦^<3>( Figure number 2 Shown ) yes 0 still 1, Except before 3 Word , You need more information , Because before 3 A word can't tell what they're saying Teddy The bear , Or the former president of the United States Teddy Roosevelt, therefore This is a non bidirectional or forward only RNN. What I just said is always true , No matter these units ( Figure number 3 Shown ) It's standard RNN block , still GRU Unit or LSTM unit , As long as these components are forward only .
how can BRNN solve this problem?
Input only 4 individual ,𝑥<1> To 𝑥<4>. The network from here will have a forward loop unit called 𝑎⃗⃗ <1>,𝑎⃗⃗ <2>,𝑎⃗⃗ <3> also 𝑎⃗⃗ <4>, I put a right arrow on it to show the forward loop unit , All four cycle units have a current input 𝑥 Input in , Predicted 𝑦<1>,𝑦<2>,𝑦<3> and 𝑦<4>.
Here is a 𝑎⃖⃗⃗<1>, The left arrow represents a reverse connection ,𝑎⃖⃗⃗<2> Reverse connection ,𝑎⃖⃗⃗<3> Reverse connection ,𝑎⃖⃗⃗<4> Reverse connection , So the left arrow here represents a reverse connection .
Given an input sequence 𝑥<1> To 𝑥<4>, This sequence First calculate the forward 𝑎⃗⃗ <1>, Then calculate the forward 𝑎⃗⃗ <2>, next 𝑎⃗⃗ <3>,𝑎⃗⃗ <4>. and The reverse sequence is calculated from 𝑎⃖⃗⃗<4> Start , Go in reverse , Calculate the reverse 𝑎⃖⃗⃗<3>. What you calculate is the network activation value , It's not reverse but forward propagation , And this one in the picture Part of the forward propagation is calculated from left to right , Part of the calculation is right to left . After calculating the reverse 𝑎⃖⃗⃗<3>, You can use these activation values to calculate the reverse 𝑎⃖⃗⃗<2>, And then it's reverse 𝑎⃖⃗⃗<1>, After all these activation values have been calculated, the prediction results can be calculated .
for instance , In order to predict the result , Your network will be like 𝑦^<𝑡>,𝑦^<𝑡> = 𝑔(𝑊𝑔[𝑎⃗⃗ <𝑡> , 𝑎⃖⃗⃗<𝑡>] + 𝑏𝑦)
. For example, you need to observe time 3 The prediction here , Information from 𝑥<1> To come over , Through here , Prior to the 𝑎⃗⃗ <1> To the forward 𝑎⃗⃗ <2>, There are expressions in these functions , To the forward 𝑎⃗⃗ <3> Until then 𝑦^<3>, So from 𝑥<1>,𝑥<2>,𝑥<3> All incoming information will be taken into account , And from 𝑥<4> Incoming information flows in the opposite direction 𝑎⃖⃗⃗<4>, To reverse 𝑎⃖⃗⃗<3> Until then 𝑦^<3>, This makes time 3 The results of the prediction not only input the past information , And now the message , This step involves forward and backward information dissemination as well as future information .
This is a two-way recurrent neural network , And these basic units are not just standards RNN unit , It can also be GRU Unit or LSTM unit . in fact , A great deal of NLP problem , For a large number of texts with natural language processing problems , Yes LSTM Two way of unit RNN Models are the most used . So if there is NLP problem , And the text and sentences are complete , First of all, we need to calibrate these sentences , One has LSTM Two way of unit RNN Model , Having forward and reverse processes is a good first choice
Deep circulation neural network (Deep RNNs)
use 𝑎[1]<0> To represent the first layer , So we're now use 𝑎[𝑙]<𝑡> To represent the l Activation value of layer , This means the second 𝑡 Some time , In this way, we can express . Activation value of the first time point of the first layer 𝑎[1]<1>, this (𝑎[1]<2>) Is the activation value of the second time point of the first layer ,𝑎[1]<3> and 𝑎[1]<4>. And then we Stack these on top , This is a new network with three hidden layers .
Look at the value 𝑎[2]<3> How is it calculated .
Activation value 𝑎[2]<3> There are two inputs , One is the input from below , There's another input coming from the left ,𝑎[2]<3> = 𝑔(𝑊𝑎[2][𝑎[2]<2>, 𝑎[1]<3>] + 𝑏𝑎[2])
, This is how the activation value is calculated . Parameters 𝑊𝑎[2] and 𝑏𝑎[2] It is the same in the calculation of this layer , Correspondingly, the first layer also has its own parameters 𝑊𝑎[1] and 𝑏𝑎[1].
边栏推荐
- Network security and data in 2021: collection of new compliance review articles (215 pages)
- Deloitte: investment management industry outlook in 2022
- Final review -php learning notes 7-php and web page interaction
- Xiashuo think tank: 125 planet updates reported today (packed with 101 meta universe collections)
- Similarities and differences of differential signal, common mode signal and single ended signal (2022.2.14)
- Summary and common applications of direction and angle operators in Halcon
- Common sorting methods
- Introduction notes to pytorch deep learning (XII) neural network - nonlinear activation
- Application of stack -- using stack to realize bracket matching (C language implementation)
- November 9, 2020 [wgs/gwas] - whole genome analysis (association analysis) process (Part 2)
猜你喜欢
2021-10-27 [WGS] pacbio third generation methylation modification process
Multi whale capital: report on China's education intelligent hardware industry in 2022
深度学习——语言模型和序列生成
深度学习——嵌入矩阵and学习词嵌入andWord2Vec
Final review -php learning notes 6- string processing
深度学习——残差网络ResNets
Final review -php learning notes 1
Introduction notes to pytorch deep learning (10) neural network convolution layer
想转行,却又不知道干什么?此文写给正在迷茫的你
25岁,从天坑行业提桶跑路,在经历千辛万苦转行程序员,属于我的春天终于来了
随机推荐
Account command and account authority
MCU essay
6月底了,可以开始做准备了,不然这么赚钱的行业就没你的份了
C51 minimum system board infrared remote control LED light on and off
Desk lamp control panel - brightness adjustment timer
NMOS model selection
深度学习——目标定位
Basic operation command
STM32 register on LED
Program acceleration
Deloitte: investment management industry outlook in 2022
Lexicographic order -- full arrangement in bell sound
Disk space, logical volume
November 22, 2021 [reading notes] - bioinformatics and functional genomics (Section 5 of Chapter 5 uses a comparison tool similar to blast to quickly search genomic DNA)
24C02
Final review -php learning notes 11-php-pdo database abstraction layer
Tue Jun 28 2022 15:30:29 GMT+0800 (中国标准时间) 日期格式化
Log service management
深度学习——LSTM
冰冰学习笔记:快速排序