当前位置:网站首页>Deep learning - brnn and DRNN
Deep learning - brnn and DRNN
2022-06-30 07:44:00 【Hair will grow again without it】
Two way recurrent neural network (Bidirectional RNN)
two-way RNN The model allows you to not only get the previous information at some point in the sequence , You can also get information about the future
why we need BRNN?
There is a problem with this network , In judging the third word Teddy( Figure number 1 Shown ) Is it part of a person's name , It's not enough just to look at the front part of the sentence , To judge 𝑦^<3>( Figure number 2 Shown ) yes 0 still 1, Except before 3 Word , You need more information , Because before 3 A word can't tell what they're saying Teddy The bear , Or the former president of the United States Teddy Roosevelt, therefore This is a non bidirectional or forward only RNN. What I just said is always true , No matter these units ( Figure number 3 Shown ) It's standard RNN block , still GRU Unit or LSTM unit , As long as these components are forward only .
how can BRNN solve this problem?
Input only 4 individual ,𝑥<1> To 𝑥<4>. The network from here will have a forward loop unit called 𝑎⃗⃗ <1>,𝑎⃗⃗ <2>,𝑎⃗⃗ <3> also 𝑎⃗⃗ <4>, I put a right arrow on it to show the forward loop unit , All four cycle units have a current input 𝑥 Input in , Predicted 𝑦<1>,𝑦<2>,𝑦<3> and 𝑦<4>.
Here is a 𝑎⃖⃗⃗<1>, The left arrow represents a reverse connection ,𝑎⃖⃗⃗<2> Reverse connection ,𝑎⃖⃗⃗<3> Reverse connection ,𝑎⃖⃗⃗<4> Reverse connection , So the left arrow here represents a reverse connection .
Given an input sequence 𝑥<1> To 𝑥<4>, This sequence First calculate the forward 𝑎⃗⃗ <1>, Then calculate the forward 𝑎⃗⃗ <2>, next 𝑎⃗⃗ <3>,𝑎⃗⃗ <4>. and The reverse sequence is calculated from 𝑎⃖⃗⃗<4> Start , Go in reverse , Calculate the reverse 𝑎⃖⃗⃗<3>. What you calculate is the network activation value , It's not reverse but forward propagation , And this one in the picture Part of the forward propagation is calculated from left to right , Part of the calculation is right to left . After calculating the reverse 𝑎⃖⃗⃗<3>, You can use these activation values to calculate the reverse 𝑎⃖⃗⃗<2>, And then it's reverse 𝑎⃖⃗⃗<1>, After all these activation values have been calculated, the prediction results can be calculated .
for instance , In order to predict the result , Your network will be like 𝑦^<𝑡>,𝑦^<𝑡> = 𝑔(𝑊𝑔[𝑎⃗⃗ <𝑡> , 𝑎⃖⃗⃗<𝑡>] + 𝑏𝑦). For example, you need to observe time 3 The prediction here , Information from 𝑥<1> To come over , Through here , Prior to the 𝑎⃗⃗ <1> To the forward 𝑎⃗⃗ <2>, There are expressions in these functions , To the forward 𝑎⃗⃗ <3> Until then 𝑦^<3>, So from 𝑥<1>,𝑥<2>,𝑥<3> All incoming information will be taken into account , And from 𝑥<4> Incoming information flows in the opposite direction 𝑎⃖⃗⃗<4>, To reverse 𝑎⃖⃗⃗<3> Until then 𝑦^<3>, This makes time 3 The results of the prediction not only input the past information , And now the message , This step involves forward and backward information dissemination as well as future information .
This is a two-way recurrent neural network , And these basic units are not just standards RNN unit , It can also be GRU Unit or LSTM unit . in fact , A great deal of NLP problem , For a large number of texts with natural language processing problems , Yes LSTM Two way of unit RNN Models are the most used . So if there is NLP problem , And the text and sentences are complete , First of all, we need to calibrate these sentences , One has LSTM Two way of unit RNN Model , Having forward and reverse processes is a good first choice
Deep circulation neural network (Deep RNNs)
use 𝑎[1]<0> To represent the first layer , So we're now use 𝑎[𝑙]<𝑡> To represent the l Activation value of layer , This means the second 𝑡 Some time , In this way, we can express . Activation value of the first time point of the first layer 𝑎[1]<1>, this (𝑎[1]<2>) Is the activation value of the second time point of the first layer ,𝑎[1]<3> and 𝑎[1]<4>. And then we Stack these on top , This is a new network with three hidden layers .
Look at the value 𝑎[2]<3> How is it calculated .
Activation value 𝑎[2]<3> There are two inputs , One is the input from below , There's another input coming from the left ,𝑎[2]<3> = 𝑔(𝑊𝑎[2][𝑎[2]<2>, 𝑎[1]<3>] + 𝑏𝑎[2]), This is how the activation value is calculated . Parameters 𝑊𝑎[2] and 𝑏𝑎[2] It is the same in the calculation of this layer , Correspondingly, the first layer also has its own parameters 𝑊𝑎[1] and 𝑏𝑎[1].
边栏推荐
- 深度学习——特征点检测和目标检测
- Efga design open source framework openlane series (I) development environment construction
- 深度学习——嵌入矩阵and学习词嵌入andWord2Vec
- Final review -php learning notes 7-php and web page interaction
- Disk space, logical volume
- right four steps of SEIF SLAM
- STM32 key control LED
- Analysis of cross clock transmission in tinyriscv
- Final review -php learning notes 5-php array
- Deloitte: investment management industry outlook in 2022
猜你喜欢

Tencent and Fudan University "2021-2022 yuan universe report" with 102 yuan universe collections
![January 23, 2022 [reading notes] - bioinformatics and functional genomics (Chapter 6: multiple sequence alignment)](/img/48/cfe6ab95b4d4660e3ac3d84ae5303b.jpg)
January 23, 2022 [reading notes] - bioinformatics and functional genomics (Chapter 6: multiple sequence alignment)

Spring Festival inventory of Internet giants in 2022

深度学习——BRNN和DRNN

Installation software operation manual (continuous update)

Inversion Lemma
![November 22, 2021 [reading notes] - bioinformatics and functional genomics (Section 5 of Chapter 5 uses a comparison tool similar to blast to quickly search genomic DNA)](/img/de/7ffcc8d6911c499a9798ac9215c63f.jpg)
November 22, 2021 [reading notes] - bioinformatics and functional genomics (Section 5 of Chapter 5 uses a comparison tool similar to blast to quickly search genomic DNA)

深度学习——卷积的滑动窗口实现

National technology n32g45x series about timer timing cycle calculation

Recurrence relation (difference equation) -- Hanoi problem
随机推荐
Firewall firewalld
Account command and account authority
Basic theory of four elements and its application
Efga design open source framework openlane series (I) development environment construction
Parameter calculation of deep learning convolution neural network
NMOS model selection
STM32 infrared communication 3 brief
2021-10-29 [microbiology] qiime2 sample pretreatment form automation script
Final review -php learning notes 8-mysql database
Global digital industry strategy and policy observation in 2021 (China Academy of ICT)
C. Fishingprince Plays With Array
Recurrence relation (difference equation) -- Hanoi problem
Digital tube EEPROM key to save value
DS1302 digital tube clock
Examen final - notes d'apprentissage PHP 3 - Déclaration de contrôle du processus PHP
Deep learning - residual networks resnets
期末复习-PHP学习笔记3-PHP流程控制语句
你了解IP协议吗?
Simple application of generating function -- integer splitting 2
February 14, 2022 [reading notes] - life science based on deep learning Chapter 2 Introduction to deep learning (Part 1)

