当前位置:网站首页>RNN recurrent neural network
RNN recurrent neural network
2022-07-02 05:24:00 【TT ya】
Beginner little rookie , I hope it's like taking notes and recording what I've learned , Also hope to help the same entry-level people , I hope the big guys can help correct it ~ Tort made delete .
One 、RNN Significance of emergence
What we are familiar with CNN, Its output only considers the influence of the previous input and does not consider the influence of the input at other times ( That is, one input after another can only be processed alone )
however , For some time related , A sequence of information ( That is, there is a relationship between before and after input ), For example, predict the content of the document before and after , Now CNN The effect of is not very good .
Our cognition is based on past experience and memory , From this point of view and to the above CNN Inadequate remedy , The design not only considers the input of the previous moment , A recurrent neural network that can also remember the contents in front of the network ——RNN.
Two 、RNN principle
1、RNN Model structure and forward propagation

RNN By input layer , Composition of hidden layer and output layer .
among x,s,o Is a vector , They are the values of the input layer , Hidden layer value and output layer value .
U Is the weight matrix from the input layer to the hidden layer ,V Is the weight matrix from hidden layer to output layer ,W Is the weight matrix of the last value of the hidden layer as the input of this time .
The formula is as follows :

among f and g Is the activation function ,f It can be tanh,relu,sigmoid Wait for the activation function , and g Usually softmax.
ad locum U,V,W It is the same. ( Back propagation and then change , This is just to emphasize that the variable is the latter 3 individual ), The change is Xt,St-1 and St, there W*St-1 It is the influence of the value of the previous moment ( The so-called memory of the past ) Join in .
Specifically, as shown in the figure below, expand by time

2、 Back propagation
Output value of each time Ot Will produce an error value Et
The loss function can use either cross entropy loss function or square error loss function
First let's look at the formula :
Total error

Parameter gradient method



From the above formula, we can get its meaning : The sum of the partial derivatives of the deviation at each moment (U,V,W It's all like this )
We'll take W For reference :
First, expand the formula with the chain rule

Then from the formula just
Plug in , And found st With all the previous moments s There is a direct or indirect relationship , We can get the following formula :

V and U The formula is as follows


3、 ... and 、RNN Application and deficiency of
1、RNN Application field
natural language processing (NLP): Mainly video processing , The text generated , Language model , The image processing
Machine translation , Text similarity calculation , Image description generation
speech recognition
recommend
2、 Insufficient
It is easy to have the problem of gradient disappearance or gradient explosion .
reason : Long time dependence leads to over fitting, which leads to gradient explosion, and long time leads to small memory value, which leads to gradient disappearance .
You are welcome to criticize and correct in the comment area , thank you ~
边栏推荐
猜你喜欢

Centos8 installation mysql8.0.22 tutorial

centos8安裝mysql8.0.22教程

Mysql基础---查询(1天学会mysql基础)

Using QA band and bit mask in Google Earth engine

LeetCode 1175. Prime number arrangement (prime number judgment + Combinatorial Mathematics)

LeetCode 241. Design priorities for operational expressions (divide and conquer / mnemonic recursion / dynamic programming)

Paddlepaddle project source code

LeetCode 241. 为运算表达式设计优先级(分治/记忆化递归/动态规划)

Using Kube bench and Kube hunter to evaluate the risk of kubernetes cluster

Dark horse notes -- Set Series Collection
随机推荐
Brew install * failed, solution
画波形图_数字IC
7.TCP的十一种状态集
Fabric.js 3个api设置画布宽高
Record my pytorch installation process and errors
Creation and destruction of function stack frames
Gee data set: export the distribution and installed capacity of hydropower stations in the country to CSV table
Feign realizes file uploading and downloading
Global and Chinese market of travel data recorder (VDR) 2022-2028: Research Report on technology, participants, trends, market size and share
Database batch insert data
Gee series: Unit 1 Introduction to Google Earth engine
brew install * 失败,解决方法
[opencv] image binarization
操作符详解
Fabric. JS round brush
Pyechats 1.19 generate a web version of Baidu map
LeetCode 1175. 质数排列(质数判断+组合数学)
Global and Chinese market of hydrocyclone desander 2022-2028: Research Report on technology, participants, trends, market size and share
Fabric. JS 3 APIs to set canvas width and height
Thread pool batch processing data