当前位置:网站首页>RNN recurrent neural network
RNN recurrent neural network
2022-07-02 05:24:00 【TT ya】
Beginner little rookie , I hope it's like taking notes and recording what I've learned , Also hope to help the same entry-level people , I hope the big guys can help correct it ~ Tort made delete .
One 、RNN Significance of emergence
What we are familiar with CNN, Its output only considers the influence of the previous input and does not consider the influence of the input at other times ( That is, one input after another can only be processed alone )
however , For some time related , A sequence of information ( That is, there is a relationship between before and after input ), For example, predict the content of the document before and after , Now CNN The effect of is not very good .
Our cognition is based on past experience and memory , From this point of view and to the above CNN Inadequate remedy , The design not only considers the input of the previous moment , A recurrent neural network that can also remember the contents in front of the network ——RNN.
Two 、RNN principle
1、RNN Model structure and forward propagation

RNN By input layer , Composition of hidden layer and output layer .
among x,s,o Is a vector , They are the values of the input layer , Hidden layer value and output layer value .
U Is the weight matrix from the input layer to the hidden layer ,V Is the weight matrix from hidden layer to output layer ,W Is the weight matrix of the last value of the hidden layer as the input of this time .
The formula is as follows :

among f and g Is the activation function ,f It can be tanh,relu,sigmoid Wait for the activation function , and g Usually softmax.
ad locum U,V,W It is the same. ( Back propagation and then change , This is just to emphasize that the variable is the latter 3 individual ), The change is Xt,St-1 and St, there W*St-1 It is the influence of the value of the previous moment ( The so-called memory of the past ) Join in .
Specifically, as shown in the figure below, expand by time

2、 Back propagation
Output value of each time Ot Will produce an error value Et
The loss function can use either cross entropy loss function or square error loss function
First let's look at the formula :
Total error

Parameter gradient method



From the above formula, we can get its meaning : The sum of the partial derivatives of the deviation at each moment (U,V,W It's all like this )
We'll take W For reference :
First, expand the formula with the chain rule

Then from the formula just
Plug in , And found st With all the previous moments s There is a direct or indirect relationship , We can get the following formula :

V and U The formula is as follows


3、 ... and 、RNN Application and deficiency of
1、RNN Application field
natural language processing (NLP): Mainly video processing , The text generated , Language model , The image processing
Machine translation , Text similarity calculation , Image description generation
speech recognition
recommend
2、 Insufficient
It is easy to have the problem of gradient disappearance or gradient explosion .
reason : Long time dependence leads to over fitting, which leads to gradient explosion, and long time leads to small memory value, which leads to gradient disappearance .
You are welcome to criticize and correct in the comment area , thank you ~
边栏推荐
猜你喜欢

Gee series: Unit 1 Introduction to Google Earth engine

centos8安装mysql8.0.22教程

Straighten elements (with transition animation)
![Gee: explore the change of water area in the North Canal basin over the past 30 years [year by year]](/img/7b/b9ef76cee8b32204331a9c3c21b5c2.jpg)
Gee: explore the change of water area in the North Canal basin over the past 30 years [year by year]

LeetCode 241. Design priorities for operational expressions (divide and conquer / mnemonic recursion / dynamic programming)

LeetCode 1175. 质数排列(质数判断+组合数学)

"Original, excellent and vulgar" in operation and maintenance work

指针使用详解

Fabric. JS iText sets the color and background color of the specified text

Fabric.js 将本地图像上传到画布背景
随机推荐
Two implementation methods of delay queue
Gee: explore the characteristics of precipitation change in the Yellow River Basin in the past 10 years [pixel by pixel analysis]
Fabric.js 圆形笔刷
Fabric. JS background is not affected by viewport transformation
Essence and physical meaning of convolution (deep and brief understanding)
7.1 Résumé du concours de simulation
Collectors.groupingBy 排序
Fabric.js 更换图片的3种方法(包括更换分组内的图片,以及存在缓存的情况)
Fabric. JS right click menu
Disable access to external entities in XML parsing
Global and Chinese market of hydrocyclone desander 2022-2028: Research Report on technology, participants, trends, market size and share
el form 表单validate成功后没有执行逻辑
Exercise notes 13 (effective letter ectopic words)
LeetCode 1175. 质数排列(质数判断+组合数学)
画波形图_数字IC
Mysql基础---查询(1天学会mysql基础)
Briefly introduce chown command
删除排序数组中的重复项go语言实现
4. Flask cooperates with a tag to link internal routes
线程池批量处理数据