当前位置:网站首页>How to calculate the number of parameters in the LSTM layer
How to calculate the number of parameters in the LSTM layer
2022-06-09 03:28:00 【deephub】
Long and short term memory network ( Often referred to as “ LSTM”) It's a special kind RNN, Well designed LSTM Be able to learn long-term dependence . Just like his name , It can learn long-term and short-term dependence .

Every LSTM Each floor has four doors :
- Forget gate
- Input gate
- New cell state gate
- Output gate
Let's calculate a LSTM Parameters of the unit :
every last lstm All operations are linear , So just calculate one and multiply by 4 That's all right. , Let's say Forget gate For example :

h(t-1) — Hidden layer unit from previous timestamps
x(t) — n-dimesnional unit vector
b- bias term
Because we already know h(t-1) and X(t) W_f and b_f Is unknown . Here we use LSTM To find the final w_f yes [h(t-1), x(t)] The joining together of .
W_f:num_units + input_dim: concat [h(t-1), x(t)]
b_f:1
So let's calculate the parameter formula :
num_param = no_of_gate(num_units + input_dim+1)
Throughout LSTM There are four doors in the floor , So the final equation is as follows .
num_param = 4(num_units + input_dim+1)
In practice , We don't just deal with individual LSTM cell. How to calculate multiple cell Parameters of ?
num_params = 4 * [(num_units + input_dim + 1) * num_units]
num_units = Hidden layer units from previous timestamps = output_dim
We actually calculate a lstm The number of parameters of
from keras.models import Sequential
from keras.layers import Dense, Dropout, Activation
from keras.layers import Embedding
from keras.layers import LSTM
model = Sequential()
model.add(LSTM(200, input_dim=4096, input_length=16))
model.summary()
keras The result of the calculation is :
Model: "sequential_2"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
lstm_2 (LSTM) (None, 200) 3437600
=================================================================
Total params: 3,437,600
Trainable params: 3,437,600
Non-trainable params: 0
_________________________________________________________________
Let's use the above formula to calculate manually :
num_params = 4 * [(num_units + input_dim + 1) * num_units]
num_params = 4*[(200+4096+1) * 200]
num_params = 3437600
The result is the same
https://avoid.overfit.cn/post/ed5f0d482d5e486387f2708b7d0d58d8
author :Maheshmj
边栏推荐
- el-cascader 代码取消选择,手动删除某项
- Ccf-csp 201412-3 call auction
- How does the rtsp/onvif protocol video platform easynvr configure the playback duration of a user's video stream?
- 关于回调的一些思考
- What's the matter with online stock account opening? Is it safe to open an account online?
- Multi scale aligned distillation for low resolution detection
- About JS console Log() is a problem caused by synchronous or asynchronous
- FPGA初次尝试
- Ccf-csp 201403-3 command line options
- Implementation of hash table of Telephone Query System for C language project (project requirements + operation interface + code analysis + complete code)
猜你喜欢

Structure of the actual combat battalion | module 3

ERP overview

Neural network learning (V) -- comparison of common network structures

Failure analysis | xtrabackup backup failure caused by DDL

Ccf-csp 201403-3 command line options

Spark - logging simple to use

No cached version available for offline mode

Simple use of Wireshark

Redis6 learning notes - Chapter 1 - Introduction and environment construction of redis

并发操作之——ReenTrantLock和synchronized的区别
随机推荐
洛谷P3647 [APIO2014] 连珠线 题解
[detailed explanation of kubernetes 13] - dashboard deployment
What's the matter with online stock account opening? Is it safe to open an account online?
Failure analysis | xtrabackup backup failure caused by DDL
Copywriting template used by 90% of we media professionals
Laravel determines whether the mailbox already exists and verifies whether the mailbox format is legal
STM32 flash erase crash
Definition and basic terms of tree
Free video format converter
Coulometer scheme
Ccf-csp 201503-3 Festival
Luogu p3647 [apio2014] Lianzhu line solution
Date tool class - conversion of operation string to date and localdate, time difference between two dates, etc
Two Merged Sequences(CF 1144 G)(将序列拆分成升序序列和降序序列两部分)-DP
The El cascader code cancels the selection and manually deletes an item
ERP overview
Ccf-csp 201909-4 recommended system 100 points
Redis6学习笔记-第一章-Redis的介绍与环境搭建
What is the network transformer for? (Ethernet network LAN LAN communication isolation filter) production plant / product schematic diagram / common products / price influencing factors
No cached version available for offline mode