当前位置:网站首页>神经网络参数初始化
神经网络参数初始化
2022-07-27 05:13:00 【Mr_health】
1. bn层的参数初始化
bn层需要初始化的参数主要是scale和bias,bias一般初始化为0,scale有两种初始化方式
- 初始化为1
- 初始化为0:这里假定bn层不存在,一般用在有残差的结构:resblock,因为有chortcut,所以这种初始化方式才可行
# Zero-initialize the last BN in each residual branch,
# so that the residual branch starts with zeros, and each residual block behaves like an identity.
# This improves the model by 0.2~0.3% according to https://arxiv.org/abs/1706.02677
if zero_init_residual:
for m in self.modules():
if isinstance(m, Bottleneck):
nn.init.constant_(m.bn3.weight, 0)
elif isinstance(m, BasicBlock):
nn.init.constant_(m.bn2.weight, 0)2. fc层的参数初始化
- weight:正态分布来初始化,均值为0,方差可以调整
- bias:0.01或者为0,一般为0
if isinstance(m, nn.Linear):
nn.init.normal_(m.weight, 0, 0.01) #均值为0, 方差为0.01
nn.init.zeros_(m.bias)3. conv层的参数初始化
- bias:0.01或者为0,一般为0
- weight:
(1)Xavier 初始化:Xavier 参数初始化、深度学习参数初始化(一)Xavier初始化 含代码
(2)He初始化:深度学习参数初始化(二)Kaiming初始化 含代码
在He初始化中可以选择mode为:fan_in或者fan_out,有博主总结:
我看了下各大主流的模型如mobilenet_v2,resnet都采用的是mode=‘fan_out’
边栏推荐
猜你喜欢

西瓜书学习第五章---神经网络

Day 6.重大医疗伤害事件网络舆情能量传播过程分析*———以“魏则西事件”为例

6.维度变换和Broadcasting

4.张量数据类型和创建Tensor

Deploy redis with docker for high availability master-slave replication

If you encounter oom online, how to solve it?

Do you really know session and cookies?

Day 9. Graduate survey: A love–hurt relationship

Count the quantity in parallel after MySQL grouping

数字图像处理第四章——频率域滤波
随机推荐
GBASE 8C——SQL参考6 sql语法(5)
Day 15. Deep learning radiomics can predict axillary lymphnode status in early-stage breast cancer
2021中大厂php+go面试题(2)
万字解析MySQL索引原理——InnoDB索引结构与读取
GBASE 8C——SQL参考6 sql语法(7)
MySQL查询操作索引优化实践
GBASE 8C——SQL参考6 sql语法(11)
Numpy basic learning
数字图像处理第四章——频率域滤波
Emoji表情符号用于文本情感分析-Improving sentiment analysis accuracy with emoji embedding
Public opinion & spatio-temporal analysis of infectious diseases literature reading notes
Day 8.Developing Simplified Chinese Psychological Linguistic Analysis Dictionary for Microblog
难道Redis真的变慢了吗?
Day14. 用可解释机器学习方法鉴别肠结核和克罗恩病
GBASE 8C——SQL参考6 sql语法(14)
GBase 8c核心技术
19.上下采样与BatchNorm
If the interviewer asks you about JVM, the extra answer of "escape analysis" technology will give you extra points
什么是okr,和kpi的区别在哪里
Seven enabling schemes of m-dao help Dao ecology move towards mode and standardization