当前位置:网站首页>神经网络参数初始化
神经网络参数初始化
2022-07-27 05:13:00 【Mr_health】
1. bn层的参数初始化
bn层需要初始化的参数主要是scale和bias,bias一般初始化为0,scale有两种初始化方式
- 初始化为1
- 初始化为0:这里假定bn层不存在,一般用在有残差的结构:resblock,因为有chortcut,所以这种初始化方式才可行
# Zero-initialize the last BN in each residual branch,
# so that the residual branch starts with zeros, and each residual block behaves like an identity.
# This improves the model by 0.2~0.3% according to https://arxiv.org/abs/1706.02677
if zero_init_residual:
for m in self.modules():
if isinstance(m, Bottleneck):
nn.init.constant_(m.bn3.weight, 0)
elif isinstance(m, BasicBlock):
nn.init.constant_(m.bn2.weight, 0)2. fc层的参数初始化
- weight:正态分布来初始化,均值为0,方差可以调整
- bias:0.01或者为0,一般为0
if isinstance(m, nn.Linear):
nn.init.normal_(m.weight, 0, 0.01) #均值为0, 方差为0.01
nn.init.zeros_(m.bias)3. conv层的参数初始化
- bias:0.01或者为0,一般为0
- weight:
(1)Xavier 初始化:Xavier 参数初始化、深度学习参数初始化(一)Xavier初始化 含代码
(2)He初始化:深度学习参数初始化(二)Kaiming初始化 含代码
在He初始化中可以选择mode为:fan_in或者fan_out,有博主总结:
我看了下各大主流的模型如mobilenet_v2,resnet都采用的是mode=‘fan_out’
边栏推荐
- 西瓜书学习笔记---第一、二章
- 15.GPU加速、minist测试实战和visdom可视化
- Day 7. Towards Preemptive Detection of Depression and Anxiety in Twitter
- Day 8.Developing Simplified Chinese Psychological Linguistic Analysis Dictionary for Microblog
- Day 4.Social Data Sentiment Analysis: Detection of Adolescent Depression Signals
- Numpy basic learning
- Gbase 8C - SQL reference 6 SQL syntax (7)
- Gbase 8C - SQL reference 5 full text search
- 2.简单回归问题
- 16.过拟合欠拟合
猜你喜欢

3.分类问题---手写数字识别初体验

西瓜书学习笔记---第四章 决策树

Digital image processing Chapter 8 - image compression

The NFT market pattern has not changed. Can okaleido set off a new round of waves?

GBase 8c产品简介

Day14. Using interpretable machine learning method to distinguish intestinal tuberculosis and Crohn's disease

11.感知机的梯度推导

关于pytorch反向传播的思考

Digital image processing Chapter 5 - image restoration and reconstruction

NFT new paradigm, okaleido innovation NFT aggregation trading ecosystem
随机推荐
GBASE 8C——SQL参考6 sql语法(13)
Uboot中支持lcd和hdmi显示不同的logo图片
Brief analysis of application process creation process of activity
GBASE 8C——SQL参考6 sql语法(12)
14.实例-多分类问题
Day 6.重大医疗伤害事件网络舆情能量传播过程分析*———以“魏则西事件”为例
13.逻辑回归
GBASE 8C——SQL参考6 sql语法(10)
数字图像处理——第三章 灰度变换与空间滤波
数字图像处理——第九章 形态学图像处理
2021中大厂php+go面试题(1)
Minio8.x version setting policy bucket policy
Web2.0 giants have deployed VC, and tiger Dao VC may become a shortcut to Web3
Seven enabling schemes of m-dao help Dao ecology move towards mode and standardization
Gbase 8C - SQL reference 4 character set support
Gbase 8C - SQL reference 5 full text search
GBASE 8C——SQL参考6 sql语法(15)
Performance optimization of common ADB commands
Gbase 8C - SQL reference 6 SQL syntax (3)
12.优化问题实战