当前位置:网站首页>神经网络参数初始化
神经网络参数初始化
2022-07-27 05:13:00 【Mr_health】
1. bn层的参数初始化
bn层需要初始化的参数主要是scale和bias,bias一般初始化为0,scale有两种初始化方式
- 初始化为1
- 初始化为0:这里假定bn层不存在,一般用在有残差的结构:resblock,因为有chortcut,所以这种初始化方式才可行
# Zero-initialize the last BN in each residual branch,
# so that the residual branch starts with zeros, and each residual block behaves like an identity.
# This improves the model by 0.2~0.3% according to https://arxiv.org/abs/1706.02677
if zero_init_residual:
for m in self.modules():
if isinstance(m, Bottleneck):
nn.init.constant_(m.bn3.weight, 0)
elif isinstance(m, BasicBlock):
nn.init.constant_(m.bn2.weight, 0)2. fc层的参数初始化
- weight:正态分布来初始化,均值为0,方差可以调整
- bias:0.01或者为0,一般为0
if isinstance(m, nn.Linear):
nn.init.normal_(m.weight, 0, 0.01) #均值为0, 方差为0.01
nn.init.zeros_(m.bias)3. conv层的参数初始化
- bias:0.01或者为0,一般为0
- weight:
(1)Xavier 初始化:Xavier 参数初始化、深度学习参数初始化(一)Xavier初始化 含代码
(2)He初始化:深度学习参数初始化(二)Kaiming初始化 含代码
在He初始化中可以选择mode为:fan_in或者fan_out,有博主总结:
我看了下各大主流的模型如mobilenet_v2,resnet都采用的是mode=‘fan_out’
边栏推荐
猜你喜欢

MySQL索引优化相关原理

数字图像处理——第六章 彩色图像处理

Seektiger will launch STI fusion mining function to obtain Oka pass

Day 3. Suicidal ideation and behavior in institutions of higher learning: A latent class analysis

Inno setup package jar + H5 + MySQL + redis into exe

Move protocol launched a beta version, and you can "0" participate in p2e

19.上下采样与BatchNorm

Digital image processing Chapter 8 - image compression

Day 9. Graduate survey: A love–hurt relationship

数字图像处理 第二章 数字图像基础
随机推荐
GBASE 8C——SQL参考6 sql语法(2)
数字图像处理——第三章 灰度变换与空间滤波
GBASE 8C——SQL参考6 sql语法(11)
9.高阶操作
我想不通,MySQL 为什么使用 B+ 树来作索引?
GBase 8c技术特点
数字图像处理——第九章 形态学图像处理
3.分类问题---手写数字识别初体验
8.数学运算与属性统计
5.索引和切片
Gbase 8C - SQL reference 6 SQL syntax (5)
MySQL查询操作索引优化实践
Minio fragment upload lifting fragment size limit - chunk size must be greater than 5242880
Day 8.Developing Simplified Chinese Psychological Linguistic Analysis Dictionary for Microblog
Read and understand the advantages of the LAAS scheme of elephant swap
GBASE 8C——SQL参考6 sql语法(7)
Dimitra and ocean protocol interpret the secrets behind agricultural data
Digital image processing -- Chapter 9 morphological image processing
Cap principle
Emoji表情符号用于文本情感分析-Improving sentiment analysis accuracy with emoji embedding