当前位置:网站首页>Neural network parameter initialization
Neural network parameter initialization
2022-07-27 07:07:00 【Mr_ health】
1. bn Parameter initialization of layer
bn The parameters that the layer needs to initialize are scale and bias,bias It is usually initialized to 0,scale There are two ways to initialize
- Initialize to 1
- Initialize to 0: It is assumed that bn Layer does not exist , It is generally used in structures with residuals :resblock, Because there is chortcut, So this initialization method is feasible
# Zero-initialize the last BN in each residual branch,
# so that the residual branch starts with zeros, and each residual block behaves like an identity.
# This improves the model by 0.2~0.3% according to https://arxiv.org/abs/1706.02677
if zero_init_residual:
for m in self.modules():
if isinstance(m, Bottleneck):
nn.init.constant_(m.bn3.weight, 0)
elif isinstance(m, BasicBlock):
nn.init.constant_(m.bn2.weight, 0)2. fc Parameter initialization of layer
- weight: Normal distribution to initialize , The mean for 0, Variance can be adjusted
- bias:0.01 Or for 0, It's usually 0
if isinstance(m, nn.Linear):
nn.init.normal_(m.weight, 0, 0.01) # The mean for 0, The variance of 0.01
nn.init.zeros_(m.bias)3. conv Parameter initialization of layer
- bias:0.01 Or for 0, It's usually 0
- weight:
(1)Xavier initialization :Xavier Parameter initialization 、 Deep learning parameter initialization ( One )Xavier initialization With code
(2)He initialization : Deep learning parameter initialization ( Two )Kaiming initialization With code
stay He During initialization, you can choose mode by :fan_in perhaps fan_out, Some bloggers concluded :
- If the weight is through linear layer ( Convolution or full connection ) Implicitly determined , You need to set mode=fan_in;
- If by creating random matrix Explicitly create weights , Then set mode=‘fan_out’.
I looked at the major mainstream models, such as mobilenet_v2,resnet They all use mode=‘fan_out’
边栏推荐
- Netease Yunxin appeared at the giac global Internet architecture conference to decrypt the practice of the new generation of audio and video architecture in the meta universe scene
- 基于SSM图书借阅管理系统
- deepsort源码解读(七)
- DNA modified zinc oxide | DNA modified gold nanoparticles | DNA coupled modified carbon nanomaterials
- vscode运行命令报错:标记“&&”不是此版本中的有效语句分隔符。
- Music website management system based on SSM
- Disk management and file system
- ZnS DNA QDs near infrared zinc sulfide ZnS quantum dots modified deoxyribonucleic acid dna|dna modified ZnS quantum dots
- OpenGL development with QT (I) drawing plane graphics
- Interpretation of deepsort source code (V)
猜你喜欢

肽核酸PNA-多肽PNA-TPP|Glt-Ala-Ala-Pro-Leu-pNA|Suc-Ala-Pro-pNA|Suc-AAPL-pNA|Suc-AAPM-pNA

Pytorch uses data_ Prefetcher improves data reading speed

Music website management system based on SSM

PNA modified polypeptide arms PNA PNA DNA suc aapf PNA suc - (ALA) 3 PNA
![[unity URP] the code obtains the universalrendererdata of the current URP configuration and dynamically adds the rendererfeature](/img/be/812ccb05d7763effcece51945f0460.png)
[unity URP] the code obtains the universalrendererdata of the current URP configuration and dynamically adds the rendererfeature

Record of pychart running jupyter notebook in virtual environment

含有偶氮苯单体的肽核酸寡聚体(NH2-TNT4,N-PNAs)齐岳生物定制

CdS quantum dots modified DNA | CDs DNA QDs | near infrared CdS quantum dots coupled DNA specification information

关于ES6的新特性

Campus news release management system based on SSM
随机推荐
Brief introduction of simulation model
运行代码报错: libboost_filesystem.so.1.58.0: cannot open shared object file: No such file or directory
Li Hongyi 2020 deep learning and human language processing dlhlp conditional generation by RNN and attention-p22
C#时间相关操作
Campus news release management system based on SSM
Interpretation of deepsort source code (I)
Interpretation of deepsort source code (II)
DNA coupled PbSe quantum dots | near infrared lead selenide PbSe quantum dots modified DNA | PbSe DNA QDs
Reasoning speed of model
Problems related to compilation and training of Darknet yolov3 and Yolo fast using CUDA environment of rtx30 Series graphics card on win10 platform
Dimension problems and contour lines
脱氧核糖核酸DNA修饰氧化锌|DNA修饰纳米金颗粒|DNA偶联修饰碳纳米材料
Significance of NVIDIA SMI parameters
DNA modified noble metal nanoparticles | DNA modified gold nanoparticles (scientific research level)
DNA修饰贵金属纳米颗粒|脱氧核糖核酸DNA修饰纳米金(科研级)
CASS11.0.0.4 for AutoCAD2010-2023免狗使用方法
Why can cross entropy loss be used to characterize loss
客户案例 | 聚焦流程体验,助银行企业APP迭代
最新!国资委发布国有企业数字化转型新举措
Analysis of online and offline integration mode of o2o E-commerce