当前位置:网站首页>Deep Learning Theory - Initialization, Parameter Adjustment
Deep Learning Theory - Initialization, Parameter Adjustment
2022-08-04 06:18:00 【Learning adventure】
Initialization
The essence of the deep learning model training process is to update the parameter w, which requires each parameter to have a corresponding initial value.
Why initialization?
Neural network needs to optimize a very complex nonlinear model, and there is basically no global optimal solution, initialization plays a very important role in it.
□ The selection of the initial point can sometimes determine whether the algorithm converges;
□ When it converges, the initial point can determine how fast the learning converges and whether it converges to a point with high or low cost;
□ OverA large initialization leads to exploding gradients, and an initialization that is too small leads to vanishing gradients.
What is a good initialization?
A good initialization should meet the following two conditions:
□ The activation value of each layer of neurons will not be saturated;
□ The activation value of each layer should not be0.
All-zero initialization: The parameter is initialized to 0.
Disadvantages: Neurons in the same layer will learn the same features, and the symmetry properties of different neurons cannot be destroyed.
If the weight of the neuron is initialized to 0, the output of all neurons will be the same, except for the output, all the nodes in the middle layer will have the value of zero.Generally, the neural network has a symmetrical structure, so when the first error backpropagation is performed, the updated network parameters will be the same. In the next update, the same network parameters cannot learn to extract useful features, so the deep learning modelNeither will initialize all parameters with 0.
Parameter adjustment
![]()
Batch batchsize choose 2 exponential times with computerMemory match

Hyperparameter tuning method
Trial and error, web search, random search, Bayesian optimization, Gaussian process
边栏推荐
- 【CV-Learning】Object Detection & Instance Segmentation
- Data reading in yolov3 (1)
- Image-Adaptive YOLO for Object Detection in Adverse Weather Conditions
- 【深度学习21天学习挑战赛】0、搭建学习环境
- Unity ML-agents 参数设置解明
- Polynomial Regression (PolynomialFeatures)
- 【论文阅读】SPANET: SPATIAL PYRAMID ATTENTION NETWORK FOR ENHANCED IMAGE RECOGNITION
- PCL1.12 解决memory.h中EIGEN处中断问题
- Usage of Thread, Handler and IntentService
- Golang环境变量设置(二)--GOMODULE&GOPROXY
猜你喜欢
随机推荐
Usage of RecyclerView
简单明了,数据库设计三大范式
MySQL leftmost prefix principle [I understand hh]
动手学深度学习_softmax回归
Vision Transformer 论文 + 详解( ViT )
TypeError: load() missing 1 required positional argument: ‘Loader‘
动手学深度学习_线性回归
【深度学习日记】第一天:Hello world,Hello CNN MNIST
Comparison of oracle's number and postgresql's numeric
[CV-Learning] Convolutional Neural Network Preliminary Knowledge
度量学习(Metric learning、损失函数、triplet、三元组损失、fastreid)
MNIST手写数字识别 —— 从二分类到十分类
BatchNorm&&LayerNorm
【论文阅读】SPANET: SPATIAL PYRAMID ATTENTION NETWORK FOR ENHANCED IMAGE RECOGNITION
Usage of Thread, Handler and IntentService
Image-Adaptive YOLO for Object Detection in Adverse Weather Conditions
深度学习,“粮草”先行--浅谈数据集获取之道
The pipeline mechanism in sklearn
target has libraries with conflicting names: libcrypto.a and libssl.a.
[CV-Learning] Linear Classifier (SVM Basics)




Batch batchsize choose 2 exponential times with computerMemory match















