当前位置:网站首页>In depth learning report (3)
In depth learning report (3)
2022-07-27 01:06:00 【Curve overtaker】
Catalog
Chapter vii. Parameter adjustment
Chapter six initialization
One 、 Why initialize ?
- Selection of initial point , Sometimes it can decide whether the algorithm converges
- When converging , The initial point can determine how fast the learning converges , Whether it can converge to a high or low cost point
- Too large initialization leads to gradient explosion , Too small initialization causes the gradient to disappear
Two 、 What is good initialization ?
- So that the activation value of each neuron layer will not be saturated
- The activation value of each layer cannot be 0
3、 ... and 、 Common initialization
1、 All zero initialization
- All zero initialization : The initial value of the parameter is 0.
- shortcoming : Neurons in the same layer will learn the same characteristics , The symmetry of different neurons cannot be destroyed . If the weights of neurons are initialized to 0 , The output of all neurons will be the same , In addition to output , The value of all nodes in the middle tier is zero . General neural networks have a symmetrical structure , Then when the first error back propagation is carried out , The updated network parameters will be the same , At the next update , Learning the same network parameters can not extract useful features , Therefore, deep learning models will not be used 0 Initialize all parameters .
2、 Random initialization
- Random initialization : Initialize the parameter to a small random number . Generally, the random value is μ , The standard deviation is σ Sampling in Gaussian distribution , Each dimension of the final parameter comes from a multidimensional Gaussian distribution .
- shortcoming : Once the random distribution is chosen improperly , Will lead to network optimization in trouble . If the initial value of the parameter is too small , In back propagation, it will lead to small gradients , For deep Networks , There will be gradient dispersion problem , Reduce the convergence speed of parameters . If the initial value of the parameter is too large , Then neurons will be easily saturated .
3、Xavier initialization

4、He initialization

Chapter vii. Parameter adjustment
1、 Trial and error method : For example, after students design an experiment , Follow all the steps of the learning process ( Visualization from data collection to feature map ), Then iterate on the super parameter in turn until the time ends .
2、 The grid search : If there are three or fewer super parameters , The common super parameter search method is grid search . For each super parameter , The user selects a smaller set of finite values to explore . then , The Cartesian product of these hyperparameters yields a set of hyperparameters , Grid search uses each set of hyperparametric training models . Select the super parameter with the smallest error in the verification set as the best super parameter .
3、 Random search : The only difference between grid search and random search is the first step of the policy cycle , Random search randomly selects points in the configuration space .
4、 Bayesian optimization :
- Build the model
- Select the super parameter
- Training , assessment
- Optimization model , Return to step 2
边栏推荐
猜你喜欢
随机推荐
腾讯云MLVB技术如何在移动直播服务中突出重围之基础概念
Uni-app开发App和插件以后如何开通广告盈利:uni-AD
flink需求之—ProcessFunction(需求:如果30秒内温度连续上升就报警)
Data warehouse knowledge points
MYSQL中的行锁升级表锁的原因
智密-腾讯云直播 MLVB 插件优化教程:六步提升拉流速度+降低直播延迟
MYSQL数据库事务的隔离级别(详解)
2022.DAY599
Canal introduction
Spark source code learning - Data Serialization
Spark source code learning - memory tuning
通过FlinkCDC将MySQL中变更的数据写入到kafka(DataStream方式)
数据库表连接的简单解释
[CTF攻防世界] WEB区 关于Cookie的题目
adb shell截屏录屏命令
Flink1.11 intervalJoin watermark生成,状态清理机制源码理解&Demo分析
Which securities company has a low stock commission and which stock is safe to open an account
Flink based real-time computing Demo - Data Analysis on user behavior
不止直播:腾讯云直播MLVB 插件除了推流/拉流还有哪些亮眼功能
Spark累加器(Accumulator)



![[By Pass] 文件上传的绕过方式](/img/72/d3e46a820796a48b458cd2d0a18f8f.png)




![[HFCTF2020]EasyLogin](/img/23/91912865a01180ee191a513be22c03.png)
![[BJDCTF2020]EzPHP](/img/be/a48a1a9147f1f3b21ef2d60fc1f59f.png)