当前位置:网站首页>[Go through 8] Fully Connected Neural Network Video Notes
[Go through 8] Fully Connected Neural Network Video Notes
2022-08-05 05:25:00 【Mosu playing computer】
过一下8 Video is over
It's been two days
Fifth video,The teacher reviewed the previous concepts first,About cross entropy and relative entropy here,The former is simpler without a denominator,然后onehotSo in the end it simplifies to -log
(2022年6月28日08:38:38 The remainder has now been seen1/38了,加油)
梯度消失(Multiply by reverse pass0)
梯度爆炸(飞出去,Hold the egg)
裁剪:Bound stride
It was often used in the last era when the two problems were not exposed.Now the usefulness is in the output layer,Results are available if needed0和1之间的时候.Not used in hidden layers.
梯度下降存在的问题
设置成1 It's like no friction ,never stop.v=v
Sprint on the flat road
vibration direction r大 One step Xinglang is small
But in the process of accumulating,r越来越大,最后 步长很小,走不动了
(这里就相当于 Brainless integration of all previous records)
This is fine ρRepresents how many training records before the ensemble.
0.999r+0.001 (g*g)-100轮-》就很小了
保证了 He just keeps this100training experience,It will not increase infinitely
If you want to keep more,就ρ设大一点,但不能为1(That leaves everything)
动量法-此消彼长
自适应-Different steps in different directions
Adam-结合两者
(2022年6月28日09:14:35 看了 2/38了)
可以先adamQuickly pick a similar one,Then add momentumSGDAlchemy slowly
Momentum will also be added firstSGD,然后再adam
参数初始化
The game can be played
Basically normal
大部分集中在0
Much even
If weight initialization is not considered ,Each neuron has the same parameters,就相当于一个神经元
If an inappropriate combination of initialization method and activation function is used,It will either lead to uneven distribution,Or pull your hips
批归一化
现在不考虑 Weight initialization thing
Think straight from the end,我直接对y下手
你想要的不就是 0均值1方差的y嘛
Then I'll ask for an average,Reduce again(归一化一下),然后把这个当做y
The idea is to put it after the activation function,But in practice,Score it first FCand the activation function is better
can make those The spot that would have landed in a place where birds don't poop came back to a good place(Originally smaller and smaller values and where there is no gradient)
x1…xm就是原来的y
y1…ym就是上面标黄的
如果止步于此,It is a normalization
做了个改进,平移缩放
Let the neural network decide the mean and variance by itself(Those two parameters are also learned)
Forward is more convenient,The reverse can also have gradients
Ensure smooth flow of information=》训练好
(2022年6月28日09:59:20 看完3/38)
过拟合 欠拟合
过拟合 记住就好了(Often run towards this to design)
欠拟合 学习能力差,学不来(通常可以解决)
L损失 E误差
训练集-优化
验证集 测试集-泛化(arithmetic precision)
(2022年6月28日10:13:25 看了 4/38)
应对过拟合
增加训练数据-成本高
调整大小-9层改8层,500neuron changes300个
Force the neural network not to rely on larger samples to influence the weight parameters,Take the overall situation into consideration,to be more dispersed
make interface Simpler and smoother
随机失活
针对解释2
It feels a bit like the entire universe in an instant
Might be the last fightboss,To draw power from other universes,Then try your best,Might be doing well here,结果突然bossCome over and kill it(dropout),That's for the final fightboss,Other universes have to work hard to become stronger,不能太单一.
Why do you have to work hard(平均),instead of raising a big dad(All in one),Because I don't know which one will bedropout,If it's all messed up(Little information is stored),That's even worse,bossDefinitely can't beat it
解释3
equivalent to thatx的网络B和AThe result of the vote
Although a network is very cattle,Probably right most of the time,But when you make a mistake,就完蛋了,So this time you need three stooges
使用的时候
The neurons were all turned on during the test,Not randomly deactivated
One more ride at the endp,Otherwise, it's all time training1/2期望E,When testing is expectedE,It's twice as bad
That's directly during training 除一下p,保证数值(期望)The same is fine
(2022年6月28日10:38:59 看完了 5/38)
参数
参数-The neural network learns by itself
超参数-我定的
妙呀,Compare the learning rate to the length of the stick,Too big to hang outside
Generally speaking, you can't touch the bottom of the valley.In the upper right corner is the strategy
/e^t has been declining
Or train for a round,然后卡住了,然后 Go to the next level to tune,循环
(Slip off and rest)
前者 Only three learning rates were compared 后者9个 So use the latter
(2022年6月28日12:14:43 6/38已看完)
边栏推荐
- 实现跨域的几种方式
- 【技能】长期更新
- redis persistence
- What field type of MySQL database table has the largest storage length?
- In Opencv, imag=cv2.cvtColor(imag,cv2.COLOR_BGR2GRAY) error: error:!_src.empty() in function 'cv::cvtColor'
- 【过一下8】全连接神经网络 视频 笔记
- 分布式和集群
- 【解码工具】Bitcoin的一些在线工具
- Reverse theory knowledge 4
- Matplotlib(二)—— 子图
猜你喜欢
SQL(一) —— 增删改查
server disk array
【过一下12】整整一星期没记录
RL reinforcement learning summary (1)
Convert the paper official seal in the form of a photo into an electronic official seal (no need to download ps)
Algorithms - ones and zeros (Kotlin)
[Let's pass 14] A day in the study room
[Go through 7] Notes from the first section of the fully connected neural network video
Calling Matlab configuration in pycharm: No module named 'matlab.engine'; 'matlab' is not a package
OFDM 十六讲 5 -Discrete Convolution, ISI and ICI on DMT/OFDM Systems
随机推荐
软件设计 实验四 桥接模式实验
[Student Graduation Project] Design and Implementation of the Website Based on the Web Student Information Management System (13 pages)
Pandas(五)—— 分类数据、读取数据库
【过一下6】机器视觉视频 【过一下2被挤掉了】
Mysql5.7 二进制 部署
Flutter real machine running and simulator running
学习总结week2_3
Map、WeakMap
实现跨域的几种方式
What field type of MySQL database table has the largest storage length?
Detailed Explanation of Redis Sentinel Mode Configuration File
【过一下11】随机森林和特征工程
第5讲 使用pytorch实现线性回归
pycharm中调用Matlab配置:No module named ‘matlab.engine‘; ‘matlab‘ is not a package
coppercam primer [6]
ESP32 485 Illuminance
vscode+pytorch use experience record (personal record + irregular update)
DOM and its applications
Do you use tomatoes to supervise your peers?Add my study room, come on together
【过一下3】卷积&图像噪音&边缘&纹理