当前位置:网站首页>[Go through 7] Notes from the first section of the fully connected neural network video
[Go through 7] Notes from the first section of the fully connected neural network video
2022-08-05 05:25:00 【Mosu playing computer】
Article table of contents
Today is dp
June 25, 2022 Check out 7
Fully Connected Neural Network
Cascade multiple linear classifiers
The weight is the template

If in a linear classifier, the number of w lines isFixed, it is the same as the number of categories.So learn horses, learn two ends (obviously wrong)
But if you are in a full connection, you only require that w2 is fixed, w1 is casual, create 100 lines, 100 templates, and 10 of them are dividedIt is used to learn horses, learn single-headed horses, and then use the activation function to select them.That's right.
Name

Generally speaking, the former
Activation function

sigmoid is between 0-1
tanh is between -1 and 1, and it is symmetrical
softmax
We can set the output class according to the maximum value of the last value, but if we need to know how much predicted probability is, we need softmax
Take e to the power of the exponent, then normalize it
Cross-entropy loss
Cross-entropy loss is used to compare the difference between distributionsSimilarity , can not say distance, distance is AB BA is the same, has exchange, but entropy is not necessarily 
And H[p] is the ground truth, the information it reflects is not confusing at all, so the entropy is 0.Cross entropy is 0+relative entropy.After simplification, it is the -log of the correct classification score.
But when the H[p] standard is not one-hot encoding, it is necessary to honestly use the relative entropy (KL divergence).
The teacher mentioned here: It may be in trainingAt the same time, there is a situation where "loss has not decreased, but accuracy has increased".Just like the example in the lower right corner of the figure above (assuming the third column is the correct classification),
0.35 0.33 0.32 (obviously not correct)
0.333 0.332 0.334 (correct)
For the correct classification-log 0.35 and -log0.333 are actually not much different, but his probability has become larger, and he stands out with a small improvement (0.334>0.333)
Calculation graph
The positive value is the value, the negative direction is the gradient, chain derivationMultiplyable
Each node of the computational graph stores forward-propagation values and a reverse Jacobian matrix for forward and back-propagation
Granularity

A series of gates can be connected together to formA function gate like sigmoid has a large granularity, but has few calculation steps and is fast in operation.
Caffe someone wrote these functions, so it is fast; TensorFlow is a small gate, so the parameter return, slow (later improved)
Common door units

max is the larger number, and it will be passed to whoever.
Today, I watched the video of the third section, but unfortunately the notes I read in the pdf were squeezed out.I didn't stay, otherwise I can compare it and add it.
Send
I didn't sleep well this morning, got up with a golden shovel, then ate, rushed into the study room in the rain, was happy here, did whatever I wanted, and left at night for an hour of fast study.Hurry up.
边栏推荐
- Requests the library deployment and common function
- Transformation 和 Action 常用算子
- pycharm中调用Matlab配置:No module named ‘matlab.engine‘; ‘matlab‘ is not a package
- Returned object not currently part of this pool
- ESP32 485 Illuminance
- UVA10827
- 结构光三维重建(二)线结构光三维重建
- 判断语句_switch与case
- Flutter 父子组件如何都能收到点击事件
- 位运算符与逻辑运算符的区别
猜你喜欢

Structured Light 3D Reconstruction (2) Line Structured Light 3D Reconstruction

Flutter学习5-集成-打包-发布

u-boot debugging and positioning means

数据库 单表查询

第三讲 Gradient Tutorial梯度下降与随机梯度下降

Basic properties of binary tree + oj problem analysis

开发一套高容错分布式系统

A blog clears the Redis technology stack

将照片形式的纸质公章转化为电子公章(不需要下载ps)

多线程查询结果,添加List集合
随机推荐
[Student Graduation Project] Design and Implementation of the Website Based on the Web Student Information Management System (13 pages)
Structured light 3D reconstruction (1) Striped structured light 3D reconstruction
结构光三维重建(一)条纹结构光三维重建
Judgment statement _switch and case
Redis - 13、开发规范
uva1325
判断语句_switch与case
Detailed Explanation of Redis Sentinel Mode Configuration File
WPF中DataContext作用
【微信小程序】WXML模板语法-条件渲染
小白一枚各位大牛轻虐虐
jvm three heap and stack
【技能】长期更新
uboot enable debug printing information
2022 Hangzhou Electric Multi-School 1st Session 01
学习总结week3_2函数进阶
多线程查询结果,添加List集合
LeetCode: 1403. Minimum subsequence in non-increasing order [greedy]
server disk array
【过一下12】整整一星期没记录