当前位置:网站首页>Over fitting and regularization
Over fitting and regularization
2022-07-05 05:33:00 【Li Junfeng】
Over fitting
This is a neural network training process , The problems we often encounter , Simply speaking , Is the performance of the model , Learning ability is too strong , So that Training set All the details of have been recorded . When you meet Test set , It's when I haven't seen data before , There will be obvious mistakes .
The reasons causing
The most essential reason is : Too many parameters ( The model is too complex )
Other reasons are :
- The distribution of test set and training set is different
- The number of training sets is too small
terms of settlement
For the above reasons , Several countermeasures can be put forward
- Reduce model complexity , Regularization is commonly used .
- Enhanced training set
norm Norm(Minkowski distance )
Definition
A norm is a function , It gives each vector in a vector space a length or size .
For the zero vector , The length of 0.
∥ x ∥ p = ( ∑ i = 1 n ∣ x ∣ p ) 1 p \lVert x \rVert_p = \left(\displaystyle\sum_{i=1}^n \lvert x\rvert^p\right)^{\frac{1}{p}} ∥x∥p=(i=1∑n∣x∣p)p1
Properties of norms
- Nonnegativity ∥ x ∥ ≥ 0 \lVert x\rVert \ge 0 ∥x∥≥0
- Homogeneity ∥ c x ∥ = ∣ c ∣ ∥ x ∥ \lVert cx\rVert=\lvert c\rvert \lVert x\rVert ∥cx∥=∣c∣∥x∥
- Trigonometric inequality ∥ x + y ∥ ≤ ∥ x ∥ + ∥ y ∥ \lVert x + y\rVert \leq \lVert x\rVert +\Vert y\rVert ∥x+y∥≤∥x∥+∥y∥
Norm characteristic
- L 0 L_0 L0 norm : Number of non-zero elements
- L 1 L_1 L1 norm : The sum of absolute values
- L 2 L_2 L2 norm : Euler distance
- L ∞ L_{\infin} L∞ norm : The absolute value Maximum The absolute value of the element of
Regularization
Objective function plus a norm , As a penalty . If a parameter is larger , It will increase the norm , That is, the penalty item increases . So under the action of norm , Many parameters are getting smaller .
The smaller the parameter , It shows that it plays a smaller role in neural network , That is, the smaller the impact on the final result , Therefore, it can make the model simpler , And it has more generalization ability .
Regularization is also a Superior bad discard Thought , Although many parameters are useful for the model , But in the end, only important parameters can be preserved ( It's worth more , Have a great impact on the results ), And most parameters have been eliminated ( Small values , It has little effect on the results ).
边栏推荐
- Educational codeforces round 109 (rated for Div. 2) C. robot collisions D. armchairs
- Haut OJ 1241: League activities of class XXX
- Graduation project of game mall
- Zzulioj 1673: b: clever characters???
- Summary of Haut OJ 2021 freshman week
- CF1634E Fair Share
- CCPC Weihai 2021m eight hundred and ten thousand nine hundred and seventy-five
- A problem and solution of recording QT memory leakage
- 【Jailhouse 文章】Jailhouse Hypervisor
- Warning using room database: schema export directory is not provided to the annotation processor so we cannot export
猜你喜欢
sync.Mutex源码解读
Chapter 6 data flow modeling - after class exercises
剑指 Offer 04. 二维数组中的查找
F - Two Exam(AtCoder Beginner Contest 238)
剑指 Offer 53 - I. 在排序数组中查找数字 I
lxml. etree. XMLSyntaxError: Opening and ending tag mismatch: meta line 6 and head, line 8, column 8
从Dijkstra的图灵奖演讲论科技创业者特点
[merge array] 88 merge two ordered arrays
The present is a gift from heaven -- a film review of the journey of the soul
Using HashMap to realize simple cache
随机推荐
Zheng Qing 21 ACM is fun. (3) part of the problem solution and summary
kubeadm系列-01-preflight究竟有多少check
Sword finger offer 06 Print linked list from beginning to end
Introduction to tools in TF-A
[es practice] use the native realm security mode on es
动漫评分数据分析与可视化 与 IT行业招聘数据分析与可视化
SSH password free login settings and use scripts to SSH login and execute instructions
剑指 Offer 58 - II. 左旋转字符串
【Jailhouse 文章】Jailhouse Hypervisor
In this indifferent world, light crying
【ES实战】ES上的native realm安全方式使用
Software test -- 0 sequence
Configuration and startup of kubedm series-02-kubelet
Annotation and reflection
全国中职网络安全B模块之国赛题远程代码执行渗透测试 //PHPstudy的后门漏洞分析
Codeforces round 712 (Div. 2) d. 3-coloring (construction)
Palindrome (csp-s-2021-palin) solution
What is the agile proportion of PMP Exam? Dispel doubts
Cluster script of data warehouse project
Summary of Haut OJ 2021 freshman week