当前位置:网站首页>Sparse knowledge points
Sparse knowledge points
2022-06-10 08:50:00 【Itchy heart】
sparsity (sparse)
Definition :Sparse Expressed as parameters in the model , Only a few non-zero elements or only a few elements far greater than zero .
WHY: Why should we include sparsity in the model ?
Example : Take an examination of grind learn bully to have 10000 Vocabulary of , The vocabulary used in the exam , yes 10000 A small part of a vocabulary accumulation library .
Example:
Test Number:123.456
The first set of digital bases :
[100,10,1] ⇒ \Rightarrow ⇒ 123.456 ≈ \approx ≈ 100 × \times × 1 + 10 × \times × 2 + 1 × \times × 3 (error=0.456)
The second set of digital bases :
[100,50,10,1,0.5,0.1,0.03,0.01,0.001]
123.456=100 × \times × 1 + 50 × \times × 0 + 10 × \times × 2 + 1 × \times × 3 + 0.5 × \times × 0 + 0.1 × \times × 4 + 0.03 × \times × 0 + 0.01 × \times × 5 + 0.001 × \times × 6(error=0)
among Sparse Feature( Be prepared against want ): Yes 50,0.5,0.03 These three numbers .
compared with PCA(Principal Component Analysis)
PCA(a complete set of basis vectors: Complete dictionary )
Through the vector base in this set of complete dictionaries , Restore the original data .
Sparse Represnetation(an over-complete set of basis vectors: Super complete dictionary , Contrary to sparsity .)
The number of base vectors is much larger than the dimension of the input vector
How to ensure sparsity ?
Machine learning model ⇒ \Rightarrow ⇒ Optimize parameters based on training set ( For example, reduce Loss) ⇒ \Rightarrow ⇒ Loss Add regular terms to , The penalty model parameter values make it close to 0
Common operations :
Loss = Training Loss + λ \lambda λ ∣ ∣ W ∣ ∣ 0 {||W||_0} ∣∣W∣∣0 ( L 0 {L_0} L0 normal form )
Loss = Training Loss + λ \lambda λ ∣ ∣ W ∣ ∣ 1 {||W||_1} ∣∣W∣∣1 ( L 1 {L_1} L1 normal form )
Sparce Coding( Sparse coding LOSS)
Loss = ∑ j = 1 m ∣ ∣ x ( j ) − ∑ i = 1 k a i ( j ) ϕ i ∣ ∣ 2 + λ ∑ i = 1 k ∣ ∣ a i ∣ ∣ 1 \sum_{j=1}^m||x^{(j)}-\sum_{i=1}^k a_i^{(j)}\phi_i||^2 + \lambda\sum_{i=1}^k||a_i||_1 ∑j=1m∣∣x(j)−∑i=1kai(j)ϕi∣∣2+λ∑i=1k∣∣ai∣∣1
among , ∑ i = 1 k a i ( j ) \sum_{i=1}^k a_i^{(j)} ∑i=1kai(j) It's reconstruction error , λ ∑ i = 1 k ∣ ∣ a i ∣ ∣ 1 \lambda\sum_{i=1}^k||a_i||_1 λ∑i=1k∣∣ai∣∣1 For sparse penalty ( L 1 L_1 L1 Norm)
Also in the era of convolutional networks , We will add... To the convolution layer L 1 L_1 L1 norm , To ensure its sparsity .
Increase the depth and width of the model , To ensure that there are more super complete dictionaries .
Is mindless sparsity good or bad ?
Super complete dictionary ⇒ \Rightarrow ⇒ A lot of high-quality data .
Too many inactive parameters ⇒ \Rightarrow ⇒ The training process is very long
L 1 L_1 L1 The paradigm is Loss Some positions in are not differentiable ⇒ \Rightarrow ⇒ The derivative is at zero , Derivative is not unique , Therefore, the model is difficult to converge
All in all , In the model of large-scale deep learning , Usually tend to use L 2 L_2 L2 Normal form to prevent over fitting .
边栏推荐
- Test: Cup
- matlab报错问题汇总
- Task04: set operation
- R language uses LM function to build a simple linear regression model (establish a linear regression model), fit the regression line, use attributes function to view the attribute information of the l
- Task06: Autumn move script B
- Lexin's latest support for zephyr
- window11 无法打开安全中心解决方法
- R language uses neuralnet package to build neural network regression model (feedforward neural network regression model), and the neural network model completed by plot function visual training (inclu
- Tenants roaming the rental complex
- UART中的硬件流控RTS与CTS
猜你喜欢

Computer level 2 test preparation MySQL day 4

Windows11 cannot open the security center solution

texstudio 显示行号和不检查拼写设置

What are the serious consequences of skipping 51 MCU and learning STM32 directly

vtk学习之Pipeline管线
![[cryptography] AES encryption and decryption](/img/a5/ad3fed3004646ca894d59cc22d2f11.png)
[cryptography] AES encryption and decryption

vtk学习之引用计数与智能指针

Ifstream seekg() read() text operation

MMSegmention系列之六(训练技巧)

对线HR_MySQL存储引擎,原来是这样啊
随机推荐
软件测试|从HR那里冒险套路过来的面试经验,绝对加分项
C#入门系列(十) -- 一维数组
Computer level 2 test preparation MySQL day 4
Sqlserver restore failed (the database is in use and cannot gain exclusive access to the database)
Task06:秋招秘籍 B
AWS IOT reference example of Lexin launching esp32-c3
Ten working principles for STM32 MPU developers
R language uses the Mhor function of epidisplay package to perform Cochran mantel Haenszel test, visualize and test whether the two classification variables are independent when the third variable is
乐鑫 ESP RainMaker 加速企业智能转型,私有云方案助力客户打造自有品牌
The pipelineexecute pipeline execution process of VTK learning
Coordinate system of VTK learning
window11 无法打开安全中心解决方法
只需八步将小程序一键打包生成 App
Rotate linked list
Pipeline pipeline for VTK learning
uni-app_ Configure network request in wechat applet development project (third-party package @escook/request miniprogram)
RunLoop的实际使用
How to hide application previews when switching applications while using shutter
R language uses neuralnet package to build neural network regression model (feedforward neural network regression model), and the neural network model completed by plot function visual training (inclu
信用卡客户流失预测