当前位置:网站首页>pytorch-权重衰退(weight decay)和丢弃法(dropout)
pytorch-权重衰退(weight decay)和丢弃法(dropout)
2022-07-05 11:34:00 【我渊啊我渊啊】
解决过拟合的常用两种方法:
1、权重衰退
常用方法:L1,L2正则化
L2正则化:
一个神经网络训练至loss收敛时,会有多个w,b符合条件。如果w过大,则输入层的噪声将会被放大,导致结果也会不准确,因此需要尽量减少w的值。正则化通过为模型的损失函数加入惩罚项使得学出的模型参数值比较小。
2、丢弃法(只能用于全连接层)

dropout不改变其输入的期望值,只在模型训练的时候使用
有p的概率,hi会清零
有1-p的概率,hi会除以1-p做拉伸

import torch
from torch import nn
from d2l import torch as d2l
dropout1, dropout2 = 0.2, 0.2
net = nn.Sequential(nn.Flatten(),
nn.Linear(784, 256),
nn.ReLU(),
# 在第一个全连接层之后添加一个dropout层
nn.Dropout(dropout1),
nn.Linear(256, 256),
nn.ReLU(),
# 在第二个全连接层之后添加一个dropout层
nn.Dropout(dropout2),
nn.Linear(256, 10))
def init_weights(m):
if type(m) == nn.Linear:
nn.init.normal_(m.weight, std=0.01)
net.apply(init_weights);
num_epochs, lr, batch_size = 10, 0.5, 256
loss = nn.CrossEntropyLoss(reduction='none')
train_iter, test_iter = d2l.load_data_fashion_mnist(batch_size)
trainer = torch.optim.SGD(net.parameters(), lr=lr)
d2l.train_ch3(net, train_iter, test_iter, loss, num_epochs, trainer)
边栏推荐
- SLAM 01. Modeling of human recognition Environment & path
- 《增长黑客》阅读笔记
- Empêcher le navigateur de reculer
- go语言学习笔记-初识Go语言
- C operation XML file
- 无密码身份验证如何保障用户隐私安全?
- Summary of websites of app stores / APP markets
- MySQL statistical skills: on duplicate key update usage
- 13. (map data) conversion between Baidu coordinate (bd09), national survey of China coordinate (Mars coordinate, gcj02), and WGS84 coordinate system
- 程序员内卷和保持行业竞争力
猜你喜欢

XML解析

idea设置打开文件窗口个数

7.2每日学习4

Summary of thread and thread synchronization under window

Go language learning notes - analyze the first program

Redis集群的重定向

Evolution of multi-objective sorting model for classified tab commodity flow

COMSOL--三维随便画--扫掠
![[crawler] bugs encountered by wasm](/img/29/6782bda4c149b7b2b334238936e211.png)
[crawler] bugs encountered by wasm

In the last process before the use of the risk control model, 80% of children's shoes are trampled here
随机推荐
Empêcher le navigateur de reculer
Prevent browser backward operation
以交互方式安装ESXi 6.0
Unity xlua monoproxy mono proxy class
2048游戏逻辑
程序员内卷和保持行业竞争力
Dspic33ep clock initialization program
XML解析
Open3D 欧式聚类
How can China Africa diamond accessory stones be inlaid to be safe and beautiful?
汉诺塔问题思路的证明
Technology sharing | common interface protocol analysis
11. (map data section) how to download and use OSM data
阻止浏览器后退操作
C # to obtain the filtered or sorted data of the GridView table in devaexpress
基于Lucene3.5.0怎样从TokenStream获得Token
Unity Xlua MonoProxy Mono代理类
COMSOL -- establishment of geometric model -- establishment of two-dimensional graphics
How did the situation that NFT trading market mainly uses eth standard for trading come into being?
iTOP-3568开发板NPU使用安装RKNN Toolkit Lite2