当前位置:网站首页>pytorch-权重衰退(weight decay)和丢弃法(dropout)
pytorch-权重衰退(weight decay)和丢弃法(dropout)
2022-07-05 11:34:00 【我渊啊我渊啊】
解决过拟合的常用两种方法:
1、权重衰退
常用方法:L1,L2正则化
L2正则化:
一个神经网络训练至loss收敛时,会有多个w,b符合条件。如果w过大,则输入层的噪声将会被放大,导致结果也会不准确,因此需要尽量减少w的值。正则化通过为模型的损失函数加入惩罚项使得学出的模型参数值比较小。
2、丢弃法(只能用于全连接层)

dropout不改变其输入的期望值,只在模型训练的时候使用
有p的概率,hi会清零
有1-p的概率,hi会除以1-p做拉伸

import torch
from torch import nn
from d2l import torch as d2l
dropout1, dropout2 = 0.2, 0.2
net = nn.Sequential(nn.Flatten(),
nn.Linear(784, 256),
nn.ReLU(),
# 在第一个全连接层之后添加一个dropout层
nn.Dropout(dropout1),
nn.Linear(256, 256),
nn.ReLU(),
# 在第二个全连接层之后添加一个dropout层
nn.Dropout(dropout2),
nn.Linear(256, 10))
def init_weights(m):
if type(m) == nn.Linear:
nn.init.normal_(m.weight, std=0.01)
net.apply(init_weights);
num_epochs, lr, batch_size = 10, 0.5, 256
loss = nn.CrossEntropyLoss(reduction='none')
train_iter, test_iter = d2l.load_data_fashion_mnist(batch_size)
trainer = torch.optim.SGD(net.parameters(), lr=lr)
d2l.train_ch3(net, train_iter, test_iter, loss, num_epochs, trainer)
边栏推荐
- redis的持久化机制原理
- 汉诺塔问题思路的证明
- [Oracle] use DataGrid to connect to Oracle Database
- NFT 交易市场主要使用 ETH 本位进行交易的局面是如何形成的?
- SLAM 01. Modeling of human recognition Environment & path
- COMSOL -- establishment of 3D graphics
- Error assembling WAR: webxml attribute is required (or pre-existing WEB-INF/web.xml if executing in
- An error is reported in the process of using gbase 8C database: 80000305, host IPS long to different cluster. How to solve it?
- Startup process of uboot:
- Technology sharing | common interface protocol analysis
猜你喜欢
随机推荐
C language current savings account management system
POJ 3176 cow bowling (DP | memory search)
MySQL giant pit: update updates should be judged with caution by affecting the number of rows!!!
13.(地图数据篇)百度坐标(BD09)、国测局坐标(火星坐标,GCJ02)、和WGS84坐标系之间的转换
技术管理进阶——什么是管理者之体力、脑力、心力
Solve the problem of slow access to foreign public static resources
How can China Africa diamond accessory stones be inlaid to be safe and beautiful?
How to get a token from tokenstream based on Lucene 3.5.0
石油化工企业安全生产智能化管控系统平台建设思考和建议
C # implements WinForm DataGridView control to support overlay data binding
ZCMU--1390: 队列问题(1)
FFmpeg调用avformat_open_input时返回错误 -22(Invalid argument)
Golang application topic - channel
技术分享 | 常见接口协议解析
What about SSL certificate errors? Solutions to common SSL certificate errors in browsers
AutoCAD -- mask command, how to use CAD to locally enlarge drawings
中非 钻石副石怎么镶嵌,才能既安全又好看?
2048游戏逻辑
COMSOL -- three-dimensional graphics random drawing -- rotation
CDGA|数据治理不得不坚持的六个原则









