当前位置:网站首页>pytorch-权重衰退(weight decay)和丢弃法(dropout)
pytorch-权重衰退(weight decay)和丢弃法(dropout)
2022-07-05 11:34:00 【我渊啊我渊啊】
解决过拟合的常用两种方法:
1、权重衰退
常用方法:L1,L2正则化
L2正则化:
一个神经网络训练至loss收敛时,会有多个w,b符合条件。如果w过大,则输入层的噪声将会被放大,导致结果也会不准确,因此需要尽量减少w的值。正则化通过为模型的损失函数加入惩罚项使得学出的模型参数值比较小。
2、丢弃法(只能用于全连接层)
dropout不改变其输入的期望值,只在模型训练的时候使用
有p的概率,hi会清零
有1-p的概率,hi会除以1-p做拉伸
import torch
from torch import nn
from d2l import torch as d2l
dropout1, dropout2 = 0.2, 0.2
net = nn.Sequential(nn.Flatten(),
nn.Linear(784, 256),
nn.ReLU(),
# 在第一个全连接层之后添加一个dropout层
nn.Dropout(dropout1),
nn.Linear(256, 256),
nn.ReLU(),
# 在第二个全连接层之后添加一个dropout层
nn.Dropout(dropout2),
nn.Linear(256, 10))
def init_weights(m):
if type(m) == nn.Linear:
nn.init.normal_(m.weight, std=0.01)
net.apply(init_weights);
num_epochs, lr, batch_size = 10, 0.5, 256
loss = nn.CrossEntropyLoss(reduction='none')
train_iter, test_iter = d2l.load_data_fashion_mnist(batch_size)
trainer = torch.optim.SGD(net.parameters(), lr=lr)
d2l.train_ch3(net, train_iter, test_iter, loss, num_epochs, trainer)
边栏推荐
- 13.(地图数据篇)百度坐标(BD09)、国测局坐标(火星坐标,GCJ02)、和WGS84坐标系之间的转换
- Sklearn model sorting
- How did the situation that NFT trading market mainly uses eth standard for trading come into being?
- Ziguang zhanrui's first 5g R17 IOT NTN satellite in the world has been measured on the Internet of things
- How can edge computing be combined with the Internet of things?
- Mysql统计技巧:ON DUPLICATE KEY UPDATE用法
- C operation XML file
- Crawler (9) - scrape framework (1) | scrape asynchronous web crawler framework
- 7.2 daily study 4
- Evolution of multi-objective sorting model for classified tab commodity flow
猜你喜欢
COMSOL -- three-dimensional graphics random drawing -- rotation
Ziguang zhanrui's first 5g R17 IOT NTN satellite in the world has been measured on the Internet of things
Three suggestions for purchasing small spacing LED display
Basics - rest style development
分类TAB商品流多目标排序模型的演进
In the last process before the use of the risk control model, 80% of children's shoes are trampled here
Oneforall installation and use
Harbor镜像仓库搭建
AUTOCAD——遮罩命令、如何使用CAD对图纸进行局部放大
Modulenotfounderror: no module named 'scratch' ultimate solution
随机推荐
Open3D 欧式聚类
I used Kaitian platform to build an urban epidemic prevention policy inquiry system [Kaitian apaas battle]
居家办公那些事|社区征文
解决readObjectStart: expect { or n, but found N, error found in #1 byte of ...||..., bigger context ..
uboot的启动流程:
Manage multiple instagram accounts and share anti Association tips
【无标题】
Three suggestions for purchasing small spacing LED display
C # implements WinForm DataGridView control to support overlay data binding
Home office things community essay
[crawler] Charles unknown error
Advanced technology management - what is the physical, mental and mental strength of managers
7.2每日学习4
Idea set the number of open file windows
Shell script file traversal STR to array string splicing
spark调优(一):从hql转向代码
go语言学习笔记-初识Go语言
COMSOL--三维随便画--扫掠
Redis集群的重定向
Redis如何实现多可用区?