当前位置:网站首页>pytorch-权重衰退(weight decay)和丢弃法(dropout)
pytorch-权重衰退(weight decay)和丢弃法(dropout)
2022-07-05 11:34:00 【我渊啊我渊啊】
解决过拟合的常用两种方法:
1、权重衰退
常用方法:L1,L2正则化
L2正则化:
一个神经网络训练至loss收敛时,会有多个w,b符合条件。如果w过大,则输入层的噪声将会被放大,导致结果也会不准确,因此需要尽量减少w的值。正则化通过为模型的损失函数加入惩罚项使得学出的模型参数值比较小。
2、丢弃法(只能用于全连接层)
dropout不改变其输入的期望值,只在模型训练的时候使用
有p的概率,hi会清零
有1-p的概率,hi会除以1-p做拉伸
import torch
from torch import nn
from d2l import torch as d2l
dropout1, dropout2 = 0.2, 0.2
net = nn.Sequential(nn.Flatten(),
nn.Linear(784, 256),
nn.ReLU(),
# 在第一个全连接层之后添加一个dropout层
nn.Dropout(dropout1),
nn.Linear(256, 256),
nn.ReLU(),
# 在第二个全连接层之后添加一个dropout层
nn.Dropout(dropout2),
nn.Linear(256, 10))
def init_weights(m):
if type(m) == nn.Linear:
nn.init.normal_(m.weight, std=0.01)
net.apply(init_weights);
num_epochs, lr, batch_size = 10, 0.5, 256
loss = nn.CrossEntropyLoss(reduction='none')
train_iter, test_iter = d2l.load_data_fashion_mnist(batch_size)
trainer = torch.optim.SGD(net.parameters(), lr=lr)
d2l.train_ch3(net, train_iter, test_iter, loss, num_epochs, trainer)
边栏推荐
- How to understand super browser? What scenarios can it be used in? What brands are there?
- Solve the grpc connection problem. Dial succeeds with transientfailure
- Cron expression (seven subexpressions)
- Shell script file traversal STR to array string splicing
- 11. (map data section) how to download and use OSM data
- COMSOL -- establishment of geometric model -- establishment of two-dimensional graphics
- Open3D 欧式聚类
- MFC pet store information management system
- Oneforall installation and use
- Basics - rest style development
猜你喜欢
[crawler] Charles unknown error
Harbor镜像仓库搭建
高校毕业求职难?“百日千万”网络招聘活动解决你的难题
MySQL 巨坑:update 更新慎用影响行数做判断!!!
Advanced technology management - what is the physical, mental and mental strength of managers
COMSOL--三维图形的建立
【Office】Excel中IF函数的8种用法
CDGA|数据治理不得不坚持的六个原则
12.(地图数据篇)cesium城市建筑物贴图
In the last process before the use of the risk control model, 80% of children's shoes are trampled here
随机推荐
《增长黑客》阅读笔记
Leetcode 185 All employees with the top three highest wages in the Department (July 4, 2022)
阻止瀏覽器後退操作
11.(地图数据篇)OSM数据如何下载使用
简单解决redis cluster中从节点读取不了数据(error) MOVED
Spark Tuning (I): from HQL to code
SET XACT_ABORT ON
13.(地图数据篇)百度坐标(BD09)、国测局坐标(火星坐标,GCJ02)、和WGS84坐标系之间的转换
[there may be no default font]warning: imagettfbbox() [function.imagettfbbox]: invalid font filename
COMSOL--建立几何模型---二维图形的建立
ZCMU--1390: 队列问题(1)
redis的持久化机制原理
Crawler (9) - scrape framework (1) | scrape asynchronous web crawler framework
边缘计算如何与物联网结合在一起?
Lombok makes ⽤ @data and @builder's pit at the same time. Are you hit?
Redis集群(主从)脑裂及解决方案
龙蜥社区第九次运营委员会会议顺利召开
Programmers are involved and maintain industry competitiveness
【Office】Excel中IF函数的8种用法
How did the situation that NFT trading market mainly uses eth standard for trading come into being?