当前位置:网站首页>Pytorch weight decay and dropout
Pytorch weight decay and dropout
2022-07-05 11:42:00 【My abyss, my abyss】
There are two common methods to solve over fitting :
1、 Weight decline
Common methods :L1,L2 Regularization
L2 Regularization :
A neural network is trained to loss When converging , There will be multiple w,b eligible . If w Too big , Then the noise of the input layer will be amplified , The result will also be inaccurate , So we need to minimize w Value . Regularization makes the learned model parameters smaller by adding penalty terms to the loss function of the model .
2、 The law of abandonment ( Can only be used in the full connection layer )
dropout Do not change the expected value of its input , Only use it during model training
Yes p Probability ,hi It will be cleared
Yes 1-p Probability ,hi Will divide by 1-p Do stretching
import torch
from torch import nn
from d2l import torch as d2l
dropout1, dropout2 = 0.2, 0.2
net = nn.Sequential(nn.Flatten(),
nn.Linear(784, 256),
nn.ReLU(),
# Add one after the first fully connected layer dropout layer
nn.Dropout(dropout1),
nn.Linear(256, 256),
nn.ReLU(),
# Add a... After the second fully connected layer dropout layer
nn.Dropout(dropout2),
nn.Linear(256, 10))
def init_weights(m):
if type(m) == nn.Linear:
nn.init.normal_(m.weight, std=0.01)
net.apply(init_weights);
num_epochs, lr, batch_size = 10, 0.5, 256
loss = nn.CrossEntropyLoss(reduction='none')
train_iter, test_iter = d2l.load_data_fashion_mnist(batch_size)
trainer = torch.optim.SGD(net.parameters(), lr=lr)
d2l.train_ch3(net, train_iter, test_iter, loss, num_epochs, trainer)
边栏推荐
- 【使用TensorRT通过ONNX部署Pytorch项目】
- COMSOL -- establishment of geometric model -- establishment of two-dimensional graphics
- Guys, I tested three threads to write to three MySQL tables at the same time. Each thread writes 100000 pieces of data respectively, using F
- 871. Minimum Number of Refueling Stops
- [crawler] Charles unknown error
- 全网最全的新型数据库、多维表格平台盘点 Notion、FlowUs、Airtable、SeaTable、维格表 Vika、飞书多维表格、黑帕云、织信 Informat、语雀
- 石油化工企业安全生产智能化管控系统平台建设思考和建议
- Startup process of uboot:
- 【上采样方式-OpenCV插值】
- 网络五连鞭
猜你喜欢
redis主从模式
MySQL 巨坑:update 更新慎用影响行数做判断!!!
【SingleShotMultiBoxDetector(SSD,单步多框目标检测)】
How did the situation that NFT trading market mainly uses eth standard for trading come into being?
[yolov3 loss function]
XML parsing
[office] eight usages of if function in Excel
iTOP-3568开发板NPU使用安装RKNN Toolkit Lite2
【TFLite, ONNX, CoreML, TensorRT Export】
全网最全的新型数据库、多维表格平台盘点 Notion、FlowUs、Airtable、SeaTable、维格表 Vika、飞书多维表格、黑帕云、织信 Informat、语雀
随机推荐
龙蜥社区第九次运营委员会会议顺利召开
How can edge computing be combined with the Internet of things?
[singleshotmultiboxdetector (SSD, single step multi frame target detection)]
Idea set the number of open file windows
如何让你的产品越贵越好卖
【无标题】
《增长黑客》阅读笔记
pytorch-权重衰退(weight decay)和丢弃法(dropout)
COMSOL -- establishment of geometric model -- establishment of two-dimensional graphics
XML parsing
Question and answer 45: application of performance probe monitoring principle node JS probe
高校毕业求职难?“百日千万”网络招聘活动解决你的难题
【Win11 多用户同时登录远程桌面配置方法】
阻止瀏覽器後退操作
11.(地图数据篇)OSM数据如何下载使用
What does cross-border e-commerce mean? What do you mainly do? What are the business models?
CDGA|数据治理不得不坚持的六个原则
iTOP-3568开发板NPU使用安装RKNN Toolkit Lite2
【主流Nivida显卡深度学习/强化学习/AI算力汇总】
13.(地图数据篇)百度坐标(BD09)、国测局坐标(火星坐标,GCJ02)、和WGS84坐标系之间的转换