当前位置:网站首页>Pytorch weight decay and dropout
Pytorch weight decay and dropout
2022-07-05 11:42:00 【My abyss, my abyss】
There are two common methods to solve over fitting :
1、 Weight decline
Common methods :L1,L2 Regularization
L2 Regularization :
A neural network is trained to loss When converging , There will be multiple w,b eligible . If w Too big , Then the noise of the input layer will be amplified , The result will also be inaccurate , So we need to minimize w Value . Regularization makes the learned model parameters smaller by adding penalty terms to the loss function of the model .
2、 The law of abandonment ( Can only be used in the full connection layer )
dropout Do not change the expected value of its input , Only use it during model training
Yes p Probability ,hi It will be cleared
Yes 1-p Probability ,hi Will divide by 1-p Do stretching
import torch
from torch import nn
from d2l import torch as d2l
dropout1, dropout2 = 0.2, 0.2
net = nn.Sequential(nn.Flatten(),
nn.Linear(784, 256),
nn.ReLU(),
# Add one after the first fully connected layer dropout layer
nn.Dropout(dropout1),
nn.Linear(256, 256),
nn.ReLU(),
# Add a... After the second fully connected layer dropout layer
nn.Dropout(dropout2),
nn.Linear(256, 10))
def init_weights(m):
if type(m) == nn.Linear:
nn.init.normal_(m.weight, std=0.01)
net.apply(init_weights);
num_epochs, lr, batch_size = 10, 0.5, 256
loss = nn.CrossEntropyLoss(reduction='none')
train_iter, test_iter = d2l.load_data_fashion_mnist(batch_size)
trainer = torch.optim.SGD(net.parameters(), lr=lr)
d2l.train_ch3(net, train_iter, test_iter, loss, num_epochs, trainer)
边栏推荐
- Open3D 网格(曲面)赋色
- The ninth Operation Committee meeting of dragon lizard community was successfully held
- 11.(地图数据篇)OSM数据如何下载使用
- MySQL statistical skills: on duplicate key update usage
- Cron expression (seven subexpressions)
- 高校毕业求职难?“百日千万”网络招聘活动解决你的难题
- 1.php的laravel创建项目
- SLAM 01. Modeling of human recognition Environment & path
- 中非 钻石副石怎么镶嵌,才能既安全又好看?
- vscode快捷键
猜你喜欢
【爬虫】charles unknown错误
[singleshotmultiboxdetector (SSD, single step multi frame target detection)]
【主流Nivida显卡深度学习/强化学习/AI算力汇总】
COMSOL--三维图形的建立
【yolov5.yaml解析】
CDGA|数据治理不得不坚持的六个原则
COMSOL -- establishment of 3D graphics
7 themes and 9 technology masters! Dragon Dragon lecture hall hard core live broadcast preview in July, see you tomorrow
【使用TensorRT通过ONNX部署Pytorch项目】
【上采样方式-OpenCV插值】
随机推荐
Solve the grpc connection problem. Dial succeeds with transientfailure
汉诺塔问题思路的证明
【yolov5.yaml解析】
[LeetCode] Wildcard Matching 外卡匹配
【爬虫】charles unknown错误
COMSOL--建立几何模型---二维图形的建立
【SingleShotMultiBoxDetector(SSD,单步多框目标检测)】
基于Lucene3.5.0怎样从TokenStream获得Token
Unity Xlua MonoProxy Mono代理类
C # implements WinForm DataGridView control to support overlay data binding
871. Minimum Number of Refueling Stops
Open3D 网格(曲面)赋色
紫光展锐全球首个5G R17 IoT NTN卫星物联网上星实测完成
splunk配置163邮箱告警
Error assembling WAR: webxml attribute is required (or pre-existing WEB-INF/web.xml if executing in
Project summary notes series wstax kt session2 code analysis
15 methods in "understand series after reading" teach you to play with strings
【pytorch 修改预训练模型:实测加载预训练模型与模型随机初始化差别不大】
Web API配置自定义路由
爬虫(9) - Scrapy框架(1) | Scrapy 异步网络爬虫框架