当前位置:网站首页>Pytorch weight decay and dropout
Pytorch weight decay and dropout
2022-07-05 11:42:00 【My abyss, my abyss】
There are two common methods to solve over fitting :
1、 Weight decline
Common methods :L1,L2 Regularization
L2 Regularization :
A neural network is trained to loss When converging , There will be multiple w,b eligible . If w Too big , Then the noise of the input layer will be amplified , The result will also be inaccurate , So we need to minimize w Value . Regularization makes the learned model parameters smaller by adding penalty terms to the loss function of the model .
2、 The law of abandonment ( Can only be used in the full connection layer )
dropout Do not change the expected value of its input , Only use it during model training
Yes p Probability ,hi It will be cleared
Yes 1-p Probability ,hi Will divide by 1-p Do stretching
import torch
from torch import nn
from d2l import torch as d2l
dropout1, dropout2 = 0.2, 0.2
net = nn.Sequential(nn.Flatten(),
nn.Linear(784, 256),
nn.ReLU(),
# Add one after the first fully connected layer dropout layer
nn.Dropout(dropout1),
nn.Linear(256, 256),
nn.ReLU(),
# Add a... After the second fully connected layer dropout layer
nn.Dropout(dropout2),
nn.Linear(256, 10))
def init_weights(m):
if type(m) == nn.Linear:
nn.init.normal_(m.weight, std=0.01)
net.apply(init_weights);
num_epochs, lr, batch_size = 10, 0.5, 256
loss = nn.CrossEntropyLoss(reduction='none')
train_iter, test_iter = d2l.load_data_fashion_mnist(batch_size)
trainer = torch.optim.SGD(net.parameters(), lr=lr)
d2l.train_ch3(net, train_iter, test_iter, loss, num_epochs, trainer)
边栏推荐
- NFT 交易市场主要使用 ETH 本位进行交易的局面是如何形成的?
- [mainstream nivida graphics card deep learning / reinforcement learning /ai computing power summary]
- An error is reported in the process of using gbase 8C database: 80000305, host IPS long to different cluster. How to solve it?
- Error assembling WAR: webxml attribute is required (or pre-existing WEB-INF/web.xml if executing in
- 分类TAB商品流多目标排序模型的演进
- Empêcher le navigateur de reculer
- Harbor镜像仓库搭建
- What does cross-border e-commerce mean? What do you mainly do? What are the business models?
- 跨境电商是啥意思?主要是做什么的?业务模式有哪些?
- iTOP-3568开发板NPU使用安装RKNN Toolkit Lite2
猜你喜欢
pytorch-softmax回归
简单解决redis cluster中从节点读取不了数据(error) MOVED
【pytorch 修改预训练模型:实测加载预训练模型与模型随机初始化差别不大】
中非 钻石副石怎么镶嵌,才能既安全又好看?
全网最全的新型数据库、多维表格平台盘点 Notion、FlowUs、Airtable、SeaTable、维格表 Vika、飞书多维表格、黑帕云、织信 Informat、语雀
Yolov 5 Target Detection Neural Network - Loss Function Calculation Principle
【TFLite, ONNX, CoreML, TensorRT Export】
《增长黑客》阅读笔记
网络五连鞭
How to protect user privacy without password authentication?
随机推荐
How can China Africa diamond accessory stones be inlaid to be safe and beautiful?
SET XACT_ABORT ON
阻止浏览器后退操作
【L1、L2、smooth L1三类损失函数】
2048游戏逻辑
871. Minimum Number of Refueling Stops
项目总结笔记系列 wsTax KT Session2 代码分析
石油化工企业安全生产智能化管控系统平台建设思考和建议
redis主从模式
Ziguang zhanrui's first 5g R17 IOT NTN satellite in the world has been measured on the Internet of things
13.(地图数据篇)百度坐标(BD09)、国测局坐标(火星坐标,GCJ02)、和WGS84坐标系之间的转换
CDGA|数据治理不得不坚持的六个原则
Yolov 5 Target Detection Neural Network - Loss Function Calculation Principle
XML解析
Redis集群的重定向
Implementation of array hash function in PHP
《增长黑客》阅读笔记
11.(地图数据篇)OSM数据如何下载使用
POJ 3176 cow bowling (DP | memory search)
SLAM 01. Modeling of human recognition Environment & path