当前位置:网站首页>[Linear Neural Network] softmax regression
[Linear Neural Network] softmax regression
2022-07-31 04:29:00 【PBemmm】
one - hot encoding
Generally used for classification problems, where the features are discrete
It is very simple, using n states to represent n features, only one state takes the value 1, and the others are all 0
Cross entropy
Use the difference between the true probability and the predicted probability as the loss
Loss function
L2 Loss
The green curve is the likelihood function, the yellow is the gradient
When it is far from the origin, the update range of the parameters may be larger, which leads to L1 Loss
Absolute value loss L1 Loss
Huber's Rubust Loss
The loss function combining L1 Loss and L2 Loss
Softmax is realized from 0
Read data
import torchfrom IPython import displayfrom d2l import torch as d2lbatch_size = 256train_iter, test_iter = d2l.load_data_fashion_mnist(batch_size)
The dataset is Fashion-MNIST, 10 types of images, 6000 images for each type, the training set is 60000, the test set is 10000, and the batch size is 256
num_inputs = 784num_outputs = 10W = torch.normal(0, 0.01, size=(num_inputs, num_outputs), requires_grad=True)b = torch.zeros(num_outputs, requires_grad=True)
Each image is 28*28, expanded to 784-dimensional vector, and the output layer is 10 types
Here weight W is a matrix of (784, 10), input data x is a vector of length 784, for O(1 -> 10), each Oj corresponds to
Oj =< W[j], X>+ bj, so the number of columns of W is num_outputs, which is the number (type) of O
Obviously, b also corresponds to O
softmax
As the name suggests, softmax corresponds to hardmax, and hardmax is the routine value of the sequence. In classification, one hot coding is used, and confidence is introduced. According to the index of e introduced by softmax, we only care whether it can make The predicted value and confidence of the correct class are large enough not to care about the incorrect class.The model can distance the real class from other classes.
def softmax(X):X_exp = torch.exp(X)partition = X_exp.sum(1, keepdim=True)return X_exp / partition # The broadcast mechanism is applied here
Model
def net(X):return softmax(torch.matmul(X.reshape((-1, W.shape[0])), W) + b)
Cross entropy
y = torch.tensor([0, 2])y_hat = torch.tensor([[0.1, 0.3, 0.6], [0.3, 0.2, 0.5]])y_hat[[0, 1], y]
y records the index of the real category, y_hat[ [0,1] , y ] returns the element 0.1 of the real category index of the first group [0.1,0.3,0.6] and 0.5 of the second group
def cross_entropy(y_hat, y):return - torch.log(y_hat[range(len(y_hat)), y])cross_entropy(y_hat, y)
Simple implementation of softmax
Import
import torchfrom torch import nnfrom d2l import torch as d2lbatch_size = 256train_iter, test_iter = d2l.load_data_fashion_mnist(batch_size)
Initialize model parameters
net = nn.Sequential(nn.Flatten(), nn.Linear(784, 10))def init_weights(m):if type(m) == nn.Linear:nn.init.normal_(m.weight, std=0.01)net.apply(init_weights);
nn.Flatten() is used to adjust the shape of the network input
trainer = torch.optim.SGD(net.parameters(), lr=0.1)
Loss function
loss = nn.CrossEntropyLoss(reduction='none')
Optimization algorithm
trainer = torch.optim.SGD(net.parameters(), lr=0.1)
Training
num_epochs = 10d2l.train_ch3(net, train_iter, test_iter, loss, num_epochs, trainer)
边栏推荐
- 关于出现大量close_wait状态的理解
- (树) 最近公共祖先(LCA)
- 慧通编程第4关 - 魔法学院第6课
- mysql数据库安装(详细)
- No qualifying bean of type question
- Unity2D 自定义Scriptable Tiles的理解与使用(四)——开始着手构建一个基于Tile类的自定义tile(下)
- Win10 CUDA CUDNN 安装配置(torch paddlepaddle)
- 【C语言进阶】文件操作(一)
- MySQL database must add, delete, search and modify operations (CRUD)
- open failed: EACCES (Permission denied)
猜你喜欢
重磅 | 基金会为白金、黄金、白银捐赠人授牌
C语言从入门到如土——数据的存储
ENSP,划分VLAN、静态路由,三层交换机综合配置
How Zotero removes auto-generated tags
idea工程明明有依赖但是文件就是显示没有,Cannot resolve symbol ‘XXX‘
Hand in hand to realize the picture preview plug-in (3)
Win10 CUDA CUDNN 安装配置(torch paddlepaddle)
input输入框展示两位小数之precision
WeChat applet uses cloud functions to update and add cloud database nested array elements
《DeepJIT: An End-To-End Deep Learning Framework for Just-In-Time Defect Prediction》论文笔记
随机推荐
Thinking about data governance after Didi fines
微信小程序使用云函数更新和添加云数据库嵌套数组元素
Fusion Cloud Native, Empowering New Milestones | 2022 Open Atom Global Open Source Summit Cloud Native Sub-Forum Successfully Held
【线性神经网络】softmax回归
Pytest电商项目实战(上)
Vue项目通过node连接MySQL数据库并实现增删改查操作
递归实现汉诺塔问题
微软 AI 量化投资平台 Qlib 体验
重磅 | 基金会为白金、黄金、白银捐赠人授牌
Redis uses LIST to cache the latest comments
IDEA常用快捷键与插件
RESTful api interface design specification
[Swift] Customize the shortcut that pops up by clicking the APP icon
MATLAB/Simulink&&STM32CubeMX工具链完成基于模型的设计开发(MBD)(三)
HCIP第十天_BGP路由汇总实验
Safety 20220722
three.js 制作3D相册
el-image tag doesn't work after binding click event
Pytest e-commerce project combat (on)
Notes on the establishment of the company's official website (6): The public security record of the domain name is carried out and the record number is displayed at the bottom of the web page