当前位置:网站首页>1、 Usage of common loss function
1、 Usage of common loss function
2022-07-29 06:08:00 【My hair is messy】
Preface
Common methods of defining loss function , It includes multi classification cross entropy 、 Mean square error 、 The usage of binary cross entropy . Its functions include :1. Measure the difference between the output value of the model and the label value ;2. Evaluate the degree of inconsistency between the predicted value of the model and the real value ;3. The optimized objective function in neural network , The smaller the loss function , The closer the predicted value is to the real value , The robustness of the model is also better .
One 、L1-loss(MAE)、L2- loss(MSE)、smooth L1- loss、 What is the cross entropy loss function ?


Two 、 Use steps
1. Loss function method
The code is as follows ( Example ):
# Define the loss function , Update gradient ----
loss_fn = torch.nn.CrossEntropyLoss()# The multi classification cross entropy does not need to be output by the activation power function
# loss_fn=torch.nn.MSELoss()# Mean square error
# loss_fn=torch.nn.BCELoss()# Two categories cross entropy
# loss_fn=torch.nn.BCEWithLogitsLoss()# Automatically introduce activation function 2. Code operation
The code is as follows ( Example ):
import torch
from torchvision import datasets,transforms
from torch.utils.data import DataLoader# Class is the core of loading data , Return iteratable data
import os
import matplotlib.pyplot as plt
class Net(torch.nn.Module):
def __init__(self):
super().__init__()# Inherit
self.fc1 = torch.nn.Sequential(
torch.nn.Linear(784,256),
torch.nn.BatchNorm1d(256),
torch.nn.ReLU())
#nn.Sequential Combine the network layer with the activation function , Output the activated network node .
#nn.Linear(in_features,out_features,bias = True ) Apply a linear transformation to the incoming data
#784, The size of each input sample ---- That is to say 28*28, The pixel value of the image w*h
# 256 The size of each output sample ----- namely 784 adopt Linear function
#BatchNorm1d(256),# Adaptive standardization - Normal distribution -- The input value falls in the sensitive area of nonlinear function , Avoid the problem of gradient disappearance
#nn.ReLU() Activation function relu
self.fc2 = torch.nn.Sequential(
torch.nn.Linear(256,128),
torch.nn.BatchNorm1d(128),
torch.nn.ReLU())
self.fc3 = torch.nn.Linear(128,10)
def forward(self,x):#forward Function to implement forward propagation operation
# print(x.shape)
#N,C,H,W(batchsize,channels,x,y)-->N,V
#x.size(0)==batchsize, There are several lines after conversion
# Finally through x.view(x.size(0), -1) take tensor The structure of is transformed into (batchsize, channels*x*y),
# the (channels,x,y) straighten , And then you can talk to fc Layers connect
#-1 Without telling the function how many columns there are , According to the original tensor Data and batchsize Auto assign columns .
x = torch.reshape(x,[x.size(0),-1])# Transform shape , Switch to 2 dimension ,reshape=view
# print(x.shape)
y=self.fc1(x)#N,256
y=self.fc2(y)#N,128 #y=w*sqrt(x2+bias)
# y=self.fc3(y)#N,10
self.y=self.fc3(y)
y=torch.softmax(self.y,1)
return y
if __name__ == '__main__':
save_params = r"./save_params/parmas.pth"# Save parameters
save_net = r"./save_params/net.pth"# Save network
transf = transforms.Compose([transforms.ToTensor(),
transforms.Normalize(mean=[0.5,],std=[0.5,])])
#transforms.Compose take transforms In the list transform Operation to traverse .
#transforms.ToTensor() The gray scale ranges from 0-255 Change to 0-1 Between
#transforms.Normalize hold 0-1 Change to (-1,1),(image-mean)/std
train_data = datasets.MNIST("./data",train=True,transform=transf,download=True)# Read training data
test_data = datasets.MNIST("./data",train=False,transform=transf,download=False)# Read test data
# 100 Up picture ,True Whether to disturb , Random , Give different characteristics to learn
trin_loader = DataLoader(train_data,100,True)# Load data
test_loader = DataLoader(test_data,100,True)
# DataLoader()
# Use multiple processes to speed up batchdata To deal with
# Intuitive network input data structure , Easy to use and expand
print(train_data.data.shape)
print(train_data.targets.shape)
print(test_data.data.shape)
print(test_data.targets.shape)
print(test_data.classes)
if torch.cuda.is_available():
device = torch.device("cuda")
else:
device = torch.device("cpu")
net=Net().to(device)# When the network loader starts reading data tensor Variable copy One to device As specified by the cuda Up
if os.path.exists(save_params):
net.load_state_dict(torch.load(save_params))# Only load parameters
print(" Parameters loaded successfully ")
else:
print("No params!")
# net = torch.load(save_net).to(device)# Load parameters and Networks
# loss_fn = torch.nn.CrossEntropyLoss()# Multi category cross entropy , Define the loss function , Update gradient ----
# loss_fn=torch.nn.MSELoss()# Mean square error
# loss_fn=torch.nn.BCELoss()# Two categories cross entropy
loss_fn=torch.nn.BCEWithLogitsLoss()# Two categories cross entropy , Do it automatically for the input value sigmoid
# optim = torch.optim.SGD(net.parameters(),lr=1e-3)# More stable
optim = torch.optim.Adam(net.parameters(),lr=1e-3)# Create optimizer
#torch.optim.Adam Optimizer
# (net.parameters(), Of the parameters to be optimized iterable Or the one that defines the parameter group dict
# lr=1e-3) Learning rate or step factor
# test , Real time drawing and analysis
a = []
b = []
# plt.ion()
net.train()
for epoch in range(1):
for i ,(x,y) in enumerate(trin_loader):
x = x.to(device)
y = y.to(device)
y_ = torch.zeros(len(y), max(y) + 1).to(device)
y_[torch.arange(len(y)), y] = 1
out = net(x)# Forward output
# loss = loss_fn(out,y)# Seeking loss
loss = loss_fn(net.y,y_)
optim.zero_grad()# Clear the current gradient
loss.backward()# Calculate the current gradient
optim.step()# Update one step along the current gradient
# a.append(i)
# b.append(loss.item())
# plt.clf()
# plt.plot(a,b)
# plt.pause(0.1)
if i%50==0:
print("loss",loss.item())
plt.ioff()
plt.show()
# test
eval_loss=0
eval_acc=0
net.eval()
for i,(x,y) in enumerate(test_loader):
x = x.to(device)#x Send it to the network
y = y.to(device)
y_ = torch.zeros(len(y), max(y) + 1).to(device)
y_[torch.arange(len(y)), y] = 1
out = net(x)
# loss = loss_fn(out, y)
loss = loss_fn(out,y_)
eval_loss += loss.item() * y.size(0)
eval_acc += (y == torch.argmax(out, 1)).cpu().sum().item()
avg_loss = eval_loss / len(test_data)
avg_acc = eval_acc / len(test_data)
print(avg_loss)
print(avg_acc)
if not os.path.exists("./save_params"):
os.mkdir("./save_params")
torch.save(net.state_dict(),"./save_params/parmas.pth")# Save only parameters
torch.save(net,"./save_params/net.pth")
summary
Tips : Here is a summary of the article :
loss_fn = torch.nn.CrossEntropyLoss()# Multi category cross entropy , The output does not need to add an activation function .
loss_fn=torch.nn.MSELoss()# Mean square error 、 Output needs to add activation function .
loss_fn=torch.nn.BCELoss()# Two categories cross entropy 、 Output needs to add activation function .
loss_fn=torch.nn.BCEWithLogitsLoss()# Two categories cross entropy , Do it automatically for the input value sigmoid
边栏推荐
- Continue the new journey and control smart storage together
- 五、图像像素统计
- Technology that deeply understands the principle of MMAP and makes big manufacturers love it
- 【Transformer】AdaViT: Adaptive Tokens for Efficient Vision Transformer
- [ml] PMML of machine learning model -- Overview
- [target detection] generalized focal loss v1
- clion+opencv+aruco+cmake配置
- 【DL】关于tensor(张量)的介绍和理解
- 【DL】搭建卷积神经网络用于回归预测(数据+代码详细教程)
- Improve quality with intelligence financial imaging platform solution
猜你喜欢
![[DL] introduction and understanding of tensor](/img/d8/a367c26b51d9dbaf53bf4fe2a13917.png)
[DL] introduction and understanding of tensor

D3.js vertical relationship diagram (with arrows and text description of connecting lines)

Flink connector Oracle CDC synchronizes data to MySQL in real time (oracle19c)

第一周任务 深度学习和pytorch基础

Ribbon learning notes II

ROS教程(Xavier)

ML16-神经网络(2)

Change! Change! Change!

【ML】机器学习模型之PMML--概述

【目标检测】6、SSD
随机推荐
【图像分类】如何使用 mmclassification 训练自己的分类模型
[clustmaps] visitor statistics
【Transformer】ACMix:On the Integration of Self-Attention and Convolution
Interesting talk about performance optimization thread pool: is the more threads open, the better?
2021-06-10
个人学习网站
【Attention】Visual Attention Network
【Transformer】SOFT: Softmax-free Transformer with Linear Complexity
MySQL inserts millions of data (using functions and stored procedures)
Exploration of flutter drawing skills: draw arrows together (skill development)
Ffmpeg creation GIF expression pack tutorial is coming! Say thank you, brother black fly?
虚假新闻检测论文阅读(四):A novel self-learning semi-supervised deep learning network to detect fake news on...
clion+opencv+aruco+cmake配置
六、基于深度学习关键点的指针式表计识别
【Transformer】AdaViT: Adaptive Tokens for Efficient Vision Transformer
二、OCR训练时,将txt文件和图片数据转为lmdb文件格式
Technology that deeply understands the principle of MMAP and makes big manufacturers love it
ML16-神经网络(2)
ABSA1: Attentional Encoder Network for Targeted Sentiment Classification
引入Spacy模块出错—OSError: [E941] Can‘t find model ‘en‘.