当前位置:网站首页>Pytorch - optimize model parameters
Pytorch - optimize model parameters
2022-07-28 15:47:00 【SpikeKing】
Reference resources :OPTIMIZING MODEL PARAMETERS
Gradient back propagation algorithm , Update parameters
SGD -> Adam
dataset -> dataloader,train_dataloader Training ,test_dataloader test , iterator
Model inheritance Module class ,__init__(self), Definition layer ,Flatten() Flattening , and Sequential(), An orderly container
forward() function , Forward calculation ,logits Number of output categories ,10 Categories
Hyperparameters, Hyperparameters , Not involved in optimization , Affect the effect of the model
Loss Function, Loss function 、 Objective function , Classification function CrossEntropyLoss, Regression function MSELoss
Optimizer, Optimizer ,SGD, Update the parameters ,torch.optim.SGD(), Update model parameters ,model.parameters()
Before optimization , call optimizer.zero_grad(), Calculate the gradient loss.backward(),optimizer.step() Update all parameters
optimizer.zero_grad()
loss.backward()
optimizer.step()
torch.no_grad() Reasoning , Calculate the accuracy correct
Data set section , To write dataset User defined classes for ,Transformer function ,Collection function ,dataset -> dataloader
model Can replace timm class , call require_grad(False), Freeze parameters
Simple classification or regression tasks ,seq2seq, Training incoming real value , Testing is an autoregressive task
AI Mission , Data sets + Model + Training , When predicting , There is no need to optimize parameters
PyTorch Of torch.autograd(), Automatic differentiation
Don't use zero_grad(), The learning rate needs to be reduced , Actual projects require custom datasets
Sequence modeling , need padding operation , There are some illegal values ,Python Classes and class inheritance , Learning rate + Optimizer
Embedding class , Floating point type , Input one-hot vector , The weight is embedding vector
Source code :
- Use timm Model replaces the base model
- Convert the grayscale image , Convert to color image
import timm
import torch
from torch import nn
from torch.utils.data import DataLoader
from torchvision import datasets
from torchvision.transforms import ToTensor, Lambda
training_data = datasets.FashionMNIST(
root="data",
train=True,
download=True,
transform=ToTensor()
)
test_data = datasets.FashionMNIST(
root="data",
train=False,
download=True,
transform=ToTensor()
)
train_dataloader = DataLoader(training_data, batch_size=64)
test_dataloader = DataLoader(test_data, batch_size=64)
class NeuralNetwork(nn.Module):
def __init__(self):
super(NeuralNetwork, self).__init__()
self.flatten = nn.Flatten()
# self.linear_relu_stack = nn.Sequential(
# nn.Linear(28*28, 512),
# nn.ReLU(),
# nn.Linear(512, 512),
# nn.ReLU(),
# nn.Linear(512, 10),
# )
self.mobilenetv3 = timm.create_model('mobilenetv3_large_100', num_classes=10, pretrained=True)
def forward(self, x):
x = torch.cat([x, x, x], dim=1) # Convert gray image to color image
logits = self.mobilenetv3(x)
return logits
model = NeuralNetwork()
learning_rate = 1e-3
batch_size = 64
epochs = 5
# Initialize the loss function
loss_fn = nn.CrossEntropyLoss()
optimizer = torch.optim.SGD(model.parameters(), lr=learning_rate)
def train_loop(dataloader, model, loss_fn, optimizer):
size = len(dataloader.dataset)
for batch, (X, y) in enumerate(dataloader):
pred = model(X)
loss = loss_fn(pred, y)
# Backpropagation
optimizer.zero_grad()
loss.backward()
optimizer.step()
if batch % 100 == 0:
loss, current = loss.item(), batch * len(X)
print(f"loss: {
loss:>7f} [{
current:>5d}/{
size:>5d}]")
def test_loop(dataloader, model, loss_fn):
size = len(dataloader.dataset)
num_batches = len(dataloader)
test_loss, correct = 0, 0
with torch.no_grad():
for X, y in dataloader:
pred = model(X)
test_loss += loss_fn(pred, y).item()
correct += (pred.argmax(1) == y).type(torch.float).sum().item()
test_loss /= num_batches
correct /= size
print(f"Test Error: \n Accuracy: {
(100*correct):>0.1f}%, Avg loss: {
test_loss:>8f} \n")
loss_fn = nn.CrossEntropyLoss()
optimizer = torch.optim.SGD(model.parameters(), lr=learning_rate)
epochs = 10
for t in range(epochs):
print(f"Epoch {
t+1}\n-------------------------------")
train_loop(train_dataloader, model, loss_fn, optimizer)
test_loop(test_dataloader, model, loss_fn)
print("Done!")
边栏推荐
- Summary and arrangement of postgraduate entrance examination information of 211 colleges and universities nationwide
- A wave of operation to solve the error problem of Laya scene editor
- samba服务器搭建指南
- Matlab exports high-definition pictures without distortion in word compression and PDF conversion
- 德国电信否认加强与华为合作,并称过去3年已缩减与华为的合作
- 虚拟机之NAT模式下设置静态IP
- Self cultivation of programmers
- 10. Implementation of related data accumulation task
- 20. Channel allocation task implementation
- Among the three "difficult and miscellaneous diseases" of machine learning, causal learning is the breakthrough | Liu Li, Chongqing University
猜你喜欢

About the pictures inserted in the word document, only the following part is displayed

Easyexcel complex header export (one to many)

软件架构与设计(十)-----架构技术

软件架构与设计(六)-----层次结构体

Minimum heap improves the efficiency of each sort

多功能混合信号AI采集/开关量DI/DO采集转RS485/232/MODBUS模块

Matlab exports high-definition pictures without distortion in word compression and PDF conversion

Heap operation

Endnote 与word关联
![[delete specified number leetcode]](/img/16/b40492d8414a363a3a24f00b4afd47.png)
[delete specified number leetcode]
随机推荐
How as makes intelligent prompts regardless of case
突发!微星CEO江胜昌坠楼身亡
Rongyun real-time community solution
Matlab exports high-definition pictures without distortion in word compression and PDF conversion
Flowable workflow all business concepts
数据实时反馈技术
Set structure byte alignment
Framework定制系列(十)-----SystemUI定制状态栏statusbar和导航栏navigationbar教程
19. Channel assignment task definition
What is the concept of game testing? What are the test methods and processes?
1200 times faster! MIT develops a new generation of drug research and development AI, and suspends the old model
软件架构与设计(五)-----以数据为中心的架构
Easyexcel complex header export (one to many)
热敏电阻PT100,NTC转0-10V/4-20mA转换器
flowable工作流所有业务概念
Matlab does not overwrite importing Excel
Learn RX programming from me -- concat
shell编程规范与变量
融云实时社区解决方案
String (3)