当前位置:网站首页>Hands on learning and deep learning -- Realization of linear regression from scratch
Hands on learning and deep learning -- Realization of linear regression from scratch
2022-06-12 08:13:00 【Orange acridine 21】
We will implement the whole method from scratch , Including assembly line 、 Model 、 Loss function and small batch random gradient descent optimizer .
import os
import matplotlib.pyplot as plt
# among matplotlib Packages can be used for mapping , And it is set as embedded
import torch
from IPython import display
from matplotlib import pyplot as plt
import numpy as np
import random ![]()
1、 Generate data set
"""
An artificial data set is constructed according to the linear model with noise
"""
num_inputs=2 # Enter the number ( Characteristic number ) by 2
num_examples=1000 # The number of samples in the training data set is 1000
true_w=[2,-3.4] # The true weight of the linear regression model
true_b=4.2 # True deviation of linear regression model
features=torch.from_numpy(np.random.normal(0,1,(num_examples,num_inputs)))
# The mean for 0, The variance of 1 The random number , Yes num_examples Samples , The number of columns is num_inputs
labels=true_w[0]*features[:,0]+true_w[1]*features[:,1]+true_b
# lables Is equal to w For each column multiplied by features Each column of is then added , Finally, add the deviation true_b;
labels+=torch.from_numpy(np.random.normal(0,0.01,size=labels.size()))
# Added a noise , The mean for 0, The variance of 0.01, Shape and lables It's the same length
"""
features Every ⼀⾏ yes ⼀ individual ⻓ Degree is 2 Vector ,⽽ labels Every ⼀⾏ yes ⼀ individual ⻓ Degree is 1 Vector ( mark
The amount )
"""
print(features[0],labels[0])
# Finally, output the column vector ( Features and dimensions ) 
adopt ⽣ Chengdi ⼆ Features features[:, 1] And labels labels The scatter diagram of , To observe the linear relationship between the two system
def use_svg_display():
# Use a vector diagram to represent
display.set_matplotlib_formats('svg')
def set_figsize(figsize=(3.5,2.5)):
use_svg_display()
# Set the size of the drawing
plt.rcParams['figure.figsize']=figsize
set_figsize()
plt.scatter(features[:,1].numpy(),labels.numpy(),1);
plt.show()
You can also save the above mapping function in d21zh_pytorch In bag , Steps are as follows :
First step : Create a new one d21zh_pytorch package ;
The second step : Create a new one in the package methods.py file ;
The third step : hold plt do chart Letter Count With And use_svg_display Letter Count and set_figsize Letter Count set The righteous stay d2lzh_pytorch Medium methods.py In file , Pictured :

Then you can call the saved function directly , To display a scatter plot :
"""
If in d21zh_pytorch After adding the above two functions , You can use the following method calls
"""
import sys
sys.path.append("..")
from d21zh_pytorch import *
set_figsize()
plt.scatter(features[:,1].numpy(),labels.numpy(),1);
plt.show()
2、 Reading data
When training data , We need to traverse the data set and constantly read small batches of data samples . Here we define a function : Every time he returns batch_size( Batch size ) Characteristics and labels of random samples .
# Define a data_iter function , This function accepts the batch size 、 Feature matrix and label vector are used as input , The generation size is batch_size A small batch of
def data_iter(batch_size,features,labels):
num_examples=len(features)
indices=list (range(num_examples))
random.shuffle(indices) # The reading order of samples is random
for i in range(0,num_examples,batch_size):
# from 0 Start to num_examples end , Each jump batch_size A size .
j=torch.LongTensor(indices[i:min(i+batch_size,num_examples)])
# The last time may be less than one batch, So the last batch takes a minimum value min
yield features.index_select(0,j),labels.index_select(0,j)
# Read the first small batch data sample and print
batch_size=10
for X,y in data_iter(batch_size,features,labels):
print(X,'\n',y) # Generate X yes 10 ride 2 Vector ,y yes 10 ride 1 Vector
# Add one ‘\n’ It can make y A newline indicates
break

3、 Define initialization model parameters
"""
We initialize the weight to mean 0, The standard deviation is 0.01 The normal random number of , The deviation is initialized to 0
"""
tensor = tensor.to(torch.float32)
w=torch.tensor(np.random.normal(0,0.01,(num_inputs,1)),dtype=torch.double)
b=torch.zeros(1,dtype=torch.double)
# After the model training , We need to gradient these parameters to iterate the values of the parameters , So we have to make them requires_grad=True
w.requires_grad_(requires_grad=True)
b.requires_grad_(requires_grad=True)notes :w=torch.tensor(np.random.normal(0,0.01,(num_inputs,1)),dtype=torch.double) b=torch.zeros(1,dtype=torch.double)
Here, the data type is double, But can't use float32, Otherwise, an error will be reported .
4、 Defining models
# The following is the implementation of vector calculation expression of linear regression , We use mm Functions do matrix multiplication
def linreg(X,w,b):# linear regression model
return torch.mm(X,w)+b
# Return forecast ( Input X multiply w, Matrix times vector , Plus deviation )5、 Define the loss function
"""
Describe the square loss to define the loss function of linear regression , In reality , We need to put the real value y Become the predicted value y_hat The shape of the . The results returned by the following functions will also be the same as y_hat In the same shape
"""
def squared_loss(y_hat,y):
#squared_loss Mean square loss
return (y_hat - y.view(y_hat.size()))**2/2# Subtract by element , Square by element , Finally divide by 2
# Pay attention to this ⾥ The return is a vector , in addition , pytorch⾥ Of MSELoss Not divided by 26、 Define optimization algorithms
# Following sgd The function implements the above ⼀ Section ⼩ Batch random gradient descent algorithm . It optimizes the loss function by iterating the model parameters . this ⾥⾃ The gradient calculated by the dynamic gradient module is ⼀ Gradient sum of batch samples . We divide it by the lot size ⼤⼩ Get the average .
def sgd(params, lr, batch_size): # Given all the parameters param( Contains w and b), Learning rate , Batch size
# Small batch random gradient drop
for param in params: # For parameters params Every parameter in param(w perhaps b)
param.data -= lr * param.grad / batch_size # The gradient exists .grad in
## Pay attention to this ⾥ change param when ⽤ Of param.data7、 Training models
lr = 0.03 # The learning rate is set to 0.03
num_epochs = 3 # The number of iteration cycles is 3, Scan the whole data three times
net = linreg # linear regression model
loss = squared_loss # Mean square loss
# The realization of training is two levels for epoch
for epoch in range(num_epochs): # Training models ⼀ A common need num_epochs Iterations
# In each of the ⼀ In iterations , Can make ⽤ All samples in the training data set ⼀ Time ( It is assumed that the number of samples can be batched ⼤⼩ to be divisible by )
# X and y Namely ⼩ Characteristics and labels of bulk samples
for X, y in data_iter(batch_size, features, labels):
l = loss(net(X, w, b), y).sum()
# l It's about ⼩ Batch X and y The loss of
l.backward() # ⼩ The loss of lot size is a gradient of model parameters
sgd([w, b], lr, batch_size) # send ⽤⼩ Batch stochastic gradient descent iteration model parameters
# Don't forget to reset the gradient
w.grad.data.zero_()
b.grad.data.zero_()
train_l = loss(net(features, w, b), labels)
print('epoch %d, loss %f' % (epoch + 1, train_l.mean().item()))

Output many pieces of data from epoch1 To epoch3.
# Compare the real parameters with the parameters learned through training to evaluate the success of training
print(true_w, '\n', w)
print(true_b, '\n', b) 
边栏推荐
- ASP.NET项目开发实战入门_项目六_错误报告(自己写项目时的疑难问题总结)
- MYSQL中的触发器
- Leetcode notes: Weekly contest 275
- Vision Transformer | Arxiv 2205 - TRT-ViT 面向 TensorRT 的 Vision Transformer
- 计组第一章
- 从AC5到AC6转型之路(1)——补救和准备
- C语言printf输出整型格式符简单总结
- Mathematical knowledge - matrix - matrix / vector derivation
- MES帮助企业智能化改造,提高企业生产透明度
- FPGA to flip video up and down (SRAM is61wv102416bll)
猜你喜欢

(P36-P39)右值和右值引用、右值引用的作用以及使用、未定引用类型的推导、右值引用的传递

System service configuration service - detailed version

Model compression | tip 2022 - Distillation position adaptation: spot adaptive knowledge distillation

FPGA implementation of right and left flipping of 720p image

(P40-P41)move资源的转移、forward完美转发

Vision Transformer | Arxiv 2205 - TRT-ViT 面向 TensorRT 的 Vision Transformer

KAtex problem of vscade: parseerror: KAtex parse error: can't use function '$' in math mode at position

802.11 protocol: wireless LAN protocol

Vins technical route and code explanation

What is an extension method- What are Extension Methods?
随机推荐
MYSQL中的触发器
Database connection pool and dbutils tool
Record the treading pit of grain Mall (I)
C # hide the keyboard input on the console (the input content is not displayed on the window)
Model compression | tip 2022 - Distillation position adaptation: spot adaptive knowledge distillation
FPGA generates 720p video clock
Derivation of Poisson distribution
Face recognition using BP neural network of NNET in R language
MES系统质量追溯功能,到底在追什么?
ASP. Net project development practice introduction_ Item VI_ Error report (summary of difficult problems when writing the project)
C语言printf输出整型格式符简单总结
Alibaba cloud deploys VMware and reports an error
What is an extension method- What are Extension Methods?
Upgrade eigen to version 3.3.5 under Ubuntu 16.04
MinGW offline installation package (free, fool)
vm虚拟机中使用NAT模式特别说明
Mathematical Essays: Notes on the angle between vectors in high dimensional space
C # push box
Leetcode notes: biweekly contest 69
2.2 链表---设计链表(Leetcode 707)