当前位置：网站首页>Hands on learning and deep learning -- Realization of linear regression from scratch

Hands on learning and deep learning -- Realization of linear regression from scratch

2022-06-12 08:13:00 【Orange acridine 21】

 We will implement the whole method from scratch , Including assembly line 、 Model 、 Loss function and small batch random gradient descent optimizer .

import os
import matplotlib.pyplot as plt
# among matplotlib Packages can be used for mapping , And it is set as embedded 
import torch
from IPython import display
from matplotlib import pyplot as plt
import numpy as np
import random

We construct ⼀ A simple one. ⼈⼯ Training data set , Given random ⽣ The characteristics of the bulk samples

, We make ⽤ Real weight of linear regression model

And deviation b=4.2, as well as ⼀ It's a random noise term € Come on ⽣ Labeling :

1、 Generate data set

"""
 An artificial data set is constructed according to the linear model with noise 
"""
num_inputs=2 # Enter the number （ Characteristic number ） by 2
num_examples=1000 # The number of samples in the training data set is 1000
true_w=[2,-3.4] # The true weight of the linear regression model 
true_b=4.2 # True deviation of linear regression model 
features=torch.from_numpy(np.random.normal(0,1,(num_examples,num_inputs)))
#  The mean for 0, The variance of 1 The random number  , Yes num_examples Samples , The number of columns is num_inputs

labels=true_w[0]*features[:,0]+true_w[1]*features[:,1]+true_b
# lables Is equal to w For each column multiplied by features Each column of is then added , Finally, add the deviation true_b;

labels+=torch.from_numpy(np.random.normal(0,0.01,size=labels.size()))
#  Added a noise , The mean for 0, The variance of 0.01, Shape and lables It's the same length 
"""
features  Every ⼀⾏ yes ⼀ individual ⻓ Degree is 2 Vector ,⽽ labels  Every ⼀⾏ yes ⼀ individual ⻓ Degree is 1 Vector （ mark 
 The amount ）
"""
print(features[0],labels[0])
# Finally, output the column vector （ Features and dimensions ）

 adopt ⽣ Chengdi ⼆ Features  features[:, 1]  And labels  labels  The scatter diagram of , To observe the linear relationship between the two 
 system

def use_svg_display():
    #  Use a vector diagram to represent 
    display.set_matplotlib_formats('svg')
def set_figsize(figsize=(3.5,2.5)):
    use_svg_display()
    # Set the size of the drawing 
    plt.rcParams['figure.figsize']=figsize
set_figsize()
plt.scatter(features[:,1].numpy(),labels.numpy(),1);
plt.show()

You can also save the above mapping function in d21zh_pytorch In bag , Steps are as follows ：

First step ： Create a new one d21zh_pytorch package ;

The second step ： Create a new one in the package methods.py file ;

The third step ： hold plt do chart Letter Count With And use_svg_display Letter Count and set_figsize Letter Count set The righteous stay d2lzh_pytorch Medium methods.py In file , Pictured ：

Then you can call the saved function directly , To display a scatter plot ：

"""
 If in d21zh_pytorch After adding the above two functions , You can use the following method calls 
"""
import sys
sys.path.append("..")
from d21zh_pytorch import *
set_figsize()
plt.scatter(features[:,1].numpy(),labels.numpy(),1);
plt.show()

2、 Reading data

When training data , We need to traverse the data set and constantly read small batches of data samples . Here we define a function ： Every time he returns batch_size( Batch size ) Characteristics and labels of random samples .

# Define a data_iter function , This function accepts the batch size 、 Feature matrix and label vector are used as input , The generation size is batch_size A small batch of 
def data_iter(batch_size,features,labels):
    num_examples=len(features)
    indices=list (range(num_examples))
    random.shuffle(indices) # The reading order of samples is random 
    for i in range(0,num_examples,batch_size):
        # from 0 Start to num_examples end , Each jump batch_size A size .
        j=torch.LongTensor(indices[i:min(i+batch_size,num_examples)])
        # The last time may be less than one batch, So the last batch takes a minimum value min
        yield  features.index_select(0,j),labels.index_select(0,j)
# Read the first small batch data sample and print 
batch_size=10
for X,y in data_iter(batch_size,features,labels):
    print(X,'\n',y) # Generate X yes 10 ride 2 Vector ,y yes 10 ride 1 Vector 
    # Add one ‘\n’  It can make y A newline indicates 
    break

3、 Define initialization model parameters

"""
 We initialize the weight to mean 0, The standard deviation is 0.01 The normal random number of , The deviation is initialized to 0
"""
tensor = tensor.to(torch.float32)
w=torch.tensor(np.random.normal(0,0.01,(num_inputs,1)),dtype=torch.double)

b=torch.zeros(1,dtype=torch.double)
# After the model training , We need to gradient these parameters to iterate the values of the parameters , So we have to make them requires_grad=True
w.requires_grad_(requires_grad=True)
b.requires_grad_(requires_grad=True)

 notes ：w=torch.tensor(np.random.normal(0,0.01,(num_inputs,1)),dtype=torch.double)

b=torch.zeros(1,dtype=torch.double)

Here, the data type is double, But can't use float32, Otherwise, an error will be reported .

4、 Defining models

# The following is the implementation of vector calculation expression of linear regression , We use mm Functions do matrix multiplication 
def linreg(X,w,b):# linear regression model 
    return torch.mm(X,w)+b
    # Return forecast （ Input X multiply w, Matrix times vector , Plus deviation ）

5、 Define the loss function

"""
 Describe the square loss to define the loss function of linear regression , In reality , We need to put the real value y Become the predicted value y_hat The shape of the . The results returned by the following functions will also be the same as y_hat In the same shape 
"""
def squared_loss(y_hat,y):
    #squared_loss Mean square loss 
    return (y_hat - y.view(y_hat.size()))**2/2# Subtract by element , Square by element , Finally divide by 2
#  Pay attention to this ⾥ The return is a vector ,  in addition , pytorch⾥ Of MSELoss Not divided by 2

6、 Define optimization algorithms

#  Following  sgd  The function implements the above ⼀ Section ⼩ Batch random gradient descent algorithm . It optimizes the loss function by iterating the model parameters . this ⾥⾃ The gradient calculated by the dynamic gradient module is ⼀ Gradient sum of batch samples . We divide it by the lot size ⼤⼩ Get the average .
def sgd(params, lr, batch_size): # Given all the parameters param（ Contains w and b）, Learning rate , Batch size 
    # Small batch random gradient drop 
   for param in params: # For parameters params Every parameter in param（w perhaps b）
       param.data -= lr * param.grad / batch_size #  The gradient exists .grad in 
     ##  Pay attention to this ⾥ change param when ⽤ Of param.data

7、 Training models

lr = 0.03 # The learning rate is set to 0.03
num_epochs = 3 # The number of iteration cycles is 3, Scan the whole data three times 
net = linreg # linear regression model 
loss = squared_loss # Mean square loss 
# The realization of training is two levels for epoch
for epoch in range(num_epochs): #  Training models ⼀ A common need num_epochs Iterations 

    #  In each of the ⼀ In iterations , Can make ⽤ All samples in the training data set ⼀ Time （ It is assumed that the number of samples can be batched ⼤⼩ to be divisible by ）
    # X and y Namely ⼩ Characteristics and labels of bulk samples 
    for X, y in data_iter(batch_size, features, labels):
        l = loss(net(X, w, b), y).sum()
        # l It's about ⼩ Batch X and y The loss of 
        l.backward()  # ⼩ The loss of lot size is a gradient of model parameters 
        sgd([w, b], lr, batch_size)  #  send ⽤⼩ Batch stochastic gradient descent iteration model parameters 
        #  Don't forget to reset the gradient 
        w.grad.data.zero_()
        b.grad.data.zero_()
        train_l = loss(net(features, w, b), labels)
        print('epoch %d, loss %f' % (epoch + 1, train_l.mean().item()))

Output many pieces of data from epoch1 To epoch3.

# Compare the real parameters with the parameters learned through training to evaluate the success of training

print(true_w, '\n', w)
print(true_b, '\n', b)

原网站

版权声明
本文为[Orange acridine 21]所创，转载请带上原文链接，感谢
https://yzsam.com/2022/03/202203010550044194.html

当前位置：网站首页>Hands on learning and deep learning -- Realization of linear regression from scratch

Hands on learning and deep learning -- Realization of linear regression from scratch

边栏推荐

猜你喜欢

随机推荐