当前位置：网站首页>NLP introduction + practice: Chapter 5: using the API in pytorch to realize linear regression

NLP introduction + practice: Chapter 5: using the API in pytorch to realize linear regression

2022-07-29 07:35:00 【ZNineSun】

List of articles

1.Pytorch Commonly used to complete the model API
2. Use pytorch Medium API Linear regression
- 2.1 stay GPU Run code on

Last one ：《nlp introduction + actual combat ： Chapter four ： Use pytorch Implement linear regression manually 》

Code link of this chapter ：

1.Pytorch Commonly used to complete the model API

In the previous part . We achieved it ourselves through torch Back propagation and parameter updating are completed by the relevant methods of , stay pytorch Medium presupposition — Some more flexible and simple objects , Let's construct the model 、 Define loss , Optimization loss, etc .

So next , Let's take a look at the commonly used API

1.1 nn.Module

nn .Modul yes torch.nn A class provided , yes pytorch In, we customize a base class of the network , There are many useful methods in this class , Let's inherit this class and define the network very simply .

When we customize the network , There are two ways to pay special attention :

1.__init__ Need to call super Method , Inherits the properties and methods of the parent class
2.farward Method must implement , The process used to define the forward computation of our network

Use the front one y = wx+b An example of the model is as follows :

from torch import nn
import torch


class Lr(nn.Module):
    def __init__(self):
        super(Lr, self).__init__()  #  Inherited from the parent class init Parameters 
        self.linear = nn.Linear(1, 1)  #  The first parameter ： Input shape   The second parameter ： Output shape  1： Represents dimension （ Also called column number ）

    def forward(self, x):
        out = self.linear(x)
        return out

Be careful :

1.nn.Linear by torch Predefined linear model , Also known as Full link layer , The parameter passed in is the number of inputs , Number of outputs (in_features, out_features), yes (batch_size Columns of )
2.nn.Nodule Defined __ca11_ Method , The implementation is to call forward Method . namely Lr Example , Can be called directly by passed in parameters , It's actually calling theta forward Method and pass in parameters

#  Instantiation model 
model = Lr()
#  Incoming data , The result of the calculation is 
x = torch.rand([500, 1])  # 1 rank ,50 That's ok 1 Column 
predict = model(x)

expand ： There is only one hidden layer of the model above , If we want to add another layer , Just like the following ：

class Lr(nn.Module):
    def __init__(self):
        super(Lr, self).__init__()  #  Inherited from the parent class init Parameters 
        # linear=nn.Linear(input Number of features , Number of features output )
        self.linear = nn.Linear(1, 1)  #  The first parameter ： Input shape   The second parameter ： Output shape  1： Represents dimension （ Also called column number ）
        self.fcl = nn.Linear(1, 1)

    def forward(self, x):
        out = self.linear(x)
        out = self.fcl(out)
        return out

The above code means that our input will go through two layers of neural network , If you want to use the activation function for the second time , Join us to use relu Activation function , It can be like this ：

class Lr(nn.Module):
    def __init__(self):
        super(Lr, self).__init__()  #  Inherited from the parent class init Parameters 
        # linear=nn.Linear(input Number of features , Number of features output )
        self.linear = nn.Linear(1, 1)  #  The first parameter ： Input shape   The second parameter ： Output shape  1： Represents dimension （ Also called column number ）
        self.fcl = nn.Linear(1, 1)

    def forward(self, x):
        out = self.linear(x)
        out = self.fcl(out)
        out=nn.ReLU(out)
        return out

1.2 Optimizer class

Optimizer ( optimizer), It can be understood as torch The method encapsulated for us to update parameters , For example, the common random gradient bulge (stochastic gradient descent,SGD )

Optimizer classes are composed of torch.optim Provided , for example

1.torch.optim.sGD( Parameters , Learning rate )
2.torch.optim.Adam( Parameters , Learning rate )

Be careful :

1. Parameters can be used model.parameters() To get , Get all the data in the model requires_grad=True Parameters of
2. Optimize the use of classes
- 1. Instantiation
- 2 Gradient of all parameters , Set the value to 0
- 3. Back propagation calculation gradient
- 4. Update parameter values

Examples are as follows :

optimizer = optim.SGD(model.parameters(), lr=le - 3)
optimizer.zero_gard()  #  Gradient set 0
loss.backward()  #  Calculate the gradient 
optimizer.step()  #  Update parameter values

1.3 Loss function

The previous example is a regression problem ,torch Many loss functions are also predicted in

1. Mean square error :nn.MSELoss(), Commonly used for classification problems
2. Cross line loss :nn.crossEntropyLoss(), Common language logistic regression

Usage method :

#  Incoming data , The result of the calculation is 
x = torch.rand([500, 1])  # 1 rank ,50 That's ok 1 Column 
predict = model(x)
y = 3 * x + 0.8
model = Lr()  # 1. Instantiation model 
criterion = nn.MSELoss()  # 2. Instantiate the loss function 
optimizer = optim.SGD(model.parameters(), lr=x.le - 3)  # 3. Instantiation optimizer 
for i in range(100):
    y_predict = model(x)  # 4. Spread forward 
    loss = criterion(y, y_predict)  # 5. Call the loss function to pass in the real value and the predicted value , Get the loss result 
    optimizer.zero_grad()  # 5. The current loop parameter gradient is set to 0
    loss.backward()  # 6. Calculate the gradient 
    optimizer.step()  # 7. Update the value of the parameter

2. Use pytorch Medium API Linear regression

import torch
import torch.nn as nn
from torch.optim import SGD
from matplotlib import pyplot as plt

# 1. Defining data 
x = torch.rand([500, 1], dtype=torch.float32)  # 1 rank ,50 That's ok 1 Column 
y = 3 * x + 0.8


# 2. Defining models 
class Lr(nn.Module):
    def __init__(self):
        super(Lr, self).__init__()  #  Inherited from the parent class init Parameters 
        # linear=nn.Linear(input Number of features , Number of features output )
        self.linear = nn.Linear(1, 1)  #  The first parameter ： Input shape   The second parameter ： Output shape  1： Represents dimension （ Also called column number ）
        # self.fcl = nn.Linear(1, 1)

    def forward(self, x):
        out = self.linear(x)
        # out = self.fcl(out)
        # out=nn.ReLU(out)
        return out


# 3. Instantiation model ,loss And optimizer 
model = Lr()
loss_fn = nn.MSELoss()
optimizer = SGD(model.parameters(), 0.001)
# 4. Training models 
for i in range(30000):
    y_predict = model(x)  #  Get predictions 
    loss = loss_fn(y, y_predict)  #  Calculate the loss 
    optimizer.zero_grad()  #  Parameter gradient setting 0
    loss.backward()  #  Regression calculation gradient 
    optimizer.step()  #  Update gradient 
    print(" Loss ：{}".format(loss.data))
# 5. Model to evaluate 
model.eval()  #  Set the model to evaluation mode , Prediction mode 
predict = model(x)
predict = predict.data.numpy()
plt.scatter(x.data.numpy(), y.data.numpy(), c='b')
plt.plot(x.data.numpy(), predict, c='r')
plt.show()

Insert picture description here

Be careful :

model.eva1 Indicates that the model is set to the evaluation mode , Prediction mode
model.train(mode=True) Indicates that the model is set to training mode

In the current linear regression , There is no difference between the above

But in other models , The parameters of training and prediction will be different , At that time, we need to tell the program whether we are training or predicting , For example, there are Dropout,BatchNorm When

2.1 stay GPU Run code on

When the model is too large , Or when there are too many parameters , To speed up your training , Often use GPU At this time, our code needs to be slightly adjusted :

1. Judge GPU Is it available torch.cuda.is_avai1able()

if torch.cuda.is_available():
    device = torch.device("cuda:0")  # cuda device object , If there are more than one GPU, Take the first one 
    y = torch.ones_like(t19, device=device)  #  Create a cuda Of tensor
    x = t19.to(device)  #  How to use t19 Turn into cuda Of tensor
    z = x + y
    print(z.to("cpu", torch.double))  # .to Method can also set the type 
else:
    print(" Your device does not support gpu operation ")

2. Compare the model parameters with input Data into cuda Type of support for

model.to(device)
x_true.to(device)

3. stay GPU The calculation result is also cuda Data type of , We need to convert to numpy perhaps torch Of cpu Of tensor type

predict=pridict.cpu().detach().numpy()

detach() The effect and data Similarity of , however detach() It's a deep copy ,data It's a value , Is a shallow copy

The modified code is as follows ：

import torch
import torch.nn as nn
from torch.optim import SGD
from matplotlib import pyplot as plt

# 1. Defining data 
x = torch.rand([500, 1], dtype=torch.float32)  # 1 rank ,50 That's ok 1 Column 
y = 3 * x + 0.8


# 2. Defining models 
class Lr(nn.Module):
    def __init__(self):
        super(Lr, self).__init__()  #  Inherited from the parent class init Parameters 
        # linear=nn.Linear(input Number of features , Number of features output )
        self.linear = nn.Linear(1, 1)  #  The first parameter ： Input shape   The second parameter ： Output shape  1： Represents dimension （ Also called column number ）
        # self.fcl = nn.Linear(1, 1)

    def forward(self, x):
        out = self.linear(x)
        # out = self.fcl(out)
        # out=nn.ReLU(out)
        return out


# 3. Instantiation model ,loss And optimizer 
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
x, y = x.to(device), y.to(device)
model = Lr().to(device)
loss_fn = nn.MSELoss()
optimizer = SGD(model.parameters(), 0.001)
# 4. Training models 
for i in range(30000):
    y_predict = model(x)  #  Get predictions 
    loss = loss_fn(y, y_predict)  #  Calculate the loss 
    optimizer.zero_grad()  #  Parameter gradient setting 0
    loss.backward()  #  Regression calculation gradient 
    optimizer.step()  #  Update gradient 
    print(" Loss ：{}".format(loss.data))
# 5. Model to evaluate 
model.eval()  #  Set the model to evaluation mode , Prediction mode 
predict = model(x)
predict = predict.cpu().detach().numpy()
plt.scatter(x.cpu().detach().numpy(), y.cpu().detach().numpy(), c='b')
plt.plot(x.cpu().detach().numpy(), predict, c='r')
plt.show()