当前位置:网站首页>Pytorch neural network
Pytorch neural network
2022-06-26 08:54:00 【Thick Cub with thorns】
pytorch Deep learning
RNN Cyclic neural network pytorch
RNN
The latter neural network will be based on the contribution of the former neural network
A wider range of time series structure inputs can be accepted
LSTM RNN
long short-term memory( Long and short term memory )
Ordinary rnn The initial information will be ignored , Reduce the initial information during back propagation .
And cause The gradient disappears , Also called gradient dispersion
It is also possible to create infinity after the initial gradient change , be called Gradient explosion
therefore , Ordinary rnn Unable to solve the problem of pivot point memory
lstm rnn More input in , Output , Forget the controller
According to the importance of input and output , Join the recurrent neural network
pytorch Realization
Classification problem
import torch
from torch import nn
import torchvision.datasets as dsets
import torchvision.transforms as transforms
import matplotlib.pyplot as plt
# torch.manual_seed(1) # reproducible
# Hyper Parameters
EPOCH = 1 # train the training data n times, to save time, we just train 1 epoch
BATCH_SIZE = 64
TIME_STEP = 28 # rnn time step / image height
INPUT_SIZE = 28 # rnn input size / image width
LR = 0.01 # learning rate
DOWNLOAD_MNIST = True # set to True if haven't download the data
# Mnist digital dataset
train_data = dsets.MNIST(
root='./mnist/',
train=True, # this is training data
transform=transforms.ToTensor(), # Converts a PIL.Image or numpy.ndarray to
# torch.FloatTensor of shape (C x H x W) and normalize in the range [0.0, 1.0]
download=DOWNLOAD_MNIST, # download it if you don't have it
)
# plot one example
print(train_data.train_data.size()) # (60000, 28, 28)
print(train_data.train_labels.size()) # (60000)
plt.imshow(train_data.train_data[0].numpy(), cmap='gray')
plt.title('%i' % train_data.train_labels[0])
plt.show()
# Data Loader for easy mini-batch return in training
train_loader = torch.utils.data.DataLoader(dataset=train_data, batch_size=BATCH_SIZE, shuffle=True)
# convert test data into Variable, pick 2000 samples to speed up testing
test_data = dsets.MNIST(root='./mnist/', train=False, transform=transforms.ToTensor())
test_x = test_data.test_data.type(torch.FloatTensor)[:2000]/255. # shape (2000, 28, 28) value in range(0,1)
test_y = test_data.test_labels.numpy()[:2000] # covert to numpy array
class RNN(nn.Module):
def __init__(self):
super(RNN, self).__init__()
self.rnn = nn.LSTM( # if use nn.RNN(), it hardly learns
input_size=INPUT_SIZE,
hidden_size=64, # rnn hidden unit
num_layers=1, # number of rnn layer
batch_first=True, # input & output will has batch size as 1s dimension. e.g. (batch, time_step, input_size)
)
self.out = nn.Linear(64, 10)
def forward(self, x):
# x shape (batch, time_step, input_size)
# r_out shape (batch, time_step, output_size)
# h_n shape (n_layers, batch, hidden_size)
# h_c shape (n_layers, batch, hidden_size)
r_out, (h_n, h_c) = self.rnn(x, None) # None represents zero initial hidden state
# choose r_out at the last time step
out = self.out(r_out[:, -1, :])
return out
rnn = RNN()
print(rnn)
out
RNN(
(rnn): LSTM(28, 64, batch_first=True)
(out): Linear(in_features=64, out_features=10, bias=True)
)
Achieve optimization 、 Training
optimizer = torch.optim.Adam(rnn.parameters(), lr=LR) # optimize all cnn parameters
loss_func = nn.CrossEntropyLoss() # the target label is not one-hotted
# training and testing
for epoch in range(EPOCH):
for step, (b_x, b_y) in enumerate(train_loader): # gives batch data
b_x = b_x.view(-1, 28, 28) # reshape x to (batch, time_step, input_size)
output = rnn(b_x) # rnn output
loss = loss_func(output, b_y) # cross entropy loss
optimizer.zero_grad() # clear gradients for this training step
loss.backward() # backpropagation, compute gradients
optimizer.step() # apply gradients
if step % 50 == 0:
test_output = rnn(test_x) # (samples, time_step, input_size)
pred_y = torch.max(test_output, 1)[1].data.numpy()
accuracy = float((pred_y == test_y).astype(int).sum()) / float(test_y.size)
print('Epoch: ', epoch, '| train loss: %.4f' % loss.data.numpy(), '| test accuracy: %.2f' % accuracy)
# print 10 predictions from test data
test_output = rnn(test_x[:10].view(-1, 28, 28))
pred_y = torch.max(test_output, 1)[1].data.numpy()
print(pred_y, 'prediction number')
print(test_y[:10], 'real number')
out
Epoch: 0 | train loss: 2.2896 | test accuracy: 0.12
Epoch: 0 | train loss: 0.8098 | test accuracy: 0.60
Epoch: 0 | train loss: 0.6983 | test accuracy: 0.73
Epoch: 0 | train loss: 0.5486 | test accuracy: 0.81
Epoch: 0 | train loss: 0.7209 | test accuracy: 0.85
Epoch: 0 | train loss: 0.2399 | test accuracy: 0.87
Epoch: 0 | train loss: 0.4179 | test accuracy: 0.90
Epoch: 0 | train loss: 0.5278 | test accuracy: 0.88
Epoch: 0 | train loss: 0.3201 | test accuracy: 0.90
Epoch: 0 | train loss: 0.1950 | test accuracy: 0.92
Epoch: 0 | train loss: 0.2301 | test accuracy: 0.92
Epoch: 0 | train loss: 0.1683 | test accuracy: 0.94
Epoch: 0 | train loss: 0.1188 | test accuracy: 0.93
Epoch: 0 | train loss: 0.0566 | test accuracy: 0.95
Epoch: 0 | train loss: 0.0941 | test accuracy: 0.94
Epoch: 0 | train loss: 0.3501 | test accuracy: 0.95
Epoch: 0 | train loss: 0.0342 | test accuracy: 0.93
Epoch: 0 | train loss: 0.0753 | test accuracy: 0.96
Epoch: 0 | train loss: 0.1507 | test accuracy: 0.96
[7 2 1 0 4 1 4 9 6 9] prediction number
[7 2 1 0 4 1 4 9 5 9] real number
The return question
import torch
from torch import nn
import numpy as np
import matplotlib.pyplot as plt
# torch.manual_seed(1) # reproducible
# Hyper Parameters
TIME_STEP = 10 # rnn time step
INPUT_SIZE = 1 # rnn input size
LR = 0.02 # learning rate
# show data
steps = np.linspace(0, np.pi * 2, 100, dtype=np.float32) # float32 for converting torch FloatTensor
x_np = np.sin(steps)
y_np = np.cos(steps)
plt.plot(steps, y_np, 'r-', label='target (cos)')
plt.plot(steps, x_np, 'b-', label='input (sin)')
plt.legend(loc='best')
plt.show()
class RNN(nn.Module):
def __init__(self):
super(RNN, self).__init__()
self.rnn = nn.RNN(
input_size=INPUT_SIZE,
hidden_size=32, # rnn hidden unit
num_layers=1, # number of rnn layer
batch_first=True, # input & output will has batch size as 1s dimension. e.g. (batch, time_step, input_size)
)
self.out = nn.Linear(32, 1)
def forward(self, x, h_state):
# x (batch, time_step, input_size)
# h_state (n_layers, batch, hidden_size)
# r_out (batch, time_step, hidden_size)
r_out, h_state = self.rnn(x, h_state)
outs = [] # save all predictions
for time_step in range(r_out.size(1)): # calculate output for each time step
outs.append(self.out(r_out[:, time_step, :]))
return torch.stack(outs, dim=1), h_state
# instead, for simplicity, you can replace above codes by follows
# r_out = r_out.view(-1, 32)
# outs = self.out(r_out)
# outs = outs.view(-1, TIME_STEP, 1)
# return outs, h_state
# or even simpler, since nn.Linear can accept inputs of any dimension
# and returns outputs with same dimension except for the last
# outs = self.out(r_out)
# return outs
rnn = RNN()
print(rnn)
optimizer = torch.optim.Adam(rnn.parameters(), lr=LR) # optimize all cnn parameters
loss_func = nn.MSELoss()
h_state = None # for initial hidden state
plt.figure(1, figsize=(12, 5))
plt.ion() # continuously plot
for step in range(100):
start, end = step * np.pi, (step + 1) * np.pi # time range
# use sin predicts cos
steps = np.linspace(start, end, TIME_STEP, dtype=np.float32,
endpoint=False) # float32 for converting torch FloatTensor
x_np = np.sin(steps)
y_np = np.cos(steps)
x = torch.from_numpy(x_np[np.newaxis, :, np.newaxis]) # shape (batch, time_step, input_size)
y = torch.from_numpy(y_np[np.newaxis, :, np.newaxis])
prediction, h_state = rnn(x, h_state) # rnn output
# !! next step is important !!
h_state = h_state.data # repack the hidden state, break the connection from last iteration
loss = loss_func(prediction, y) # calculate loss
optimizer.zero_grad() # clear gradients for this training step
loss.backward() # backpropagation, compute gradients
optimizer.step() # apply gradients
# plotting
plt.plot(steps, y_np.flatten(), 'r-')
plt.plot(steps, prediction.data.numpy().flatten(), 'b-')
plt.draw();
plt.pause(0.05)
plt.ioff()
plt.show()
out
RNN(
(rnn): RNN(1, 32, batch_first=True)
(out): Linear(in_features=32, out_features=1, bias=True)
)
AutoEncoder( Self coding )
First compress the original data , Decompress to get the output
Then the output is optimized by reverse transmission
It's a kind of Unsupervised learning , More than the PCA
After compression, the encoder is obtained , Master the essence of the original data
Reinforcement learning
- Deep Q Network(DQN)
- GAN( Meaningless random number generation , Improve each other )
- generator Generate the data ,discriminator To judge
torch Is dynamic
May adopt GPU Speed up
Ease of overfitting (Over fitting)
Add one more drop layer
net_dropped = torch.nn.Sequential(
torch.nn.Linear(1, N_HIDDEN),
torch.nn.Dropout(0.5), # Then half of the points are shielded , Achieve mitigation overfitting
torch.nn.ReLU(),
)
Batch of standardized (Batch Normalization)
The excitation function is insensitive to large numbers
This is not just at the input layer , Also occurs in hidden layers
Batch standardization is between the excitation function and the next layer
It is divided into standard chemical engineering sequence , Reverse standardization process
def __init__(self, batch_normalization=False):
super(Net, self).__init__()
self.do_bn = batch_normalization
self.fns = []
self.bns = []
self.bn_input = nn.BatchNormal1d(1, momentum=0.5)
for i in range(N_HIDDEN):
input_size = 1 if i == 0 else 10
fc = nn.Linear(input_size, 10)
setattr(self, 'fc%i' % i, fc) # important
self._set_init()
self.predict = nn.Linear(10, 1)
self._set_init(self.predict)
边栏推荐
- Trimming_ nanyangjx
- QT_ AI
- VS2005 project call free() compiled with static libcurl library reported heap error
- torch. fft
- MPC learning notes (I): push MPC formula manually
- 1.27 pytorch learning
- 利用无线技术实现分散传感器信号远程集中控制
- Digital image processing learning (II): Gaussian low pass filter
- Simulation of parallel structure using webots
- Ultrasonic image segmentation
猜你喜欢

Solution to the encoding problem encountered by the crawler when requesting get/post

How to realize wireless Ethernet high-speed communication for multiple Mitsubishi PLCs?

STM32 project design: smart door lock PCB and source code based on stm32f1 (4 unlocking methods)

Recovering the system with Clonezilla USB disk

Isinstance() function usage

Playing card image segmentation

Transformers loading Roberta to implement sequence annotation task

Principle of playing card image segmentation

Intra class data member initialization of static const and static constexpr

Embedded Software Engineer (6-15k) written examination interview experience sharing (fresh graduates)
随机推荐
Relation extraction model -- spit model
Checkerboard generation + camera calibration + stereo matching
Realizing sequence annotation with transformers
Euler function: find the number of numbers less than or equal to N and coprime with n
Deploy wiki system Wiki in kubesphere JS and enable Chinese full-text retrieval
VS2005 project call free() compiled with static libcurl library reported heap error
1.Intro_ Math (white board derivation and reprint of station B)
Detailed explanation of self attention & transformer
1.21 study logistic regression and regularization
STM32 project design: smart home system design based on stm32
Leetcode notes: binary search simple advanced
Use of PCL
SQL learning experience (II): question brushing record
Analysis of Yolo series principle
Installation of jupyter
Playing card image segmentation
opencv學習筆記三
唯品会工作实践 : Json的deserialization应用
How to correctly PIP install pyscipopt
STM32 project design: temperature, humidity and air quality alarm, sharing source code and PCB