当前位置：网站首页>Pytoch (II) -- activation function, loss function and its gradient

Pytoch (II) -- activation function, loss function and its gradient

2022-07-01 04:46:00 【CyrusMay】

Pytorch（ Two ） —— Activation function 、 The loss function and its gradient

1. Activation function
2. Loss function
- 2.1 MSE
- 2.2 CorssEntorpy
3. Derivation and back propagation
- 3.1 Derivation
- 3.2 Back propagation

1. Activation function

1.1 Sigmoid / Logistic

$\delta(x)=\frac{1}{1+e^{-x}}\\\delta'(x)=\delta(1-\delta)$

import matplotlib.pyplot as plt
import torch.nn.functional as F
x = torch.linspace(-10,10,1000)
y = F.sigmoid(x)
plt.plot(x,y)
plt.show()

Insert picture description here

1.2 Tanh

$tanh(x)=\frac{e^x-e^{-x}}{e^x+e^{-x}}\\\frac{\partial tanh(x)}{\partial x}=1-tanh^2(x)$

import matplotlib.pyplot as plt
import torch.nn.functional as F
x = torch.linspace(-10,10,1000)
y = F.tanh(x)
plt.plot(x,y)
plt.show()

Insert picture description here

1.3 ReLU

$f (x) = m a x (0, x)$

import matplotlib.pyplot as plt
import torch.nn.functional as F
x = torch.linspace(-10,10,1000)
y = F.relu(x)
plt.plot(x,y)
plt.show()

Insert picture description here

1.4 Softmax

$p_i=\frac{e^{a_i}}{\sum_{k=1}^N{e^{a_k}}}\\ \frac{\partial p_i}{\partial a_j}=\left\{ \begin{array}{lc} p_i(1-p_j) & i=j \\ -p_ip_j&i\neq j\\ \end{array} \right.$

import torch.nn.functional as F
logits = torch.rand(10)
prob = F.softmax(logits,dim=0)
print(prob)

tensor([0.1024, 0.0617, 0.1133, 0.1544, 0.1184, 0.0735, 0.0590, 0.1036, 0.0861,
        0.1275])

2. Loss function

2.1 MSE

import torch.nn.functional as F
x = torch.rand(100,64)
w = torch.rand(64,1)
y = torch.rand(100,1)
mse = F.mse_loss(y,[email protected])
print(mse)

tensor(238.5115)

2.2 CorssEntorpy

import torch.nn.functional as F
x = torch.rand(100,64)
w = torch.rand(64,10)
y = torch.randint(0,9,[100])
entropy = F.cross_entropy([email protected],y)
print(entropy)

tensor(3.6413)

3. Derivation and back propagation

3.1 Derivation

Tensor.requires_grad_()
torch.autograd.grad()

import torch.nn.functional as F
import torch
x = torch.rand(100,64)
w = torch.rand(64,1)
y = torch.rand(100,1)
w.requires_grad_()
mse = F.mse_loss([email protected],y)
grads = torch.autograd.grad(mse,[w])
print(grads[0].shape)

torch.Size([64, 1])

3.2 Back propagation

Tensor.backward()

import torch.nn.functional as F
import torch
x = torch.rand(100,64)
w = torch.rand(64,10)
w.requires_grad_()
y = torch.randint(0,9,[100,])
entropy = F.cross_entropy([email protected],y)
entropy.backward()
w.grad.shape