当前位置:网站首页>Entropy and full connection layer
Entropy and full connection layer
2022-07-03 23:50:00 【onlywishes】
entropy

Entropy is the expected value of information , It is a measure of the certainty of random variables . The greater the entropy , The more uncertain the value of the variable , Otherwise, the more certain it is .

import torch
a = torch.full([4],1/4.)
print(-(a*torch.log2(a)).sum()) #a,b,c Suppose four options choose the right probability
#tensor(2.) # The more obvious entropy is, the less unexpected it is , The greater the entropy, the greater the uncertainty
b = torch.tensor([0.1,0.1,0.1,0.7])
print(-(b*torch.log2(b)).sum())
#tensor(1.3568)
c = torch.tensor([0.001,0.001,0.001,0.999])
print(-(c*torch.log2(c)).sum())
#tensor(0.0313)Cross entropy

The relative entropy is also known as KL The divergence ,KL distance , It's a measure of the distance between two random distributions . Write it down as DKL(p||q)DKL(p||q). It measures when the true distribution is p when , Hypothetical distribution q Invalidity of
For two distributions p( Real sample distribution ),q( Model to be estimated ) All are (0-1) Yes

Yes H(p,q)= H(p) + Dkl(p|q ) , When p When known , You can put H(p) As a constant , At this time, cross entropy and KL Distance is equivalent in behavior , All reflect the distribution p,q The degree of similarity . Minimizing cross entropy is equal to minimizing KL distance . They will all be in p=q Get the minimum H(p)(p=q when KL A distance of 0)

Fully connected layer
nn.Linear(input,output,bias=TRUE)
It contains w The transpose , and b( Personal understanding )
Used to set up the full connection layer in the network , It should be noted that the input and output of the full connection layer are two-dimensional tensors
The general shape is [batch_size, size], Unlike the convolution layer, the input and output are required to be four-dimensional tensors
input: Enter the size of the two-dimensional tensor
output: Output the size of the two-dimensional tensor
If not used nn.Linear, Create two dimensions tensor The input and output of is the opposite , namely 【output,input】 To transpose
import torch
import torch.nn as nn
x = torch.randn(1,784)
print(x.shape) #torch.Size([1, 784])
layer1 = nn.Linear(784,200) # The first is in, The second is out
layer2 = nn.Linear(200,200)
layer3 = nn.Linear(200,10)
x = layer1(x)
print(x.shape) #torch.Size([1, 200]) , Dimension reduction
x = layer2(x)
print(x.shape) #torch.Size([1, 200]) , feature extraction
x = layer3(x)
print(x.shape) #torch.Size([1, 10]) , Reduced to 10nn.Relu vs F.relu

nn.Relu Class methods are used ,F.relu The function method is used , Pay attention to the case of letters
relu Medium inplace The default is FALSE
inplace = False when , The value of the input object will not be modified , Instead, it returns a newly created object
inplace = True when , Will modify the value of the input object , So the printed object has the same storage address , Save space and time for repeated application and memory release , Just pass the original address , More efficient
import torch
import torch.nn as nn
from torch.nn import functional as F
x = torch.randn(1,10)
x = F.relu(x,inplace=True)
print(x) #tensor([[0.0000, 0.0000, 0.0000, 0.0000, 0.9487, 1.0720, 0.0000, 0.0000, 0.3956, 0.0000]])
layer = nn.ReLU()
x = layer(x)
print(x) #tensor([[0.0000, 0.0000, 0.0000, 0.0000, 0.9487, 1.0720, 0.0000, 0.0000, 0.3956, 0.0000]])
About activation functions relu
sigmoid and tanh yes “ Saturation activation function ”, and ReLU And its variants are “ Unsaturated activation function ”. Use “ Unsaturated activation function ” There are two advantages :
1.“ Unsaturated activation function ” Can solve the so-called “ The gradient disappears ” problem .
2. It can speed up convergence .
leaky relu
ReLU Is to set all negative values to zero ,Leaky ReLU Is to give all negative values a non-zero slope , Default 0.01
Usage and relu identical , Angle can be set

SELU function
It's a composite function

Softplus
take relu function x Axis approach 0 Partial smoothing

Use GPU Speed up
device = torch.device('cuda:0') # Use equipment , You can choose to move what you need to calculate to the equipment you need .
# Will be accelerated to GPU
o = nn.CrossEntropyLoss().to(device) # Use .to() Method specification GPU Speed up , Will return inference, His type depends on the original type
t = nn.CrossEntropyLoss().cuda() # Use .cuda() Method
Calculation accuracy
logits = torch.rand(4,10)
pred = F.softmax(logits,dim=1)
pred_label = pred.argmax(dim=1)
print(pred_label) #tensor([5, 5, 2, 6])
p = logits.argmax(dim=1)
print(p) #tensor([5, 5, 2, 6])
label = torch.tensor([3,9,2,0])
correct = torch.eq(pred_label,label) #
print(correct) #tensor([False, False, True, False])
i = correct.sum().float().item()/4 # Calculation accuracy ,item Get the elements inside
print(i) #0.25边栏推荐
- Ramble 72 of redis source code
- C # basic knowledge (1)
- 2020.2.14
- Introducing Software Testing
- Actual combat | use composite material 3 in application
- [MySQL] sql99 syntax to realize multi table query
- D30:color tunnels (color tunnels, translation)
- Ningde times and BYD have refuted rumors one after another. Why does someone always want to harm domestic brands?
- It is forbidden to splice SQL in code
- Zipper table in data warehouse (compressed storage)
猜你喜欢

X Opencv feature point detection and matching

A treasure open source software, cross platform terminal artifact tabby

Recursive least square adjustment

Private project practice sharing populate joint query in mongoose makes the template unable to render - solve the error message: syntaxerror: unexpected token r in JSON at

JDBC Technology
![[2021]NeRF in the Wild: Neural Radiance Fields for Unconstrained Photo Collections](/img/c6/3dc7d01600f6713afdbb4cf3df5238.jpg)
[2021]NeRF in the Wild: Neural Radiance Fields for Unconstrained Photo Collections

Interesting 10 CMD commands

How to understand the gain bandwidth product operational amplifier gain

Unsafe and CAS principle

Common mode interference of EMC
随机推荐
Minimum commission for stock account opening. Stock account opening is free. Is online account opening safe
D27:mode of sequence (maximum, translation)
Gossip about redis source code 77
2022 a special equipment related management (elevator) examination questions and a special equipment related management (elevator) examination contents
A treasure open source software, cross platform terminal artifact tabby
[Happy Valentine's day] "I still like you very much, like sin ² a+cos ² A consistent "(white code in the attached table)
Kubedl hostnetwork: accelerating the efficiency of distributed training communication
X Opencv feature point detection and matching
D26: the nearest number (translation + solution)
Learning methods of zynq
2022 free examination questions for hoisting machinery command and hoisting machinery command theory examination
How can I get the Commission discount of stock trading account opening? Is it safe to open an account online
Comment obtenir une commission préférentielle pour l'ouverture d'un compte en bourse? Est - ce que l'ouverture d'un compte en ligne est sécurisée?
How to make recv have a little temper?
Interpretation of corolla sub low configuration, three cylinder power configuration, CVT fuel saving and smooth, safety configuration is in place
Current detection circuit - including op amp current scheme
Ningde times and BYD have refuted rumors one after another. Why does someone always want to harm domestic brands?
Ningde times and BYD have refuted rumors one after another. Why does someone always want to harm domestic brands?
Gossip about redis source code 76
Private project practice sharing populate joint query in mongoose makes the template unable to render - solve the error message: syntaxerror: unexpected token r in JSON at