当前位置:网站首页>Entropy and full connection layer
Entropy and full connection layer
2022-07-03 23:50:00 【onlywishes】
entropy

Entropy is the expected value of information , It is a measure of the certainty of random variables . The greater the entropy , The more uncertain the value of the variable , Otherwise, the more certain it is .

import torch
a = torch.full([4],1/4.)
print(-(a*torch.log2(a)).sum()) #a,b,c Suppose four options choose the right probability
#tensor(2.) # The more obvious entropy is, the less unexpected it is , The greater the entropy, the greater the uncertainty
b = torch.tensor([0.1,0.1,0.1,0.7])
print(-(b*torch.log2(b)).sum())
#tensor(1.3568)
c = torch.tensor([0.001,0.001,0.001,0.999])
print(-(c*torch.log2(c)).sum())
#tensor(0.0313)Cross entropy

The relative entropy is also known as KL The divergence ,KL distance , It's a measure of the distance between two random distributions . Write it down as DKL(p||q)DKL(p||q). It measures when the true distribution is p when , Hypothetical distribution q Invalidity of
For two distributions p( Real sample distribution ),q( Model to be estimated ) All are (0-1) Yes

Yes H(p,q)= H(p) + Dkl(p|q ) , When p When known , You can put H(p) As a constant , At this time, cross entropy and KL Distance is equivalent in behavior , All reflect the distribution p,q The degree of similarity . Minimizing cross entropy is equal to minimizing KL distance . They will all be in p=q Get the minimum H(p)(p=q when KL A distance of 0)

Fully connected layer
nn.Linear(input,output,bias=TRUE)
It contains w The transpose , and b( Personal understanding )
Used to set up the full connection layer in the network , It should be noted that the input and output of the full connection layer are two-dimensional tensors
The general shape is [batch_size, size], Unlike the convolution layer, the input and output are required to be four-dimensional tensors
input: Enter the size of the two-dimensional tensor
output: Output the size of the two-dimensional tensor
If not used nn.Linear, Create two dimensions tensor The input and output of is the opposite , namely 【output,input】 To transpose
import torch
import torch.nn as nn
x = torch.randn(1,784)
print(x.shape) #torch.Size([1, 784])
layer1 = nn.Linear(784,200) # The first is in, The second is out
layer2 = nn.Linear(200,200)
layer3 = nn.Linear(200,10)
x = layer1(x)
print(x.shape) #torch.Size([1, 200]) , Dimension reduction
x = layer2(x)
print(x.shape) #torch.Size([1, 200]) , feature extraction
x = layer3(x)
print(x.shape) #torch.Size([1, 10]) , Reduced to 10nn.Relu vs F.relu

nn.Relu Class methods are used ,F.relu The function method is used , Pay attention to the case of letters
relu Medium inplace The default is FALSE
inplace = False when , The value of the input object will not be modified , Instead, it returns a newly created object
inplace = True when , Will modify the value of the input object , So the printed object has the same storage address , Save space and time for repeated application and memory release , Just pass the original address , More efficient
import torch
import torch.nn as nn
from torch.nn import functional as F
x = torch.randn(1,10)
x = F.relu(x,inplace=True)
print(x) #tensor([[0.0000, 0.0000, 0.0000, 0.0000, 0.9487, 1.0720, 0.0000, 0.0000, 0.3956, 0.0000]])
layer = nn.ReLU()
x = layer(x)
print(x) #tensor([[0.0000, 0.0000, 0.0000, 0.0000, 0.9487, 1.0720, 0.0000, 0.0000, 0.3956, 0.0000]])
About activation functions relu
sigmoid and tanh yes “ Saturation activation function ”, and ReLU And its variants are “ Unsaturated activation function ”. Use “ Unsaturated activation function ” There are two advantages :
1.“ Unsaturated activation function ” Can solve the so-called “ The gradient disappears ” problem .
2. It can speed up convergence .
leaky relu
ReLU Is to set all negative values to zero ,Leaky ReLU Is to give all negative values a non-zero slope , Default 0.01
Usage and relu identical , Angle can be set

SELU function
It's a composite function

Softplus
take relu function x Axis approach 0 Partial smoothing

Use GPU Speed up
device = torch.device('cuda:0') # Use equipment , You can choose to move what you need to calculate to the equipment you need .
# Will be accelerated to GPU
o = nn.CrossEntropyLoss().to(device) # Use .to() Method specification GPU Speed up , Will return inference, His type depends on the original type
t = nn.CrossEntropyLoss().cuda() # Use .cuda() Method
Calculation accuracy
logits = torch.rand(4,10)
pred = F.softmax(logits,dim=1)
pred_label = pred.argmax(dim=1)
print(pred_label) #tensor([5, 5, 2, 6])
p = logits.argmax(dim=1)
print(p) #tensor([5, 5, 2, 6])
label = torch.tensor([3,9,2,0])
correct = torch.eq(pred_label,label) #
print(correct) #tensor([False, False, True, False])
i = correct.sum().float().item()/4 # Calculation accuracy ,item Get the elements inside
print(i) #0.25边栏推荐
- Investment demand and income forecast report of China's building ceramics industry, 2022-2028
- Version rollback revert don't reset better reset must be forced
- What is the difference between NFT, SFT and dnft? How to build NFT platform applications?
- [BSP video tutorial] stm32h7 video tutorial phase 5: MDK topic, system introduction to MDK debugging, AC5, AC6 compilers, RTE development environment and the role of various configuration items (2022-
- Idea a method for starting multiple instances of a service
- 股票开户最低佣金炒股开户免费,网上开户安全吗
- [MySQL] sql99 syntax to realize multi table query
- ADB related commands
- Gorilla/mux framework (RK boot): add tracing Middleware
- Double efficiency. Six easy-to-use pychar plug-ins are recommended
猜你喜欢

Actual combat | use composite material 3 in application

How to make recv have a little temper?

Ningde times and BYD have refuted rumors one after another. Why does someone always want to harm domestic brands?

Interpretation of corolla sub low configuration, three cylinder power configuration, CVT fuel saving and smooth, safety configuration is in place

Briefly understand the operation mode of developing NFT platform
![[network security] what is emergency response? What indicators should you pay attention to in emergency response?](/img/ff/c733ffbb922760910ab09af3ae2886.jpg)
[network security] what is emergency response? What indicators should you pay attention to in emergency response?
![[source code] VB6 chat robot](/img/89/46b67f627c8257eaddc70a247c9ba5.jpg)
[source code] VB6 chat robot

Design of logic level conversion in high speed circuit

X Opencv feature point detection and matching

Unity shader visualizer shader graph
随机推荐
P1629 postman delivering letter
Selenium check box
炒股开户佣金优惠怎么才能获得,网上开户安全吗
Private project practice sharing populate joint query in mongoose makes the template unable to render - solve the error message: syntaxerror: unexpected token r in JSON at
Amway by head has this project management tool to improve productivity in a straight line
Is user authentication really simple
"Learning notes" recursive & recursive
How to prevent malicious crawling of information by one-to-one live broadcast source server
D28:maximum sum (maximum sum, translation)
A treasure open source software, cross platform terminal artifact tabby
2022 chemical automation control instrument examination content and chemical automation control instrument simulation examination
Selenium library 4.5.0 keyword explanation (II)
股票开户佣金最低的券商有哪些大家推荐一下,手机上开户安全吗
What is the Valentine's Day gift given by the operator to the product?
Fashion cloud interview questions series - JS high-frequency handwritten code questions
D27:mode of sequence (maximum, translation)
C # basic knowledge (1)
No qualifying bean of type ‘com. netflix. discovery. AbstractDiscoveryClientOptionalArgs<?>‘ available
After the Lunar New Year and a half
2020.2.14