当前位置:网站首页>Two forms of softmax cross entropy + numpy implementation
Two forms of softmax cross entropy + numpy implementation
2022-07-29 04:14:00 【ytusdc】
Cross entropy is used to describe the distance between two distributions
Cross entropy definition :

among y Represents our true value ,y yes one-hot Vector labels ,pi On behalf of us softmax The calculated value , p The sum of vectors is 1.i Represents the label of the output node .
Cross entropy of multi classification problems
The last layer corresponding to the cross entropy of multiple classifications is softmax, Output probability p And for 1. Corresponding label y yes one-hot vector . also ,y Only the index of the correct untag in is 1, Others are 0(one-hot Express ). In fact, only the natural logarithm of the output corresponding to the correct solution label is calculated
Cross entropy of binary classification problem
Look again. sigmoid As the last layer ,sigmoid It is a binary output .sigmoid As the last layer of output , Then the output of the last layer cannot be regarded as a distribution , Because it doesn't add up to 1. Now we should think of each neuron in the last layer as a distribution , Corresponding target It belongs to binomial distribution (target The value of represents the probability of this class ).
In dichotomous problems ,y It's a real label , Its value can only be set {0, 1}. There are only two cases that the model needs to predict in the end , For each category, the probability of our prediction is p and 1 - p , In this case, the expression is

- yi —— Presentation sample i Of label, Positive class is 1, The negative category is 0
- pi —— Presentation sample i The probability of being predicted as a positive class
For the whole model , The loss function is the average value of the loss function of all sample points
Python Code implementation :
Mean square error numpy Realization :

import numpy as np
def MSE(y,t):
# Shape parameter t Represents training data ( Monitoring data )( real )
#y Represents forecast data
return 0.5*np.sum((y-t)**2)
softmax + Cross entropy Code implementation
Softmax The output of the full connection layer can be mapped into a probability distribution , The goal of our training is to make us belong to the second class k The sample of class is Softmax in the future , The first k The higher the probability of a class, the better .
# numpy Definition softmax function
def softmax(x):
exps = np.exp(x)
return exps / np.sum(exps) It should be noted that , stay numpy There are numerical restrictions on floating-point types in . For exponential functions , This ceiling can easily be broken , If that happens python Will be returned nan.
In order to make Softmax The function is more stable in numerical calculation , Avoid its output nan This situation , A very simple method is to normalize the input vector , Just multiply the numerator and denominator by a constant C( For example, less than 1 Number of numbers , Make the output smaller ), In theory , We can choose any value as , But generally we choose Enter the maximum value of the value , Avoid nan The situation of .
Also use Python, Improved Softmax The function can be written like this :
def softmax(x):
exps = np.exp(x - np.max(x))
return exps / np.sum(exps)Cross entropy code :
def sigmoid(x):
return (1/(1+np.exp(x)))
def softmax(x):
exps = np.exp(x - np.max(x))
return exps / np.sum(exps)
def cross_entropy_error(p,y):
delta=1e-7 # Adding a tiny value can prevent negative infinity (np.log(0)) Happen .
p = softmax(p) # adopt softmax Become a probability distribution , also sum(p) = 1
return -np.sum(y*np.log(p+delta))
Softmax And cross entropy Python Realization
Similarities and differences between two cross entropy loss functions
PYTHON(NUMPY) Realize mean square deviation 、 Cross entropy loss function, etc
边栏推荐
- C语言:枚举知识点总结
- [paper translation] vectornet: encoding HD maps and agent dynamics from vectorized representation
- Note: restframe work records many to one tables, how to serialize in that table (reverse query)
- Don't the JDBC SQL connector of the big guys Flink now support all databases, such as vertica?
- 通过js来实现一元二次方程的效果,输入a,b,c系数后可计算出x1和x2的值
- Locally call tensorboard and Jupiter notebook on the server (using mobaxterm)
- “蔚来杯“2022牛客暑期多校训练营1 J Serval and Essay(启发式合并)
- How to query the submission number of a version
- 14.haproxy+keepalived负载均衡和高可用
- C language - character array - string array - '\0' -sizeof-strlen() -printf()
猜你喜欢

Jenkins 参数化构建中 各参数介绍与示例

How to solve the problem of store ranking?

Note: restframe work records many to one tables, how to serialize in that table (reverse query)

Lua language (stm32+2g/4g module) and C language (stm32+esp8266) methods of extracting relevant data from strings - collation

不会就坚持65天吧 只出现一次的数字

Function pointer and callback function

不会就坚持60天吧 神奇的字典

不会就坚持67天吧 平方根

Nacos registry

The solution of porting stm32f103zet6 program to c8t6+c8t6 download program flash timeout
随机推荐
不会就坚持64天吧 查找插入位置
Some problems about pointers
The data source is SQL server. I want to configure the incremental data of the last two days of the date field updatedate to add
Code or script to speed up the video playback of video websites
Pointer constant and constant pointer
[paper translation] vectornet: encoding HD maps and agent dynamics from vectorized representation
安装ros的laser_scan_matche库所遇到的问题(一)
Codeforces round 810 (Div. 2) d. rain (segment tree difference)
Pointer variables -printf%d and%p meaning
How to execute insert into select from job in SQL client
Interview notes of a company
First knowledge of C language (3)
信号处理中的反傅里叶变换(IFFT)原理
Basic configuration of BGP - establish peers and route announcements
Do you have a boss to help me check whether the parameter configuration of the Flink SQL connection Kafka authentication Kerberos is wrong
Change the value of the argument by address through malloc and pointer
C语言:枚举知识点总结
请问,在sql client中,执行insert into select from job时,如何单
伏英娜:元宇宙就是新一代互联网!
Why are there so many unknowns when opengauss starts?