当前位置:网站首页>Two forms of softmax cross entropy + numpy implementation
Two forms of softmax cross entropy + numpy implementation
2022-07-29 04:14:00 【ytusdc】
Cross entropy is used to describe the distance between two distributions
Cross entropy definition :

among y Represents our true value ,y yes one-hot Vector labels ,pi On behalf of us softmax The calculated value , p The sum of vectors is 1.i Represents the label of the output node .
Cross entropy of multi classification problems
The last layer corresponding to the cross entropy of multiple classifications is softmax, Output probability p And for 1. Corresponding label y yes one-hot vector . also ,y Only the index of the correct untag in is 1, Others are 0(one-hot Express ). In fact, only the natural logarithm of the output corresponding to the correct solution label is calculated
Cross entropy of binary classification problem
Look again. sigmoid As the last layer ,sigmoid It is a binary output .sigmoid As the last layer of output , Then the output of the last layer cannot be regarded as a distribution , Because it doesn't add up to 1. Now we should think of each neuron in the last layer as a distribution , Corresponding target It belongs to binomial distribution (target The value of represents the probability of this class ).
In dichotomous problems ,y It's a real label , Its value can only be set {0, 1}. There are only two cases that the model needs to predict in the end , For each category, the probability of our prediction is p and 1 - p , In this case, the expression is

- yi —— Presentation sample i Of label, Positive class is 1, The negative category is 0
- pi —— Presentation sample i The probability of being predicted as a positive class
For the whole model , The loss function is the average value of the loss function of all sample points
Python Code implementation :
Mean square error numpy Realization :

import numpy as np
def MSE(y,t):
# Shape parameter t Represents training data ( Monitoring data )( real )
#y Represents forecast data
return 0.5*np.sum((y-t)**2)
softmax + Cross entropy Code implementation
Softmax The output of the full connection layer can be mapped into a probability distribution , The goal of our training is to make us belong to the second class k The sample of class is Softmax in the future , The first k The higher the probability of a class, the better .
# numpy Definition softmax function
def softmax(x):
exps = np.exp(x)
return exps / np.sum(exps) It should be noted that , stay numpy There are numerical restrictions on floating-point types in . For exponential functions , This ceiling can easily be broken , If that happens python Will be returned nan.
In order to make Softmax The function is more stable in numerical calculation , Avoid its output nan This situation , A very simple method is to normalize the input vector , Just multiply the numerator and denominator by a constant C( For example, less than 1 Number of numbers , Make the output smaller ), In theory , We can choose any value as , But generally we choose Enter the maximum value of the value , Avoid nan The situation of .
Also use Python, Improved Softmax The function can be written like this :
def softmax(x):
exps = np.exp(x - np.max(x))
return exps / np.sum(exps)Cross entropy code :
def sigmoid(x):
return (1/(1+np.exp(x)))
def softmax(x):
exps = np.exp(x - np.max(x))
return exps / np.sum(exps)
def cross_entropy_error(p,y):
delta=1e-7 # Adding a tiny value can prevent negative infinity (np.log(0)) Happen .
p = softmax(p) # adopt softmax Become a probability distribution , also sum(p) = 1
return -np.sum(y*np.log(p+delta))
Softmax And cross entropy Python Realization
Similarities and differences between two cross entropy loss functions
PYTHON(NUMPY) Realize mean square deviation 、 Cross entropy loss function, etc
边栏推荐
- How to execute insert into select from job in SQL client
- UnicodeDecodeError: ‘ascii‘ codec can‘t decode byte 0x90 in position 614: ordinal not in range(128)
- Why are there so many unknowns when opengauss starts?
- MPU6050
- Fuzzy query of SQL
- BIO、NIO、AIO的区别和原理
- Nacos registry
- First knowledge of C language (3)
- Jenkins 参数化构建中 各参数介绍与示例
- 从淘宝,天猫,1688,微店,京东,苏宁,淘特等其他平台一键复制商品到拼多多平台(批量上传宝贝详情接口教程)
猜你喜欢
![[paper translation] vectornet: encoding HD maps and agent dynamics from vectorized representation](/img/4b/150689d5e4809ae66a4297915ecd0c.png)
[paper translation] vectornet: encoding HD maps and agent dynamics from vectorized representation

Mmdetection preliminary use

开课!看smardaten如何分解复杂业务场景

The solution of porting stm32f103zet6 program to c8t6+c8t6 download program flash timeout

rman不标记过期备份

不会就坚持68天吧 狒狒吃香蕉

Installation and use of stm32cubemx (5.3.0)

安装ros的laser_scan_matche库所遇到的问题(一)

AssertionError(“Torch not compiled with CUDA enabled“)

小程序:区域滚动、下拉刷新、上拉加载更多
随机推荐
顺序表和链表
JS realizes the function of one click Copy
Compilation and linking
[Openstack] keystone,nova
店铺排名问题,如何解决?
索引的最左前缀原理
Taobao product details interface (product details page data interface)
Lua language (stm32+2g/4g module) and C language (stm32+esp8266) methods of extracting relevant data from strings - collation
SQL time fuzzy query datediff() function
Nacos registry
Design of environment detection system based on STM32 and Alibaba cloud
Opengauss pre check installation
不会就坚持60天吧 神奇的字典
Fuzzy query of SQL
The difference between dynamic, VaR and object in fluent
Codeforces Round #810 (Div. 2) D. Rain (线段树差分)
一个公司的面试笔记
Array as function parameter -- pointer constant / constant pointer
Lua语言(stm32+2G/4G模块)和C语言(stm32+esp8266)从字符串中提取相关数据的方法-整理
Class starts! See how smardaten decomposes complex business scenarios