当前位置:网站首页>Two forms of softmax cross entropy + numpy implementation
Two forms of softmax cross entropy + numpy implementation
2022-07-29 04:14:00 【ytusdc】
Cross entropy is used to describe the distance between two distributions
Cross entropy definition :

among y Represents our true value ,y yes one-hot Vector labels ,pi On behalf of us softmax The calculated value , p The sum of vectors is 1.i Represents the label of the output node .
Cross entropy of multi classification problems
The last layer corresponding to the cross entropy of multiple classifications is softmax, Output probability p And for 1. Corresponding label y yes one-hot vector . also ,y Only the index of the correct untag in is 1, Others are 0(one-hot Express ). In fact, only the natural logarithm of the output corresponding to the correct solution label is calculated
Cross entropy of binary classification problem
Look again. sigmoid As the last layer ,sigmoid It is a binary output .sigmoid As the last layer of output , Then the output of the last layer cannot be regarded as a distribution , Because it doesn't add up to 1. Now we should think of each neuron in the last layer as a distribution , Corresponding target It belongs to binomial distribution (target The value of represents the probability of this class ).
In dichotomous problems ,y It's a real label , Its value can only be set {0, 1}. There are only two cases that the model needs to predict in the end , For each category, the probability of our prediction is p and 1 - p , In this case, the expression is

- yi —— Presentation sample i Of label, Positive class is 1, The negative category is 0
- pi —— Presentation sample i The probability of being predicted as a positive class
For the whole model , The loss function is the average value of the loss function of all sample points
Python Code implementation :
Mean square error numpy Realization :

import numpy as np
def MSE(y,t):
# Shape parameter t Represents training data ( Monitoring data )( real )
#y Represents forecast data
return 0.5*np.sum((y-t)**2)
softmax + Cross entropy Code implementation
Softmax The output of the full connection layer can be mapped into a probability distribution , The goal of our training is to make us belong to the second class k The sample of class is Softmax in the future , The first k The higher the probability of a class, the better .
# numpy Definition softmax function
def softmax(x):
exps = np.exp(x)
return exps / np.sum(exps) It should be noted that , stay numpy There are numerical restrictions on floating-point types in . For exponential functions , This ceiling can easily be broken , If that happens python Will be returned nan.
In order to make Softmax The function is more stable in numerical calculation , Avoid its output nan This situation , A very simple method is to normalize the input vector , Just multiply the numerator and denominator by a constant C( For example, less than 1 Number of numbers , Make the output smaller ), In theory , We can choose any value as , But generally we choose Enter the maximum value of the value , Avoid nan The situation of .
Also use Python, Improved Softmax The function can be written like this :
def softmax(x):
exps = np.exp(x - np.max(x))
return exps / np.sum(exps)Cross entropy code :
def sigmoid(x):
return (1/(1+np.exp(x)))
def softmax(x):
exps = np.exp(x - np.max(x))
return exps / np.sum(exps)
def cross_entropy_error(p,y):
delta=1e-7 # Adding a tiny value can prevent negative infinity (np.log(0)) Happen .
p = softmax(p) # adopt softmax Become a probability distribution , also sum(p) = 1
return -np.sum(y*np.log(p+delta))
Softmax And cross entropy Python Realization
Similarities and differences between two cross entropy loss functions
PYTHON(NUMPY) Realize mean square deviation 、 Cross entropy loss function, etc
边栏推荐
- 基于STM32和阿里云的环境检测系统设计
- 信号处理中的反傅里叶变换(IFFT)原理
- Object detection: object_ Detection API +ssd target detection model
- Taobao product details interface (product details page data interface)
- mmdetection初步使用
- Communication between parent-child components and parent-child components provide and inject
- GBase 8a特殊场景下屏蔽 ODBC 负载均衡方式?
- C语言:typedef知识点总结
- Interview notes of a company
- 编译与链接
猜你喜欢

Install the laser of ROS_ scan_ Problems encountered in match library (I)

SVG--loading动画

不会就坚持64天吧 查找插入位置

Note: restframe work records many to one tables, how to serialize in that table (reverse query)

MySQL gets the maximum value record by field grouping

不会就坚持60天吧 神奇的字典

MPU6050

为什么opengauss启动的时候这么多的unknown?

Some problems about pointers

Basic configuration of BGP - establish peers and route announcements
随机推荐
Communication between parent-child components and parent-child components provide and inject
不会就坚持64天吧 查找插入位置
Database SQL statement realizes function query of data decomposition
Safari's compatibility with Z-index
14.haproxy+keepalived负载均衡和高可用
Leftmost prefix principle of index
安装ros的laser_scan_matche库所遇到的问题(一)
[kvm] common commands
GBase 8a特殊场景下屏蔽 ODBC 负载均衡方式?
Copy products with one click from Taobao, tmall, 1688, wechat, jd.com, Suning, taote and other platforms to pinduoduo platform (batch upload baby details Interface tutorial)
Machine vision Series 1: Visual Studio 2019 dynamic link library DLL establishment
Lua language (stm32+2g/4g module) and C language (stm32+esp8266) methods of extracting relevant data from strings - collation
HCIP BGP
这个报错是什么鬼啊,不影响执行结果,但是在执行sql时一直报错。。。连接maxComputer是使用
The table of antd hides the pager when there is only one page
C declaration and initialization and assignment
How to set the SQL execution timeout for flick SQL
Object detection: object_ Detection API +ssd target detection model
How to write the filter conditions of data integration and what syntax to use? SQL syntax processing bizdate can not be
不会就坚持59天吧 替换单词