当前位置:网站首页>Why can cross entropy loss be used to characterize loss
Why can cross entropy loss be used to characterize loss
2022-07-27 06:57:00 【Mr_ health】
This paper is mainly about why the cross entropy loss can be used to calculate the cost , Mainly write ideas , Don't explain in detail .
1. The amount of information -log(p)----> Information entropy -plog(p) ( The amount of information expected )
There is only one distribution
2. Further introduce KL The divergence , At this time, there are two distributions , Use at this time p Represents the true distribution ,q Represents the distribution of the forecast
plog(p/q)
KL Divergence is the distance between two distributions
3. In machine learning and deep learning , What we want is the distribution learned by the model Pmodel As close to the real distribution of data as possible Preal.

So according to KL The divergence , We minimize Pmodel and Ptraining Of KL Divergence is enough .
And then we went to KL Make a change in divergence as follows , In the formula p It means Ptraining,q It means the distribution of model learning derivatives Pmodel.

In the machine ( depth ) I am learning , Distribution of training data p(x) It has been fixed , that
Is a fixed value , So we minimize KL The divergence , It is approximately equal to minimizing cross entropy
.
边栏推荐
猜你喜欢

O2O电商线上线下一体化模式分析

Li Hongyi 2020 deep learning and human language processing dlhlp conditional generation by RNN and attention-p22

DNA偶联PbSe量子点|近红外硒化铅PbSe量子点修饰脱氧核糖核酸DNA|PbSe-DNA QDs

What is the reason why dragging the timeline is invalid when playing device videos on the easycvr platform?

Keras OCR instance test

FTX Foundation funded 15million to help covid-19 clinical trials, which will affect global public health

网站服务器被攻击怎么办?向日葵提示防范漏洞是关键

Project training experience 2

MySQL的基本语句(1)—增删改查

For redis under windows, it can only read but not write
随机推荐
Basic concepts of program, process, thread, coprocess, single thread and multi thread
Create a container that does not depend on any underlying image
Detection and identification data set and yolov5 model of helmet reflective clothing
最新!国资委发布国有企业数字化转型新举措
银行业客户体验管理现状与优化策略分析
deepsort源码解读(二)
About the problem that Druid can't connect to the database
脱氧核糖核酸DNA改性近红外二区砷化镓GaAs量子点|GaAs-DNA QDs|DNA修饰GaAs量子点
Shell编程的规范和变量
ES6新特性(入门)
Sunflower: don't worry when you encounter computer vulnerabilities, understand clearly and then judge sunflower: don't worry when you encounter computer vulnerabilities, understand clearly and then ju
The difference between malloc and new - Practical chapter
Px4 source code compilation to establish its own program module
Esxi virtual machine starts, and the module "monitorloop" fails to power on
Express framework
FTX US launched FTX stocks, striding forward to the mainstream financial industry
硫化镉CdS量子点修饰脱氧核糖核酸DNA|CdS-DNA QDs|近红外CdS量子点偶联DNA规格信息
智能安防视频平台EasyCVR出现通道列表为空情况的原因是什么?
聊聊大火的多模态
Three methods to judge whether it is palindrome structure