当前位置:网站首页>Entropy information entropy cross entropy
Entropy information entropy cross entropy
2022-07-06 23:28:00 【TranSad】
In information theory , We often use entropy to express the degree of chaos and uncertainty of information . The greater the entropy , The more uncertain the information is .
The formula for entropy is as follows :
( notes :log Default to 2 Bottom )
Taking this formula apart is actually very simple : A minus sign , One p(x) as well as log(p(x)). We know that the probability of an event is 0-1 Between , Such a probability value is sent into log function ( Here's the picture ), It must be less than 0 Of , So add a symbol outside , We can get the common positive entropy .
When the probability value tends to 0 perhaps 1 when ( That is, the certainty is very strong ), be p(x) perhaps log(p(x)) Will tend 0, Entropy will be small ; When the probability value tends to 1/2 when ( That is, the uncertainty is very strong ), be p(x) perhaps log(p(x)) Do not tend to 0, Entropy will be large .
for instance , Let's say there are 4 A ball .
1. Suppose it's all black balls ( Minimum uncertainty ), Its entropy is :-1*log1=0
2. hypothesis 2 black ball 2 White ball ( Start with uncertainty ), The entropy is :-0.5log0.5-0.5log0.5=0.301
3. hypothesis 1 black ball 1 White ball 1 yellow ball 1 Basketball ( High uncertainty ), The entropy is :-0.25log0.25--0.25log0.25-0.25log0.25-0.25log0.25 = 0.602
Evaluate the classification results
We know that entropy can be used to see the uncertainty of information , Then we can use entropy to evaluate the effect of classification tasks , For example, we have four pictures , Two pictures of cats : cat 1, cat 2, Two week dog chart : Dog 1, Dog 2. We feed it into the classifier , Obviously, this is a dichotomy problem , Now our two models get the following results respectively :
Model one :( cat 1, cat 2)/( Dog 1, Dog 2).
Model two :( cat 1, Dog 2)/( cat 2, Dog 1).
obviously , Put the model 1 The calculation formula of input entropy of classification results , We can get 0 The entropy of , That is, the uncertainty is 0, Explain that each category is classified correctly . And model 2 The classification results of , You will find that entropy is very large , It shows that the classification effect of the model is not good .
Of course , The above is just a simple example of using entropy to evaluate classification tasks , Only applicable to unsupervised tasks .
Cross entropy
In more cases , Our classification task is labeled , For supervised learning , We use cross entropy to evaluate . The idea of cross entropy is different from the above , It starts from every sample , Calculate the distance between the predicted value and the expected value of each sample ( We call it cross entropy ), Formula for :
among p For the expected output , A probability distribution q For actual output ,H(p,q) For cross entropy .
For example, the cat and dog classification task , We assume that the cat is 10, Dog for 01, Then if the model classifies cats into dogs , Now p=10,q=01,H(p,q)=-(1*log(0)+0*log(1))=∞. It's not surprising to calculate infinity here , Because it's the opposite , So the calculated “ distance ” A very large . In more cases , We will have something similar q=(0.1,0.9) So the value of the , At this time, the calculated entropy is a non infinite but equally large value .
Information entropy solves the problem of weighing times
Information entropy is very useful , For example, we often encounter such a classic problem : Yes n A little ball , Only one ball weighs differently from the others ( Heavier than other balls ), Ask us how we weigh by balance , You can find this ball at least a few times ?
If we didn't study information theory , The first method I came up with was dichotomy :“ Divide the balls into two parts , Keep the heavy part , Another dichotomy ……” And so on. . But by calculating information entropy , We can deviate from the actual weighing method , Directly from “ God's perspective ” To get the final answer —— This is the wonder of applying information entropy .
How to solve it ? We know there are n A ball , The probability of each ball being heavier is 1/n, Then the total amount of information is :
H(x) = n* (-1/n)*log(1/n) = logn
It uses “ The amount of information ” To describe the result , Information quantity is another variable closely related to entropy in information theory —— Looking at the formula, it seems that the method of calculating the amount of information is also very easy to understand and closely related to the entropy formula .( The more direct calculation formula of information is I=log2(1/p))
And weigh once every time , We can get three results : Left , Right side and the same weight . So the amount of information that can be eliminated is :
H(y)=3*(-1/3)*log(1/3)=log3
therefore , The minimum number of weighing times required is H(x)/H(y)=logn/log3 Time .
The above is just a very simple example , Sometimes we don't know whether the unusual ball is heavy or light , At this time, our uncertainty about the whole will increase , That is, the total amount of information H(x) It will change —— The specific idea of adding is : Finding a different ball requires logn The amount of information , It is necessary to judge whether the ball is heavy or light log2 The amount of information , So the total amount of information H(x) by log2+logn=log2n.
At this time, the minimum number of weighing times required is H(x)/H(y)=log2n/log3 Time .
This article mainly combs the information entropy 、 Concepts and usages such as cross entropy , Finally, it is simply extended to the use of information in information theory to solve the problem of balance weighing . Originally, I wanted to write another three door question ( It can also be seen from the idea of information entropy ), But I feel that I'm getting off the subject and pulling away …… That's it .
边栏推荐
- JS import excel & Export Excel
- 石墨文档:4大对策解决企业文件信息安全问题
- [unity] upgraded version · Excel data analysis, automatically create corresponding C classes, automatically create scriptableobject generation classes, and automatically serialize asset files
- If the request URL contains jsessionid, the solution
- (flutter2) as import old project error: inheritfromwidgetofexacttype
- Today's sleep quality record 78 points
- js导入excel&导出excel
- 不要再说微服务可以解决一切问题了
- How much does the mlperf list weigh when AI is named?
- 今日睡眠质量记录78分
猜你喜欢
Docker starts MySQL and -emysql_ ROOT_ Password = my secret PW problem solving
公链与私链在数据隐私和吞吐量上的竞争
(1) Chang'an chain learning notes - start Chang'an chain
AI金榜题名时,MLPerf榜单的份量究竟有多重?
Cover fake big empty talk in robot material sorting
云原生(三十二) | Kubernetes篇之平台存储系统介绍
室内LED显示屏应该怎么选择?这5点注意事项必须考虑在内
Dayu200 experience officer homepage AITO video & Canvas drawing dashboard (ETS)
docker启动mysql及-eMYSQL_ROOT_PASSWORD=my-secret-pw问题解决
Up to 5million per person per year! Choose people instead of projects, focus on basic scientific research, and scientists dominate the "new cornerstone" funded by Tencent to start the application
随机推荐
云原生(三十二) | Kubernetes篇之平台存储系统介绍
How does crmeb mall system help marketing?
How to choose the server system
机器人材料整理中的套-假-大-空话
这个『根据 op 值判断操作类型来自己组装 sql』是指在哪里实现?是指单纯用 Flink Tabl
Demonstration of the development case of DAPP system for money deposit and interest bearing financial management
Ajout, suppression et modification d'un tableau json par JS
11 preparations for Web3 and Decentralization for traditional enterprises
Is "applet container technology" a gimmick or a new outlet?
食品里的添加剂品种越多,越不安全吗?
同一个作业有两个source,同一链接不同数据库账号,为何第二个链接查出来的数据库列表是第一个账号的
Example code of MySQL split string as query condition
若依请求url中带有jsessionid的解决办法
企業不想換掉用了十年的老系統
Stop saying that microservices can solve all problems
Efficient ETL Testing
If the request URL contains jsessionid, the solution
Talking about the current malpractice and future development
电脑重装系统u盘文件被隐藏要怎么找出来
Graphite document: four countermeasures to solve the problem of enterprise document information security