当前位置:网站首页>Entropy information entropy cross entropy
Entropy information entropy cross entropy
2022-07-06 23:28:00 【TranSad】
In information theory , We often use entropy to express the degree of chaos and uncertainty of information . The greater the entropy , The more uncertain the information is .
The formula for entropy is as follows :
( notes :log Default to 2 Bottom )
Taking this formula apart is actually very simple : A minus sign , One p(x) as well as log(p(x)). We know that the probability of an event is 0-1 Between , Such a probability value is sent into log function ( Here's the picture ), It must be less than 0 Of , So add a symbol outside , We can get the common positive entropy .
When the probability value tends to 0 perhaps 1 when ( That is, the certainty is very strong ), be p(x) perhaps log(p(x)) Will tend 0, Entropy will be small ; When the probability value tends to 1/2 when ( That is, the uncertainty is very strong ), be p(x) perhaps log(p(x)) Do not tend to 0, Entropy will be large .
for instance , Let's say there are 4 A ball .
1. Suppose it's all black balls ( Minimum uncertainty ), Its entropy is :-1*log1=0
2. hypothesis 2 black ball 2 White ball ( Start with uncertainty ), The entropy is :-0.5log0.5-0.5log0.5=0.301
3. hypothesis 1 black ball 1 White ball 1 yellow ball 1 Basketball ( High uncertainty ), The entropy is :-0.25log0.25--0.25log0.25-0.25log0.25-0.25log0.25 = 0.602
Evaluate the classification results
We know that entropy can be used to see the uncertainty of information , Then we can use entropy to evaluate the effect of classification tasks , For example, we have four pictures , Two pictures of cats : cat 1, cat 2, Two week dog chart : Dog 1, Dog 2. We feed it into the classifier , Obviously, this is a dichotomy problem , Now our two models get the following results respectively :
Model one :( cat 1, cat 2)/( Dog 1, Dog 2).
Model two :( cat 1, Dog 2)/( cat 2, Dog 1).
obviously , Put the model 1 The calculation formula of input entropy of classification results , We can get 0 The entropy of , That is, the uncertainty is 0, Explain that each category is classified correctly . And model 2 The classification results of , You will find that entropy is very large , It shows that the classification effect of the model is not good .
Of course , The above is just a simple example of using entropy to evaluate classification tasks , Only applicable to unsupervised tasks .
Cross entropy
In more cases , Our classification task is labeled , For supervised learning , We use cross entropy to evaluate . The idea of cross entropy is different from the above , It starts from every sample , Calculate the distance between the predicted value and the expected value of each sample ( We call it cross entropy ), Formula for :
among p For the expected output , A probability distribution q For actual output ,H(p,q) For cross entropy .
For example, the cat and dog classification task , We assume that the cat is 10, Dog for 01, Then if the model classifies cats into dogs , Now p=10,q=01,H(p,q)=-(1*log(0)+0*log(1))=∞. It's not surprising to calculate infinity here , Because it's the opposite , So the calculated “ distance ” A very large . In more cases , We will have something similar q=(0.1,0.9) So the value of the , At this time, the calculated entropy is a non infinite but equally large value .
Information entropy solves the problem of weighing times
Information entropy is very useful , For example, we often encounter such a classic problem : Yes n A little ball , Only one ball weighs differently from the others ( Heavier than other balls ), Ask us how we weigh by balance , You can find this ball at least a few times ?
If we didn't study information theory , The first method I came up with was dichotomy :“ Divide the balls into two parts , Keep the heavy part , Another dichotomy ……” And so on. . But by calculating information entropy , We can deviate from the actual weighing method , Directly from “ God's perspective ” To get the final answer —— This is the wonder of applying information entropy .
How to solve it ? We know there are n A ball , The probability of each ball being heavier is 1/n, Then the total amount of information is :
H(x) = n* (-1/n)*log(1/n) = logn
It uses “ The amount of information ” To describe the result , Information quantity is another variable closely related to entropy in information theory —— Looking at the formula, it seems that the method of calculating the amount of information is also very easy to understand and closely related to the entropy formula .( The more direct calculation formula of information is I=log2(1/p))
And weigh once every time , We can get three results : Left , Right side and the same weight . So the amount of information that can be eliminated is :
H(y)=3*(-1/3)*log(1/3)=log3
therefore , The minimum number of weighing times required is H(x)/H(y)=logn/log3 Time .
The above is just a very simple example , Sometimes we don't know whether the unusual ball is heavy or light , At this time, our uncertainty about the whole will increase , That is, the total amount of information H(x) It will change —— The specific idea of adding is : Finding a different ball requires logn The amount of information , It is necessary to judge whether the ball is heavy or light log2 The amount of information , So the total amount of information H(x) by log2+logn=log2n.
At this time, the minimum number of weighing times required is H(x)/H(y)=log2n/log3 Time .
This article mainly combs the information entropy 、 Concepts and usages such as cross entropy , Finally, it is simply extended to the use of information in information theory to solve the problem of balance weighing . Originally, I wanted to write another three door question ( It can also be seen from the idea of information entropy ), But I feel that I'm getting off the subject and pulling away …… That's it .
边栏推荐
- The same job has two sources, and the same link has different database accounts. Why is the database list found in the second link the first account
- 【全网首发】Redis系列3:高可用之主从架构的
- Is the more additives in food, the less safe it is?
- AI表现越差,获得奖金越高?纽约大学博士拿出百万重金,悬赏让大模型表现差劲的任务...
- 华为云GaussDB(for Redis)揭秘第21期:使用高斯Redis实现二级索引
- Leetcode problem solving - 889 Construct binary tree according to preorder and postorder traversal
- Stop saying that microservices can solve all problems
- (shuttle) navigation return interception: willpopscope
- 企业不想换掉用了十年的老系统
- 让 Rust 库更优美的几个建议!你学会了吗?
猜你喜欢
Ajout, suppression et modification d'un tableau json par JS
(shuttle) navigation return interception: willpopscope
B站大佬用我的世界搞出卷積神經網絡,LeCun轉發!爆肝6個月,播放破百萬
Why are some people still poor and living at the bottom of society even though they have been working hard?
js對JSON數組的增删改查
Children's pajamas (Australia) as/nzs 1249:2014 handling process
Knowledge * review
Word2vec (skip gram and cbow) - pytorch
The worse the AI performance, the higher the bonus? Doctor of New York University offered a reward for the task of making the big model perform poorly
Talking about the current malpractice and future development
随机推荐
Per capita Swiss number series, Swiss number 4 generation JS reverse analysis
Cloud native (32) | kubernetes introduction to platform storage system
食品里的添加剂品种越多,越不安全吗?
Face recognition class attendance system based on paddlepaddle platform (easydl)
Cover fake big empty talk in robot material sorting
Introduction to network basics
dockermysql修改root账号密码并赋予权限
Summary of three methods for MySQL to view table structure
允许全表扫描 那个语句好像不生效set odps.sql.allow.fullscan=true;我
资产安全问题或制约加密行业发展 风控+合规成为平台破局关键
The problem that dockermysql cannot be accessed by the host machine is solved
koa2对Json数组增删改查
Unified Focal loss: Generalising Dice and cross entropy-based losses to handle class imbalanced medi
How does crmeb mall system help marketing?
What does security capability mean? What are the protection capabilities of different levels of ISO?
使用MitmProxy离线缓存360度全景网页
MySQL实现字段分割一行转多行的示例代码
华为云GaussDB(for Redis)揭秘第21期:使用高斯Redis实现二级索引
Let me ask you if there are any documents or cases of flynk SQL generation jobs. I know that flynk cli can create tables and specify items
Two week selection of tdengine community issues | phase II