当前位置:网站首页>Function classification big PK! How to use sigmoid and softmax respectively?
Function classification big PK! How to use sigmoid and softmax respectively?
2020-11-08 16:17:00 【Spiritual】
Design models to perform classification tasks ( As for the chest X Just check the disease or handwritten number to classify ) when , Sometimes you need to choose multiple answers at the same time ( If you choose pneumonia and abscess at the same time ), Sometimes you can only choose one answer ( Like numbers “8”). This article will discuss how to apply Sigmoid Function or Softmax Function handles the original output value of the classifier .
There are many kinds of neural network classifier classification algorithms , But the content of this paper is limited to neural network classifier . The classification problem can be solved by different neural networks , Such as feedforward neural network and convolution neural network . application Sigmoid Function or Softmax The final result of FNN classifier is a vector , namely “ The original output value ”, Such as [-0.5, 1.2, -0.1, 2.4], These four outputs correspond to the chest X Pneumonia after light examination 、 Heart hypertrophy 、 Tumors and abscesses . But what do these raw output values mean ? It may be easier to understand by converting the output value to a probability . Compared with the seemingly casual “2.4”, The possibility of diabetes is 91%, This statement is easier for patients to understand .Sigmoid Function or Softmax Function can map the original output value of classifier to probability . The following figure shows the original output of the feedforward neural network ( Blue ) adopt Sigmoid Functions are mapped to probabilities ( Red ) The process of :
Then use Softmax Function repeats the above process :
As shown in the figure ,Sigmoid Functions and Softmax Function gives different results . The reason lies in ,Sigmoid The function processes the raw output values separately , So the results are independent of each other , The sum of probabilities is not necessarily 1, Pictured 0.37 + 0.77 + 0.48 + 0.91 = 2.53. contrary ,Softmax The output values of functions are related to each other , The sum of the probabilities is always 1, Pictured 0.04 + 0.21 + 0.05 + 0.70 = 1.00. therefore , stay Softmax Function , To increase the probability of a class , The probability of other categories must be reduced accordingly .
Sigmoid Function application : With the chest X X-ray examination and admission for example, chest X Photo chip : A chest X Light film can show many diseases at the same time , So the chest X X-ray classifiers also need to display multiple symptoms at the same time . Here is a chest showing pneumonia and abscess X Photo chip , In the tab bar on the right, there are two “1”:
be hospitalized : The goal is based on the patient's health record , Determine the possibility of the patient's admission in the future . therefore , The classification problem can be designed as : According to the diagnosis, the disease may lead to the patient's admission in the future ( If any ), Classify the patient's existing health records . There may be a variety of diseases leading to admission , So there may be more than one answer . Chart : The following two feedforward neural networks correspond to the above problems respectively . In the final calculation , from Sigmoid Function handles the original output value , Get the corresponding probability , Allow multiple possibilities to coexist —— Because of the chest X X-rays may reflect a variety of abnormal states , There may be more than one cause of admission .
Softmax Function application : With handwritten numbers and Iris( Iris ) For example, handwritten numbers : Distinguish between handwritten numbers (MNIST Data sets :https://en.wikipedia.org/wiki/MNIST_database) when , The classifier should use Softmax function , What kind of numbers are . After all , Numbers 8 It's just numbers 8, It can't be numbers at the same time 7.
Iris:Iris Data set in 1936 In introducing (https://en.wikipedia.org/wiki/Iris_flower_data_set), It includes 150 Data sets , Divided into iris 、 Variegated Iris 、 Iris Virginia 3 class , Each category has 50 Data sets , Each data contains calyx length 、 Calyx width 、 Petal length 、 Petal width 4 Attributes . following 9 An example is taken from Iris Data sets :
There are no images in the dataset , But here's the mottled iris (https://en.wikipedia.org/wiki/Iris_flower_data_set#/media/File:Iris_versicolor_3.jpg), For you to enjoy :
Iris Neural network classifier of data set , To adopt Softmax Function handles the original output value , Because a iris can only be a specific species —— There's no point in dividing it into several varieties .
About “e” We should understand that Sigmoid and Softmax function , We should introduce “e”. In this paper , Just need to know e It's about equal to 2.71828 The mathematical constant of . Here is about e Other information about :• e The decimal system means forever , The numbers appear completely random —— Be similar to pi.• e Often used in compound interest 、 In the study of gambling and some probability distributions .• Here is e A formula for :
but e There is more than one formula for . There are many ways to calculate it . For example :https://www.intmath.com/exponential-logarithmic-functions/calculating-e.php• 2004 year , Google's IPO reached 2,718,281,828 dollar , namely “e Million dollars ”.• Wikipedia is the famous decimal number in human history e The evolution of (https://en.wikipedia.org/wiki/E_%28mathematical_constant%29#Bernoulli_trials), from 1690 One digit of the year begins , Last until 1978 Year of 116,000 Digit number :
Sigmoid Functions and Softmax function Sigmoid = Multi label classification problem = Multiple correct answers = Exclusive output ( For example, the chest X Light check 、 In the hospital )• Building classifiers , When solving a problem that has more than one correct answer , use Sigmoid The function processes each raw output value separately .• Sigmoid The function is shown below ( Be careful e):
In this formula ,σ Express Sigmoid function ,σ(zj) It means that you will Sigmoid Function applied to a number Zj. “Zj” Represents a single raw output value , Such as -0.5. j Represents the output value of the current operation . If you have four raw output values , be j = 1,2,3 or 4. In the previous example , The original output value is [-0.5,1.2,-0.1,2.4], be Z1 = -0.5,Z2 = 1.2,Z3 = -0.1,Z4 = 2.4. therefore ,
Z2,Z3、Z4 The calculation process is the same as above . because Sigmoid The function is applied to each of the original output values , So the possible output scenarios include : All categories have very low probabilities ( Such as “ This chest X There is nothing wrong with light inspection ”), The probability of one category is high, but the probability of others is very low ( Such as “ chest X The light examination revealed only pneumonia ”), The probability of multiple or all categories is high ( Such as “ chest X Light examination revealed pneumonia and abscess ”). The following figure for Sigmoid Function curve :
Softmax = Multi category classification problem = There is only one correct answer = Mutually exclusive output ( For example, handwritten numbers , Iris )• Building classifiers , When solving a problem with only one correct answer , use Softmax The function processes the raw output values .• Softmax The denominator of the function synthesizes all the factors of the original output value , It means ,Softmax The different probabilities obtained by the function are related to each other .• Softmax The function is expressed as follows :
Except for the denominator , To synthesize all the factors , In the original output value e ^ thing Add up ,Softmax Function and Sigmoid There's not much difference in functions . In other words , use Softmax Function to calculate a single raw output value ( for example Z1) when , You can't just count Z1, In the denominator Z1,Z2,Z3 and Z4 It should also be calculated , As shown below :
Softmax The advantage of the function is that the sum of all the output probabilities is 1:
When distinguishing handwritten numbers , use Softmax Function handles the original output value , If you want to add an example, it is divided into “8” Probability , It's going to reduce the example to other numbers (0,1,2,3,4,5,6,7 and / or 9) Probability .Sigmoid and Softmax Other examples of
summary :
• If the model output is a non mutex class , And you can select multiple categories at the same time , Then Sigmoid Function to calculate the original output value of the network .
• If the model output is a mutex class , And only one category can be selected , Then Softmax Function to calculate the original output value of the network .
版权声明
本文为[Spiritual]所创,转载请带上原文链接,感谢
边栏推荐
猜你喜欢
Use markdown
The birth of a new integrated memory and computing chip is conducive to the application of artificial intelligence~
函数分类大pk!sigmoid和softmax,到底分别怎么用?
第五章编程题
Drink soda, a bottle of soda water 1 yuan, two empty bottles can change a bottle of soda, give 20 yuan, how much soda can you
Millet and oppo continue to soar in the European market, and Xiaomi is even closer to apple
C++的那些事儿:从电饭煲到火箭,C++无处不在
Dev-c++在windows环境下无法debug(调试)的解决方案
I used Python to find out all the people who deleted my wechat and deleted them automatically
Google's AI model, which can translate 101 languages, is only one more than Facebook
随机推荐
Build simple business monitoring Kanban based on Alibaba cloud log service
别再在finally里面释放资源了,解锁个新姿势!
Application of four ergodic square of binary tree
Talk about go code coverage technology and best practices
总结: 10月海外DeFi新项目,更多资管策略来了!
Workers, workers soul, draw lifelong members, become a person!
It's just right. It's the ideal state
LeanCloud 十月变化
2035 we will build such a country
I used Python to find out all the people who deleted my wechat and deleted them automatically
API生命周期的5个阶段
习题五
On the concurrency of update operation
Apache Kylin远程代码执行漏洞复现(CVE-2020-1956)
The birth of a new integrated memory and computing chip is conducive to the application of artificial intelligence~
腾讯:阿里的大中台虽好,但也不是万能的!
AI weekly: employees are allowed to voluntarily reduce salary; company response: employees are happy and satisfied; tiger tooth HR takes employees out of the company; Sweden forbids Huawei ZTE 5g equi
2035我们将建成这样的国家
我用 Python 找出了删除我微信的所有人并将他们自动化删除了
Station B STM32 video learning