当前位置:网站首页>Function classification big PK! How to use sigmoid and softmax respectively?
Function classification big PK! How to use sigmoid and softmax respectively?
2020-11-08 16:17:00 【Spiritual】
Design models to perform classification tasks ( As for the chest X Just check the disease or handwritten number to classify ) when , Sometimes you need to choose multiple answers at the same time ( If you choose pneumonia and abscess at the same time ), Sometimes you can only choose one answer ( Like numbers “8”). This article will discuss how to apply Sigmoid Function or Softmax Function handles the original output value of the classifier .
There are many kinds of neural network classifier classification algorithms , But the content of this paper is limited to neural network classifier . The classification problem can be solved by different neural networks , Such as feedforward neural network and convolution neural network . application Sigmoid Function or Softmax The final result of FNN classifier is a vector , namely “ The original output value ”, Such as [-0.5, 1.2, -0.1, 2.4], These four outputs correspond to the chest X Pneumonia after light examination 、 Heart hypertrophy 、 Tumors and abscesses . But what do these raw output values mean ? It may be easier to understand by converting the output value to a probability . Compared with the seemingly casual “2.4”, The possibility of diabetes is 91%, This statement is easier for patients to understand .Sigmoid Function or Softmax Function can map the original output value of classifier to probability . The following figure shows the original output of the feedforward neural network ( Blue ) adopt Sigmoid Functions are mapped to probabilities ( Red ) The process of :
Then use Softmax Function repeats the above process :
As shown in the figure ,Sigmoid Functions and Softmax Function gives different results . The reason lies in ,Sigmoid The function processes the raw output values separately , So the results are independent of each other , The sum of probabilities is not necessarily 1, Pictured 0.37 + 0.77 + 0.48 + 0.91 = 2.53. contrary ,Softmax The output values of functions are related to each other , The sum of the probabilities is always 1, Pictured 0.04 + 0.21 + 0.05 + 0.70 = 1.00. therefore , stay Softmax Function , To increase the probability of a class , The probability of other categories must be reduced accordingly .
Sigmoid Function application : With the chest X X-ray examination and admission for example, chest X Photo chip : A chest X Light film can show many diseases at the same time , So the chest X X-ray classifiers also need to display multiple symptoms at the same time . Here is a chest showing pneumonia and abscess X Photo chip , In the tab bar on the right, there are two “1”:
be hospitalized : The goal is based on the patient's health record , Determine the possibility of the patient's admission in the future . therefore , The classification problem can be designed as : According to the diagnosis, the disease may lead to the patient's admission in the future ( If any ), Classify the patient's existing health records . There may be a variety of diseases leading to admission , So there may be more than one answer . Chart : The following two feedforward neural networks correspond to the above problems respectively . In the final calculation , from Sigmoid Function handles the original output value , Get the corresponding probability , Allow multiple possibilities to coexist —— Because of the chest X X-rays may reflect a variety of abnormal states , There may be more than one cause of admission .
Softmax Function application : With handwritten numbers and Iris( Iris ) For example, handwritten numbers : Distinguish between handwritten numbers (MNIST Data sets :https://en.wikipedia.org/wiki/MNIST_database) when , The classifier should use Softmax function , What kind of numbers are . After all , Numbers 8 It's just numbers 8, It can't be numbers at the same time 7.
Iris:Iris Data set in 1936 In introducing (https://en.wikipedia.org/wiki/Iris_flower_data_set), It includes 150 Data sets , Divided into iris 、 Variegated Iris 、 Iris Virginia 3 class , Each category has 50 Data sets , Each data contains calyx length 、 Calyx width 、 Petal length 、 Petal width 4 Attributes . following 9 An example is taken from Iris Data sets :
There are no images in the dataset , But here's the mottled iris (https://en.wikipedia.org/wiki/Iris_flower_data_set#/media/File:Iris_versicolor_3.jpg), For you to enjoy :
Iris Neural network classifier of data set , To adopt Softmax Function handles the original output value , Because a iris can only be a specific species —— There's no point in dividing it into several varieties .
About “e” We should understand that Sigmoid and Softmax function , We should introduce “e”. In this paper , Just need to know e It's about equal to 2.71828 The mathematical constant of . Here is about e Other information about :• e The decimal system means forever , The numbers appear completely random —— Be similar to pi.• e Often used in compound interest 、 In the study of gambling and some probability distributions .• Here is e A formula for :
but e There is more than one formula for . There are many ways to calculate it . For example :https://www.intmath.com/exponential-logarithmic-functions/calculating-e.php• 2004 year , Google's IPO reached 2,718,281,828 dollar , namely “e Million dollars ”.• Wikipedia is the famous decimal number in human history e The evolution of (https://en.wikipedia.org/wiki/E_%28mathematical_constant%29#Bernoulli_trials), from 1690 One digit of the year begins , Last until 1978 Year of 116,000 Digit number :
Sigmoid Functions and Softmax function Sigmoid = Multi label classification problem = Multiple correct answers = Exclusive output ( For example, the chest X Light check 、 In the hospital )• Building classifiers , When solving a problem that has more than one correct answer , use Sigmoid The function processes each raw output value separately .• Sigmoid The function is shown below ( Be careful e):
In this formula ,σ Express Sigmoid function ,σ(zj) It means that you will Sigmoid Function applied to a number Zj. “Zj” Represents a single raw output value , Such as -0.5. j Represents the output value of the current operation . If you have four raw output values , be j = 1,2,3 or 4. In the previous example , The original output value is [-0.5,1.2,-0.1,2.4], be Z1 = -0.5,Z2 = 1.2,Z3 = -0.1,Z4 = 2.4. therefore ,
Z2,Z3、Z4 The calculation process is the same as above . because Sigmoid The function is applied to each of the original output values , So the possible output scenarios include : All categories have very low probabilities ( Such as “ This chest X There is nothing wrong with light inspection ”), The probability of one category is high, but the probability of others is very low ( Such as “ chest X The light examination revealed only pneumonia ”), The probability of multiple or all categories is high ( Such as “ chest X Light examination revealed pneumonia and abscess ”). The following figure for Sigmoid Function curve :
Softmax = Multi category classification problem = There is only one correct answer = Mutually exclusive output ( For example, handwritten numbers , Iris )• Building classifiers , When solving a problem with only one correct answer , use Softmax The function processes the raw output values .• Softmax The denominator of the function synthesizes all the factors of the original output value , It means ,Softmax The different probabilities obtained by the function are related to each other .• Softmax The function is expressed as follows :
Except for the denominator , To synthesize all the factors , In the original output value e ^ thing Add up ,Softmax Function and Sigmoid There's not much difference in functions . In other words , use Softmax Function to calculate a single raw output value ( for example Z1) when , You can't just count Z1, In the denominator Z1,Z2,Z3 and Z4 It should also be calculated , As shown below :
Softmax The advantage of the function is that the sum of all the output probabilities is 1:
When distinguishing handwritten numbers , use Softmax Function handles the original output value , If you want to add an example, it is divided into “8” Probability , It's going to reduce the example to other numbers (0,1,2,3,4,5,6,7 and / or 9) Probability .Sigmoid and Softmax Other examples of
summary :
• If the model output is a non mutex class , And you can select multiple categories at the same time , Then Sigmoid Function to calculate the original output value of the network .
• If the model output is a mutex class , And only one category can be selected , Then Softmax Function to calculate the original output value of the network .
版权声明
本文为[Spiritual]所创,转载请带上原文链接,感谢
边栏推荐
- Build simple business monitoring Kanban based on Alibaba cloud log service
- 构建者模式(Builder pattern)
- 浅谈,盘点历史上有哪些著名的电脑病毒,80%的人都不知道!
- 区块链周报:数字货币发展写入十四五规划;拜登邀请MIT数字货币计划高级顾问加入总统过渡团队;委内瑞拉推出国营加密交易所
- We made a medical version of the MNIST dataset, and found that the common automl algorithm is not so easy to use
- 实验
- Golang ICMP Protocol detects viable hosts
- 我们做了一个医疗版MNIST数据集,发现常见AutoML算法没那么好用
- What are the necessary laws and regulations to know when entering the Internet?
- Flink's sink: a preliminary study
猜你喜欢
Builder pattern
习题五
我们做了一个医疗版MNIST数据集,发现常见AutoML算法没那么好用
Travel notes of Suzhou
Builder pattern
One minute comprehensive understanding of forsage smart contract global shared Ethereum matrix plan
Arduino ide build esp8266 development environment, slow file download solution | esp-01 make WiFi switch tutorial, transform dormitory lights
机械硬盘随机IO慢的超乎你的想象
How to cooperate with people in software development? |Daily anecdotes
markdown使用
随机推荐
Solution to the problem of offline connection between ADB and mobile phone
第二章编程练习
I used Python to find out all the people who deleted my wechat and deleted them automatically
Leancloud changes in October
Your random IO hard disk
The first open source Chinese Bert pre training model in the financial field
(O)ServiceManager分析(一)之BinderInternal.getContextObject
What are the necessary laws and regulations to know when entering the Internet?
Golang system ping program to detect the surviving host (any permission)
rabbitmq(一)-基础入门
Millet and oppo continue to soar in the European market, and Xiaomi is even closer to apple
浅谈,盘点历史上有哪些著名的电脑病毒,80%的人都不知道!
AI weekly: employees are allowed to voluntarily reduce salary; company response: employees are happy and satisfied; tiger tooth HR takes employees out of the company; Sweden forbids Huawei ZTE 5g equi
我用 Python 找出了删除我微信的所有人并将他们自动化删除了
On DSA of OpenGL
Rabbitmq (1) - basic introduction
金融领域首个开源中文BERT预训练模型,熵简科技推出FinBERT 1.0
Summary of rendering of water wave and caustics (etching) in webgl
Examples of unconventional aggregation
Golang 系统ping程序探测存活主机(任意权限)