当前位置:网站首页>[A summary of the sorting and use of activation functions in deep learning]

[A summary of the sorting and use of activation functions in deep learning]

2022-08-03 13:45:00 vcsir

介绍

激活函数定义如下:Activation function weighted sum,Then add to its deviation to decide whether should activate neurons.The goal is to nonlinear activation function into the output of the neuron.No activation functions of neural network is basically linear regression network model of deep learning,Because these functions to perform nonlinear neural network's input to calculate,使其能够学习和执行更复杂的任务.因此,The activation functions of derivative and application,And analysis the advantages and disadvantages of each activation function,For selection in a specific neural network model with nonlinear and the accuracy of the appropriate type of activation function is vital.
在这里插入图片描述
We know that neurons in the neural network is according to their weight、Deviation and activation function and work,According to to change the neuron weights of the neural network output error and deviation.Back propagation is the term of this process,Due to the gradient and error at the same time provide updated weights and bias,Therefore support back propagation activation function.

为什么我们需要它?

非线性激活函数:如果没有激活函数,Neural network is just a linear regression model.The activation function in nonlinear way transformation input,To study and complete the task more complicated.

激活函数的种类

1)线性激活函数

• 方程:Linear function equation isy = ax,With the linear equations are very similar.

• -inf 到 +inf 范围

• 应用:Linear activation function in the output layer using only one time.

• 问题:If we are to differential linear function to introduce nonlinear,The result will no longer with the input“x”Relevant and function will become a constant,Therefore our program will not be shown any behavior.

在这里插入图片描述

2)sigmoid激活函数:

• 这是一个以“S”Of mapping function.

• 公式:A = 1/(1 + ex)

•非线性.X 的值范围从 -2 到 2,但 Y The value is very steep.这表明x Small changes will lead to Y 值的巨大变化.

• 0 到 1 的范围值

在这里插入图片描述

3)Tanh 激活函数:

Tanh 函数,Also known as the tangent hyperbolic function,Is an almost always better than sigmoid Better function of activation function.It is just an adjusted sigmoid 函数.Both are related to,Can be derived to each other.

• 方程:f(x) = tanh(x) = 2/(1 + e-2x) – 1 OR tanh(x) = 2 * sigmoid(2x) – 1 OR tanh(x) = 2 * sigmoid(2x) – 1
• 值范围:-1 到 +1

• 用途:Usually used in neural network hidden layer,因为它的值从 -1 变为 1,Lead to hidden layers on the average of 0 Or very close to it,This helps by making the average close to 0 To help the data centralized,This makes learning under a layer of more direct.

在这里插入图片描述

4)RELU激活函数.

This is the most commonly used way to activate,Mainly used in neural network hidden layer.

• 公式:A(x) = max (0,x).如果 x 为正,则返回 x;否则,它返回 0.

• 值范围:(inf, 0)

• 本质上是非线性的,This means that the simple error back propagation and also has the activation of multilayer neurons ReLU 函数.

• 应用:Because it contains less math,ReLu The calculation of costs less than tanh 和 sigmoid,A only a small number of neurons active,This makes the network sparse and high calculation efficiency.

简单地说,RELU Function of the learning faster than sigmoid 和 Tanh 函数快得多.

在这里插入图片描述

5)Softmax 激活函数

softmax 函数是一种 sigmoid 函数,Come in handy when dealing with classification problem.

• In essence the nonlinear

• 用途:Usually when dealing with multiple classification using.softmaxThe sum of functions will be divided by output,And the output of each class compression in 0 和 1 之间.

• 输出:softmax Function best in classifier output layer,We try to use probability to define each input categories.

在这里插入图片描述

Select the correct activation function

If not sure you want to use the activation function,只需选择 RELU,This is a widespread activation function,At present, in most cases is used.If our output layer is used for binary classification recognition/检测,那么 sigmoid Function is a right choice.

Python代码实现

import numpy as np
import matplotlib.pyplot as plt

#实现sigmoid函数
def sigmoid(x):
    s=1/(1+np.exp(-x))
    ds=s*(1-s)
    return s,ds


if __name__ == '__main__':
    x=np.arange(-5,5,0.01)
    sigmoid(x)
    fig, ax = plt.subplots(figsize=(9, 5))
    ax.spines['left'].set_position('center')
    ax.spines['right'].set_color('none')
    ax.spines['top'].set_color('none')
    ax.xaxis.set_ticks_position('bottom')
    ax.yaxis.set_ticks_position('left')
    ax.plot(x,sigmoid(x)[0], color='#ff0000', linewidth=3, label='sigmoid')
    ax.plot(x,sigmoid(x)[1], color='#0000ff', linewidth=3, label='derivative')
    ax.legend(loc="center right", frameon=False)
    fig.show()
    plt.pause(-1)

在这里插入图片描述

结论

在本文中,I mainly to sort out and sums up the depth study of the different types of activation function for your reference and study,若有不正确的地方,还请大家指正,谢谢!

原网站

版权声明
本文为[vcsir]所创,转载请带上原文链接,感谢
https://yzsam.com/2022/215/202208031314578950.html