当前位置:网站首页>Activate function
Activate function
2022-07-28 07:20:00 【Sauerkraut】
Why? tanh Than Sigmoid Convergence is faster ?
1. Zuo Wei Sigmoid:
Sigmoid Is a commonly used nonlinear activation function , His mathematical form is as follows :
Picture description here
It can put the continuous real value of input ” Compress “ To 0-1 Between .
Special , If it's a very large negative number , So the output is 0; If it's a very large positive number , Output is 1.
Sigmoid Functions have been used a lot , But now , Fewer and fewer people use it . Mainly because of some of its shortcoming :
(1)Sigmoids saturate and kill gradients, This is the gradient vanishing problem we often mention .sigmoid There is a very fatal flaw , When the input is very large or very small (saturation), The gradient of these neurons is close to 0 Of . If your initial value is very large , Most neurons may be in saturation And put gradient kill fall , This will make the Internet difficult to learn .
(2)Sigmoid Of output No 0 mean value . This is not desirable , Because this will cause the neurons in the latter layer to get the non output of the upper layer 0 Mean signal as input .
2. Right for tanh;
tanh yes Sigmoid Deformation of :
Picture description here
And Sigmoid The difference is ,tanh yes 0 Mean . therefore , Practical application ,tanh than Sigmoid Better .
The corresponding derivative :
Picture description here
Picture description here
You know , The range of is (0,1)
The range of is (0,1/4).
Sum up ,tanh(x) Gradient vanishing problem ratio Sigmoid Be light , So convergence should be fast .
————————————————
Copyright notice : This paper is about CSDN Blogger 「Peanut_ Fan 」 The original article of , follow CC 4.0 by-sa Copyright agreement , For reprint, please attach the original source link and this statement .
Link to the original text :https://blog.csdn.net/u013841196/article/details/80473654
边栏推荐
- 用户态vs内核态、进程vs线程
- How to connect the uniapp project to the real mobile phone for debugging
- Standard C language learning summary 3
- Database-Trivial
- Standard C language learning summary 5
- 大话持久性与redolog
- guava之EventBus
- Implementation method of converting ast into word vector before converting word vector
- Install pycharm
- Shell --- conditional statement practice
猜你喜欢
随机推荐
Circular linked list problem
Sysevr environment configuration: joern-0.3.1, neo4j-2.1.5, py2neo2.0
深入剖析单例模式的实现
Review of C language (variable parameters)
232 (female) to 422 (male)
Log in to Oracle10g OEM and want to manage the monitor program, but the account password input page always pops up
High performance memory queue -disruptor
Redis哨兵模式及集群
PyTorch - Dropout: A Simple Way to Prevent Neural Networks from Overfitting
Standard C language learning summary 3
Leetcode then a deep copy of the linked list
MySQL查询父节点下面的所有子孙节点,查询用户列表时多级(公司)部门处理,根据反射,递归树形结构工具类
guava之EventBus
C language push box
登录进oracle10g的oem,想管理监听程序却总是弹出帐号密码输入页面
List of papers on gestures
Standard C language learning summary 6
Safflower STL
用户态vs内核态、进程vs线程
easypoi导出表格带echars图表









