当前位置:网站首页>Activation function - relu vs sigmoid
Activation function - relu vs sigmoid
2022-07-02 20:22:00 【Zi Yan Ruoshui】
Data flow through sigmoid after , There will be significant attenuation .
Hypothetical front face w Make a big change
, after sigmoid Then it will become a small change . This change has been transmitted back attenuation , Until
. At this time, you will find the front layer
Obviously smaller than the following
.
If you use the gradient descent method , The latter parameters must iterate faster than the previous parameters , So convergence is faster . As a result, the training of the following parameters is almost completed , The previous parameters are still close to the bad training results of random numbers .

therefore ML Search for alternatives sigmoid The activation function of , Such as relu.

relu Function in Greater than 0 Part of The gradient is constant ,relu Function in Less than 0 At the time of the Derivative is 0 , So once the neuron activation value enters the negative half region , Then the gradient will be 0, In other words, this neuron will not undergo training . Only the neuron activation value enters the positive half area , There will be a gradient value , At this point, the neuron will do this once ( To strengthen ) Training .
relu The nature of the function is very similar to the activation of neurons in Biology .

To sum up relu Characteristics as activation function :
1) Fast calculation ;
2) It simulates the activation characteristics of biological nervous system
3) A series of relu With different bias After superposition, it can be combined into sigmoid;
4) Solved the problem of gradient disappearance
边栏推荐
- Want to ask, is there any discount for opening an account now? Is it safe to open an account online?
- How can testers do without missing tests? Seven o'clock is enough
- AcWing 181. Turnaround game solution (search ida* search)
- How to realize the function of detecting browser type in Web System
- 【JS】获取hash模式下URL的搜索参数
- AcWing 1126. Minimum cost solution (shortest path Dijkstra)
- CRM客户关系管理系统
- RPD product: super power squad nanny strategy
- 台湾SSS鑫创SSS1700替代Cmedia CM6533 24bit 96KHZ USB音频编解码芯片
- Start practicing calligraphy
猜你喜欢

通信人的经典语录,第一条就扎心了……

pytorch 模型保存的完整例子+pytorch 模型保存只保存可训练参数吗?是(+解决方案)

勵志!大凉山小夥全獎直博!論文致謝看哭網友

API文档工具knife4j使用详解

Driverless learning (4): Bayesian filtering

pytorch 模型保存的完整例子+pytorch 模型保存只保存可訓練參數嗎?是(+解决方案)

Postman interface test practice, these five questions you must know

浏览器缓存机制概述

HDL design peripheral tools to reduce errors and help you take off!

Jetson XAVIER NX上ResUnet-TensorRT8.2速度與顯存記錄錶(後續不斷補充)
随机推荐
[译]深入了解现代web浏览器(一)
Data Lake (XII): integration of spark3.1.2 and iceberg0.12.1
pytorch 模型保存的完整例子+pytorch 模型保存只保存可训练参数吗?是(+解决方案)
After writing 100000 lines of code, I sent a long article roast rust
什么叫在线开户?现在网上开户安全么?
面试经验总结,为你的offer保驾护航,满满的知识点
Is it safe to open an account for online stock speculation? I'm a novice, please guide me
Jetson XAVIER NX上ResUnet-TensorRT8.2速度与显存记录表(后续不断补充)
Educational codeforces round 129 (rated for Div. 2) supplementary problem solution
Taiwan SSS Xinchuang sss1700 replaces cmmedia cm6533 24bit 96KHz USB audio codec chip
【871. 最低加油次数】
Infix expression is converted to suffix expression (C language code + detailed explanation)
Self-Improvement! Daliangshan boys all award Zhibo! Thank you for your paper
【Hot100】22. 括号生成
笔记本安装TIA博途V17后出现蓝屏的解决办法
在网上炒股开户安全吗?我是新手,还请指导
What are the preferential account opening policies of securities companies now? Is it actually safe to open an account online?
在消费互联网时代,诞生了为数不多的头部平台的话
Kt148a voice chip IC user end self replacement voice method, upper computer
In the era of consumer Internet, a few head platforms have been born