当前位置:网站首页>Paddlepaddle 20 implementation and use of exponentialmovingaverage (EMA) (support static graph and dynamic graph)
Paddlepaddle 20 implementation and use of exponentialmovingaverage (EMA) (support static graph and dynamic graph)
2022-06-27 02:16:00 【Ten thousand miles' journey to】
Exponentially moving average (ExponentialMovingAverage,EMA) It's a exponentially decreasing weighted moving average , Each update is based on the weight reserved last time decay To attenuate . The calculation method is pram_n'=(pram_n'-1)*decay+(1-decay)*pram_n, among pram_n' by EMA Saved in n Step weight ,pram_n Is the weight calculated normally by the algorithm .
EMA It's an iterative operation , In the n Step by step , The first k The update parameters are attenuated decay^(n-k) times .EMA The essence is a strategy of learning rate attenuation , See blog for details (https://www.cnblogs.com/sddai/p/14646581.html) The derivation in , Besides , The author also realized pytorch Version of EMA.
stay paddle2 in EMA It can be divided into static plate and dynamic plate , The static plate is officially in paddle2.0 Implementation in , The following will be a brief introduction . The dynamic plate needs to be realized by itself , This will be implemented in the second section of this article . At the beginning of the training model ema In effect , Can make ema_model Very poor performance in the test set , Because there are few model updates at this stage , Most of its weights are just initialized . It is recommended to use after a certain period of training ema.
1、 Static plate EMA
Introduction to the official website https://www.paddlepaddle.org.cn/documentation/docs/zh/api/paddle/static/ExponentialMovingAverage_cn.html#exponentialmovingaverage
Import Can only be used for static graphs
from paddle.static import ExponentialMovingAverageinitialization </
边栏推荐
- Microsoft365 developer request
- DAMA、DCMM等数据管理框架各个能力域的划分是否合理?有内在逻辑吗?
- Oracle/PLSQL: NumToYMInterval Function
- 参数估计——《概率论及其数理统计》第七章学习报告(点估计)
- Oracle/PLSQL: Trim Function
- C language -- Design of employee information management system
- memcached基础12
- Oracle/PLSQL: Lpad Function
- 学习太极创客 — MQTT(七)MQTT 主题进阶
- Flink learning 5: how it works
猜你喜欢
随机推荐
使用命令行安装达梦数据库
Enterprise digital transformation: informatization and digitalization
Oracle/PLSQL: Translate Function
Oracle/PLSQL: Rpad Function
Laravel 的 ORM 缓存包
参数估计——《概率论及其数理统计》第七章学习报告(点估计)
P5.js death planet
Oracle/PLSQL: Lpad Function
SQLite reader plug-in tests SQLite syntax
Flink learning 5: how it works
ThreadLocal详解
pytorch 23 hook的使用与介绍 及基于hook实现即插即用的DropBlock
memcached基础13
Flink learning 3: data processing mode (stream batch)
Topolvm: kubernetes local persistence scheme based on LVM, capacity aware, dynamically create PV, and easily use local disk
达梦数据库的卸载
Summer planning for the long river
【数组】剑指 Offer II 012. 左右两边子数组的和相等 | 剑指 Offer II 013. 二维子矩阵的和
谷歌开始卷自己,AI架构Pathways加持,推出200亿生成模型
Microsoft365开发人员申请








