当前位置:网站首页>Paddlepaddle 20 implementation and use of exponentialmovingaverage (EMA) (support static graph and dynamic graph)
Paddlepaddle 20 implementation and use of exponentialmovingaverage (EMA) (support static graph and dynamic graph)
2022-06-27 02:16:00 【Ten thousand miles' journey to】
Exponentially moving average (ExponentialMovingAverage,EMA) It's a exponentially decreasing weighted moving average , Each update is based on the weight reserved last time decay To attenuate . The calculation method is pram_n'=(pram_n'-1)*decay+(1-decay)*pram_n, among pram_n' by EMA Saved in n Step weight ,pram_n Is the weight calculated normally by the algorithm .
EMA It's an iterative operation , In the n Step by step , The first k The update parameters are attenuated decay^(n-k) times .EMA The essence is a strategy of learning rate attenuation , See blog for details (https://www.cnblogs.com/sddai/p/14646581.html) The derivation in , Besides , The author also realized pytorch Version of EMA.
stay paddle2 in EMA It can be divided into static plate and dynamic plate , The static plate is officially in paddle2.0 Implementation in , The following will be a brief introduction . The dynamic plate needs to be realized by itself , This will be implemented in the second section of this article . At the beginning of the training model ema In effect , Can make ema_model Very poor performance in the test set , Because there are few model updates at this stage , Most of its weights are just initialized . It is recommended to use after a certain period of training ema.
1、 Static plate EMA
Introduction to the official website https://www.paddlepaddle.org.cn/documentation/docs/zh/api/paddle/static/ExponentialMovingAverage_cn.html#exponentialmovingaverage
Import Can only be used for static graphs
from paddle.static import ExponentialMovingAverageinitialization </
边栏推荐
猜你喜欢

达梦数据库安装

Flink learning 1: Introduction

H5 liquid animation JS special effect code

p5.js死亡星球

Sample development of WiFi IOT Hongmeng development kit

CVPR2022 | PointDistiller:面向高效紧凑3D检测的结构化知识蒸馏

为什么传递SPIF_SENDCHANGE标志SystemParametersInfo会挂起?

别被洗脑了,这才是90%中国人的工资真相

Flink learning 2: application scenarios

企业数字化转型:信息化与数字化
随机推荐
Topolvm: kubernetes local persistence scheme based on LVM, capacity aware, dynamically create PV, and easily use local disk
pytorch 23 hook的使用与介绍 及基于hook实现即插即用的DropBlock
Reading a book in idea is too much!
Oracle/PLSQL: Replace Function
Memcached basics 15
three. JS domino JS special effect
Hot discussion: what are you doing for a meaningless job with a monthly salary of 18000?
学习太极创客 — MQTT 第二章(三)保留消息
Why divide the training set and the test set before normalization?
lottie.js创意开关按钮动物头像
TechSmith Camtasia最新2022版详细功能讲解下载
Memcached foundations 12
"All majors are persuading them to quit." is it actually the most friendly to college students?
svg拖拽装扮Kitty猫
Addition, deletion, modification and query of ymal file
Constraintlayout Development Guide
Sample development of WiFi IOT Hongmeng development kit
YaLM 100B:来自俄罗斯Yandex的1000亿参数开源大模型,允许商业用途
Oracle/PLSQL: NumToDSInterval Function
paddlepaddle 20 指数移动平均(ExponentialMovingAverage,EMA)的实现与使用(支持静态图与动态图)