当前位置:网站首页>Be careful with your dictionaries and boilerplate code
Be careful with your dictionaries and boilerplate code
2022-07-30 21:33:00 【InfoQ】
The dictionary keys does not match
data={
"image1.5": image_0_5,
"image1.0": image_1_0,
"image0.5": image_1_5,
}
data={
"image1.5": image_1_5,
"image1.0": image_1_0,
"image0.5": image_0_5,
}
a+b+cassertifraiseassert data['image1.5'].shape[-1] > data['image1.0'].shape[-1] > data['image0.5'].shape[-1]样板代码(boilerplate code)的遗漏
loss = loss_fn(model(X), Y)
optimizer.zero_grad()
loss.backward()
optimizer.step()
optimizer.zero_grad()
with autocast(enabled=args.use_fp16):
loss = loss_fn(model(X), Y)
scaler.scale(loss).backward()
scaler.step(optimizer)
scaler.update()
optimizer.zero_grad()scaler.scale(loss).backward()自定义snippet
代码模块化
def clip_grad(params, mode, clip_cfg: dict):
if mode == "norm":
if "max_norm" not in clip_cfg:
raise ValueError(f"`clip_cfg` must contain `max_norm`.")
torch.nn.utils.clip_grad_norm_(
params, max_norm=clip_cfg.get("max_norm"), norm_type=clip_cfg.get("norm_type", 2.0)
)
elif mode == "value":
if "clip_value" not in clip_cfg:
raise ValueError(f"`clip_cfg` must contain `clip_value`.")
torch.nn.utils.clip_grad_value_(params, clip_value=clip_cfg.get("clip_value"))
else:
raise NotImplementedError
class Scaler:
def __init__(
self, optimizer, use_fp16=False, *, set_to_none=False, clip_grad=False, clip_mode=None, clip_cfg=None
) -> None:
self.optimizer = optimizer
self.set_to_none = set_to_none
self.autocast = autocast(enabled=use_fp16)
self.scaler = GradScaler(enabled=use_fp16)
if clip_grad:
self.grad_clip_ops = partial(ops.clip_grad, mode=clip_mode, clip_cfg=clip_cfg)
else:
self.grad_clip_ops = None
def calculate_grad(self, loss):
self.scaler.scale(loss).backward()
if self.grad_clip_ops is not None:
self.scaler.unscale_(self.optimizer)
self.grad_clip_ops(chain(*[group["params"] for group in self.optimizer.param_groups]))
def update_grad(self):
self.scaler.step(self.optimizer)
self.scaler.update()
self.optimizer.zero_grad(set_to_none=self.set_to_none)
def state_dict(self):
r"""
Returns the state of the scaler as a :class:`dict`. It contains five entries:
* ``"scale"`` - a Python float containing the current scale
* ``"growth_factor"`` - a Python float containing the current growth factor
* ``"backoff_factor"`` - a Python float containing the current backoff factor
* ``"growth_interval"`` - a Python int containing the current growth interval
* ``"_growth_tracker"`` - a Python int containing the number of recent consecutive unskipped steps.
If this instance is not enabled, returns an empty dict.
.. note::
If you wish to checkpoint the scaler's state after a particular iteration, :meth:`state_dict`
should be called after :meth:`update`.
"""
return self.scaler.state_dict()
def load_state_dict(self, state_dict):
r"""
Loads the scaler state. If this instance is disabled, :meth:`load_state_dict` is a no-op.
Args:
state_dict(dict): scaler state. Should be an object returned from a call to :meth:`state_dict`.
"""
self.scaler.load_state_dict(state_dict)
scaler = pipeline.Scaler(
optimizer=optimizer,
use_fp16=cfg.train.use_amp,
set_to_none=cfg.train.optimizer.set_to_none,
clip_grad=cfg.train.grad_clip.enable,
clip_mode=cfg.train.grad_clip.mode,
clip_cfg=cfg.train.grad_clip.cfg,
)
with torch.cuda.amp.autocast(enabled=cfg.train.use_amp):
probs, loss, loss_str = model(
data=batch_data, iter_percentage=counter.curr_iter / counter.num_total_iters
)
loss = loss / cfg.train.grad_acc_step
scaler.calculate_grad(loss=loss)
if counter.every_n_iters(cfg.train.grad_acc_step): # Accumulates scaled gradients.
scaler.update_grad()
边栏推荐
猜你喜欢

弹性盒子模型
![[Limited Time Bonus] 21-Day Learning Challenge - MySQL from entry to mastery](/img/12/f9fe60c7fc3d376aa95a4756541b61.png)
[Limited Time Bonus] 21-Day Learning Challenge - MySQL from entry to mastery

MySQL60 homework

openim支持十万超级大群

How do I refresh the company's background management system (Part 1) - performance optimization

【限时福利】21天学习挑战赛 - MySQL从入门到精通

navicat连接MySQL报错:1045 - Access denied for user ‘root‘@‘localhost‘ (using password YES)

一个网络两种用途!南开&哈工程提出TINet,通过细化纹理和边缘,在显著性目标检测和伪装目标检测上实现双SOTA!...

kubernetes

MySQL 灵魂 16 问,你能撑到第几问?
随机推荐
How strict Typescript strict mode?
Chrome 配置samesite=none方式
Automatically generate test modules using JUnit4 and JUnitGenerator V2.0 in IDEA
DPW-SDNet: Dual Pixel-Wavelet Domain Deep CNNsfor Soft Decoding of JPEG-Compressed Images
基于ABP实现DDD--仓储实践
MySQL 有这一篇就够(呕心狂敲37k字,只为博君一点赞!!!)
About the data synchronization delay of MySQL master-slave replication
How do I refresh the company's background management system (Part 1) - performance optimization
Apache DolphinScheduler新一代分布式工作流任务调度平台实战-
Why do so many people who teach themselves software testing give up later...
Day 16 of HCIP
The mysql time field is set to the current time by default
字节对齐之C语言犄角旮旯的知识
uni-app开发微信小程序踩坑
Structured Streaming报错记录:Overloaded method foreachBatch with alternatives
openim支持十万超级大群
Image Restoration by Estimating Frequency Distribution of Local Patches
kubernetes
用于视频压缩伪影消除的深度卡尔曼滤波网络
JDBC (detailed explanation)