当前位置:网站首页>Gradient accumulation in pytorch [during the experiment, due to the limitation of GPU video memory, the batch\u size can no longer be increased. To solve this problem, the gradient accumulation method
Gradient accumulation in pytorch [during the experiment, due to the limitation of GPU video memory, the batch\u size can no longer be increased. To solve this problem, the gradient accumulation method
2022-06-12 23:00:00 【u013250861】
During the experiment , because GPU Memory limit , encounter batch_size A situation that cannot be increased . To solve the problem , Using the gradient accumulation method .
The method without gradient accumulation is as follows :
for i,(images,target) in enumerate(train_loader):
# 1. input output
images = images.cuda(non_blocking=True)
target = torch.from_numpy(np.array(target)).float().cuda(non_blocking=True)
outputs = model(images)
loss = criterion(outputs,target)
# 2. backward
optimizer.zero_grad() # reset gradient
loss.backward()
optimizer.step()
Use gradient accumulation :
for i,(images,target) in enumerate(train_loader):
# 1. input output
images = images.cuda(non_blocking=True)
target = torch.from_numpy(np.array(target)).float().cuda(non_blocking=True)
outputs = model(images)
loss = criterion(outputs,target)
# 2.1 loss regularization
loss = loss/accumulation_steps
# 2.2 back propagation
loss.backward()
# 3. update parameters of net
if((i+1)%accumulation_steps)==0:
# optimizer the net
optimizer.step() # update parameters of net
optimizer.zero_grad() # reset gradient
Original batch size by 32, Use gradient accumulation , Set up accumulation_steps=4, At this point, just put batch_size Set to 8, Can achieve the previous effect .
Reference material :
Pytorch Gradient accumulation in
边栏推荐
- Embedded pipeline out of the box
- 【LeetCode】53. Maximum subarray and
- 【LeetCode】102. 二叉树的层序遍历
- Ten key defensive points in the detailed attack and defense drill
- Colab教程(超级详细版)及Colab Pro/Colab Pro+使用评测
- C语言:如何给全局变量起一个别名?
- MOOG servo valve d634-341c/r40ko2m0nss2
- iShot
- ASP. Net core Middleware
- 【Web技术】1348- 聊聊水印实现的几种方式
猜你喜欢

Colab tutorial (super detailed version) and colab pro/colab pro+ usage evaluation

Alcohol detector based on 51 single chip microcomputer

JVM foundation - > talk about class loader two parent delegation model

JVM Basics - > how to troubleshoot JVM problems in your project

be careful! Your Navicat may have been poisoned

管线中的坐标变换

Anti aliasing / anti aliasing Technology
![[Part 7] source code analysis and application details of cyclicbarrier [key]](/img/bc/8ba2b86e599539a29683a63d02f0f7.jpg)
[Part 7] source code analysis and application details of cyclicbarrier [key]
C语言:如何给全局变量起一个别名?

The carrying capacity of L2 level ADAS increased by more than 60% year-on-year in January, and domestic suppliers "emerged"
随机推荐
【LeetCode】103. Zigzag sequence traversal of binary tree
Generate the chrysanthemum code of the applet (generate the chrysanthemum code, change the middle logo, change the image size, and add text)
Research Report on market supply and demand and strategy of China's digital camera lens industry
Go time format assignment
【建议收藏】通俗易懂图解网络知识-第一篇
[web technology] 1348- talk about several ways to implement watermarking
VIM use the lower right 4 keys
JVM foundation > CMS garbage collector
【Web技术】1348- 聊聊水印实现的几种方式
Research Report on water sports shoes industry - market status analysis and development prospect forecast
Go时间格式化 赋值
The fate of Internet people is that it is difficult to live to 30?
Research Report on market supply and demand and strategy of tizanidine industry in China
Photoshop:ps how to enlarge a picture without blurring
China Aquatic Fitness equipment market trend report, technical innovation and market forecast
同花顺股票账户开户安全吗
在同花顺开户安全么 ,证券开户怎么开户流程
Shardingsphere-proxy-5.0.0 deployment table implementation (I)
Flutter库推荐Sizer 可帮助您轻松创建响应式 UI
JVM Basics - > What are the JVM parameters?