当前位置:网站首页>Gradient accumulation in pytorch [during the experiment, due to the limitation of GPU video memory, the batch\u size can no longer be increased. To solve this problem, the gradient accumulation method

Gradient accumulation in pytorch [during the experiment, due to the limitation of GPU video memory, the batch\u size can no longer be increased. To solve this problem, the gradient accumulation method

2022-06-12 23:00:00 u013250861

During the experiment , because GPU Memory limit , encounter batch_size A situation that cannot be increased . To solve the problem , Using the gradient accumulation method .

The method without gradient accumulation is as follows :

for i,(images,target) in enumerate(train_loader):
    # 1. input output
    images = images.cuda(non_blocking=True)
    target = torch.from_numpy(np.array(target)).float().cuda(non_blocking=True)
    outputs = model(images)
    loss = criterion(outputs,target)

    # 2. backward
    optimizer.zero_grad()   # reset gradient
    loss.backward()
    optimizer.step()

Use gradient accumulation :

for i,(images,target) in enumerate(train_loader):
    # 1. input output
    images = images.cuda(non_blocking=True)
    target = torch.from_numpy(np.array(target)).float().cuda(non_blocking=True)
    outputs = model(images)
    loss = criterion(outputs,target)

    # 2.1 loss regularization
    loss = loss/accumulation_steps
    # 2.2 back propagation
    loss.backward()

    # 3. update parameters of net
    if((i+1)%accumulation_steps)==0:
        # optimizer the net
        optimizer.step()        # update parameters of net
        optimizer.zero_grad()   # reset gradient

Original batch size by 32, Use gradient accumulation , Set up accumulation_steps=4, At this point, just put batch_size Set to 8, Can achieve the previous effect .




Reference material :
Pytorch Gradient accumulation in

原网站

版权声明
本文为[u013250861]所创,转载请带上原文链接,感谢
https://yzsam.com/2022/163/202206122247122769.html