当前位置：网站首页>Introduction notes to pytorch deep learning (11) neural network pooling layer

Introduction notes to pytorch deep learning (11) neural network pooling layer

2022-06-30 07:35:00 【Snow fish】

Course notes , Course link
The learning notes are synchronously posted on my Personal website On , Welcome to check .

One 、MaxPool2d brief introduction

This section explains the pooling layer . Or through Pytorch Official documents for learning ：
open torch.nn Of pooling layers, The most common functions are nn.MaxPool2d, The parameters to be provided are as follows ：
Insert picture description here

kernel_size Is the size of the window , It can be int or tuple data type
dilation : Change the window spacing . As shown in the figure below , Blue is the input , Grey is the window , You can see that the window is 3*3 size , And the interval is dilation by 1. Generally, no setting is required .
cell_mode:: When cell_mode by True when , Will use cell Pattern instead of floor Mode to calculate output . A simple explanation cell Patterns and floor Pattern ：

Floor The pattern is to round the value down , Such as 2.31 The value is 2 , and Cell The pattern is to round up the value , Such as 2.31 The value is 3. In the maximum pooling operation , When it comes to cell Mode time , If the window and input do not completely coincide , A calculation will also be made ; by floor Mode will discard this calculation .
Maximum pooling is to get the largest number in the window , example ：

Two 、 Code demonstration

import torch
from torch import nn
from torch.nn import MaxPool2d

input = torch.tensor([[1, 2, 0, 3, 1],
                      [0, 1, 2, 3, 1],
                      [1, 2, 1, 0, 0],
                      [5, 2, 3, 1, 1],
                      [2, 1, 0, 1, 1]], dtype=torch.float32)
# N C H W
input = torch.reshape(input, (-1, 1, 5, 5))

class Net(nn.Module):
    def __init__(self):
        super(Net, self).__init__()
        self.maxpool1 = MaxPool2d(kernel_size=3, ceil_mode=True)

    def forward(self, input):
        output = self.maxpool1(input)
        return output

net1 = Net()
output = net1(input)
print(output)

Output results ：
Insert picture description here
It is consistent with the previous calculation .
The function of maximum pooling is to reduce the amount of data on the premise of saving data characteristics .
Let's do another example ：
Example ：

import torch
import torchvision
from torch import nn
from torch.nn import MaxPool2d
from torch.utils.data import DataLoader
from torch.utils.tensorboard import SummaryWriter

dataset = torchvision.datasets.CIFAR10("./dataset", train=False, download=True,
                                       transform=torchvision.transforms.ToTensor())
dataloader = DataLoader(dataset, batch_size=64)

class Net(nn.Module):
    def __init__(self):
        super(Net, self).__init__()
        self.maxpool1 = MaxPool2d(kernel_size=3, ceil_mode=True)

    def forward(self, input):
        output = self.maxpool1(input)
        return output

net1 = Net()

writer = SummaryWriter("logs")
step = 0

for data in dataloader:
    imgs, targets = data
    writer.add_images("input", imgs, step)
    output = net1(imgs)
    writer.add_images("output", output, step)
    step = step + 1
writer.close()

use tensorboard View results ：
Insert picture description here

It can be seen that the direct effect of the maximum pool operation is to reduce the pixels of the picture , That's blurred , Only the most prominent features of the original image are retained .