当前位置：网站首页>Pytorch transpose convolution

Pytorch transpose convolution

2022-06-24 16:06:00 【Full stack programmer webmaster】

Hello everyone , I meet you again , I'm your friend, Quan Jun .

Pytorch Transposition convolution

0. Introduction to the environment

Environment use Kaggle Built for free in Notebook

The tutorial uses Mr. Li Mu's Hands-on deep learning Website and Video Explanation

Tips ： When you don't understand the function, you can press Shift+Tab View function details .

1. Transposition convolution （transposed convolution）

Convolution does not increase the height and width of the input , Usually either the same , Or halve it . Transpose convolution can be used to increase the input height and width .

Suppose the channel is ignored , The stride is 1 And filled with 0. The input tensor shape is n h × n w n_h \times n_w nh×nw, The shape of convolution kernel is k h × k w k_h \times k_w kh×kw. Co generation n h n w n_hn_w nhnw An intermediate result . Each intermediate result is a ( n h + k h − 1 ) × ( n w + k w − 1 ) (n_h+k_h-1)\times(n_w+k_w-1) (nh+kh−1)×(nw+kw−1) Tensor （ Initialize to 0）. How to calculate the intermediate tensor ： Each element in the input tensor is multiplied by the convolution kernel , obtain k h × k w k_h \times k_w kh×kw Replace a part of the intermediate tensor with a tensor of . The position of the replaced part of each intermediate tensor corresponds to the position of the element in the input tensor . Last , All intermediate results are added together to obtain the final result .

The formula for calculating the intermediate tensor is as follows ： Y [ i : i + h , j : j + w ] + = X [ i , j ] ∗ K Y[i: i + h, j: j + w] += X[i, j] * K Y[i:i+h,j:j+w]+=X[i,j]∗K

1.1 Why is it called “ Transposition ” ？

For convolution Y = X * W Y = X * W Y=X*W （ * * * Represents convolution operation ）

It can be done to W W W Construct a V V V, Make convolution equivalent to matrix multiplication Y ′ = V X ′ Y^{\prime} = VX^{\prime} Y′=VX′
here Y ′ and X ′ Y^{\prime} and X^{\prime} Y′ and X′ yes Y , X Y, X Y,X Corresponding vector version .

Transpose convolution is equivalent to Y ′ = V T X ′ Y^{\prime} = V^TX^{\prime} Y′=VTX′ If convolution will be input from ( h , w ) (h, w) (h,w) Turned into ( h ′ , w ′ ) (h^{\prime}, w^{\prime}) (h′,w′)

Similarly, the transposed convolution of hyperparameters starts from ( h ′ , w ′ ) (h^{\prime}, w^{\prime}) (h′,w′) Turned into ( h , w ) (h, w) (h,w)

2. Transpose convolution implementation

2.1 Transposition convolution

!pip install -U d2l
import torch
from torch import nn
from d2l import torch as d2l

def trans_conv(X, K):
    h, w = K.shape
    Y = torch.zeros((X.shape[0] + h - 1, X.shape[1] + w - 1))
    for i in range(X.shape[0]):
        for j in range(X.shape[1]):
            Y[i: i + h, j: j + w] += X[i, j] * K
    return Y

X = torch.tensor([[0.0, 1.0], 
				  [2.0, 3.0]])
K = torch.tensor([[0.0, 1.0], 
                  [2.0, 3.0]])
trans_conv(X, K)

2.2 API Realization

X, K = X.reshape(1, 1, 2, 2), K.reshape(1, 1, 2, 2)
#  The first two parameters represent the number of input channels ,  Number of output channels 
tconv = nn.ConvTranspose2d(1, 1, kernel_size=2, bias=False)
tconv.weight.data = K
tconv(X)

2.3 fill , Stride and multichannel

Unlike conventional convolution , In transposition convolution , Populates the output that is applied to （ Conventional convolution applies padding to the input ）. for example , When the number of fills on both sides of height and width is specified as 1 when , The first and last rows and columns will be deleted in the output of transpose convolution .

tconv = nn.ConvTranspose2d(1, 1, kernel_size=2, padding=1, bias=False)
tconv.weight.data = K
tconv(X)

In transposition convolution , The stride is specified as the intermediate result （ Output ）, Instead of typing .

tconv = nn.ConvTranspose2d(1, 1, kernel_size=2, stride=2, bias=False)
tconv.weight.data = K
tconv(X)

Input X X X The shape of the , After convolution , The shape after transpose convolution is the same as the original shape ：

X = torch.rand(size=(1, 10, 16, 16))
conv = nn.Conv2d(10, 20, kernel_size=5, padding=2, stride=3)
tconv = nn.ConvTranspose2d(20, 10, kernel_size=5, padding=2, stride=3)
tconv(conv(X)).shape == X.shape

2.4 Connection with matrix transformation

X = torch.arange(9.0).reshape(3, 3)
K = torch.tensor([[1.0, 2.0], 
	              [3.0, 4.0]])
Y = d2l.corr2d(X, K)
Y

Convolution kernel K K K Rewrite to include a large number of 0 0 0 Sparse weight matrix W W W（ 4 × 9 4 \times 9 4×9）：

def kernel2matrix(K):
    k, W = torch.zeros(5), torch.zeros((4, 9))
    k[:2], k[3:5] = K[0, :], K[1, :]
    W[0, :5], W[1, 1:6], W[2, 3:8], W[3, 4:] = k, k, k, k
    return W

W = kernel2matrix(K)
W

Y == torch.matmul(W, X.reshape(-1)).reshape(2, 2)

Z = trans_conv(Y, K)
Z == torch.matmul(W.T, Y.reshape(-1)).reshape(3, 3)

3. On transpose convolution

Transpose convolution is a kind of convolution

It rearranges the inputs and cores
The same convolution is generally used for down sampling （ Make height and width smaller ）, Transpose convolution is often used as up sampling （ The output height and width become larger ）
If convolution will be input from ( h , w ) (h, w) (h,w) Turned into ( h ′ , w ′ ) (h^{\prime}, w^{\prime}) (h′,w′), Similarly, the transposed convolution under the hyperparameter will ( h ′ , w ′ ) (h^{\prime}, w^{\prime}) (h′,w′) become ( h , w ) (h, w) (h,w).

notes ： Down sampling ： Get the feature map from the input picture On the sampling ： From the characteristic diagram, the prediction diagram is obtained

3.1 Rearrange inputs and cores

When fill is 0 0 0, The stride is 1 1 1 when

Fill the input with k − 1 k-1 k−1 （ k k k It's the nuclear window ）
Move the kernel matrix up and down 、 Flip left and right
Then do the normal convolution （ fill 0 0 0, Stride 1 1 1）

( p , s ) = ( 0 , 1 ) (p,s) = (0, 1) (p,s)=(0,1)

When fill is p p p, The stride is 1 1 1 when

Fill the input with k − p − 1 k-p-1 k−p−1 （ k k k It's the nuclear window ）
Move the kernel matrix up and down 、 Flip left and right
Then do the normal convolution （ fill 0 0 0、 Stride 1 1 1）

( p , s ) = ( 1 , 1 ) (p,s) = (1, 1) (p,s)=(1,1)

When fill is p p p, The stride is s s s when

Insert... Between rows and columns s − 1 s-1 s−1 Row and column
Fill the input with k − p − 1 k-p-1 k−p−1 （ k k k It's the nuclear window ）
Move the kernel matrix up and down 、 Flip left and right
Then do the normal convolution （ fill 0 0 0、 Stride 1 1 1）

( p , s ) = ( 0 , 2 ) (p,s) = (0, 2) (p,s)=(0,2)

3.2 Shape conversion

Input high （ wide ） by n n n, nucleus k k k, fill p p p, Stride s s s. Transposition convolution ： n ′ = s n + k − 2 p − s n^{\prime} = sn + k -2p – s n′=sn+k−2p−s

Convolution ： n ′ = ⌊ ( n − k − 2 p + s ) / s ⌋ → n ≥ s n ′ + k − 2 p − s n^{\prime} = \lfloor(n-k-2p+s)/s\rfloor \to n \ge sn^{\prime} +k -2p -s n′=⌊(n−k−2p+s)/s⌋→n≥sn′+k−2p−s

If you multiply the height and width , that k = 2 p + s k=2p+s k=2p+s

3.3 The relationship between transpose convolution and deconvolution

Mathematical deconvolution （deconvolution） It refers to the inverse operation of convolution

If Y = c o n v ( X , K ) Y=conv(X, K) Y=conv(X,K), that X = d e c o n v ( Y , K ) X = deconv(Y, K) X=deconv(Y,K)

Deconvolution is rarely used in deep learning

By deconvolution neural network, we mean a neural network with transposed convolution

Publisher ： Full stack programmer stack length , Reprint please indicate the source ：https://javaforall.cn/151945.html Link to the original text ：https://javaforall.cn

原网站

版权声明
本文为[Full stack programmer webmaster]所创，转载请带上原文链接，感谢
https://yzsam.com/2022/175/202206241545217886.html