当前位置：网站首页>Detailed explanation of maxpooling corresponding to conv1d, conv2d and conv3d machines of tensorflow2

Detailed explanation of maxpooling corresponding to conv1d, conv2d and conv3d machines of tensorflow2

2022-06-13 01:40:00 【Under the starry sky 0516】

TensorFlow2 Yes Conv1D, Conv2D, Conv3D There are detailed explanations , For public

Convolution operation

Convolution in machine learning , As the name suggests, there is a roll of action ：
$\int x(a)w(t-a)da$
here x It is generally called input data , and w It's called convolution kernel . It can be seen from the formula that ,w In the face of x Convolution , The convolution operation is right x Multiply in reverse order , Pictured ：
Notice the red corner of the convolution kernel in the picture , That is, the operation of the above convolution formula .

Explain the main parameters ：

filters: Number of convolution kernels , The spatial dimension of refraction used in convolution calculation ;
kernel_size: Convolution kernel size , The request is a Tensor, have [filter_height, filter_width, in_channels, out_channels] In this way shape, The specific meaning is [ Convolution kernel height , Convolution kernel width , Number of image channels , Number of convolution kernels ], Requires types and parameters input identical . One thing to note is , The third dimension in_channels Namely input Of the fourth dimension ;
strides: Step size , The step size in each dimension of the image . This is a one-dimensional vector , The first and fourth dimensions default to 1, The third dimension and the fourth dimension are the step length of horizontal and vertical sliding respectively ;
padding: Make up the way ,string Type the amount of , Can only be “SAME” and “VALID” One of them , This value determines the different convolution modes ;
activation: Activation function , In general use ReLU As an activation function .

The size of output data after convolution , namely conv size ：
$N=(W-F+2P)/S+1（Padding=“SAME”）\\ N=(W-F)/S+1（Padding=“VALID”）\\$
The meanings of the parameters here are as follows ：

N: Size of output data , That is, the output size after one convolution
W: Enter the size of the picture , namely input Corresponding value
F: Convolution kernel size , namely kernel_size Corresponding value
P: Padding The number of pixels ,SAME In the mode, generally, it takes 1,VALID Mode 0, If this value is not given, then P=0
S: step S, namely strides Corresponding value , The default value is 1

So we can get from the above formula N, That is, the output data is 2×2 The size of the . If we choose SAME A filling , The result will be 3×3 The size of the , And choose VALID, The output 2×2.
For example ：

Conv1D

import tensorflow as tf
input = tf.Variable(tf.random.normal([1, 3, 1])) #  Enter a random 3×1 Array of 
conv = tf.keras.layers.Conv1D(1, 2)(input) #  Use 1 individual 2x1 Convolution with convolution kernels of different sizes , Steps are omitted here ( The default is 1, Not given ),padding( It's not used here padding) etc. .
print(conv)

The input dimension is ：3x1, According to the above formula , The output size is 2x1.

import tensorflow as tf
input = tf.Variable(tf.random.normal([1, 3, 1])) #  Enter a random 3×1 Array of 
conv = tf.keras.layers.Conv1D(1, 2, padding="valid")(input) #  Use 1 individual 2×1 Convolution with convolution kernels of different sizes , Steps are omitted here ( The default is 1, Not given ),padding(valid Pattern ) etc. .
print(conv)

The input dimension is ：3x1, According to the above formula , The output size is 2x1,Padding use valid Pattern .

import tensorflow as tf
input = tf.Variable(tf.random.normal([1, 3, 3, 1])) #  Enter a random 3×1 Array of 
conv = tf.keras.layers.Conv1D(1, 2, padding="same")(input) #  Use 1 individual 2×1 Convolution with convolution kernels of different sizes , Steps are omitted here ( The default is 1, Not given ),padding(same Pattern ) etc. .
print(conv)

The input dimension is ：3x1, According to the above formula , The output size is 3x1,Padding use same Pattern .

Conv2D

import tensorflow as tf
input = tf.Variable(tf.random.normal([1, 3, 3, 1])) #  Enter a random 3×3 Array of 
conv = tf.keras.layers.Conv2D(1, 2)(input) #  Use 1 individual 2x2 Convolution with convolution kernels of different sizes , Steps are omitted here ( The default is 1, Not given ),padding( It's not used here padding) etc. .
print(conv)

The input dimension is ：3x3, According to the above formula , The output size is 2x2. Other modes can be derived from the formula .

Conv3D

import tensorflow as tf
input = tf.Variable(tf.random.normal([1, 3, 3, 3, 1])) #  Enter a random 3x3x3 Array of 
conv = tf.keras.layers.Conv3D(1, 2)(input) #  Use 1 individual 2x2x2 Convolution with convolution kernels of different sizes , Steps are omitted here ( The default is 1, Not given ),padding( It's not used here padding) etc. .
print(conv)

The input dimension is ：3x3x3, According to the above formula , The output size is 2x2x2. Other modes can be derived from the formula .

Pooling operation

The pooling operation is to prevent over fitting , There are two main pooling methods ： Average pooling and maximum pooling .

The average pooling ： Average the values of the pooled window , Use this average as the value for this window ;
Maximum pooling ： Maximize the value of the pooled window , Use this maximum value as the value of the window .
Schematic diagram of pooling operation ：

Important parameter setting ：

pool_size: The size of the pooled window , The default is [2, 2]
strides: It's similar to convolution , Represents the sliding step size on each dimension of the window , The default is [2, 2]
padding: It's similar to convolution , May adopt "SAME" and "VALID" Two modes , Return to one Tensor,shape Is still [batch, height, width, channels] type .

Image size change formula after pooling ：
$\frac{W-P_s}{S}+1$
here ：