当前位置：网站首页>1D, 2D, 3D convolution operations in pytorch

1D, 2D, 3D convolution operations in pytorch

2022-07-31 05:33:00 【Cheng-O】

The convolution operation is to use the sliding window mechanism to perform cross-correlation operations on the data to extract features.

One-dimensional convolution

One-dimensional convolution is used to process sequence data. Each sequence element is usually encoded before input. The format of the input sequence obtained in this way should be [batch_size, seq_len, embedding_size], hereThe embedding_size is equivalent to the same concept as the number of channels.Therefore, before processing, permute(0,2,1) is generally performed to convert the input format to [batch_size, embedding_size, seq_len] embedding_size is used as the intermediate layer as the number of channels as the one-dimensional convolutioninput of.

eg:

self.conv1 = nn.Conv1d(in_channels=n_feature, out_channels=n_feature, kernel_size=1,stride=1,padding=0, dilation=1, groups=1,bias=True, padding_mode='zeros')

2D Convolution

Two-dimensional convolution is the earliest proposed convolution operation for processing high-dimensional data. The input of two-dimensional convolution is [batch_size, channel_num, H, W].

PS: In python, the cv2 function is usually used to read the image. The read format is [H,W,C]. Use torchvision.transforms.ToTensor() to read cv2The image is converted to the format [C,H,W] used in pytorch.

self.conv2 = nn.Conv2d(planes, planes, kernel_size=3, stride=stride, padding=2,bias=False, dilation=1) # The first two parameters are the number of channels

PS: Atrous convolution

One of the convolution operation parameters is dilation. If it is set to 1, it is a normal convolution operation. If it is set to be greater than 1, it is a hole convolution. The hole convolution operation can be obtained by a smaller convolution kernel.For a large receptive field, taking the convolution kernel as 3*3 and dilation=2 as an example, the original convolution operation will take a 3*3 size sub-region and the convolution kernel on the feature map for cross-correlation operation, and useAfter the hole convolution, the 3*3 convolution kernel will be padded to 5*5, that is, the original convolution kernel will be filled with 0, so that a 5*5 area will also be operated when the cross-correlation operation is performed., the meaning of a dilation of 2 is that the distance between each element in the convolution kernel is filled with a distance of 2.

3D Convolution

3D convolution is used to extract video data features, and the input data format is [batch_size, channel_num, t_len, H, W].

self.conv3d = nn.Conv3d(in_channels=in_channels, out_channels=output_channels,kernel_size=kernel_shape, stride=stride,padding=0, bias=self._use_bias)

Summary: Compared with low-dimensional, high-dimensional convolution requires one more dimension input, and the first two dimensions of the data input are both batch_size and channel_num.

原网站

版权声明
本文为[Cheng-O]所创，转载请带上原文链接，感谢
https://yzsam.com/2022/212/202207310507389172.html

当前位置：网站首页>1D, 2D, 3D convolution operations in pytorch

1D, 2D, 3D convolution operations in pytorch

边栏推荐

猜你喜欢

随机推荐