当前位置:网站首页>1D, 2D, 3D convolution operations in pytorch
1D, 2D, 3D convolution operations in pytorch
2022-07-31 05:33:00 【Cheng-O】
The convolution operation is to use the sliding window mechanism to perform cross-correlation operations on the data to extract features.
One-dimensional convolution
One-dimensional convolution is used to process sequence data. Each sequence element is usually encoded before input. The format of the input sequence obtained in this way should be [batch_size, seq_len, embedding_size], hereThe embedding_size is equivalent to the same concept as the number of channels.Therefore, before processing, permute(0,2,1) is generally performed to convert the input format to [batch_size, embedding_size, seq_len] embedding_size is used as the intermediate layer as the number of channels as the one-dimensional convolutioninput of.
eg:
self.conv1 = nn.Conv1d(in_channels=n_feature, out_channels=n_feature, kernel_size=1,stride=1,padding=0, dilation=1, groups=1,bias=True, padding_mode='zeros')
2D Convolution
Two-dimensional convolution is the earliest proposed convolution operation for processing high-dimensional data. The input of two-dimensional convolution is [batch_size, channel_num, H, W].
PS: In python, the cv2 function is usually used to read the image. The read format is [H,W,C]. Use torchvision.transforms.ToTensor() to read cv2The image is converted to the format [C,H,W] used in pytorch.
self.conv2 = nn.Conv2d(planes, planes, kernel_size=3, stride=stride, padding=2,bias=False, dilation=1) # The first two parameters are the number of channels
PS: Atrous convolution
One of the convolution operation parameters is dilation. If it is set to 1, it is a normal convolution operation. If it is set to be greater than 1, it is a hole convolution. The hole convolution operation can be obtained by a smaller convolution kernel.For a large receptive field, taking the convolution kernel as 3*3 and dilation=2 as an example, the original convolution operation will take a 3*3 size sub-region and the convolution kernel on the feature map for cross-correlation operation, and useAfter the hole convolution, the 3*3 convolution kernel will be padded to 5*5, that is, the original convolution kernel will be filled with 0, so that a 5*5 area will also be operated when the cross-correlation operation is performed., the meaning of a dilation of 2 is that the distance between each element in the convolution kernel is filled with a distance of 2.
3D Convolution
3D convolution is used to extract video data features, and the input data format is [batch_size, channel_num, t_len, H, W].
self.conv3d = nn.Conv3d(in_channels=in_channels, out_channels=output_channels,kernel_size=kernel_shape, stride=stride,padding=0, bias=self._use_bias)
Summary: Compared with low-dimensional, high-dimensional convolution requires one more dimension input, and the first two dimensions of the data input are both batch_size and channel_num.
边栏推荐
猜你喜欢
Object Detection Study Notes
为什么要用Flink,怎么入门使用Flink?
Unity Framework Design Series: How Unity Designs Network Frameworks
Sword Point Offer Special Assault Edition ---- Day 1
面试官,不要再问我三次握手和四次挥手
剑指offer专项突击版 ---- 第 6 天
Swordsman Offer Special Assault Edition ---- Day 6
Anaconda configure environment directives
分布式事务——分布式事务简介、分布式事务框架 Seata(AT模式、Tcc模式、Tcc Vs AT)、分布式事务—MQ
STM32 - DMA
随机推荐
剑指offer专项突击版 ---第 5 天
Shell重油常压塔模拟仿真与控制
docker安装postgresSQL和设置自定义数据目录
运用flask框架发送短信验证码的流程及具体代码
[mysql improves query efficiency] Mysql database query is slow to solve the problem
第7章 网络层第3次练习题答案(第三版)
再见了繁琐的Excel,掌握数据分析处理技术就靠它了
数据库上机实验4 数据更新和视图
C语言指针详解
Temporal客户端模型
实验7 UDP与TCP对比
剑指offer基础版 ---- 第26天
Unity resources management series: Unity framework how to resource management
剑指offer基础版 ----第31天
数据库上机实验7 数据库设计
MySQL-Explain详解
Mysql application cannot find my.ini file after installation
关于LocalDateTime的全局返回时间带“T“的时间格式处理
wx.miniProgram.navigateTo在web-view中跳回小程序并传参
pytorch中的一维、二维、三维卷积操作