当前位置:网站首页>Hands-on Deep Learning_NiN
Hands-on Deep Learning_NiN
2022-08-04 21:08:00 【CV Small Rookie】

LeNet, AlexNet and VGG all share a common design pattern: extract spatial structure features through a series of convolutional layers and pooling layers; and then process the representation of features through fully connected layers.The improvement of LeNet by AlexNet and VGG mainly lies in how to expand and deepen these two modules.
However, if fully connected layers are used, the spatial structure of the representation may be discarded entirely.The network in network (NiN) provides a very simple solution: use a multi-layer perceptron on each pixel channel (in fact, add two layers of 1 x 1 convolution, because as mentioned earlier, 1 x 1The convolution is equivalent to a parameter-sharing MLP)

As you can see from the diagram, the NiN network is composed of nin_blocks, and a nin_block consists of a convolutional layer + two
1 x 1 convolution composition:

The final output cancels the use of MLP, but uses a global Pooling to change the height and width of the feature map to 1, and finally uses Flatten to flatten to get the output.
def nin_block(in_channels, out_channels, kernel_size, strides, padding):return nn.Sequential(nn.Conv2d(in_channels, out_channels, kernel_size, strides, padding),nn.ReLU(),nn.Conv2d(out_channels, out_channels, kernel_size=1), nn.ReLU(),nn.Conv2d(out_channels, out_channels, kernel_size=1), nn.ReLU())class NiN(nn.Module):def __init__(self):super(NiN, self).__init__()self.model =nn.Sequential(nin_block(1, 96, kernel_size=11, strides=4, padding=0),nn.MaxPool2d(3, stride=2),nin_block(96, 256, kernel_size=5, strides=1, padding=2),nn.MaxPool2d(3, stride=2),nin_block(256, 384, kernel_size=3, strides=1, padding=1),nn.MaxPool2d(3, stride=2),nn.Dropout(0.5),# The number of tag categories is 10nin_block(384, 10, kernel_size=3, strides=1, padding=1),nn.AdaptiveAvgPool2d((1, 1)),# Convert the 4D output to a 2D output with shape (batch size, 10)nn.Flatten())def forward(self,x):x = self.model(x)return xThe size of the output of each layer:
Sequential output shape: torch.Size([1, 96, 54, 54])MaxPool2d output shape: torch.Size([1, 96, 26, 26])Sequential output shape: torch.Size([1, 256, 26, 26])MaxPool2d output shape: torch.Size([1, 256, 12, 12])Sequential output shape: torch.Size([1, 384, 12, 12])MaxPool2d output shape: torch.Size([1, 384, 5, 5])Dropout output shape: torch.Size([1, 384, 5, 5])Sequential output shape: torch.Size([1, 10, 5, 5])AdaptiveAvgPool2d output shape: torch.Size([1, 10, 1, 1])Flatten output shape: torch.Size([1, 10])
边栏推荐
- Interviewer: How is the expired key in Redis deleted?
- 【2022杭电多校5 1003 Slipper】多个超级源点+最短路
- idea2021版本添加上一步和下一步操作到工具栏
- Oreo domain name authorization verification system v1.0.6 public open source version website source code
- PowerCLi 导入License到vCenter 7
- LayaBox---TypeScript---举例
- How to train a deep learning model?
- 密码学系列之:PEM和PKCS7,PKCS8,PKCS12
- Zynq Fpga图像处理之AXI接口应用——axi_lite接口使用
- 嵌入式分享合集28
猜你喜欢
随机推荐
Web3安全风险令人生畏,应该如何应对?
Oreo域名授权验证系统v1.0.6公益开源版本网站源码
PowerCLi 导入License到vCenter 7
PowerCLi 批量配置NTP
88. (the home of cesium) cesium polymerization figure
【编程思想】
dotnet 使用 lz4net 压缩 Stream 或文件
QT(41)-多线程-QTThread-同步QSemaphore-互斥QMutex
二叉搜索树解决硬木问题
proe和creo的区别有哪些
链栈的应用
密码学系列之:PEM和PKCS7,PKCS8,PKCS12
C语言之实现扫雷小游戏
MySQL field type
win10 uwp modify picture quality compress picture
使用堡塔应用管理器配置laravel队列方法
idea2021版本添加上一步和下一步操作到工具栏
visual studio 2015 warning MSB3246
STP --- 生成树协议
C语言知识大全(一)——C语言概述,数据类型








