当前位置:网站首页>[in depth learning] Introduction to pytorch to project practice (11): convolution layer
[in depth learning] Introduction to pytorch to project practice (11): convolution layer
2022-06-10 04:36:00 【JOJO's data analysis Adventure】
【 Deep learning 】:《PyTorch Introduction to project practice 》( 11、 ... and ): Convolution layer
- This article is included in 【 Deep learning 】:《PyTorch Introduction to project practice 》 special column , This column mainly records how to use
PyTorchRealize deep learning notes , Try to keep updating every week , You are welcome to subscribe !- Personal home page :JoJo Data analysis adventure
- Personal introduction : I'm reading statistics in my senior year , At present, Baoyan has reached statistical top3 Colleges and universities continue to study for Postgraduates in Statistics
- If it helps you , welcome
Focus on、give the thumbs-up、Collection、subscribespecial column- Reference material : This column focuses on bathing God 《 Hands-on deep learning 》 For learning materials , Take notes of your study , Limited ability , If there is a mistake , Welcome to correct . At the same time, Musen uploaded teaching videos and teaching materials , You can go to study .
- video : Hands-on deep learning
- The teaching material : Hands-on deep learning

List of articles
Convolutional neural networks (CNN): Convolution layer implementation
The basic neural network knowledge and some concepts dealing with over fitting and under fitting have been introduced before . Now we formally enter the study of convolutional neural network .CNN yes ⼀ Class strength ⼤ Of 、 For processing image data ⽽ The nerve of design ⽹ Collateral . Based on convolution neural ⽹ The model of network architecture has been dominant in the field of computer vision , Today, ⼏ Almost all image recognition 、⽬ Academic competitions and businesses related to mark detection or semantic segmentation should ⽤ In this way ⽅ Based on law . For computer vision , A major challenge is that the input of data may be very large . for example , We have one 64 × \times × 64 Pictures of the , Suppose the number of channels is 3, So it has the same amount of data 64 × 64 × 3 = 12288 64\times 64\times 3=12288 64×64×3=12288 Eigenvector of . When we want to manipulate a larger picture , Convolution calculation is needed , It is a very important part of convolutional neural network .
1. introduce
Let's give you an example , Suppose we were given such a picture 
Let the computer figure out what's in this picture , The first thing we can do is to detect the vertical edges in the image . for instance , The railing in this picture corresponds to the vertical line , meanwhile , The silhouettes of these pedestrians are also to some extent vertical , These lines are the output of the vertical edge detector . Again , We may also want to detect horizontal edges , For example, these railings are very obvious horizontal lines , So how to detect these edges in the image ?
We can build a 3 × 3 3\times3 3×3 Matrix , We also call it filter or kernel function (kernel). So let's go through Andrew Ng To see why this can do edge detection ?

The figure above shows a simple 6×6 Images , The left half is 10, On the right is usually 0. If you think of it as a picture , The part on the left looks white , Pixel values 10 It's a brighter pixel value , The pixel value on the right is darker , I use gray to show 0, Although it can also be painted black . In the picture , There is a particularly prominent vertical edge in the middle of the image , This vertical line is the transition from black to white , Or from white to dark . therefore , When we use one 3×3 When the filter performs convolution , This 3×3 Filter results for , There are bright pixels on the left , Then there's a transition ,0 In the middle , And then on the right is the dark . After convolution , What we get is the matrix on the right . There are many other edge detection methods , About specific computer vision tasks , We'll introduce it later , Next, let's take a look at how convolution is calculated
2. Convolution operation
Let's take a look at how convolution is calculated , Give an input matrix and a kernel function , We will start from the upper left corner of the input feature to find the inner product with the kernel function , Then slide the window , Find the next inner product . Get our output , The specific calculation is as follows
0 × 0 + 1 × 1 + 3 × 2 + 4 × 3 = 19 1 × 0 + 1 × 2 + 4 × 2 + 3 × 5 = 25 3 × 0 + 4 × 1 + 6 × 2 + 7 × 3 = 37 4 × 0 + 1 × 5 + 2 × 7 + 3 × 8 = 43 0\times0+1\times1+3\times2+4\times3 = 19\\ 1\times0+1\times2+4\times2+3\times5 = 25\\ 3\times0+4\times1+6\times2+7\times3 = 37\\ 4\times0+1\times5+2\times7+3\times8 = 43 0×0+1×1+3×2+4×3=191×0+1×2+4×2+3×5=253×0+4×1+6×2+7×3=374×0+1×5+2×7+3×8=43

It can be seen that , After convolution calculation , Our raw data features have become smaller . Suppose the input matrix is n × n n\times n n×n, Kernel function (Kernel) by f × f f\times f f×f, Usually the kernel is in the form of a square matrix . Then the output result is ( n − f + 1 ) × ( n − f + 1 ) (n-f+1)\times (n-f+1) (n−f+1)×(n−f+1)
3 Code implementation
3.1 Let's simply implement convolution
""" Import related libraries """
import torch
from torch import nn
def corr2d(X,K):
""" Define convolution operations """
h, w = K.shape# Nuclear shape, Usually here h and w They are equal.
Y = torch.zeros((X.shape[0]-h+1, X.shape[1]-w+1))# Initialization output result
for i in range(Y.shape[0]):# The second part of the output matrix i That's ok
for j in range(Y.shape[1]):# The second part of the output matrix j Column
Y[i,j] = (X[i:i+h,j:j+w] * K).sum()# Calculate the corresponding inner product
return Y
Let's test whether it is correct
X = torch.Tensor([[0.0,1.0,2.0],[3.0,4.0,5.0],[6.0,7.0,8.0]])
K = torch.Tensor([[0.0,1.0],[2.0,3.0]])
corr2d(X,K)
It can be seen that it is consistent with our previous results .
3.2 Tectonic convolution
Previously we introduced how to construct linear layers , Activation function , as well as drop-out, Allied , We define the convolution layer by defining a class
class Conv2D(nn.Module):
""" Define two-dimensional convolution """
def __init__(self, kernel_size):# Define the size of the kernel function , This is a superparameter
super().__init__()
self.weight = nn.Parameter(torch.rand(Kernel_size))# The parameters of learning
self.bias = nn.Parameter(torch.zeros(1))# The parameters of learning
def forward(self, x):
return corr2d(x,self.weight)+self.bias# Calculate the convolution result
3.3 Detect image color edges
First we define a X, hypothesis 1 It's for gray ,0 For white
X = torch.ones(6,6)
X[:,2:4] = (0.0)
X
Let's do edge detection , Here we construct a 1 × 2 1\times2 1×2 Kernel function of
K = torch.tensor([[1.0,-1.0]])
K
Y = corr2d(X,K)
Y
among ,1 Indicates a vertical edge from gray to white ,-1 Indicates a vertical edge from white to black
The cores defined above can only detect vertical edges , Now suppose we are right X To transpose , We want to detect horizontal edges
Y = corr2d(X.t(),K)
Y
You can see the kernel function at this time (Kernel) Cannot detect horizontal edges , We need to be right about kernel Also transpose
Y = corr2d(X.t(),K.t())
Y
3.4 Learn convolution kernel
We just defined the convolution kernel ourselves , But when we need to do more complicated calculations , It is difficult to define convolution kernel directly , Let's see if we can learn our convolution kernel from the input and output matrices , Here we define the loss function as Y And the square error of the convolution output .
# structure ⼀ individual ⼆ Convolution layer , It has 1 Output channels and shapes are (2,2) Convolution kernel
conv2d = nn.Conv2d(1,1, kernel_size=(1, 2), bias=False)
# This ⼆ The convolution layer makes ⽤ Four dimensional transmission ⼊ And output format ( Batch ⼤⼩、 passageway 、⾼ degree 、 Width ),
# Among them, batch ⼤⼩ And the number of channels are 1
X = X.reshape((1, 1, 6, 6))
Y = Y.reshape((1, 1, 6, 5))
lr = 3e-2 # Learning rate
for i in range(10):
Y_hat = conv2d(X)# Compute convolution
l = (Y_hat - Y) ** 2# Loss function
conv2d.zero_grad()
l.sum().backward()# Back propagation calculation gradient
# Iterative convolution kernel
conv2d.weight.data[:] -= lr * conv2d.weight.grad
if (i + 1) % 2 == 0:
print(f'epoch {
i+1}, loss {
l.sum():.3f}')
You can see how 10 After iterations , The error is already low , Let's take a look at the convolution kernel parameters learned
conv2d.weight
The result is the same as what we defined before 1,-1 Basically close to .
This chapter introduces how to calculate convolution and define convolution layer , Follow up padding、 Multidimensional convolution 、 Pool layer, etc .
This is the introduction of this chapter , If it helps you , Please do more thumb up 、 Collection 、 Comment on 、 Focus on supporting !!
边栏推荐
- 25. Bom Event
- . Net C Foundation (7): interface - how people interact with cats
- GUI programming student achievement management system
- Basic methods of stack and related problems
- Mnemonic search + state compression leetcode four hundred and sixty-four
- Quic must see
- CVPR 2022 | indirect lighting modeling in inverse rendering
- 如何用天气预警API接口进行快速开发
- Pampy | powerful pattern matching tool
- libc、glibc和glib的关系
猜你喜欢

Deep learning and CV tutorial (13) | target detection (SSD, Yolo Series)

25. BOM事件

IO被谁吃了?

Pysimplegui classic practice: how to read this Chinese character?

24. browser object model BOM

Pampy | powerful pattern matching tool

Storage engine of MySQL database

Golang learning 6: file operation in

Process, time slice, concurrency and parallelism

Final examination paper 2 of the first postgraduate course in Dialectics of nature
随机推荐
Gevent | use it asynchronously!
S系列·删除文件夹的几种姿势
libc、glibc和glib的关系
mindspore1.6conda安装gpu版本验证失败
Quic must see
Meanings of letters in PMP project management calculation PV, EV, AC, SV, CV, SPI, CPI
使用 Locust 进行 Kubernetes 分布式性能测试
Openjudge noi 1.13 13: RMB payment
信息学奥赛一本通 1274:【例9.18】合并石子 | 洛谷 P1775 石子合并(弱化版)
MindSpore【初学入门】教程在线运行时报错
Unit test overview
When to use @componentscan? What is the difference with @mapperscan?
golang学习之六:中的文件操作
Good news 𞓜 wangchain technology signed the Miluo cultural, tourism and sports industry project to create a digital village on the "chain"
golang学习之五:error、painc、recover
Yyds dry goods inventory solution sword finger offer: rectangular coverage
25. BOM事件
24. 浏览器对象模型 BOM
Eight part essay ceiling! (PDF HD download)
OpenJudge NOI 1.13 13:人民币支付