当前位置：网站首页>Understanding weight sharing in convolutional neural networks

Understanding weight sharing in convolutional neural networks

2022-07-26 15:58:00 【Hua Weiyun】

First of all, it introduces the weight sharing implemented by single-layer network. Yuan Li introduces

Simply from share From the perspective of ： Weight sharing is filter Value sharing for

Convolution neural network two core ideas ：

1. Network local connection （Local Connectivity）

2. Convolution kernel parameter sharing （Parameter Sharing）

A key function of both is to reduce the number of parameters , Make the operation simple 、 Efficient , Can operate on very large data sets .

Let's use the most intuitive diagram , To clarify the role of both .

CNN The right way to open , As shown below

It can be summed up as ： One The convolution kernel of is scanned on the image , Feature extraction . Usually , , Convolution kernel of is commonly used , If channels by [ The formula ] Words （32,64 Is the number of commonly used channels ）, So the total number of parameters is .

Don't make parameter sharing

If not parameter sharing Realize the operation of the above figure , The convolution kernel structure will become as shown in the figure below

This is not difficult to find ： The number of parameters of convolution kernel is consistent with the size of image pixel matrix , namely

for example ：Inception V3 The input image size of is 192192 Of ,** If the first floor 3332 The convolution kernel of removes parameter sharing , Then the number of parameters will become 192192*32, about 120 All the parameters , It's the original 288 Parameters 50 ten thousandfold .**

Don't make local connectivity
If local connection is not used , That is, of course, a fully connected network （fully connect）, That is, each element unit is fully connected with the neurons in the hidden layer , The network structure is as follows .

At this time, the parameter quantity becomes , Because the pixel matrix is very large , Therefore, more hidden layer nodes will be selected , At this time, the number of parameters of a single hidden layer usually exceeds 1 Ten million , It makes it difficult for the network to train .

Here are pytorch Weight sharing code for multi-layer networks

import torchimport torch.nn as nnimport randomimport matplotlib.pyplot as plt #  draw loss curve def plot_curve(data):    fig = plt.figure()    plt.plot(range(len(data)), data, color='blue')    plt.legend(['value'], loc='upper right')    plt.xlabel('step')    plt.ylabel('value')    plt.show()  class DynamicNet(nn.Module):    def __init__(self, D_in, H, D_out):        super(DynamicNet, self).__init__()        self.input_linear = nn.Linear(D_in, H)        self.middle_linear = nn.Linear(H, H)        self.output_linear = nn.Linear(H, D_out)     def forward(self, x):        h_relu = self.input_linear(x).clamp(min=0)        #  Repeated use Middle linear modular         for _ in range(random.randint(0, 3)):            h_relu = self.middle_linear(h_relu).clamp(min=0)        y_pred = self.output_linear(h_relu)        return y_pred  # N It's batch size ;D It's the input dimension # H Is the hidden layer dimension ;D_out It's the output dimension N, D_in, H, D_out = 64, 1000, 100, 10 #  Simulated training data x = torch.randn(N, D_in)y = torch.randn(N, D_out) model = DynamicNet(D_in, H, D_out)criterion = nn.MSELoss(reduction='sum')#  It is difficult to train this strange model with ordinary random gradient descent , So we used momentum Method .optimizer = torch.optim.SGD(model.parameters(), lr=1e-4, momentum=0.9) loss_list = []for t in range(500):    #  Forward propagation     y_pred = model(x)    #  Calculate the loss     loss = criterion(y_pred, y)    loss_list.append(loss.item())    #  Zero gradient , Back propagation , Update weights     optimizer.zero_grad()    loss.backward()    optimizer.step() plot_curve(loss_list)

原网站

版权声明
本文为[Hua Weiyun]所创，转载请带上原文链接，感谢
https://yzsam.com/2022/207/202207261529249348.html

当前位置：网站首页>Understanding weight sharing in convolutional neural networks

Understanding weight sharing in convolutional neural networks

First of all, it introduces the weight sharing implemented by single-layer network. Yuan Li introduces

Here are pytorch Weight sharing code for multi-layer networks

边栏推荐

猜你喜欢

随机推荐