当前位置：网站首页>Convolutional neural network

Convolutional neural network

2022-07-28 06:14:00 【Jiyu Wangchuan】

Convolutional neural networks

One 、 Fully connected network Review

All connected neural networks ： Each neuron and each neuron in the front and back adjacent layers
Connected , Input is a feature , The output is the result of prediction .
$=\sum _{ Layers }( Front layer * Posterior layer + Posterior layer )$

in application , The resolution of the image is much higher than this , And most of them are color images , As shown in the figure below . Although fully connected networks are generally considered to be the best network for classification prediction , But there are too many parameters to be optimized , It is easy to cause model over fitting .

In order to solve the problem of over fitting the model due to excessive parameters , Generally, the original image will not be input directly , Instead, we first extract the features of the image , Then input the extracted features into the fully connected network .

Two 、 Convolutional neural networks

2.1 Convolution calculation process

Convolution （Convolution）：

Convolution calculation It can be considered as an effective method to extract image features .
Generally, a square is used Convolution kernel , In the specified steps , Slide on the input feature map , Traverse each pixel in the input feature map . Every step , The convolution kernel will coincide with the input characteristic graph , The coincident regions are multiplied by the corresponding elements 、 Sum and add the offset term to get a pixel of the output feature .

As shown in the figure below , Use the size of 3×3×1 Convolution check of 5×5×1 The corresponding results are obtained by convolution calculation of single channel image .

For color images （ Multichannel ） Come on , The number of convolution kernel channels is consistent with the input characteristics , After socket connection, multiply and add at the corresponding position , As shown in the figure below , Three channel convolution is used to check the color characteristic map of three channels for convolution calculation .

Multiple feature extraction of input features on the same layer can be realized by using multiple convolution kernels , The number of convolution kernels determines the channel of the output layer （channels） Count , That is, the depth of the output feature map .

2.2 Feel the field

Feel the field （Receptive Field）： The mapping area size of each pixel in each output layer of convolutional neural network on the original image .
The following figure shows the schematic diagram of feeling field

When we use convolution kernels of different sizes , The biggest difference is the size of the receptive field , Therefore, it is often used to replace a large convolution kernel with a multi-layer small convolution kernel , Reduce the amount of parameters and calculation while maintaining the same receptive field .
For example, it is very common to use 2 layer 3 * 3 Convolution kernel to replace 1 layer 5 * 5 The method of convolution kernel , As shown in the figure below .

Here is a detailed derivation ： Let's set the width of the input characteristic graph 、 Gao Junwei x, The step of convolution calculation is 1, obviously , Two $3 * 3$ The parameter of convolution kernel is $9 + 9 = 18$ , Less than $5 * 5$ Convolutional kernel 25, The former has fewer parameters .
In terms of calculation , about $5 * 5$ Convolution kernel , The output characteristic diagram has $x – 5 + 1)^2$ Pixels , Each pixel needs to be $5 * 5 = 25$ Multiplication and addition operations , Then the total calculation amount is $25 * (x – 5 + 1)^2 = 25x^2 – 200x + 400$ ;
For the two $3 * 3$ Convolution kernel , first 3 * 3 Convolution kernel output characteristic graph has $x – 3 + 1)^2$ Pixels , Each pixel needs to be $3 * 3 = 9$ Multiplication and addition operations , the second 3 * 3 Convolution kernel output characteristic graph has $x – 3 + 1 – 3 + 1)^2$ Pixels , Each pixel also needs to be 9 Multiplication and addition operations , Then the total calculation amount is $9 * (x – 3 + 1)^2 + 9 * (x – 3 + 1 – 3 + 1)^2 = 18 x^2 – 108x + 180$ ;

2.3 Output feature size calculation

Output feature size calculation ： After understanding the whole process of convolution calculation in neural network , The size of the output feature map can be calculated .
As shown in the figure below ,5×5 The image of 3×3 After convolution calculation, the output feature size is 3×3

2.4 All zero filling

All zero filling （padding）： In order to keep the size of the output image consistent with the input image , All zeros are often filled around the input image , As shown below , stay 5×5 Fill in around the input image 0, Then the output feature size is 5×5.

stay Tensorflow In the frame , With the parameters padding = ‘SAME’ or padding = ‘VALID’ Indicates whether to fill all zeros , Its influence on the output feature size is as follows ：

Tensorflow Describe convolution ：

tf.keras.layers.Conv2D (
filters= Number of convolution kernels ,
kernel_size= Convolution kernel size , # Square write core length integer , or （ Nuclear height h, Core width w）strides= Sliding step , # Write an integer with the same step length horizontally and vertically , or ( Longitudinal step h, Lateral step size w), Default 1
padding= “same” or “valid”, # Fill with all zeros yes “same”, Not using is “valid”（ Default ）
activation= “ relu” or “ sigmoid ” or “ tanh ” or “ softmax” etc. , # if there be BN Don't write here
input_shape= ( high , wide , The channel number ) # Enter the dimension of the characteristic diagram , Omission
)

model = tf.keras.models.Sequential([Conv2D(6, 5, padding='valid', activation='sigmoid'),MaxPool2D(2, 2),
        Conv2D(6, (5, 5), padding='valid', activation='sigmoid'),MaxPool2D(2, (2, 2)),
        Conv2D(filters=6, kernel_size=(5, 5),padding='valid', activation='sigmoid'),MaxPool2D(pool_size=(2, 2), strides=2),
        Flatten(),
        Dense(10, activation='softmax')
    ])

2.5 Batch of standardized

Standardization ： Make the data conform to 0 mean value ,1 Is the distribution of standard deviation .
Batch of standardized （Batch Normalization）： For a small batch of data （batch）, Do Standardization .

Batch Normalization Adjust the input of each layer of the neural network to the mean value of 0, The variance of 1 The standard normal distribution of , Its purpose is to solve the problem of The gradient disappears The problem of

BN Another important step in the operation is scaling and offsetting , It is worth noting that , Zoom factor γ And the offset factor β Are all trainable parameters

BN The layer is located behind the convolution layer , Before activating the layer .
Tensorflow Describe batch standardization

tf.keras.layers.BatchNormalization()

model = tf.keras.models.Sequential([Conv2D(filters=6, kernel_size=(5, 5), padding='same'), #  Convolution layer 
    BatchNormalization(), # BN layer 
    Activation('relu'), #  Activation layer 
    MaxPool2D(pool_size=(2, 2), strides=2, padding='same'), #  Pooling layer 
    Dropout(0.2), # dropout layer 
])

2.6 Pooling

Pooling （Pooling） Used to reduce the amount of feature data .
Maximum pool can extract image texture , Mean pooling preserves background features .

Tensorflow Describe pooling ：

tf.keras.layers.MaxPool2D(
pool_size= Pool core size ,# Square write core length integer , or （ Nuclear height h, Core width w）strides= Pool step ,# Step integer , or ( Longitudinal step h, Lateral step size w), The default is pool_sizepadding=‘valid’or‘same’# Fill with all zeros yes “same”, Not using is “valid”（ Default ）
)
tf.keras.layers.AveragePooling2D(
pool_size= Pool core size ,# Square write core length integer , or （ Nuclear height h, Core width w）strides= Pool step ,# Step integer , or ( Longitudinal step h, Lateral step size w), The default is pool_sizepadding=‘valid’or‘same’# Fill with all zeros yes “same”, Not using is “valid”（ Default ）
)

model = tf.keras.models.Sequential([Conv2D(filters=6, kernel_size=(5, 5), padding='same'), #  Convolution layer 
    BatchNormalization(), # BN layer 
    Activation('relu'), #  Activation layer 
    MaxPool2D(pool_size=(2, 2), strides=2, padding='same'), #  Pooling layer 
    Dropout(0.2), # dropout layer 
])

2.7 Abandon

Abandon （Dropout）： In neural network training , Part of the neurons are temporarily discarded from the neural network according to a certain probability . When neural networks are used , Discarded neurons restore links .

tf.keras.layers.Dropout( The probability of abandonment )

model = tf.keras.models.Sequential([Conv2D(filters=6, kernel_size=(5, 5), padding='same'), #  Convolution layer 
    BatchNormalization(), # BN layer 
    Activation('relu'), #  Activation layer 
    MaxPool2D(pool_size=(2, 2), strides=2, padding='same'), #  Pooling layer 
    Dropout(0.2), # dropout layer 
])

2.8 Main modules of convolutional neural network

Convolutional neural networks ： After extracting features with the help of convolution kernel , Into a fully connected network .
Main modules of convolutional neural network ：

3、 ... and 、 Convolutional neural network construction example

import tensorflow as tf
import os
import numpy as np
from matplotlib import pyplot as plt
from tensorflow.keras.layers import Conv2D, BatchNormalization, Activation, MaxPool2D, Dropout, Flatten, Dense
from tensorflow.keras import Model

np.set_printoptions(threshold=np.inf)

cifar10 = tf.keras.datasets.cifar10
(x_train, y_train), (x_test, y_test) = cifar10.load_data()
x_train, x_test = x_train / 255.0, x_test / 255.0


class Baseline(Model):
    def __init__(self):
        super(Baseline, self).__init__()
        self.c1 = Conv2D(filters=6, kernel_size=(5, 5), padding='same')  #  Convolution layer 
        self.b1 = BatchNormalization()  # BN layer 
        self.a1 = Activation('relu')  #  Activation layer 
        self.p1 = MaxPool2D(pool_size=(2, 2), strides=2, padding='same')  #  Pooling layer 
        self.d1 = Dropout(0.2)  # dropout layer 

        self.flatten = Flatten()
        self.f1 = Dense(128, activation='relu')
        self.d2 = Dropout(0.2)
        self.f2 = Dense(10, activation='softmax')

    def call(self, x):
        x = self.c1(x)
        x = self.b1(x)
        x = self.a1(x)
        x = self.p1(x)
        x = self.d1(x)

        x = self.flatten(x)
        x = self.f1(x)
        x = self.d2(x)
        y = self.f2(x)
        return y


model = Baseline()

model.compile(optimizer='adam',
              loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=False),
              metrics=['sparse_categorical_accuracy'])

checkpoint_save_path = "./checkpoint/Baseline.ckpt"
if os.path.exists(checkpoint_save_path + '.index'):
    print('-------------load the model-----------------')
    model.load_weights(checkpoint_save_path)

cp_callback = tf.keras.callbacks.ModelCheckpoint(filepath=checkpoint_save_path,
                                                 save_weights_only=True,
                                                 save_best_only=True)

history = model.fit(x_train, y_train, batch_size=32, epochs=5, validation_data=(x_test, y_test), validation_freq=1,
                    callbacks=[cp_callback])
model.summary()

file = open('./weights.txt', 'w')
for v in model.trainable_variables:
    file.write(str(v.name) + '\n')
    file.write(str(v.shape) + '\n')
    file.write(str(v.numpy()) + '\n')
file.close()

acc = history.history['sparse_categorical_accuracy']
val_acc = history.history['val_sparse_categorical_accuracy']
loss = history.history['loss']
val_loss = history.history['val_loss']

plt.subplot(1, 2, 1)
plt.plot(acc, label='Training Accuracy')
plt.plot(val_acc, label='Validation Accuracy')
plt.title('Training and Validation Accuracy')
plt.legend()

plt.subplot(1, 2, 2)
plt.plot(loss, label='Training Loss')
plt.plot(val_loss, label='Validation Loss')
plt.title('Training and Validation Loss')
plt.legend()
plt.show()

The operation results are as follows ：

50000/50000 [==============================] - 7s 142us/sample - loss: 1.1004 - sparse_categorical_accuracy: 0.6083 - val_loss: 1.1223 - val_sparse_categorical_accuracy: 0.6027
Model: "baseline"
_________________________________________________________________
Layer (type)                 Output Shape              Param # 
=================================================================
conv2d (Conv2D)              multiple                  456       
_________________________________________________________________
batch_normalization (BatchNo multiple                  24        
_________________________________________________________________
activation (Activation)      multiple                  0         
_________________________________________________________________
max_pooling2d (MaxPooling2D) multiple                  0         
_________________________________________________________________
dropout (Dropout)            multiple                  0         
_________________________________________________________________
flatten (Flatten)            multiple                  0         
_________________________________________________________________
dense (Dense)                multiple                  196736    
_________________________________________________________________
dropout_1 (Dropout)          multiple                  0         
_________________________________________________________________
dense_1 (Dense)              multiple                  1290      
=================================================================
Total params: 198,506
Trainable params: 198,494
Non-trainable params: 12