当前位置：网站首页>Realization of MNIST handwritten numeral recognition

Realization of MNIST handwritten numeral recognition

2022-06-25 00:23:00 【Know the cold and the warm*】

Catalog

Preface
One 、 Code implementation
Two 、 Some things to pay attention to
summary

Preface

Realization mnist Handwritten digit recognition

One 、 Code implementation

import tensorflow as tf
from tensorflow.keras.datasets import mnist

import matplotlib as plt

from tensorflow.keras import models
from tensorflow.keras import layers

(train_images,train_labels), (test_images, test_labels) = mnist.load_data()
# train_images.shape: (60000,28,28) 6 10000 images , Every image is 28*28 Pixel image of .

#  Building neural network 
network = models.Sequential()
network.add(layers.Dense(512, activation='relu', input_shape=(28*28,)))
#  Just write a few categories , Here is 10 classification .
network.add(layers.Dense(10, activation='softmax'))

# compile( compile )： Loss function 、 Optimizer 、 Indicators to be monitored during training and testing 
# metrics： The index list , For the classification problem , We usually set the list to metrics=['accuracy'], The mean square error regression loss is mse
#  For multi category losses 'categorical_crossentropy', II. For classified losses 'binary_crossentropy'
network.compile(optimizer='rmsprop',
               loss='categorical_crossentropy',
               metrics=['accuracy'])

#  Data processing ： Transform it into the shape required by the network , And normalized 
train_images = train_images.reshape((60000, 28*28))
train_images = train_images.astype('float32')/255

test_images = test_images.reshape((10000, 28*28))
test_images = test_images.astype('float32')/255

from tensorflow.keras.utils import to_categorical

# to_categorical： Convert the category vector to binary （ Only 0 and 1） Matrix type representation of . That is to say, the original category vector is transformed into the form of single hot coding .
train_labels = to_categorical(train_labels)
test_labels = to_categorical(test_labels)

#  Start training 
network.fit(train_images, train_labels, epochs=20, batch_size=128)

#  assessment 
test_loss, test_acc = network.evaluate(test_images, test_labels)
print(test_loss, test_acc)

Two 、 Some things to pay attention to

2-1、 Network construction mode

It can be ( Create... Through builder )：

network = models.Sequential()
network.add(layers.Dense(512, activation='relu', input_shape=(28*28,)))
network.add(layers.Dense(10, activation='softmax'))

It can also be （ adopt add Methods build ）：

network = models.Sequential([
	layers.Dense(512, activation='relu', input_shape=(28*28,)),
	layers.Dense(10, activation='softmax'),
])

2-2、 Determine the specification of model input data

The first layer needs to inform the model data specification through parameter transfer , The back layer does not need , Because it can be deduced automatically according to the output of the first layer .
adopt input_shape Parameters ：

network.add(layers.Dense(512, activation='relu', input_shape=(28*28,)))

It can also be done through input_dim Parameter setting , Similar to the above meaning ：

network.add(layers.Dense(512, activation='relu', input_dim=28*28))

Be careful ：input_shape=(2828,) It means that the input data is 2828 First order vector of dimension .input_shape The format of is tuple , So it must be written as (28*28,) This form .

2-3、 Tensor operation inside the full connection layer

Example ：

keras.layers.Dense(512, activation='relu')

annotation ： Enter a 2D tensor , Go back to the other 2D tensor . The function is shown below
The formula says ：output = relu(dot(w, input) + b)
namely ： Input tensor and tensor w（ A random tensor of a given shape ） Dot product operation between （dot）, Got 2D Tensor and vector b The addition between , Last pass relu Activation function （ namely max（x,0））,relu Both operations and addition operations are element by element operations .

2-4、 There are some understandings about dot products

keras.layers.Dense(512, activation='relu')

Be careful ： The dot product between two vectors is a scalar , And only vectors with the same number of elements can do dot product , Multiply element by element and add .

import numpy as np
np.dot([1, 2],[3,4])
#  Output 
# 11

commonly ： Dot product between two matrices , For two matrices x and y, If and only if x.shape[1] == y.shape[0] when , You can plot them , The result is a shape of (x.shape[0], y.shape[1]) Matrix , namely x The line and the way of life y The sum of the multiplied columns of .

np.dot([[1, 2],[1,2]], [[3, 4],[3,4]])
#  Output 
# array([[ 9, 12],
#       [ 9, 12]])

Reference article ：
adopt Sequential Quickly build tensorflow Model .
Input_shape Parameters .
Keras Chinese document .

Optimizer optimizers.
Objective function objectives.
Sequential Model method .

summary

Just try to do something , Although the result was terrible ...

原网站

版权声明
本文为[Know the cold and the warm*]所创，转载请带上原文链接，感谢
https://yzsam.com/2022/176/202206241943440126.html