当前位置:网站首页>MNIST handwritten digit recognition - based on Mindspore to quickly build a perceptron to achieve ten categories

MNIST handwritten digit recognition - based on Mindspore to quickly build a perceptron to achieve ten categories

2022-08-04 06:20:00 Learning Adventures

Based on the deep learning framework, the perceptron is quickly constructed to realize the binary classification of handwritten digits 

        Does not rely on mathematical knowledge,使用MindSporeDeep learning frameworks quickly implement model structure definition、损失函数定义、梯度下降的方法.

        The previous section implemented the model structure of the perceptron from scratch、The loss function and evaluation function are defined,And the formula for gradient descent was manually derived,最终经过3000个epoch的训练,Makes the perceptron model in handwritten digits0和1achieved on the binary classification task0.9以上的准确率.

        This way of implementing the entire machine learning process from scratch is laborious,Especially in the manual derivation of the gradient descent formula,需要一定的数学知识.If you replace another loss function,Then you have to re-derive the new gradient descent formula again,This is a laborious thing to tune the model.

        Fortunately, there are already many deep learning frameworks today,They have nicely encapsulated model structure definitions、损失函数定义、梯度下降实现等过程,Just make some simple function calls,The completed machine learning training process can be implemented,No need to pay attention to how the underlying gradient descent is implemented,极大地提高了模型开发的效率.

下面,用MindSporeFramework to quickly implement perceptron models,Binary classification of handwritten digits.

1.加载数据集 

As already defined in the previous sectionload_data_zeros_ones函数,So you can call it directly 

import os
import sys

import moxing as mox

datasets_dir = '../datasets'
if not os.path.exists(datasets_dir):
    os.makedirs(datasets_dir)
    
if not os.path.exists(os.path.join(datasets_dir, 'MNIST_Data.zip')):
    mox.file.copy('obs://modelarts-labs-bj4-v2/course/hwc_edu/python_module_framework/datasets/mindspore_data/MNIST_Data.zip', 
                  os.path.join(datasets_dir, 'MNIST_Data.zip'))
    os.system('cd %s; unzip MNIST_Data.zip' % (datasets_dir))
    
sys.path.insert(0, os.path.join(os.getcwd(), '../datasets/MNIST_Data'))
from load_data_zeros_ones import load_data_zeros_ones

train_images, train_labels, test_images, test_labels = load_data_zeros_ones(datasets_dir)

数字0,训练集规模: 5923 ,测试集规模: 980

数字1,训练集规模: 6742 ,测试集规模: 1135

 load_data_zeros_onesThe format of the data returned by the function isnp.ndarray格式,但是在MindSporeThe format required in is Tensor格式,Therefore, execute the following code for data format conversion

import mindspore
from mindspore import Tensor

# Reshape the dataset
train_images = train_images.reshape((-1,1,28,28))
train_labels = train_labels.flatten()

test_images = test_images.reshape((-1,1,28,28))
test_labels = test_labels.flatten()

train_size = len(train_labels)
test_size = len(test_labels)

# 转变为mindspore支持的tensor格式的数据
train_images =  Tensor(train_images, mindspore.float32)
train_labels = Tensor(train_labels, mindspore.int32)
test_images = Tensor(test_images, mindspore.float32)
test_labels = Tensor(test_labels, mindspore.int32)

 2.定义网络结构和评价函数

        使用 MindSpore Implementing a perceptron model is very simple,只需要调用nn.Dense定义一个全连接层,再加上一个Sigmoid单元即可,并且nn.Densewill weighwand threshold biasb自动进行初始化,代码如下:

import mindspore.nn as nn
import mindspore.ops as ops
from mindspore.common.initializer import Normal

class Network(nn.Cell):
    def __init__(self, num_of_weights):
        super(Network, self).__init__()
        self.fc = nn.Dense(in_channels=num_of_weights, out_channels=2)  # 定义一个全连接层
        self.nonlinearity = nn.Sigmoid()
        self.flatten = nn.Flatten()
    
    def construct(self, x):  # 加权求和单元和非线性函数单元通过定义计算过程来实现
        x = self.flatten(x)
        z = self.fc(x)
        pred_y = self.nonlinearity(z)
        return pred_y

# 评价函数
def evaluate(pred_y, true_y): 
    pred_labels = ops.Argmax(output_type=mindspore.int32)(pred_y)
    correct_num = (pred_labels == true_y).asnumpy().sum().item()
    return correct_num

 3.定义交叉熵损失函数和优化器

        要训练神经网络模型,需要定义损失函数和优化器.

        MindSpore支持的损失函数有SoftmaxCrossEntropyWithLogits、L1Loss、MSELoss等.这里使用交叉熵损失函数SoftmaxCrossEntropyWithLogits.

        MindSpore支持的优化器有Adam、AdamWeightDecay、SGD、Momentum等.这里使用Momentum优化器为例.

# 损失函数
net_loss = nn.SoftmaxCrossEntropyWithLogits(sparse=True, reduction='mean')

# 创建网络
network = Network(28*28)  
lr = 0.01
momentum = 0.9

# 优化器
net_opt = nn.Momentum(network.trainable_params(), lr, momentum)

4.实现训练函数

def train(network, max_epochs= 50):
    net = WithLossCell(network, net_loss)
    train_network = TrainOneStepCell(net, net_opt)
    train_network.set_train()
    for epoch in range(1, max_epochs + 1):
        train_correct_num = 0.0
        test_correct_num = 0.0
        output = train_network(train_images,train_labels)
        pred_train_labels = network.construct(train_images)  # 前向传播
        train_correct_num = evaluate(pred_train_labels, train_labels)
        train_acc = float(train_correct_num) / train_size

        if (epoch == 1) or (epoch % 10 == 0):
            pred_test_labels = network.construct(test_images)
            test_correct_num = evaluate(pred_test_labels, test_labels)
            test_acc = test_correct_num / test_size
            print("epoch: {0}/{1}, train_losses: {2:.4f}, tain_acc: {3:.4f}, test_acc: {4:.4f}" \
                  .format(epoch, max_epochs, output.asnumpy(), train_acc, test_acc, cflush=True))

5.配置运行信息

        在正式训练前,通过context.set_context来配置运行需要的信息,譬如运行模式、后端信息、硬件等信息.

from mindspore import context
context.set_context(mode=context.GRAPH_MODE, device_target="CPU")  # device_target 可选 CPU/GPU, 当选择GPU时mindspore规格也需要切换到GPU

 6.开始训练

import time
from mindspore.nn import WithLossCell, TrainOneStepCell

max_epochs = 50
start_time = time.time()
print("*"*10 + "开始训练" + "*"*10)
train(network, max_epochs= max_epochs)
print("*"*10 + "训练完成" + "*"*10)
cost_time = round(time.time() - start_time, 1)
print("训练总耗时: %.1f s" % cost_time)
**********开始训练**********

epoch: 1/50, train_losses: 0.7050, tain_acc: 0.3516, test_acc: 0.3759

epoch: 10/50, train_losses: 0.5338, tain_acc: 0.9901, test_acc: 0.9943

epoch: 20/50, train_losses: 0.3990, tain_acc: 0.9949, test_acc: 0.9981

epoch: 30/50, train_losses: 0.3593, tain_acc: 0.9935, test_acc: 0.9972

epoch: 40/50, train_losses: 0.3468, tain_acc: 0.9934, test_acc: 0.9967

epoch: 50/50, train_losses: 0.3410, tain_acc: 0.9938, test_acc: 0.9967

**********训练完成**********

训练总耗时: 9.1 s

        从上面的结果可以看到,使用 MindSpore 实现的感知机模型,使用20秒的时间,训练了50个epochAfter that it was reached0.9967的准确率,Compared to the implementation in the previous section,Training is fast and good,这说明使用 MindSpore to develop the model,不仅开发效率更高,The results achieved are also better,This is using deep learning frameworksMindSpore带来的优势. 

Extending from binary to decile 

        使用深度学习框架MindSpore,利用其友好的封装模块,模型结构定义、损失函数定义、梯度下降实现等过程,只需简单地函数调用,就能实现模型训练,极大地提高了模型开发的效率.

 1.加载数据集

 加载完整的、十个类别的数据集

import os
import numpy as np

import moxing as mox
import mindspore.dataset as ds

datasets_dir = '../datasets'
if not os.path.exists(datasets_dir):
    os.makedirs(datasets_dir)
    
if not os.path.exists(os.path.join(datasets_dir, 'MNIST_Data.zip')):
    mox.file.copy('obs://modelarts-labs-bj4-v2/course/hwc_edu/python_module_framework/datasets/mindspore_data/MNIST_Data.zip', 
                  os.path.join(datasets_dir, 'MNIST_Data.zip'))
    os.system('cd %s; unzip MNIST_Data.zip' % (datasets_dir))
    
# 读取完整训练样本和测试样本
mnist_ds_train = ds.MnistDataset(os.path.join(datasets_dir, "MNIST_Data/train"))
mnist_ds_test = ds.MnistDataset(os.path.join(datasets_dir, "MNIST_Data/test"))
train_len = mnist_ds_train.get_dataset_size()
test_len = mnist_ds_test.get_dataset_size()
print('训练集规模:', train_len, ',测试集规模:', test_len)

训练集规模: 60000 ,测试集规模: 10000

 查看10个样本

from PIL import Image
items_train = mnist_ds_train.create_dict_iterator(output_numpy=True)

train_data = np.array([i for i in items_train])
images_train = np.array([i["image"] for i in train_data])
labels_train = np.array([i["label"] for i in train_data])

batch_size = 10  # 查看10个样本
batch_label = [lab for lab in labels_train[:10]]
print(batch_label)
batch_img = images_train[0].reshape(28, 28)
for i in range(1, batch_size):
    batch_img = np.hstack((batch_img, images_train[i].reshape(28, 28)))  # 将一批图片水平拼接起来,方便下一步进行显示
Image.fromarray(batch_img)
[0, 2, 2, 7, 8, 4, 9, 1, 8, 8]

 2.处理数据集

        数据集对于训练非常重要,好的数据集可以有效提高训练精度和效率,在使用数据集前,通常会对数据集进行一些处理.

         进行数据增强操作

import mindspore.dataset.vision.c_transforms as CV
import mindspore.dataset.transforms.c_transforms as C
from mindspore.dataset.vision import Inter
from mindspore import dtype as mstype

num_parallel_workers = 1
resize_height, resize_width = 28, 28

# according to the parameters, generate the corresponding data enhancement method
resize_op = CV.Resize((resize_height, resize_width), interpolation=Inter.LINEAR)  # 对图像数据像素进行缩放
type_cast_op = C.TypeCast(mstype.int32)  # 将数据类型转化为int32.
hwc2chw_op = CV.HWC2CHW()  # 对图像数据张量进行变换,张量形式由高x宽x通道(HWC)变为通道x高x宽(CHW),方便进行数据训练.

# using map to apply operations to a dataset
mnist_ds_train = mnist_ds_train.map(operations=resize_op, input_columns="image", num_parallel_workers=num_parallel_workers)
mnist_ds_train = mnist_ds_train.map(operations=type_cast_op, input_columns="label", num_parallel_workers=num_parallel_workers)
mnist_ds_train = mnist_ds_train.map(operations=hwc2chw_op, input_columns="image", num_parallel_workers=num_parallel_workers)

buffer_size = 10000
mnist_ds_train = mnist_ds_train.shuffle(buffer_size=buffer_size)  # 打乱训练集的顺序

        进行数据归一化

        对图像数据进行标准化、归一化操作,使得每个像素的数值大小在(0,1)范围中,可以提升训练效率. 

rescale = 1.0 / 255.0
shift = 0.0

rescale_nml = 1 / 0.3081
shift_nml = -1 * 0.1307 / 0.3081

rescale_op = CV.Rescale(rescale, shift) 
mnist_ds_train = mnist_ds_train.map(operations=rescale_op, input_columns="image", num_parallel_workers=num_parallel_workers)

rescale_nml_op = CV.Rescale(rescale_nml, shift_nml)
mnist_ds_train = mnist_ds_train.map(operations=rescale_nml_op, input_columns="image", num_parallel_workers=num_parallel_workers)

mnist_ds_train = mnist_ds_train.batch(60000, drop_remainder=True)  # 对数据集进行分批,此处加载完整的训练集

3.封装成函数

到此,完成了训练数据的准备工作,可以将以上操作封装成load_data_all函数和process_dataset函数,以便后面再次用到.

        定义数据处理操作

   定义一个函数process_dataset来进行数据增强和处理操作:

  • 定义进行数据增强和处理所需要的一些参数.

  • 根据参数,生成对应的数据增强操作.

  • 使用map映射函数,将数据操作应用到数据集.

  • 对生成的数据集进行处理.

%%writefile ../datasets/MNIST_Data/process_dataset.py
def process_dataset(mnist_ds, batch_size=32, resize= 28, repeat_size=1,
                   num_parallel_workers=1):
    """
    process_dataset for train or test

    Args:
        mnist_ds (str): MnistData path
        batch_size (int): The number of data records in each group
        resize (int): Scale image data pixels
        repeat_size (int): The number of replicated data records
        num_parallel_workers (int): The number of parallel workers
    """

    import mindspore.dataset.vision.c_transforms as CV
    import mindspore.dataset.transforms.c_transforms as C
    from mindspore.dataset.vision import Inter
    from mindspore import dtype as mstype
    
    # define some parameters needed for data enhancement and rough justification
    resize_height, resize_width = resize, resize
    rescale = 1.0 / 255.0
    shift = 0.0
    rescale_nml = 1 / 0.3081
    shift_nml = -1 * 0.1307 / 0.3081

    # according to the parameters, generate the corresponding data enhancement method
    resize_op = CV.Resize((resize_height, resize_width), interpolation=Inter.LINEAR)
    rescale_nml_op = CV.Rescale(rescale_nml, shift_nml)
    rescale_op = CV.Rescale(rescale, shift)
    hwc2chw_op = CV.HWC2CHW()
    type_cast_op = C.TypeCast(mstype.int32)
    c_trans = [resize_op, rescale_op, rescale_nml_op, hwc2chw_op]

    # using map to apply operations to a dataset
    mnist_ds = mnist_ds.map(operations=type_cast_op, input_columns="label", num_parallel_workers=num_parallel_workers)
    mnist_ds = mnist_ds.map(operations=c_trans, input_columns="image", num_parallel_workers=num_parallel_workers)

    # process the generated dataset
    buffer_size = 10000
    mnist_ds = mnist_ds.shuffle(buffer_size=buffer_size)
    mnist_ds = mnist_ds.batch(batch_size, drop_remainder=True)
    mnist_ds = mnist_ds.repeat(repeat_size)

    return mnist_ds

         定义数据加载函数

%%writefile ../datasets/MNIST_Data/load_data_all.py
def load_data_all(datasets_dir):
    import os
    if not os.path.exists(datasets_dir):
        os.makedirs(datasets_dir)
    import moxing as mox
    if not os.path.exists(os.path.join(datasets_dir, 'MNIST_Data.zip')):
        mox.file.copy('obs://modelarts-labs-bj4-v2/course/hwc_edu/python_module_framework/datasets/mindspore_data/MNIST_Data.zip', 
                      os.path.join(datasets_dir, 'MNIST_Data.zip'))
        os.system('cd %s; unzip MNIST_Data.zip' % (datasets_dir))
    
    # 读取完整训练样本和测试样本
    import mindspore.dataset as ds
    datasets_dir = '../datasets'
    mnist_ds_train = ds.MnistDataset(os.path.join(datasets_dir, "MNIST_Data/train"))
    mnist_ds_test = ds.MnistDataset(os.path.join(datasets_dir, "MNIST_Data/test"))
    train_len = mnist_ds_train.get_dataset_size()
    test_len = mnist_ds_test.get_dataset_size()
    print('训练集规模:', train_len, ',测试集规模:', test_len)
    
    return mnist_ds_train, mnist_ds_test, train_len, test_len

4.加载处理后的测试集

import os, sys 
sys.path.insert(0, os.path.join(os.getcwd(), '../datasets/MNIST_Data'))
from process_dataset import process_dataset
mnist_ds_test = process_dataset(mnist_ds_test, batch_size= 10000)

5.定义网络结构和评价函数

import mindspore
import mindspore.nn as nn
import mindspore.ops as ops
from mindspore.common.initializer import Normal

class Network(nn.Cell):
    def __init__(self, num_of_weights):
        super(Network, self).__init__()
        self.fc = nn.Dense(in_channels=num_of_weights, out_channels=10, weight_init=Normal(0.02))  # 定义一个全连接层
        self.nonlinearity = nn.Sigmoid()
        self.flatten = nn.Flatten()
    
    def construct(self, x):  # 加权求和单元和非线性函数单元通过定义计算过程来实现
        x = self.flatten(x)
        z = self.fc(x)
        pred_y = self.nonlinearity(z)
        return pred_y
    
def evaluate(pred_y, true_y): 
    pred_labels = ops.Argmax(output_type=mindspore.int32)(pred_y)
    correct_num = (pred_labels == true_y).asnumpy().sum().item()
    return correct_num

6.定义交叉熵损失函数和优化器

# 损失函数
net_loss = nn.SoftmaxCrossEntropyWithLogits(sparse=True, reduction='mean')

# 创建网络
network = Network(28*28)  
lr = 0.01
momentum = 0.9

# 优化器
net_opt = nn.Momentum(network.trainable_params(), lr, momentum)

7.实现训练函数

def train(network, mnist_ds_train, max_epochs= 50):
    net = WithLossCell(network, net_loss)
    net = TrainOneStepCell(net, net_opt)
    network.set_train()
    for epoch in range(1, max_epochs + 1):
        train_correct_num = 0.0
        test_correct_num = 0.0
        for inputs_train in mnist_ds_train:
            output = net(*inputs_train)
            train_x = inputs_train[0]
            train_y = inputs_train[1]
            pred_y_train = network.construct(train_x)  # 前向传播
            train_correct_num += evaluate(pred_y_train, train_y)
        train_acc = float(train_correct_num) / train_len
        
        for inputs_test in mnist_ds_test:
            test_x = inputs_test[0]
            test_y = inputs_test[1]
            pred_y_test = network.construct(test_x)
            test_correct_num += evaluate(pred_y_test, test_y)
        test_acc = float(test_correct_num) / test_len
        if (epoch == 1) or (epoch % 10 == 0):
            print("epoch: {0}/{1}, train_losses: {2:.4f}, tain_acc: {3:.4f}, test_acc: {4:.4f}" \
                  .format(epoch, max_epochs, output.asnumpy(), train_acc, test_acc, cflush=True))

 8.配置运行信息

        在正式训练前,通过context.set_context来配置运行需要的信息,譬如运行模式、后端信息、硬件等信息.

from mindspore import context
context.set_context(mode=context.GRAPH_MODE, device_target="CPU")  # device_target 可选 CPU/GPU, 当选择GPU时mindspore规格也需要切换到GPU

9.开始训练

import time
from mindspore.nn import WithLossCell, TrainOneStepCell

max_epochs = 100
start_time = time.time()
print("*"*10 + "开始训练" + "*"*10)
train(network, mnist_ds_train, max_epochs= max_epochs)
print("*"*10 + "训练完成" + "*"*10)
cost_time = round(time.time() - start_time, 1)
print("训练总耗时: %.1f s" % cost_time)

**********开始训练**********

epoch: 1/100, train_losses: 2.2832, tain_acc: 0.1698, test_acc: 0.1626

epoch: 10/100, train_losses: 2.0465, tain_acc: 0.6343, test_acc: 0.6017

epoch: 20/100, train_losses: 1.8368, tain_acc: 0.7918, test_acc: 0.7812

epoch: 30/100, train_losses: 1.7602, tain_acc: 0.8138, test_acc: 0.8017

epoch: 40/100, train_losses: 1.7245, tain_acc: 0.8238, test_acc: 0.7972

epoch: 50/100, train_losses: 1.7051, tain_acc: 0.8337, test_acc: 0.8044

epoch: 60/100, train_losses: 1.6922, tain_acc: 0.8403, test_acc: 0.8047

epoch: 70/100, train_losses: 1.6827, tain_acc: 0.8454, test_acc: 0.8033

epoch: 80/100, train_losses: 1.6752, tain_acc: 0.8501, test_acc: 0.8051

epoch: 90/100, train_losses: 1.6689, tain_acc: 0.8536, test_acc: 0.8049

epoch: 100/100, train_losses: 1.6635, tain_acc: 0.8569, test_acc: 0.8037

**********训练完成**********

训练总耗时: 430.7 s

到目前为止,Minor modifications to the code based on binary classification of handwritten digits,就快速实现了手写数字识别的十分类;

修改的过程是非常简单的,但从上面的结果可以看到,该模型训练100个epoch,在手写数字识别十分类的任务上仅仅达到了80%的准确率,而在上一节二分类任务上,模型训练50个epoch达到了99%的准确率,说明在感知机这样简单的模型上,手写数字识别十分类要比二分类要难.

原网站

版权声明
本文为[Learning Adventures]所创,转载请带上原文链接,感谢
https://yzsam.com/2022/216/202208040525570938.html