当前位置:网站首页>MNIST手写数字识别 —— 基于Mindspore快速构建感知机实现十分类
MNIST手写数字识别 —— 基于Mindspore快速构建感知机实现十分类
2022-08-04 05:30:00 【学习历险记】
基于深度学习框架快速构建感知机实现手写数字二分类
不依赖数学知识,使用MindSpore深度学习框架快速实现模型结构定义、损失函数定义、梯度下降的方法。
上一节从零开始实现了感知机的模型结构、定义了损失函数和评价函数,并且手动推导了梯度下降的公式,最终经过3000个epoch的训练,使得感知机模型在手写数字0和1的二分类任务上达到了0.9以上的准确率。
这个从零开始实现整个机器学习过程的方式是比较费劲的,特别是在手动推导梯度下降公式这一块,需要一定的数学知识。如果更换另一个损失函数,那又得再重新推导一遍新的梯度下降公式,这对调模型来说是费力的事情。
好在当今已经有很多的深度学习框架,它们已经友好地封装了模型结构定义、损失函数定义、梯度下降实现等过程,只需要进行一些简单的函数调用,就可以实现完成的机器学习训练过程,无需关注底层的梯度下降是如何实现的,极大地提高了模型开发的效率。
下面,用MindSpore框架来快速实现感知机模型,对手写数字进行二分类。
1.加载数据集
由于上一节已经定义了load_data_zeros_ones函数,所以直接进行调用即可
import os
import sys
import moxing as mox
datasets_dir = '../datasets'
if not os.path.exists(datasets_dir):
os.makedirs(datasets_dir)
if not os.path.exists(os.path.join(datasets_dir, 'MNIST_Data.zip')):
mox.file.copy('obs://modelarts-labs-bj4-v2/course/hwc_edu/python_module_framework/datasets/mindspore_data/MNIST_Data.zip',
os.path.join(datasets_dir, 'MNIST_Data.zip'))
os.system('cd %s; unzip MNIST_Data.zip' % (datasets_dir))
sys.path.insert(0, os.path.join(os.getcwd(), '../datasets/MNIST_Data'))
from load_data_zeros_ones import load_data_zeros_ones
train_images, train_labels, test_images, test_labels = load_data_zeros_ones(datasets_dir)
数字0,训练集规模: 5923 ,测试集规模: 980
数字1,训练集规模: 6742 ,测试集规模: 1135
load_data_zeros_ones函数返回的数据格式是np.ndarray格式,但是在MindSpore中要求的格式是Tensor格式,因此要执行下面的代码进行数据格式转换
import mindspore
from mindspore import Tensor
# 重新调整数据集形状
train_images = train_images.reshape((-1,1,28,28))
train_labels = train_labels.flatten()
test_images = test_images.reshape((-1,1,28,28))
test_labels = test_labels.flatten()
train_size = len(train_labels)
test_size = len(test_labels)
# 转变为mindspore支持的tensor格式的数据
train_images = Tensor(train_images, mindspore.float32)
train_labels = Tensor(train_labels, mindspore.int32)
test_images = Tensor(test_images, mindspore.float32)
test_labels = Tensor(test_labels, mindspore.int32)
2.定义网络结构和评价函数
使用 MindSpore 实现感知机模型非常简单,只需要调用nn.Dense定义一个全连接层,再加上一个Sigmoid单元即可,并且nn.Dense会对权值w和阈值偏置b自动进行初始化,代码如下:
import mindspore.nn as nn
import mindspore.ops as ops
from mindspore.common.initializer import Normal
class Network(nn.Cell):
def __init__(self, num_of_weights):
super(Network, self).__init__()
self.fc = nn.Dense(in_channels=num_of_weights, out_channels=2) # 定义一个全连接层
self.nonlinearity = nn.Sigmoid()
self.flatten = nn.Flatten()
def construct(self, x): # 加权求和单元和非线性函数单元通过定义计算过程来实现
x = self.flatten(x)
z = self.fc(x)
pred_y = self.nonlinearity(z)
return pred_y
# 评价函数
def evaluate(pred_y, true_y):
pred_labels = ops.Argmax(output_type=mindspore.int32)(pred_y)
correct_num = (pred_labels == true_y).asnumpy().sum().item()
return correct_num
3.定义交叉熵损失函数和优化器
要训练神经网络模型,需要定义损失函数和优化器。
MindSpore支持的损失函数有SoftmaxCrossEntropyWithLogits、L1Loss、MSELoss等。这里使用交叉熵损失函数SoftmaxCrossEntropyWithLogits。
MindSpore支持的优化器有Adam、AdamWeightDecay、SGD、Momentum等。这里使用Momentum优化器为例。
# 损失函数
net_loss = nn.SoftmaxCrossEntropyWithLogits(sparse=True, reduction='mean')
# 创建网络
network = Network(28*28)
lr = 0.01
momentum = 0.9
# 优化器
net_opt = nn.Momentum(network.trainable_params(), lr, momentum)
4.实现训练函数
def train(network, max_epochs= 50):
net = WithLossCell(network, net_loss)
train_network = TrainOneStepCell(net, net_opt)
train_network.set_train()
for epoch in range(1, max_epochs + 1):
train_correct_num = 0.0
test_correct_num = 0.0
output = train_network(train_images,train_labels)
pred_train_labels = network.construct(train_images) # 前向传播
train_correct_num = evaluate(pred_train_labels, train_labels)
train_acc = float(train_correct_num) / train_size
if (epoch == 1) or (epoch % 10 == 0):
pred_test_labels = network.construct(test_images)
test_correct_num = evaluate(pred_test_labels, test_labels)
test_acc = test_correct_num / test_size
print("epoch: {0}/{1}, train_losses: {2:.4f}, tain_acc: {3:.4f}, test_acc: {4:.4f}" \
.format(epoch, max_epochs, output.asnumpy(), train_acc, test_acc, cflush=True))
5.配置运行信息
在正式训练前,通过context.set_context来配置运行需要的信息,譬如运行模式、后端信息、硬件等信息。
from mindspore import context
context.set_context(mode=context.GRAPH_MODE, device_target="CPU") # device_target 可选 CPU/GPU, 当选择GPU时mindspore规格也需要切换到GPU
6.开始训练
import time
from mindspore.nn import WithLossCell, TrainOneStepCell
max_epochs = 50
start_time = time.time()
print("*"*10 + "开始训练" + "*"*10)
train(network, max_epochs= max_epochs)
print("*"*10 + "训练完成" + "*"*10)
cost_time = round(time.time() - start_time, 1)
print("训练总耗时: %.1f s" % cost_time)
**********开始训练********** epoch: 1/50, train_losses: 0.7050, tain_acc: 0.3516, test_acc: 0.3759 epoch: 10/50, train_losses: 0.5338, tain_acc: 0.9901, test_acc: 0.9943 epoch: 20/50, train_losses: 0.3990, tain_acc: 0.9949, test_acc: 0.9981 epoch: 30/50, train_losses: 0.3593, tain_acc: 0.9935, test_acc: 0.9972 epoch: 40/50, train_losses: 0.3468, tain_acc: 0.9934, test_acc: 0.9967 epoch: 50/50, train_losses: 0.3410, tain_acc: 0.9938, test_acc: 0.9967 **********训练完成********** 训练总耗时: 9.1 s
从上面的结果可以看到,使用 MindSpore 实现的感知机模型,使用20秒的时间,训练了50个epoch之后就达到了0.9967的准确率,相比上一节的实现,训练又快又好,这说明使用 MindSpore 来进行模型的开发,不仅开发效率更高,实现的结果也更优,这就是使用深度学习框架MindSpore带来的优势。
从二分类扩展到十分类
使用深度学习框架MindSpore,利用其友好的封装模块,模型结构定义、损失函数定义、梯度下降实现等过程,只需简单地函数调用,就能实现模型训练,极大地提高了模型开发的效率。
1.加载数据集
加载完整的、十个类别的数据集
import os
import numpy as np
import moxing as mox
import mindspore.dataset as ds
datasets_dir = '../datasets'
if not os.path.exists(datasets_dir):
os.makedirs(datasets_dir)
if not os.path.exists(os.path.join(datasets_dir, 'MNIST_Data.zip')):
mox.file.copy('obs://modelarts-labs-bj4-v2/course/hwc_edu/python_module_framework/datasets/mindspore_data/MNIST_Data.zip',
os.path.join(datasets_dir, 'MNIST_Data.zip'))
os.system('cd %s; unzip MNIST_Data.zip' % (datasets_dir))
# 读取完整训练样本和测试样本
mnist_ds_train = ds.MnistDataset(os.path.join(datasets_dir, "MNIST_Data/train"))
mnist_ds_test = ds.MnistDataset(os.path.join(datasets_dir, "MNIST_Data/test"))
train_len = mnist_ds_train.get_dataset_size()
test_len = mnist_ds_test.get_dataset_size()
print('训练集规模:', train_len, ',测试集规模:', test_len)
训练集规模: 60000 ,测试集规模: 10000
查看10个样本
from PIL import Image
items_train = mnist_ds_train.create_dict_iterator(output_numpy=True)
train_data = np.array([i for i in items_train])
images_train = np.array([i["image"] for i in train_data])
labels_train = np.array([i["label"] for i in train_data])
batch_size = 10 # 查看10个样本
batch_label = [lab for lab in labels_train[:10]]
print(batch_label)
batch_img = images_train[0].reshape(28, 28)
for i in range(1, batch_size):
batch_img = np.hstack((batch_img, images_train[i].reshape(28, 28))) # 将一批图片水平拼接起来,方便下一步进行显示
Image.fromarray(batch_img)
[0, 2, 2, 7, 8, 4, 9, 1, 8, 8]
2.处理数据集
数据集对于训练非常重要,好的数据集可以有效提高训练精度和效率,在使用数据集前,通常会对数据集进行一些处理。
进行数据增强操作
import mindspore.dataset.vision.c_transforms as CV
import mindspore.dataset.transforms.c_transforms as C
from mindspore.dataset.vision import Inter
from mindspore import dtype as mstype
num_parallel_workers = 1
resize_height, resize_width = 28, 28
# according to the parameters, generate the corresponding data enhancement method
resize_op = CV.Resize((resize_height, resize_width), interpolation=Inter.LINEAR) # 对图像数据像素进行缩放
type_cast_op = C.TypeCast(mstype.int32) # 将数据类型转化为int32。
hwc2chw_op = CV.HWC2CHW() # 对图像数据张量进行变换,张量形式由高x宽x通道(HWC)变为通道x高x宽(CHW),方便进行数据训练。
# using map to apply operations to a dataset
mnist_ds_train = mnist_ds_train.map(operations=resize_op, input_columns="image", num_parallel_workers=num_parallel_workers)
mnist_ds_train = mnist_ds_train.map(operations=type_cast_op, input_columns="label", num_parallel_workers=num_parallel_workers)
mnist_ds_train = mnist_ds_train.map(operations=hwc2chw_op, input_columns="image", num_parallel_workers=num_parallel_workers)
buffer_size = 10000
mnist_ds_train = mnist_ds_train.shuffle(buffer_size=buffer_size) # 打乱训练集的顺序
进行数据归一化
对图像数据进行标准化、归一化操作,使得每个像素的数值大小在(0,1)范围中,可以提升训练效率。
rescale = 1.0 / 255.0
shift = 0.0
rescale_nml = 1 / 0.3081
shift_nml = -1 * 0.1307 / 0.3081
rescale_op = CV.Rescale(rescale, shift)
mnist_ds_train = mnist_ds_train.map(operations=rescale_op, input_columns="image", num_parallel_workers=num_parallel_workers)
rescale_nml_op = CV.Rescale(rescale_nml, shift_nml)
mnist_ds_train = mnist_ds_train.map(operations=rescale_nml_op, input_columns="image", num_parallel_workers=num_parallel_workers)
mnist_ds_train = mnist_ds_train.batch(60000, drop_remainder=True) # 对数据集进行分批,此处加载完整的训练集
3.封装成函数
到此,完成了训练数据的准备工作,可以将以上操作封装成load_data_all函数和process_dataset函数,以便后面再次用到。
定义数据处理操作
定义一个函数process_dataset来进行数据增强和处理操作:
定义进行数据增强和处理所需要的一些参数。
根据参数,生成对应的数据增强操作。
使用map映射函数,将数据操作应用到数据集。
对生成的数据集进行处理。
%%writefile ../datasets/MNIST_Data/process_dataset.py
def process_dataset(mnist_ds, batch_size=32, resize= 28, repeat_size=1,
num_parallel_workers=1):
"""
process_dataset for train or test
Args:
mnist_ds (str): MnistData path
batch_size (int): The number of data records in each group
resize (int): Scale image data pixels
repeat_size (int): The number of replicated data records
num_parallel_workers (int): The number of parallel workers
"""
import mindspore.dataset.vision.c_transforms as CV
import mindspore.dataset.transforms.c_transforms as C
from mindspore.dataset.vision import Inter
from mindspore import dtype as mstype
# define some parameters needed for data enhancement and rough justification
resize_height, resize_width = resize, resize
rescale = 1.0 / 255.0
shift = 0.0
rescale_nml = 1 / 0.3081
shift_nml = -1 * 0.1307 / 0.3081
# according to the parameters, generate the corresponding data enhancement method
resize_op = CV.Resize((resize_height, resize_width), interpolation=Inter.LINEAR)
rescale_nml_op = CV.Rescale(rescale_nml, shift_nml)
rescale_op = CV.Rescale(rescale, shift)
hwc2chw_op = CV.HWC2CHW()
type_cast_op = C.TypeCast(mstype.int32)
c_trans = [resize_op, rescale_op, rescale_nml_op, hwc2chw_op]
# using map to apply operations to a dataset
mnist_ds = mnist_ds.map(operations=type_cast_op, input_columns="label", num_parallel_workers=num_parallel_workers)
mnist_ds = mnist_ds.map(operations=c_trans, input_columns="image", num_parallel_workers=num_parallel_workers)
# process the generated dataset
buffer_size = 10000
mnist_ds = mnist_ds.shuffle(buffer_size=buffer_size)
mnist_ds = mnist_ds.batch(batch_size, drop_remainder=True)
mnist_ds = mnist_ds.repeat(repeat_size)
return mnist_ds
定义数据加载函数
%%writefile ../datasets/MNIST_Data/load_data_all.py
def load_data_all(datasets_dir):
import os
if not os.path.exists(datasets_dir):
os.makedirs(datasets_dir)
import moxing as mox
if not os.path.exists(os.path.join(datasets_dir, 'MNIST_Data.zip')):
mox.file.copy('obs://modelarts-labs-bj4-v2/course/hwc_edu/python_module_framework/datasets/mindspore_data/MNIST_Data.zip',
os.path.join(datasets_dir, 'MNIST_Data.zip'))
os.system('cd %s; unzip MNIST_Data.zip' % (datasets_dir))
# 读取完整训练样本和测试样本
import mindspore.dataset as ds
datasets_dir = '../datasets'
mnist_ds_train = ds.MnistDataset(os.path.join(datasets_dir, "MNIST_Data/train"))
mnist_ds_test = ds.MnistDataset(os.path.join(datasets_dir, "MNIST_Data/test"))
train_len = mnist_ds_train.get_dataset_size()
test_len = mnist_ds_test.get_dataset_size()
print('训练集规模:', train_len, ',测试集规模:', test_len)
return mnist_ds_train, mnist_ds_test, train_len, test_len
4.加载处理后的测试集
import os, sys
sys.path.insert(0, os.path.join(os.getcwd(), '../datasets/MNIST_Data'))
from process_dataset import process_dataset
mnist_ds_test = process_dataset(mnist_ds_test, batch_size= 10000)
5.定义网络结构和评价函数
import mindspore
import mindspore.nn as nn
import mindspore.ops as ops
from mindspore.common.initializer import Normal
class Network(nn.Cell):
def __init__(self, num_of_weights):
super(Network, self).__init__()
self.fc = nn.Dense(in_channels=num_of_weights, out_channels=10, weight_init=Normal(0.02)) # 定义一个全连接层
self.nonlinearity = nn.Sigmoid()
self.flatten = nn.Flatten()
def construct(self, x): # 加权求和单元和非线性函数单元通过定义计算过程来实现
x = self.flatten(x)
z = self.fc(x)
pred_y = self.nonlinearity(z)
return pred_y
def evaluate(pred_y, true_y):
pred_labels = ops.Argmax(output_type=mindspore.int32)(pred_y)
correct_num = (pred_labels == true_y).asnumpy().sum().item()
return correct_num
6.定义交叉熵损失函数和优化器
# 损失函数
net_loss = nn.SoftmaxCrossEntropyWithLogits(sparse=True, reduction='mean')
# 创建网络
network = Network(28*28)
lr = 0.01
momentum = 0.9
# 优化器
net_opt = nn.Momentum(network.trainable_params(), lr, momentum)
7.实现训练函数
def train(network, mnist_ds_train, max_epochs= 50):
net = WithLossCell(network, net_loss)
net = TrainOneStepCell(net, net_opt)
network.set_train()
for epoch in range(1, max_epochs + 1):
train_correct_num = 0.0
test_correct_num = 0.0
for inputs_train in mnist_ds_train:
output = net(*inputs_train)
train_x = inputs_train[0]
train_y = inputs_train[1]
pred_y_train = network.construct(train_x) # 前向传播
train_correct_num += evaluate(pred_y_train, train_y)
train_acc = float(train_correct_num) / train_len
for inputs_test in mnist_ds_test:
test_x = inputs_test[0]
test_y = inputs_test[1]
pred_y_test = network.construct(test_x)
test_correct_num += evaluate(pred_y_test, test_y)
test_acc = float(test_correct_num) / test_len
if (epoch == 1) or (epoch % 10 == 0):
print("epoch: {0}/{1}, train_losses: {2:.4f}, tain_acc: {3:.4f}, test_acc: {4:.4f}" \
.format(epoch, max_epochs, output.asnumpy(), train_acc, test_acc, cflush=True))
8.配置运行信息
在正式训练前,通过context.set_context来配置运行需要的信息,譬如运行模式、后端信息、硬件等信息。
from mindspore import context
context.set_context(mode=context.GRAPH_MODE, device_target="CPU") # device_target 可选 CPU/GPU, 当选择GPU时mindspore规格也需要切换到GPU
9.开始训练
import time
from mindspore.nn import WithLossCell, TrainOneStepCell
max_epochs = 100
start_time = time.time()
print("*"*10 + "开始训练" + "*"*10)
train(network, mnist_ds_train, max_epochs= max_epochs)
print("*"*10 + "训练完成" + "*"*10)
cost_time = round(time.time() - start_time, 1)
print("训练总耗时: %.1f s" % cost_time)
**********开始训练********** epoch: 1/100, train_losses: 2.2832, tain_acc: 0.1698, test_acc: 0.1626 epoch: 10/100, train_losses: 2.0465, tain_acc: 0.6343, test_acc: 0.6017 epoch: 20/100, train_losses: 1.8368, tain_acc: 0.7918, test_acc: 0.7812 epoch: 30/100, train_losses: 1.7602, tain_acc: 0.8138, test_acc: 0.8017 epoch: 40/100, train_losses: 1.7245, tain_acc: 0.8238, test_acc: 0.7972 epoch: 50/100, train_losses: 1.7051, tain_acc: 0.8337, test_acc: 0.8044 epoch: 60/100, train_losses: 1.6922, tain_acc: 0.8403, test_acc: 0.8047 epoch: 70/100, train_losses: 1.6827, tain_acc: 0.8454, test_acc: 0.8033 epoch: 80/100, train_losses: 1.6752, tain_acc: 0.8501, test_acc: 0.8051 epoch: 90/100, train_losses: 1.6689, tain_acc: 0.8536, test_acc: 0.8049 epoch: 100/100, train_losses: 1.6635, tain_acc: 0.8569, test_acc: 0.8037 **********训练完成********** 训练总耗时: 430.7 s
到目前为止,对基于手写数字二分类的代码进行少量修改,就快速实现了手写数字识别的十分类;
修改的过程是非常简单的,但从上面的结果可以看到,该模型训练100个epoch,在手写数字识别十分类的任务上仅仅达到了80%的准确率,而在上一节二分类任务上,模型训练50个epoch达到了99%的准确率,说明在感知机这样简单的模型上,手写数字识别十分类要比二分类要难。
边栏推荐
猜你喜欢
随机推荐
MySQL leftmost prefix principle [I understand hh]
语音驱动嘴型与面部动画生成的现状和趋势
动手学深度学习_多层感知机
2020-10-29
PP-LiteSeg
TensorFlow2 study notes: 5. Common activation functions
YOLOV5 V6.1 详细训练方法
YOLOV4流程图(方便理解)
线性回归简介01---API使用案例
双向LSTM
动手学深度学习_softmax回归
【代码学习】
Usage of Thread, Handler and IntentService
【CV-Learning】语义分割
打金?工作室?账号被封?游戏灰黑产离我们有多近
0, deep learning 21 days learning challenge 】 【 set up learning environment
Androd Day02
安装dlib踩坑记录,报错:WARNING: pip is configured with locations that require TLS/SSL
Thread 、Handler和IntentService的用法
【CV-Learning】Image Classification