当前位置:网站首页>Hands-on Deep Learning_LeNet
Hands-on Deep Learning_LeNet
2022-08-04 11:44:00 【CV Small Rookie】
LeNet 是最早发布的卷积神经网络之一,因其在计算机视觉任务中的高效性能而受到广泛关注.
It doesn't matter if you haven't heard of it,对 MINIST 肯定不陌生,MNIST 数据集就是 LeNet to identify the target.
当时,LeNet取得了与支持向量机(support vector machines)性能相媲美的成果,成为监督学习的主流方法
MNIST
简单介绍一下,MNIST 数据集共有 7w 张图片,其中 6w 用于训练,1w 用于测试.Each image is [email protected]*28 的黑白图像.
# 定义运行线程数
def get_dataloader_workers(): #@save
"""使用4个进程来读取数据"""
return 4
# 下载MNIST数据集,然后将其加载到内存中
def load_data_mnist(batch_size, resize=None): #@save
trans = [transforms.ToTensor()]
if resize:
trans.insert(0, transforms.Resize(resize))
trans = transforms.Compose(trans)
mnist_train = torchvision.datasets.MNIST(root="../data", train=True, transform=trans, download=True)
mnist_test = torchvision.datasets.MNIST(root="../data", train=False, transform=trans, download=True)
return (data.DataLoader(mnist_train, batch_size, shuffle=True, num_workers=get_dataloader_workers()),
data.DataLoader(mnist_test, batch_size, shuffle=False, num_workers=get_dataloader_workers()))
LeNet
LeNet有两个部分组成,The front convolution module and the back fully connected module.Convolution is used to extract features,Full joins are used to map the final output for classification.
每个卷积块中的基本单元是一个卷积层、一个sigmoid激活函数和 average pooling 层(虽然 ReLU 和 max pooling 更有效,但它们在20世纪90年代还没有出现).每个卷积层使用5×5卷积核和一个sigmoid激活函数.这些层将输入映射到多个二维特征输出,通常同时增加通道的数量.第一卷积层有6个输出通道,而第二个卷积层有16个输出通道.每个2×2池操作(stride为2)通过空间下采样将维数减少4倍.卷积的输出形状由批量大小、通道数、高度、宽度决定.
为了将卷积块的输出传递给稠密块,我们必须在小批量中展平每个样本.换言之,我们将这个四维输入转换成全连接层所期望的二维输入.这里的二维表示的第一个维度索引小批量中的样本,第二个维度给出每个样本的平面向量表示.LeNet的稠密块有三个全连接层,分别有120、84和10个输出.因为我们在执行分类任务,所以输出层的10维对应于最后输出结果的数量.
直接上代码!
# 作者 :CV小Rookie
# 创建时间: 2022/8/3 20:45
# 文件名: train.py
import torch
from torch import nn
from d2l import torch as d2l
from download_datas import *
def get_default_device():
if torch.cuda.is_available() :
return 'cuda'
elif getattr (torch.backends, 'mps', None) is not None and torch.backends.mps.is_available():
return 'mps'
else:
return 'cpu'
device = get_default_device()
net = nn.Sequential(
nn.Conv2d(1, 6, kernel_size=5, padding=2), nn.Sigmoid(),
nn.AvgPool2d(kernel_size=2, stride=2), #
nn.Conv2d(6, 16, kernel_size=5), nn.Sigmoid(),
nn.AvgPool2d(kernel_size=2, stride=2),
nn.Flatten(),
nn.Linear(16 * 5 * 5, 120), nn.Sigmoid(),
nn.Linear(120, 84), nn.Sigmoid(),
nn.Linear(84, 10))
# X = torch.rand(size=(1, 1, 28, 28), dtype=torch.float32)
# for layer in net:
# X = layer(X)
# print(layer.__class__.__name__,'output shape: \t',X.shape)
print(net)
batch_size = 256
train_iter, test_iter = load_data_mnist(batch_size=batch_size)
# train_iter, test_iter = load_data_fashion_mnist(batch_size=batch_size)
def evaluate_accuracy_gpu(net, data_iter, device=None): #@save
"""使用GPU计算模型在数据集上的精度"""
if isinstance(net, nn.Module):
net.eval() # 设置为评估模式
if not device:
device = next(iter(net.parameters())).device
# 正确预测的数量,总预测的数量
metric = d2l.Accumulator(2)
with torch.no_grad():
for X, y in data_iter:
if isinstance(X, list):
# BERT微调所需的(之后将介绍)
X = [x.to(device) for x in X]
else:
X = X.to(device)
y = y.to(device)
metric.add(d2l.accuracy(net(X), y), y.numel())
return metric[0] / metric[1]
def train(net, train_iter, test_iter, num_epochs, lr, device):
def init_weights(m):
if type(m) == nn.Linear or type(m) == nn.Conv2d:
nn.init.xavier_uniform_(m.weight)
net.apply(init_weights)
print('training on', device)
net.to(device)
optimizer = torch.optim.SGD(net.parameters(), lr=lr)
loss = nn.CrossEntropyLoss()
animator = d2l.Animator(xlabel='epoch', xlim=[1, num_epochs],
legend=['train loss', 'train acc', 'test acc'])
timer, num_batches = d2l.Timer(), len(train_iter)
for epoch in range(num_epochs):
# 训练损失之和,训练准确率之和,样本数
metric = d2l.Accumulator(3)
net.train()
for i, (X, y) in enumerate(train_iter):
timer.start()
optimizer.zero_grad()
X, y = X.to(device), y.to(device)
y_hat = net(X)
l = loss(y_hat, y)
l.backward()
optimizer.step()
with torch.no_grad():
metric.add(l * X.shape[0], d2l.accuracy(y_hat, y), X.shape[0])
timer.stop()
train_l = metric[0] / metric[2]
train_acc = metric[1] / metric[2]
if (i + 1) % (num_batches // 5) == 0 or i == num_batches - 1:
animator.add(epoch + (i + 1) / num_batches,
(train_l, train_acc, None))
test_acc = evaluate_accuracy_gpu(net, test_iter)
animator.add(epoch + 1, (None, None, test_acc))
torch.save(net.state_dict(), "module-{0}.pth".format(epoch))
print(f'loss {train_l:.3f}, train acc {train_acc:.3f}, '
f'test acc {test_acc:.3f}')
print(f'{metric[2] * num_epochs / timer.sum():.1f} examples/sec '
f'on {str(device)}')
lr, num_epochs = 0.9, 10
train(net, train_iter, test_iter, num_epochs, lr, device)
以[email protected]输入为例
Conv2d output shape: torch.Size([1, 6, 28, 28])
Sigmoid output shape: torch.Size([1, 6, 28, 28])
AvgPool2d output shape: torch.Size([1, 6, 14, 14])
Conv2d output shape: torch.Size([1, 16, 10, 10])
Sigmoid output shape: torch.Size([1, 16, 10, 10])
AvgPool2d output shape: torch.Size([1, 16, 5, 5])
Flatten output shape: torch.Size([1, 400])
Linear output shape: torch.Size([1, 120])
Sigmoid output shape: torch.Size([1, 120])
Linear output shape: torch.Size([1, 84])
Sigmoid output shape: torch.Size([1, 84])
Linear output shape: torch.Size([1, 10])
loss 0.131, train acc 0.961, test acc 0.966
边栏推荐
- 【目标检测】yolov2特征提取网络------Darknet19结构解析及tensorflow和pytorch实现
- 如何用一条命令将网页转成电脑 App
- Apache Doris 1.1 特性揭秘:Flink 实时写入如何兼顾高吞吐和低延时
- ESP8266-Arduino编程实例-MQ3酒精传感器驱动
- Leetcode brush - structure binary tree (105. Once upon a time sequence and the sequence structure binary tree traversal sequence, 106. From the sequence with the sequence structure binary tree travers
- ECCV 2022 | 通往数据高效的Transformer目标检测器
- 数据库对象-视图;存储过程
- 光盘刻录步骤
- cat /proc/kallsyms 发现内核符号表值都为0
- 深度学习------pytorch实现cifar10数据集
猜你喜欢
化繁为简!阿里新产亿级流量系统设计核心原理高级笔记(终极版)
使用函数
剑指长城炮? 长安全新皮卡官方谍照
Transferring Rich Feature Hierarchies for Robust
数据库对象
微信公众号之底部菜单
活动报名:如何高效应对当下的实时场景需求?
HyperLynx仿真(一)LineSim简单介绍
Leetcode brush questions - binary search tree related topics (98. Verify binary search tree, 235. The nearest common ancestor of binary search tree, 1038. From binary search tree to bigger sum tree, 5
Leetcode刷题——路径总和
随机推荐
Zikko launches new Thunderbolt 4 docking station with both HDMI2.1 and 2.5GbE
云原生Devops 的实现方法
超美星空特效,你Get了吗?
Disc burning steps
AI 助力双碳目标:让每一度电都是我们优化的
Leetcode刷题——543. 二叉树的直径、617. 合并二叉树(递归解决)
Leetcode刷题——二叉搜索树相关题目(98. 验证二叉搜索树、235. 二叉搜索树的最近公共祖先、1038. 从二叉搜索树到更大和树、538. 把二叉搜索树转换为累加树)
Leetcode刷题——构造二叉树(105. 从前序与中序遍历序列构造二叉树、106. 从中序与后序遍历序列构造二叉树)
子查询
【LeetCode】232.用栈实现队列
数据库对象-视图;存储过程
喂,你知道节流是什么吗?
Xilinx VIVADO 中 DDR3(Naive)的使用(1)创建 IP 核
2022上半年各银行理财子公司深耕差异化发展,净值型产品数量增加
国际原子能机构总干事警告称扎波罗热核电站安全形势已“完全失控”
复盘:经典的HR面试问题,这些问题可以挖掘你个人的素质,看看你是否合适合我们部门
WPF 截图控件之画笔(八)「仿微信」
opencv------图片转化为视频
Leetcode——利用先序遍历特性完成114. 二叉树展开为链表
Leetcode brush questions - 543. Diameter of binary trees, 617. Merging binary trees (recursive solution)