当前位置:网站首页>ML17-神经网络实战
ML17-神经网络实战
2022-07-29 05:22:00 【十九岁的花季少女】
神经网络实战
数据集一共分为50000训练集,10000测试集。但是我们为了速度考虑选择5000训练,500测试。
初始化
input_dim:输入数据是32*32彩色的。hidden_dim;隐藏层有十个神经元;num_classes输出十个类别的可能性。weight_scale:权重初始化小一些,reg正则化惩罚力度。
#初始化w,b
def __init__(self, input_dim=3*32*32, hidden_dim=100, num_classes=10,
weight_scale=1e-3, reg=0.0):
self.params = {
}
self.reg = reg
self.params['W1'] = weight_scale * np.random.randn(input_dim, hidden_dim)
self.params['b1'] = np.zeros((1, hidden_dim))
self.params['W2'] = weight_scale * np.random.randn(hidden_dim, num_classes)
self.params['b2'] = np.zeros((1, num_classes))
前向传播过程
每一层层&&激活函数到输出值score
数据输入进来,通过全连接层后,再通过激活层(用的ReLu函数),然后输出计算损失值
首先拿到初始化的w,b,将x,w,b传入函数,进行前向传播的过程。最后可以求得得分值。
scores = None
N = X.shape[0]
# Unpack variables from the params dictionary
W1, b1 = self.params['W1'], self.params['b1']
W2, b2 = self.params['W2'], self.params['b2']
#这两个函数的区别是,第二个没有激活函数
h1, cache1 = affine_relu_forward(X, W1, b1)
out, cache2 = affine_forward(h1, W2, b2)
#得分值
scores = out
进入affine_relu_forward函数,这个函数计算出中间层的输出。
def affine_relu_forward(x, w, b):
a, fc_cache = affine_forward(x, w, b)
#保存中间值,包括原本全连接层的输出和经过ReLu之后的输出,为了反向传播时计算用。
out, relu_cache = relu_forward(a)
cache = (fc_cache, relu_cache)
return out, cache
在affine_forward中进行计算,x*w+b
out = np.dot(x_row, w) + b
relu_forward功能:进行ReLu操作,ReLu,就是算出max(0,x)
def relu_forward(x):
out = None
out = ReLU(x)
cache = x
return out, cache
def ReLU(x):
"""ReLU non-linearity."""
return np.maximum(0, x)
softmax函数 将score转换为概率
调用softmax_loss
data_loss, dscores = softmax_loss(scores, y)
def softmax_loss(x, y):
#对得分进行归一化处理
probs = np.exp(x - np.max(x, axis=1, keepdims=True))
probs /= np.sum(probs, axis=1, keepdims=True)
N = x.shape[0]
使用-log(属于正确类别的得分) ,并计算损失值
loss = -np.sum(np.log(probs[np.arange(N), y])) / N
dx = probs.copy()
#求梯度
dx[np.arange(N), y] -= 1
dx /= N
#将损失值和梯度返回
return loss, dx
#正则化惩罚项 1/2w^2
reg_loss = 0.5 * self.reg * np.sum(W1*W1) + 0.5 * self.reg * np.sum(W2*W2)
#损失函数=损失值+正则化惩罚
loss = data_loss + reg_loss
算完softmax的梯度之后该算前一层的梯度了,第二层的w2和b2。
调用affine_backward,然后调用affine_relu_backward。
dh1, dW2, db2 = affine_backward(dscores, cache2)
dX, dW1, db1 = affine_relu_backward(dh1, cache1)
对于x求梯度,求导就是w,实际上就是w*dout(之前传下来的梯度),代码如①,计算w的梯度同样,如②。对b求就是1,那他就等于上面传下来的。
#dout是softmax层传的梯度,cache是第二层计算的结果。
def affine_backward(dout, cache):
x, w, b = cache
dx, dw, db = None, None, None
#①
dx = np.dot(dout, w.T) # (N,D)
#对x进行规范化
dx = np.reshape(dx, x.shape) # (N,d1,...,d_k)
x_row = x.reshape(x.shape[0], -1) # (N,D)
#②
dw = np.dot(x_row.T, dout) # (D,M)
db = np.sum(dout, axis=0, keepdims=True) # (1,M)
return dx, dw, db
affine_relu_backward函数,先对relu进行反向传播。然后再次调用affine_backward。
def affine_relu_backward(dout, cache):
""" Backward pass for the affine-relu convenience layer """
fc_cache, relu_cache = cache
da = relu_backward(dout, relu_cache)
dx, dw, db = affine_backward(da, fc_cache)
return dx, dw, db
对于relu层,前向传播是通过max(0,x),所以求导时x>0时导数为1也就是传进来的梯度,当x≤0时,导数就是0,那么梯度也为0.
def relu_backward(dout, cache):
dx, x = None, cache
dx = dout
dx[x <= 0] = 0
return dx
加上正则化惩罚项,就完成了完整的反向传播
dW2 += self.reg * W2
dW1 += self.reg * W1
保存梯度值
grads['W1'] = dW1
grads['b1'] = db1
grads['W2'] = dW2
grads['b2'] = db2
边栏推荐
- Detailed explanation of MySQL statistical function count
- fastText学习——文本分类
- mysql插入百万数据(使用函数和存储过程)
- 虚假新闻检测论文阅读(二):Semi-Supervised Learning and Graph Neural Networks for Fake News Detection
- 一、网页端文件流的传输
- [semantic segmentation] full attention network for semantic segmentation
- D3.js vertical relationship diagram (with arrows and text description of connecting lines)
- 备份谷歌或其他浏览器插件
- 简单聊聊 PendingIntent 与 Intent 的区别
- 【Transformer】AdaViT: Adaptive Vision Transformers for Efficient Image Recognition
猜你喜欢
[semantic segmentation] full attention network for semantic segmentation
ROS常用指令
六、基于深度学习关键点的指针式表计识别
[target detection] generalized focal loss v1
[target detection] 6. SSD
Error in installing pyspider under Windows: Please specify --curl dir=/path/to/build/libcurl solution
研究生新生培训第三周:ResNet+ResNeXt
一、迁移学习与fine-tuning有什么区别?
GA-RPN:引导锚点的建议区域网络
NLP领域的AM模型
随机推荐
【Transformer】TransMix: Attend to Mix for Vision Transformers
[semantic segmentation] Introduction to mapillary dataset
【Transformer】TransMix: Attend to Mix for Vision Transformers
神经网络相关知识回顾(PyTorch篇)
一、迁移学习与fine-tuning有什么区别?
一、多个txt文件合并成1个txt文件
Activity交互问题,你确定都知道?
【卷积核设计】Scaling Up Your Kernels to 31x31: Revisiting Large Kernel Design in CNNs
Are you sure you know the interaction problem of activity?
Spring, summer, autumn and winter with Miss Zhang (5)
【ML】机器学习模型之PMML--概述
【Attention】Visual Attention Network
【Transformer】SegFormer:Simple and Efficient Design for Semantic Segmentation with Transformers
【网络设计】ConvNeXt:A ConvNet for the 2020s
FFmpeg创作GIF表情包教程来了!赶紧说声多谢乌蝇哥?
Reporting service 2016 custom authentication
四、One-hot和损失函数的应用
GA-RPN:引导锚点的建议区域网络
[image classification] how to use mmclassification to train your classification model
【目标检测】KL-Loss:Bounding Box Regression with Uncertainty for Accurate Object Detection