当前位置:网站首页>Ml17 neural network practice
Ml17 neural network practice
2022-07-29 06:07:00 【19-year-old flower girl】
The actual combat of neural network
The data set is divided into 50000 Training set ,10000 Test set . But we choose for speed 5000 Training ,500 test .
initialization
input_dim: The input data is 32*32 chromatic .hidden_dim; There are ten neurons in the hidden layer ;num_classes Possibility of outputting ten categories .weight_scale: The weight initialization is smaller ,reg Regularization penalty .
# initialization w,b
def __init__(self, input_dim=3*32*32, hidden_dim=100, num_classes=10,
weight_scale=1e-3, reg=0.0):
self.params = {
}
self.reg = reg
self.params['W1'] = weight_scale * np.random.randn(input_dim, hidden_dim)
self.params['b1'] = np.zeros((1, hidden_dim))
self.params['W2'] = weight_scale * np.random.randn(hidden_dim, num_classes)
self.params['b2'] = np.zeros((1, num_classes))
Forward propagation process
Every layer && Activate the function to the output value score
Data input , After passing through the full connection layer , Then through the activation layer ( With ReLu function ), Then output the calculated loss value 
First get the initialized w,b, take x,w,b Passing in functions , The process of forward propagation . Finally, you can get the score .
scores = None
N = X.shape[0]
# Unpack variables from the params dictionary
W1, b1 = self.params['W1'], self.params['b1']
W2, b2 = self.params['W2'], self.params['b2']
# The difference between these two functions is , The second one has no activation function
h1, cache1 = affine_relu_forward(X, W1, b1)
out, cache2 = affine_forward(h1, W2, b2)
# Score value
scores = out
Get into affine_relu_forward function , This function calculates the output of the middle layer .
def affine_relu_forward(x, w, b):
a, fc_cache = affine_forward(x, w, b)
# Save intermediate values , Including the output and process of the original full connection layer ReLu Output after , For back propagation calculation .
out, relu_cache = relu_forward(a)
cache = (fc_cache, relu_cache)
return out, cache
stay affine_forward Calculation in ,x*w+b
out = np.dot(x_row, w) + b
relu_forward function : Conduct ReLu operation ,ReLu, Is to figure out max(0,x)
def relu_forward(x):
out = None
out = ReLU(x)
cache = x
return out, cache
def ReLU(x):
"""ReLU non-linearity."""
return np.maximum(0, x)
softmax function take score Convert to probability
call softmax_loss
data_loss, dscores = softmax_loss(scores, y)
def softmax_loss(x, y):
# Normalize the score
probs = np.exp(x - np.max(x, axis=1, keepdims=True))
probs /= np.sum(probs, axis=1, keepdims=True)
N = x.shape[0]
Use -log( Scores belonging to the correct category ) , And calculate the loss value
loss = -np.sum(np.log(probs[np.arange(N), y])) / N
dx = probs.copy()
# Find gradient
dx[np.arange(N), y] -= 1
dx /= N
# Return the loss value and gradient
return loss, dx
# Regularization penalty term 1/2w^2
reg_loss = 0.5 * self.reg * np.sum(W1*W1) + 0.5 * self.reg * np.sum(W2*W2)
# Loss function = Loss value + Regularization penalty
loss = data_loss + reg_loss
We're done softmax After the gradient of, it's time to calculate the gradient of the previous layer , Second floor w2 and b2.
call affine_backward, And then call affine_relu_backward.
dh1, dW2, db2 = affine_backward(dscores, cache2)
dX, dW1, db1 = affine_relu_backward(dh1, cache1)
about x Find gradient , Derivation is w, It's actually w*dout( The gradient passed down before ), Code such as ①, Calculation w The gradient of , Such as ②. Yes b Beg is 1, Then he is equal to the one passed down from above .
#dout yes softmax Gradient of layer transmission ,cache It is the result of the second level calculation .
def affine_backward(dout, cache):
x, w, b = cache
dx, dw, db = None, None, None
#①
dx = np.dot(dout, w.T) # (N,D)
# Yes x Standardize
dx = np.reshape(dx, x.shape) # (N,d1,...,d_k)
x_row = x.reshape(x.shape[0], -1) # (N,D)
#②
dw = np.dot(x_row.T, dout) # (D,M)
db = np.sum(dout, axis=0, keepdims=True) # (1,M)
return dx, dw, db
affine_relu_backward function , First pair relu Back propagation . Then call... Again affine_backward.
def affine_relu_backward(dout, cache):
""" Backward pass for the affine-relu convenience layer """
fc_cache, relu_cache = cache
da = relu_backward(dout, relu_cache)
dx, dw, db = affine_backward(da, fc_cache)
return dx, dw, db
about relu layer , Forward propagation is through max(0,x), So when we take the derivative x>0 The time derivative is 1 That is, the incoming gradient , When x≤0 when , The derivative is 0, Then the gradient is also 0.
def relu_backward(dout, cache):
dx, x = None, cache
dx = dout
dx[x <= 0] = 0
return dx
Add the regularization penalty , Complete back propagation
dW2 += self.reg * W2
dW1 += self.reg * W1
Save the gradient value
grads['W1'] = dW1
grads['b1'] = db1
grads['W2'] = dW2
grads['b2'] = db2
边栏推荐
- ROS常用指令
- Is flutter being quietly abandoned? On the future of flutter
- 引入Spacy模块出错—OSError: [E941] Can‘t find model ‘en‘.
- Valuable blog and personal experience collection (continuous update)
- Spring, summer, autumn and winter with Miss Zhang (5)
- 虚假新闻检测论文阅读(三):Semi-supervised Content-based Detection of Misinformation via Tensor Embeddings
- 二、如何保存MNIST数据集中train和test的图片?
- isAccessible()方法:使用反射技巧让你的性能提升数倍
- 【DL】搭建卷积神经网络用于回归预测(数据+代码详细教程)
- ML11-SKlearn实现支持向量机
猜你喜欢
![[ml] PMML of machine learning model -- Overview](/img/a1/cd3eff044d903dbcfb880e854713e5.png)
[ml] PMML of machine learning model -- Overview

ABSA1: Attentional Encoder Network for Targeted Sentiment Classification

【pycharm】pycharm远程连接服务器

CNOOC, desktop cloud & network disk storage system application case

虚假新闻检测论文阅读(三):Semi-supervised Content-based Detection of Misinformation via Tensor Embeddings

【语义分割】Fully Attentional Network for Semantic Segmentation

虚假新闻检测论文阅读(一):Fake News Detection using Semi-Supervised Graph Convolutional Network
![[go] use of defer](/img/10/9e4e1c593870450c381a154f31ebef.png)
[go] use of defer

备份谷歌或其他浏览器插件

深入理解MMAP原理,让大厂都爱不释手的技术
随机推荐
【目标检测】6、SSD
五、图像像素统计
迁移学习—— Transfer Feature Learning with Joint Distribution Adaptation
迁移学习—Geodesic Flow Kernel for Unsupervised Domain Adaptation
"Full flash measurement" database acceleration solution
研究生新生培训第一周:深度学习和pytorch基础
【网络设计】ConvNeXt:A ConvNet for the 2020s
【Transformer】AdaViT: Adaptive Tokens for Efficient Vision Transformer
【Transformer】TransMix: Attend to Mix for Vision Transformers
Are you sure you know the interaction problem of activity?
[target detection] KL loss: bounding box progression with uncertainty for accurate object detection
【Transformer】ACMix:On the Integration of Self-Attention and Convolution
【Transformer】ACMix:On the Integration of Self-Attention and Convolution
虚假新闻检测论文阅读(二):Semi-Supervised Learning and Graph Neural Networks for Fake News Detection
【Clustrmaps】访客统计
虚假新闻检测论文阅读(三):Semi-supervised Content-based Detection of Misinformation via Tensor Embeddings
个人学习网站
[CV] what are the specific numbers of convolution kernels (filters) 3*3, 5*5, 7*7 and 11*11?
The third week of postgraduate freshman training: resnet+resnext
[semantic segmentation] full attention network for semantic segmentation