当前位置:网站首页>Ml17 neural network practice
Ml17 neural network practice
2022-07-29 06:07:00 【19-year-old flower girl】
The actual combat of neural network
The data set is divided into 50000 Training set ,10000 Test set . But we choose for speed 5000 Training ,500 test .
initialization
input_dim: The input data is 32*32 chromatic .hidden_dim; There are ten neurons in the hidden layer ;num_classes Possibility of outputting ten categories .weight_scale: The weight initialization is smaller ,reg Regularization penalty .
# initialization w,b
def __init__(self, input_dim=3*32*32, hidden_dim=100, num_classes=10,
weight_scale=1e-3, reg=0.0):
self.params = {
}
self.reg = reg
self.params['W1'] = weight_scale * np.random.randn(input_dim, hidden_dim)
self.params['b1'] = np.zeros((1, hidden_dim))
self.params['W2'] = weight_scale * np.random.randn(hidden_dim, num_classes)
self.params['b2'] = np.zeros((1, num_classes))
Forward propagation process
Every layer && Activate the function to the output value score
Data input , After passing through the full connection layer , Then through the activation layer ( With ReLu function ), Then output the calculated loss value
First get the initialized w,b, take x,w,b Passing in functions , The process of forward propagation . Finally, you can get the score .
scores = None
N = X.shape[0]
# Unpack variables from the params dictionary
W1, b1 = self.params['W1'], self.params['b1']
W2, b2 = self.params['W2'], self.params['b2']
# The difference between these two functions is , The second one has no activation function
h1, cache1 = affine_relu_forward(X, W1, b1)
out, cache2 = affine_forward(h1, W2, b2)
# Score value
scores = out
Get into affine_relu_forward function , This function calculates the output of the middle layer .
def affine_relu_forward(x, w, b):
a, fc_cache = affine_forward(x, w, b)
# Save intermediate values , Including the output and process of the original full connection layer ReLu Output after , For back propagation calculation .
out, relu_cache = relu_forward(a)
cache = (fc_cache, relu_cache)
return out, cache
stay affine_forward Calculation in ,x*w+b
out = np.dot(x_row, w) + b
relu_forward function : Conduct ReLu operation ,ReLu, Is to figure out max(0,x)
def relu_forward(x):
out = None
out = ReLU(x)
cache = x
return out, cache
def ReLU(x):
"""ReLU non-linearity."""
return np.maximum(0, x)
softmax function take score Convert to probability
call softmax_loss
data_loss, dscores = softmax_loss(scores, y)
def softmax_loss(x, y):
# Normalize the score
probs = np.exp(x - np.max(x, axis=1, keepdims=True))
probs /= np.sum(probs, axis=1, keepdims=True)
N = x.shape[0]
Use -log( Scores belonging to the correct category ) , And calculate the loss value
loss = -np.sum(np.log(probs[np.arange(N), y])) / N
dx = probs.copy()
# Find gradient
dx[np.arange(N), y] -= 1
dx /= N
# Return the loss value and gradient
return loss, dx
# Regularization penalty term 1/2w^2
reg_loss = 0.5 * self.reg * np.sum(W1*W1) + 0.5 * self.reg * np.sum(W2*W2)
# Loss function = Loss value + Regularization penalty
loss = data_loss + reg_loss
We're done softmax After the gradient of, it's time to calculate the gradient of the previous layer , Second floor w2 and b2.
call affine_backward, And then call affine_relu_backward.
dh1, dW2, db2 = affine_backward(dscores, cache2)
dX, dW1, db1 = affine_relu_backward(dh1, cache1)
about x Find gradient , Derivation is w, It's actually w*dout( The gradient passed down before ), Code such as ①, Calculation w The gradient of , Such as ②. Yes b Beg is 1, Then he is equal to the one passed down from above .
#dout yes softmax Gradient of layer transmission ,cache It is the result of the second level calculation .
def affine_backward(dout, cache):
x, w, b = cache
dx, dw, db = None, None, None
#①
dx = np.dot(dout, w.T) # (N,D)
# Yes x Standardize
dx = np.reshape(dx, x.shape) # (N,d1,...,d_k)
x_row = x.reshape(x.shape[0], -1) # (N,D)
#②
dw = np.dot(x_row.T, dout) # (D,M)
db = np.sum(dout, axis=0, keepdims=True) # (1,M)
return dx, dw, db
affine_relu_backward function , First pair relu Back propagation . Then call... Again affine_backward.
def affine_relu_backward(dout, cache):
""" Backward pass for the affine-relu convenience layer """
fc_cache, relu_cache = cache
da = relu_backward(dout, relu_cache)
dx, dw, db = affine_backward(da, fc_cache)
return dx, dw, db
about relu layer , Forward propagation is through max(0,x), So when we take the derivative x>0 The time derivative is 1 That is, the incoming gradient , When x≤0 when , The derivative is 0, Then the gradient is also 0.
def relu_backward(dout, cache):
dx, x = None, cache
dx = dout
dx[x <= 0] = 0
return dx
Add the regularization penalty , Complete back propagation
dW2 += self.reg * W2
dW1 += self.reg * W1
Save the gradient value
grads['W1'] = dW1
grads['b1'] = db1
grads['W2'] = dW2
grads['b2'] = db2
边栏推荐
- Is flutter being quietly abandoned? On the future of flutter
- clion+opencv+aruco+cmake配置
- 迁移学习——Low-Rank Transfer Subspace Learning
- ASM piling: after learning ASM tree API, you don't have to be afraid of hook anymore
- ML17-神经网络实战
- Spring, summer, autumn and winter with Miss Zhang (1)
- Spring, summer, autumn and winter with Miss Zhang (4)
- [semantic segmentation] setr_ Rethinking Semantic Segmentation from a Sequence-to-Sequence Perspective with Transformer
- ReportingService WebService form authentication
- 个人学习网站
猜你喜欢
性能优化之趣谈线程池:线程开的越多就越好吗?
[go] use of defer
【Transformer】SOFT: Softmax-free Transformer with Linear Complexity
一、Focal Loss理论及代码实现
Activity交互问题,你确定都知道?
Exploration of flutter drawing skills: draw arrows together (skill development)
Yum local source production
[clustmaps] visitor statistics
anaconda中移除旧环境、增加新环境、查看环境、安装库、清理缓存等操作命令
Wechat built-in browser prohibits caching
随机推荐
六、基于深度学习关键点的指针式表计识别
Reporting Services- Web Service
一、Focal Loss理论及代码实现
Configuration and use of Nacos external database
[semantic segmentation] full attention network for semantic segmentation
迁移学习—Geodesic Flow Kernel for Unsupervised Domain Adaptation
【语义分割】Mapillary 数据集简介
【语义分割】SETR_Rethinking Semantic Segmentation from a Sequence-to-Sequence Perspective with Transformer
Ribbon learning notes II
迁移学习笔记——Adapting Component Analysis
[target detection] KL loss: bounding box progression with uncertainty for accurate object detection
【DL】关于tensor(张量)的介绍和理解
引入Spacy模块出错—OSError: [E941] Can‘t find model ‘en‘.
研究生新生培训第三周:ResNet+ResNeXt
Detailed explanation of atomic operation classes atomicreference and atomicstampedreference in learning notes of concurrent programming
ASM piling: after learning ASM tree API, you don't have to be afraid of hook anymore
[ml] PMML of machine learning model -- Overview
Error in installing pyspider under Windows: Please specify --curl dir=/path/to/build/libcurl solution
ML10自学笔记-SVM
电脑视频暂停再继续,声音突然变大