当前位置:网站首页>What is a multi-layer perceptron (what is a multi-layer perceptron)
What is a multi-layer perceptron (what is a multi-layer perceptron)
2022-07-27 20:19:00 【Full stack programmer webmaster】
Hello everyone , I meet you again , I'm your friend, Quan Jun .
1. Perceptron and multilayer perceptron
1.1 door
And gate : Implementation logic “ ride ” operation y=AB And gate truth table
A | B | y |
|---|---|---|
0 | 0 | 0 |
0 | 1 | 0 |
1 | 0 | 0 |
1 | 1 | 1 |
Not gate : Implementation logic is not , One to one output Non gate truth table
A | y |
|---|---|
0 | 1 |
1 | 0 |
Or gate : Implementation logic “ and ” operation y=A+B Or gate truth table
A | B | y |
|---|---|---|
0 | 0 | 0 |
1 | 0 | 1 |
0 | 1 | 1 |
1 | 1 | 1 |
NAND gate : First and then not NAND gate truth table
A | B | y |
|---|---|---|
0 | 0 | 1 |
0 | 1 | 1 |
1 | 0 | 1 |
1 | 1 | 0 |
1.2 perceptron
The sensor receives multiple input signals , Output a signal ,x1,x2 It's the input signal ,y It's the output signal ,w1,w2 Weight. , Output y=x1w1+x2w2, When the sum exceeds a certain threshold , Will be output 1. This is also called “ Neurons are activated ”. Here we call this threshold value , use θ Express
The limitation of perceptron is that it can only represent the space divided by a straight line , The curve of XOR gate cannot be represented by perceptron
XOR gates cannot be divided by straight lines , You can use a curve to split
XOR gate in digital circuit can pass Combination and gate , NAND gate , Or gate implements XOR gate , The combined truth table is as follows :
x1 | x2 | s1(x1,x2 NAND gate ) | s2(x1,x2 Or gate ) | y(s1,s2 And gate ) |
|---|---|---|---|---|
0 | 0 | 1 | 0 | 0 |
0 | 1 | 1 | 1 | 1 |
1 | 0 | 1 | 1 | 1 |
1 | 1 | 0 | 1 | 0 |
The following is represented by a perceptron :
The picture above shows s1 and s2 Two layer perceptron , A multi-layer perceptron is also called a multi-layer perceptron . Common multilayer perceptron ( neural network ) Graph :
The difference between single-layer perceptron and multi-layer perceptron : <1>. The multi-layer perceptron has a hidden layer between the input layer and the output layer <2>. Each layer of neurons is fully interconnected with the next layer of neurons <3>. The hidden layer also has neurons with activation function
2. Tensorflow Realize multi-layer perceptron
tensorflow Training neural networks 4 A step
step1: Define the calculation formula
Hide layer weight initialization , The choice of activation function
step2: Define the loss function And choose the optimizer
The loss function has a square error , Cross information entropy, etc , Choose the optimizer , Learning rate
step3: Training models
Number of training rounds ,batch The number of ,batch Size ,dropout Of keep_prob Set up
step4: Use the test set to evaluate the accuracy of the model
If there is keep_prob Set to 1 That is, use all the features to predict , use tf.equal Judge the sample with correct prediction ,tf.cast take [True,False] To 0,1,tf.reduce_mean Calculate the mean
Implementation code :
from tensorflow.examples.tutorials.mnist import input_data
import tensorflow as tf
mnist= input_data.read_data_sets("MNIST_DATA/",one_hot=True)
sess = tf.InteractiveSession()
''' Next, set the parameters of the hidden layer Variable To initialize in_units: Enter the number of nodes h1_units: The number of output nodes of the hidden layer ( Set to 300) W1,b1: Hidden layer weights and offsets Initialize the weight to the truncated positive distribution , The standard deviation is 0.1( The offset is all 0),tf.truncated_normal([in_units,h1_units],stddev=0.1) Because the activation function used by the model is ReLU, So we need to use normal distribution to add a little noise to the parameters to break the complete symmetry , And avoid 0 gradient W2,b2: Output layer weights and offsets ( All set to 0) '''
in_units = 784
h1_units =300
W1 = tf.Variable(tf.truncated_normal([in_units,h1_units],stddev=0.1))
b1 = tf.Variable(tf.zeros([h1_units]))
W2 = tf.Variable(tf.zeros([h1_units,10]))
b2 = tf.Variable(tf.zeros([10]))
''' Define input x Of placeholder, In training and Forecasting ,Dropout Ratio of Keep_prob It's different , Usually less than during training 1, And the prediction is equal to 1, So the Dropout As the input of the calculation chart , And defined as a placeholder '''
x = tf.placeholder(tf.float32,[None,in_units])
keep_prob = tf.placeholder(tf.float32)
''' Define the model structure : step1: First, we need a hidden layer , Name it hidden1, Can pass tf.nn.relu(tf.matmul(x,w1) + b1) Implement an activation function as Relu Hidden layer of , The calculation formula of this hidden layer is y=relu(W1X+b1) step2: use tf.nn.dropout Realization dropout function , Randomly set some nodes to 0, keep_prob: Keep the data , That is, it is not set to 0 The proportion of , Training is less than 1, Can create randomness , The prediction is equal to 1, Use all features to predict the sample '''
hidden1 = tf.nn.relu(tf.matmul(x,W1) + b1)
hidden1_drop = tf.nn.dropout(hidden1,keep_prob)
y = tf.nn.softmax(tf.matmul(hidden1_drop,W2) + b2)
''' Define loss functions and select optimizers '''
y_=tf.placeholder(tf.float32,[None,10]) #10 dimension , Which number corresponds to the index 1
cross_entropy = tf.reduce_mean(-tf.reduce_sum(y_*tf.log(y),reduction_indices=[1]))
train_step = tf.train.AdagradOptimizer(0.3).minimize(cross_entropy)
''' Start training common 300 individual batch, Every batch contain 100 Samples , Keep it during training 75% The node of '''
tf.global_variables_initializer().run()
for i in range(300):
batch_xs,batch_ys=mnist.train.next_batch(100)
train_step.run({
x:batch_xs,y_:batch_ys,keep_prob:0.75})
correct_prediction = tf.equal(tf.argmax(y,1),tf.argmax(y_,1)) #tf.argmax(y,1) y Index of the maximum value of each row of the matrix
accuracy= tf.reduce_mean(tf.cast(correct_prediction,tf.float32))
print(accuracy.eval({
x:mnist.test.images,y_:mnist.test.labels,keep_prob:1.0}))
''' Publisher : Full stack programmer stack length , Reprint please indicate the source :https://javaforall.cn/128186.html Link to the original text :https://javaforall.cn
边栏推荐
- JS 数组方法 forEach 和 map 比较
- Product Manager: check where there is an error prompt of "system exception" on the offline
- Zepto入门详解
- conda常用命令
- Membership card head assembly usage document
- Online judge output overrun
- If you want to switch to software testing, you should pass these three tests first, including a 3000 word super full test learning guide
- TS2532: Object is possibly ‘undefined‘
- JD: search product API by keyword
- 内置函数锁相关
猜你喜欢

Solve the problem of displaying the scroll bar when there is no data in the viewui table

C background GC cause and effect

Ten year test old bird talk about mobile terminal compatibility test

《安富莱嵌入式周报》第275期:2022.07.18--2022.07.24

Connection pool - return connection details (Part 1)

Pyqt5 rapid development and practice 4.7 qspinbox (counter) and 4.8 QSlider (slider)

盘点下互联网大厂的实习薪资:有了它,你也可以进厂

If you want to switch to software testing, you should pass these three tests first, including a 3000 word super full test learning guide

Redis-基本了解,五大基本数据类型

Illustration leetcode - 592. Fraction addition and subtraction (difficulty: medium)
随机推荐
Clickhouse 实现 MaterializedPostgreSQL
Program design Comprehensive Experiment III
C language -- array
uva1421
JD: search product API by keyword
PyQt5快速开发与实战 4.5 按钮类控件 and 4.6 QComboBox(下拉列表框)
Codeforces Round #810 (Div. 2)B.party(思维题)超详细题解
es6删除对象的属性_ES6删除对象中的某个元素「建议收藏」
shell
办公自动化解决方案——DocuWare Cloud 将应用程序和流程迁移到云端的完整的解决方案
ZJNU 22-07-26 比赛心得
Understanding of basic concepts of channel capacity and channel bandwidth
If you want to switch to software testing, you should pass these three tests first, including a 3000 word super full test learning guide
我也是醉了,Eureka 延迟注册还有这个坑
Assignment 1 - Hello World ! - Simple thread Creation
Source code analysis of Chang'an chain data storage
C background GC cause and effect
十年测试老鸟聊聊移动端兼容性测试
同源与跨域
js跳转页面并刷新(本页面跳转)