当前位置：网站首页>What is a multi-layer perceptron (what is a multi-layer perceptron)

What is a multi-layer perceptron (what is a multi-layer perceptron)

2022-07-27 20:19:00 【Full stack programmer webmaster】

Hello everyone , I meet you again , I'm your friend, Quan Jun .

1. Perceptron and multilayer perceptron

1.1 door

And gate ： Implementation logic “ ride ” operation y=AB And gate truth table

A	B	y
0	0	0
0	1	0
1	0	0
1	1	1

Not gate ： Implementation logic is not , One to one output Non gate truth table

A	y
0	1
1	0

Or gate ： Implementation logic “ and ” operation y=A+B Or gate truth table

A	B	y
0	0	0
1	0	1
0	1	1
1	1	1

NAND gate ： First and then not NAND gate truth table

A	B	y
0	0	1
0	1	1
1	0	1
1	1	0

1.2 perceptron

The sensor receives multiple input signals , Output a signal ,x1,x2 It's the input signal ,y It's the output signal ,w1,w2 Weight. , Output y=x1w1+x2w2, When the sum exceeds a certain threshold , Will be output 1. This is also called “ Neurons are activated ”. Here we call this threshold value , use θ Express

The limitation of perceptron is that it can only represent the space divided by a straight line , The curve of XOR gate cannot be represented by perceptron

XOR gates cannot be divided by straight lines , You can use a curve to split

XOR gate in digital circuit can pass Combination and gate , NAND gate , Or gate implements XOR gate , The combined truth table is as follows ：

x1	x2	s1（x1,x2 NAND gate ）	s2（x1,x2 Or gate ）	y(s1,s2 And gate )
0	0	1	0	0
0	1	1	1	1
1	0	1	1	1
1	1	0	1	0

The following is represented by a perceptron ：

The picture above shows s1 and s2 Two layer perceptron , A multi-layer perceptron is also called a multi-layer perceptron . Common multilayer perceptron （ neural network ） Graph ：

The difference between single-layer perceptron and multi-layer perceptron ： <1>. The multi-layer perceptron has a hidden layer between the input layer and the output layer <2>. Each layer of neurons is fully interconnected with the next layer of neurons <3>. The hidden layer also has neurons with activation function

2. Tensorflow Realize multi-layer perceptron

tensorflow Training neural networks 4 A step

step1: Define the calculation formula

Hide layer weight initialization , The choice of activation function

step2: Define the loss function And choose the optimizer

The loss function has a square error , Cross information entropy, etc , Choose the optimizer , Learning rate

step3: Training models

Number of training rounds ,batch The number of ,batch Size ,dropout Of keep_prob Set up

step4: Use the test set to evaluate the accuracy of the model

If there is keep_prob Set to 1 That is, use all the features to predict , use tf.equal Judge the sample with correct prediction ,tf.cast take [True,False] To 0,1,tf.reduce_mean Calculate the mean

Implementation code ：

from tensorflow.examples.tutorials.mnist import input_data
import tensorflow as tf

mnist= input_data.read_data_sets("MNIST_DATA/",one_hot=True)
sess = tf.InteractiveSession()
'''  Next, set the parameters of the hidden layer Variable To initialize  in_units: Enter the number of nodes  h1_units: The number of output nodes of the hidden layer ( Set to 300) W1,b1: Hidden layer weights and offsets   Initialize the weight to the truncated positive distribution , The standard deviation is 0.1( The offset is all 0),tf.truncated_normal([in_units,h1_units],stddev=0.1)  Because the activation function used by the model is ReLU, So we need to use normal distribution to add a little noise to the parameters to break the complete symmetry , And avoid 0 gradient  W2,b2: Output layer weights and offsets （ All set to 0） '''
in_units = 784
h1_units =300
W1 = tf.Variable(tf.truncated_normal([in_units,h1_units],stddev=0.1))
b1 = tf.Variable(tf.zeros([h1_units]))
W2 = tf.Variable(tf.zeros([h1_units,10]))
b2 = tf.Variable(tf.zeros([10]))

'''  Define input x Of placeholder, In training and Forecasting ,Dropout Ratio of Keep_prob It's different , Usually less than during training 1, And the prediction is equal to 1, So the Dropout As the input of the calculation chart , And defined as a placeholder '''
x = tf.placeholder(tf.float32,[None,in_units])
keep_prob = tf.placeholder(tf.float32)
'''  Define the model structure : step1： First, we need a hidden layer , Name it hidden1, Can pass tf.nn.relu(tf.matmul(x,w1) + b1) Implement an activation function as Relu Hidden layer of ,  The calculation formula of this hidden layer is y=relu(W1X+b1) step2: use tf.nn.dropout Realization dropout function , Randomly set some nodes to 0, keep_prob: Keep the data , That is, it is not set to 0 The proportion of , Training is less than 1, Can create randomness , The prediction is equal to 1, Use all features to predict the sample  '''

hidden1 = tf.nn.relu(tf.matmul(x,W1) + b1)
hidden1_drop = tf.nn.dropout(hidden1,keep_prob)
y = tf.nn.softmax(tf.matmul(hidden1_drop,W2) + b2)

'''  Define loss functions and select optimizers  '''
y_=tf.placeholder(tf.float32,[None,10])   #10 dimension , Which number corresponds to the index 1
cross_entropy = tf.reduce_mean(-tf.reduce_sum(y_*tf.log(y),reduction_indices=[1]))
train_step = tf.train.AdagradOptimizer(0.3).minimize(cross_entropy)

'''  Start training   common 300 individual batch, Every batch contain 100 Samples , Keep it during training 75% The node of  '''
tf.global_variables_initializer().run()
for i in range(300):
    batch_xs,batch_ys=mnist.train.next_batch(100)
    train_step.run({ 
   x:batch_xs,y_:batch_ys,keep_prob:0.75})
    
correct_prediction = tf.equal(tf.argmax(y,1),tf.argmax(y_,1))   #tf.argmax(y,1) y Index of the maximum value of each row of the matrix 
accuracy= tf.reduce_mean(tf.cast(correct_prediction,tf.float32))
print(accuracy.eval({ 
   x:mnist.test.images,y_:mnist.test.labels,keep_prob:1.0}))
'''

Publisher ： Full stack programmer stack length , Reprint please indicate the source ：https://javaforall.cn/128186.html Link to the original text ：https://javaforall.cn

原网站

版权声明
本文为[Full stack programmer webmaster]所创，转载请带上原文链接，感谢
https://yzsam.com/2022/208/202207271742550492.html