当前位置：网站首页>Read an article to understand artificial neural network

Read an article to understand artificial neural network

2022-07-27 23:14:00 【InfoQ】

1. summary

    Artificial neural network

（ANN）

It is a mathematical model developed to solve various problems through the information processing mechanism of neurons . Neurons are also called nerve cells , There are cell bodies and processes （ Dendrites and axons ） form . The neural processing mechanism of animal brain is

Dendrites

Pass the received information to the cell body , The cell body is analyzed and processed to pass

axon

Pass to next

Neuron

   BP The structure of neural network is also similar to that of neuron ,X

、X

、Xn To receive information is equivalent to the role of dendrites . The hidden layer is equivalent to the cell body , Used to process and transmit information .Y

、Y

It's the output layer .

2. principle

    One is trained according to the error back propagation algorithm

Multilayer feedforward neural network

, As for the characteristics of the model, information is transmitted forward , The calculated result and the actual error are transmitted in reverse . In the process of forward transmission , The input signal is input from the input layer , After hidden layer by layer processing , We can understand the hidden layer as a function , Input into a value by inputting , This value is carried into the function of the hidden layer , Finally, calculate the output layer result . The neuron state of each layer only affects the next neuron state . If there is an error between the output result and the actual expectation , For example, we want his output to be 200, But the result of his calculation is 150, At this time, we need to adjust the weight and threshold of the network according to its error , Thus, the predictive output of this neural network is constantly approaching the desired output .

    In the first mock exam , Hidden layers can have multiple layers , The input layer is calculated through multiple layers of hidden layers , Finally, the output layer . We also need to mention the concepts of weight and threshold .

A weight

It refers to every signal characteristic input X

、X

、Xn, Will give him matching weight , It can be understood as a weight in the hidden layer .

threshold

It can be understood as deviation , This input value is calculated according to the weight , There is a gap with the value we actually want , Make a correction through the threshold .

3. Training steps

 1. Network initialization

    Determine the number of hidden layers according to the actual conditions 、

The number of neurons in each hidden layer

、

Learning rate

、 Neuron excitation function . This

Learning rate

It refers to the error we get after each calculation , He uses a kind of

gradient descent

Law , To ensure that we update the weights and thresholds every time , Don't miss the optimal solution , The general value is 0~1 Decimal between .

Number of nodes in the hidden layer

It usually requires us to try to get .

hidden_floors_num： Number of hidden layers 
every_hidden_floor_num： The number of neurons in each hidden layer 
learning_rate： Learning rate 
activation： Activation function 
regularization： Regularization approach 
regularization_rate： Regularization ratio 
total_step： Total training times 
train_data_path： Training data path 
model_save_path： Model save path

 2. Hidden layer output calculation

    According to the input vector X, Enter the weight between the layer and the hidden layer . And the threshold of the hidden layer , Calculate the output of the hidden layer .

    At this time, we need to pass the excitation function , Control the output value within a certain range . Common activation functions are sigmoid、tanh .sigmoid Function controls the input result to 0 To 1 Within the scope of .

 3. Output layer output calculation

    According to the output of the hidden layer H, Connect weights and thresholds , Calculation BP Prediction output of neural network O.

 4. Error calculation

    according to BP Prediction output of neural network O And the expected output Y, Calculate the error of the network e.

 5. Weight update

    According to the error of neural network e, Using gradient descent , Update the weight of the network .

 6. Threshold update

    According to the error of neural network e, Update the threshold of the network .

 7. Judge whether the algorithm iteration is completed

    Judge whether the algorithm iteration is completed , If it's not over , Return steps 2.

4. Case study

Next, we will implement artificial neural network through a case . The case is mainly generated by data 、 Training models 、 The forecast data consists of three parts .

 1. Generate the data

#  Generate test data 
import numpy as np
import pandas as pd
#  The total number of training set and verification set samples 
sample = 2000
train_data_path = 'train.csv'
validate_data_path = 'validate.csv'
predict_data_path = 'test.csv'

#  Construct a model that generates data 
X1 = np.zeros((sample, 1))
X1[:, 0] = np.random.normal(1, 1, sample)
X2 = np.zeros((sample, 1))
X2[:, 0] = np.random.normal(2, 1, sample)
X3 = np.zeros((sample, 1))
X3[:, 0] = np.random.normal(3, 1, sample)

#  Model 
Y = 6 * X1 - 3* X2 + X3 * X3 + np.random.normal(0, 0.1, [sample, 1])

#  Put all the generated data into data Inside 
data = np.zeros((sample, 4))
data[:, 0] = X1[:, 0]
data[:, 1] = X2[:, 0]
data[:, 2] = X3[:, 0]
data[:, 3] = Y[:, 0]

#  take data Divided into test set and training set 
num_traindata = int(0.8*sample)

#  Save the training data 
traindata = pd.DataFrame(data[0:num_traindata, :], columns=['x1', 'x2', 'x3', 'y'])
traindata.to_csv(train_data_path, index=False)
print(' Training data is stored in : ', train_data_path)

#  Save the validation data 
validate_data = pd.DataFrame(data[num_traindata:, :], columns=['x1', 'x2', 'x3', 'y'])
validate_data.to_csv(validate_data_path, index=False)
print(' The validation data is saved in : ', validate_data_path)

#  Save the forecast data 
predict_data = pd.DataFrame(data[num_traindata:, 0:-1], columns=['x1', 'x2', 'x3'])
predict_data.to_csv(predict_data_path, index=False)
print(' The forecast data is saved in : ', predict_data_path)

    The above code passes Y = 6 * X1 - 3* X2 + X3 * X3 + np.random.normal(0, 0.1, [sample, 1]), Random generation 2000 Group data . among 1600 Group put train.csv, As a training set .400 Group put validate.csv, As validation set . The input is X1、X2、X3, The output is Y.

 2. Training validation model

import tensorflow as tf
import pandas as pd
import numpy as np
createVar = locals()

'''
 Build a network with variable structure BP Neural network general code ：

 The significance of various parameters during training ：
hidden_floors_num： Number of hidden layers 
every_hidden_floor_num： The number of neurons in each hidden layer 
learning_rate： Learning rate 
activation： Activation function 
regularization： Regularization approach 
regularization_rate： Regularization ratio 
total_step： Total training times 
train_data_path： Training data path 
model_save_path： Model save path 

 The significance of each parameter when using the trained model to verify the verification set ：
model_save_path： Model save path 
validate_data_path： Verify set path 
precision： precision 

 The significance of each parameter when using the trained model to predict ：
model_save_path： Save path of model 
predict_data_path： Predict the data path 
predict_result_save_path： Save path of prediction results 
'''


#  Training model global parameters 
hidden_floors_num = 1
every_hidden_floor_num = [50]
learning_rate = 0.00001
activation = 'tanh'
regularization = 'L1'
regularization_rate = 0.0001
total_step = 200000
train_data_path = 'train.csv'
model_save_path = 'model/predict_model'

#  Use the model to verify the verification set and return the correct rate 
model_save_path = 'model/predict_model'
validate_data_path = 'validate.csv'
precision = 0.5

#  Use the model to predict the global parameters 
model_save_path = 'model/predict_model'
predict_data_path = 'test.csv'
predict_result_save_path = 'test_predict.csv'


def inputs(train_data_path):
 train_data = pd.read_csv(train_data_path)
 X = np.array(train_data.iloc[:, :-1])
 Y = np.array(train_data.iloc[:, -1:])
 return X, Y


def make_hidden_layer(pre_lay_num, cur_lay_num, floor):
 createVar['w' + str(floor)] = tf.Variable(tf.random_normal([pre_lay_num, cur_lay_num], stddev=1))
 createVar['b' + str(floor)] = tf.Variable(tf.random_normal([cur_lay_num], stddev=1))
 return eval('w'+str(floor)), eval('b'+str(floor))


def initial_w_and_b(all_floors_num):
 #  Initialize hidden layer w, b
 for floor in range(2, hidden_floors_num+3):
 pre_lay_num = all_floors_num[floor-2]
 cur_lay_num = all_floors_num[floor-1]
 w_floor, b_floor = make_hidden_layer(pre_lay_num, cur_lay_num, floor)
 createVar['w' + str(floor)] = w_floor
 createVar['b' + str(floor)] = b_floor


def cal_floor_output(x, floor):
 w_floor = eval('w'+str(floor))
 b_floor = eval('b'+str(floor))
 if activation == 'sigmoid':
 output = tf.sigmoid(tf.matmul(x, w_floor) + b_floor)
 if activation == 'tanh':
 output = tf.tanh(tf.matmul(x, w_floor) + b_floor)
 if activation == 'relu':
 output = tf.nn.relu(tf.matmul(x, w_floor) + b_floor)
 return output


def inference(x):
 output = x
 for floor in range(2, hidden_floors_num+2):
 output = cal_floor_output(output, floor)

 floor = hidden_floors_num+2
 w_floor = eval('w'+str(floor))
 b_floor = eval('b'+str(floor))
 output = tf.matmul(output, w_floor) + b_floor
 return output


def loss(x, y_real):
 y_pre = inference(x)
 if regularization == 'None':
 total_loss = tf.reduce_sum(tf.squared_difference(y_real, y_pre))

 if regularization == 'L1':
 total_loss = 0
 for floor in range(2, hidden_floors_num + 3):
 w_floor = eval('w' + str(floor))
 total_loss = total_loss + tf.contrib.layers.l1_regularizer(regularization_rate)(w_floor)
 total_loss = total_loss + tf.reduce_sum(tf.squared_difference(y_real, y_pre))

 if regularization == 'L2':
 total_loss = 0
 for floor in range(2, hidden_floors_num + 3):
 w_floor = eval('w' + str(floor))
 total_loss = total_loss + tf.contrib.layers.l2_regularizer(regularization_rate)(w_floor)
 total_loss = total_loss + tf.reduce_sum(tf.squared_difference(y_real, y_pre))

 return total_loss


def train(total_loss):
 train_op = tf.train.GradientDescentOptimizer(learning_rate).minimize(total_loss)
 return train_op


#  Training models 
def train_model(hidden_floors_num, every_hidden_floor_num, learning_rate, activation, regularization,
 regularization_rate, total_step, train_data_path, model_save_path):
 file_handle = open('acc.txt', mode='w')
 X, Y = inputs(train_data_path)
 X_dim = X.shape[1]
 all_floors_num = [X_dim] + every_hidden_floor_num + [1]

 #  Save parameters to and model_save_path Under the same folder ,  When restoring the model to predict, load these parameters to create a neural network 
 temp = model_save_path.split('/')
 model_name = temp[-1]
 parameter_path = ''
 for i in range(len(temp)-1):
 parameter_path = parameter_path + temp[i] + '/'
 parameter_path = parameter_path + model_name + '_parameter.txt'
 with open(parameter_path, 'w') as f:
 f.write(&quot;all_floors_num:&quot;)
 for i in all_floors_num:
 f.write(str(i) + ' ')
 f.write('\n')
 f.write('activation:')
 f.write(str(activation))

 x = tf.placeholder(dtype=tf.float32, shape=[None, X_dim])
 y_real = tf.placeholder(dtype=tf.float32, shape=[None, 1])
 initial_w_and_b(all_floors_num)
 y_pre = inference(x)
 total_loss = loss(x, y_real)
 train_op = train(total_loss)

 #  The accuracy recorded on the training set 
 train_accuracy = tf.reduce_mean(tf.cast(tf.abs(y_pre - y_real) < precision, tf.float32))
 print(y_pre)
 #  Save the model 
 saver = tf.train.Saver()

 #  Start data flow graph in a session object , Construction process 
 sess = tf.Session()
 init = tf.global_variables_initializer()
 sess.run(init)
 for step in range(total_step):
 sess.run([train_op], feed_dict={x: X[0:, :], y_real: Y[0:, :]})
 if step % 1000 == 0:
 saver.save(sess, model_save_path)
 total_loss_value = sess.run(total_loss, feed_dict={x: X[0:, :], y_real: Y[0:, :]})
 lxacc=sess.run(train_accuracy, feed_dict={x: X, y_real: Y})
 print('train step is ', step, ', total loss value is ', total_loss_value,
 ', train_accuracy', lxacc,
 ', precision is ', precision)

 file_handle.write(str(lxacc)+&quot;\n&quot;)


 saver.save(sess, model_save_path)
 sess.close()


def validate(model_save_path, validate_data_path, precision):
 # ********************** according to model_save_path Deduce the model parameter path ,  It is concluded that all_floors_num and activation****************
 temp = model_save_path.split('/')
 model_name = temp[-1]
 parameter_path = ''
 for i in range(len(temp)-1):
 parameter_path = parameter_path + temp[i] + '/'
 parameter_path = parameter_path + model_name + '_parameter.txt'
 with open(parameter_path, 'r') as f:
 lines = f.readlines()

 #  Parse from the read content all_floors_num
 temp = lines[0].split(':')[-1].split(' ')
 all_floors_num = []
 for i in range(len(temp)-1):
 all_floors_num = all_floors_num + [int(temp[i])]

 #  Parse from the read content activation
 activation = lines[1].split(':')[-1]
 hidden_floors_num = len(all_floors_num) - 2

 # ********************** Read validation data *************************************
 X, Y = inputs(validate_data_path)
 X_dim = X.shape[1]

 # ********************** Creating neural networks ************************************
 x = tf.placeholder(dtype=tf.float32, shape=[None, X_dim])
 y_real = tf.placeholder(dtype=tf.float32, shape=[None, 1])
 initial_w_and_b(all_floors_num)
 y_pre = inference(x)

 #  The accuracy recorded on the verification set 
 validate_accuracy = tf.reduce_mean(tf.cast(tf.abs(y_pre - y_real) < precision, tf.float32))

 sess = tf.Session()
 saver = tf.train.Saver()
 with tf.Session() as sess:
 #  Read the model 
 try:
 saver.restore(sess, model_save_path)
 print(' The model is loaded successfully ！')
 except:
 print(' The model does not exist , Please train the model first ！')
 return
 validate_accuracy_value = sess.run(validate_accuracy, feed_dict={x: X, y_real: Y})
 print('validate_accuracy is ', validate_accuracy_value)

 return validate_accuracy_value


def predict(model_save_path, predict_data_path, predict_result_save_path):
 # ********************** according to model_save_path Deduce the model parameter path ,  It is concluded that all_floors_num and activation****************
 temp = model_save_path.split('/')
 model_name = temp[-1]
 parameter_path = ''
 for i in range(len(temp)-1):
 parameter_path = parameter_path + temp[i] + '/'
 parameter_path = parameter_path + model_name + '_parameter.txt'
 with open(parameter_path, 'r') as f:
 lines = f.readlines()

 #  Parse from the read content all_floors_num
 temp = lines[0].split(':')[-1].split(' ')
 all_floors_num = []
 for i in range(len(temp)-1):
 all_floors_num = all_floors_num + [int(temp[i])]

 #  Parse from the read content activation
 activation = lines[1].split(':')[-1]
 hidden_floors_num = len(all_floors_num) - 2

 # ********************** Read prediction data *************************************
 predict_data = pd.read_csv(predict_data_path)
 X = np.array(predict_data.iloc[:, :])
 X_dim = X.shape[1]

 # ********************** Creating neural networks ************************************
 x = tf.placeholder(dtype=tf.float32, shape=[None, X_dim])
 initial_w_and_b(all_floors_num)
 y_pre = inference(x)

 sess = tf.Session()
 saver = tf.train.Saver()
 with tf.Session() as sess:
 #  Read the model 
 try:
 saver.restore(sess, model_save_path)
 print(' The model is loaded successfully ！')
 except:
 print(' The model does not exist , Please train the model first ！')
 return
 y_pre_value = sess.run(y_pre, feed_dict={x: X[0:, :]})

 #  Write the prediction results to csv file 
 predict_data_columns = list(predict_data.columns) + ['predict']
 data = np.column_stack([X, y_pre_value])
 result = pd.DataFrame(data, columns=predict_data_columns)
 result.to_csv(predict_result_save_path, index=False)
 print(' The forecast results are stored in ：', predict_result_save_path)


if __name__ == '__main__':
 mode = &quot;train&quot;

 if mode == 'train':
 #  Training models 
 train_model(hidden_floors_num, every_hidden_floor_num, learning_rate, activation, regularization,
 regularization_rate, total_step, train_data_path, model_save_path)

 if mode == 'validate':
 #  Use the model to test the correctness of the verification set 
 validate(model_save_path, validate_data_path, precision)

 if mode == 'predict':
 #  Using models to make predictions 
 predict(model_save_path, predict_data_path, predict_result_save_path)