当前位置:网站首页>[deep learning skill chapter] Chap.1 from perceptron to artificial neural network (ANN)

[deep learning skill chapter] Chap.1 from perceptron to artificial neural network (ANN)

2022-06-09 06:28:00 Jie_ MSD


   This paper introduces in detail the idea of artificial neural network inspired by biology , As an introductory material for in-depth learning , This column 【 Deep learning Theory Chapter 】 Use in Python 3, Try not to rely on external libraries or tools , Lead readers to create a classic deep learning network from scratch , Make readers gradually understand deep learning in this process .


1. MCP Neuron —— The simplest abstract model of brain neurons

How to make machine image human beings Think the same way / Study ?
Why can humans think ?
How humans learn ?


1.1 The nervous system of the brain

Human consciousness comes from the brain The nervous system , There are billions of human brains nerve ( single ) element .

  1943 year , researcher Warren McCullock and Walter Pitts Put forward The first abstract model of brain neurons , McCulloch for short - Pitts neurons (McCullock-Pitts neuron) abbreviation MCP Neuron . See the following figure to describe :

Let's review the biology knowledge of high school :

  • Neuron Nerve cells , It is the most basic structural and functional unit of the nervous system . It is divided into two parts: cell body and protuberance . There are two kinds of protuberances: axon and dendrite .
  • Cell bodies By the nucleus 、 Cell membrane 、 Cytoplasmic composition , have Contact and integrate input information And send out information .
  • Dendrites , Short and ( m. ) many , Directly protruding from the expansion of the cell body , Form a dendritic shape , Its function is to receive impulses from other neuronal axons and transmit them to the cell body . Receive input information
  • axon , Neuronal Output channel
  • synaptic , Use transmitters to transmit nerve impulses , Contact between two neurons or between neurons and effector cells 、 And thereby Transmit information The part of .
  • myelin sheath , Myelin sheath is a membrane covering the axons of nerve cells , That is, myelin sheath is composed of Schwann cells and myelin sheath cell membrane . Its effect is : insulation , Prevent the transmission of electrical impulses from one neuron axon to another .

 Insert picture description here
   The neural network is calculated by a large number of artificial neuron connections . In most cases, artificial neural network can change the internal structure based on the external information , It's an adaptive system , Generally speaking, it has the function of learning .
 Insert picture description here


1.2. Artificial neuron (The Artificial Neuron) And perceptron (perceptron)

  1957 year ,Frank Rosenblatt be based on MCP Neuron Put forward The first perceptron learning algorithm , It is described as The simplest forward propagation neural network .


imitation MCU Neuron , Build a basic Artificial neuron . Illustration of signal transmission :
Artificial memory neuron is a formal description of biological neuron , It abstracts the information processing process of biological neurons , And describe it in mathematical language ; Simulate the structure and function of biological neurons , And expressed by model diagram .

 Insert picture description here
We extracted a neuron as the research object , Construct and name the single layer perceptron (Single layer perceptrons ), The process of signaling is crucial .

perceptron There are four parts :

  1. Input values or One input layer   # Input & Output
  2. Weights and Bias      # The weight & bias
  3. Weighted Sum        # To sum by weight
  4. Activation Function     # Activation function

 Insert picture description here

A single nerve cell can be regarded as a machine with only two states —— When activated ‘ yes ’, When not activated, it is ‘ no ’.

Perceptron Language description
   Of nerve cells state Depending on The amount of input signals received from other nerve cells , And synaptic strength ( Inhibit or strengthen ). When the sum of semaphores exceeds a certain value threshold when , The cell body gets excited , Generate electrical pulses . Electrical impulses are transmitted to other neurons along axons and through synapses .

Concept of perceptron Simulate the behavior of nerve cells
Weight synaptic
bias bias
Activation function Cell bodies
Output axon

   Multiple signals reach the dendrites , And then integrate into the cell body , If the accumulated signal exceeds a certain threshold , An output signal is generated , This signal is transmitted by the axon .
 Insert picture description here


2. Single layer perceptron algorithm

The process of signaling How is it implemented in an artificial neural network ?


2.1 Mathematical expression of single layer perceptron algorithm

How to make a single-layer perceptron make “ yes ” / “ no ” Judge Well ?
pivotal threshold How can we use mathematical language to describe ?
Can perceptron realize basic Logic Well ?


   Suppose the perceptron receives 2 Input signals x1、x2, And a threshold for judgment b, Output a signal y, The output signal is only “0” and “1” Two values .
Can achieve Single level binary linear classifier Of Supervised learning The algorithm of .
 Insert picture description here
   Input signal x1 and x2 Multiply by the corresponding The weight (weight)w1 and w2, The neurons calculate the sum of the signals they send , If the sum of the input signals exceeds a certain value threshold θ,( That is, the following formula -b ), Then output signal y = 1; otherwise , Output is y = 0. This is also called “ Neurons are activated ” .

  • The greater the weight , It should be weighted The importance of signals The higher . Weights are used to control the difficulty of signal flow ,
  • b, bias (bias), Threshold criteria for adjusting the model .

Express in mathematical form :y = h( ) 【 Activation function 】
 Insert picture description here

Activation function

  1. Activate the The function is to decide how to activate the sum of input signals
  2. The weighted sum of the signals is the node a, And then the nodes a Activated function h (x) Convert to nodes y

Specific steps : It can be divided into the following two formulas .

  1. a = b + w1x1 + w2x2
  2. y = h(a) # use h() Function will a Convert to output y.

 Insert picture description here
It is also important to choose the appropriate activation function
 Insert picture description here
   The activation function of this formula Bounded by the threshold , Once the input exceeds the threshold , Just switch the output . Such a function is called “ Step function ”. therefore , It can be said that the perceptron uses the step function as Activation function . in other words , Among the many candidate functions of the activation function , Below The sensor uses a step function .

Activation function


2.2 python Drawing visual expression to realize simple digital logic design , As with the door 、 Not gate 、 Or gate 、 XOR gate, etc

0 and 1—— The simplest logic

  • use 0 and 1 It can represent two opposite logical states :“ yes ”、“ no ”;
  • And 、 or 、 Not yes Logical operations Three basic operations of , Other logical operations can be composed of these three basic operations . Digital logic is the foundation of computer .

example , And gate (AND) Truth table
【 Of the following table x1、x2 Refers to two inputs of a gate circuit , It is not the input signal mentioned above 】
 Insert picture description here
In a visual way expression
   stay y = x + 0.5 The point above the line (1,1) Can be regarded as "1"; spot (0,0)、(1,0)、(0,1) Can be regarded as “0
 Insert picture description here

import numpy as np
import matplotlib.pyplot as plt
 
x = np.linspace(-3,5, 100)
y = -x + 1.5
plt.xlim(-1,2)
plt.ylim(-1,2)

plt.scatter(0, 0,c='blue')
plt.scatter(0, 1,c='blue')
plt.scatter(1, 0,c='blue')
plt.scatter(1, 1,c='blue')

plt.plot(x, y)
 
plt.title('Linear classifier')
plt.xlabel('x')
plt.ylabel('y')
 
plt.show()

The same principle can mean NAND,OR etc. .


How to use program to describe the process of human digital logic thinking ?


about “ And ” operation ,AND.
Observe the following procedure
 Insert picture description here

import numpy as np

def AND(x1, y):
    x = np.array([x1, y])
    w = np.array([1, 1]) # Weight
    b = 1.5 # bias
    tmp = -1 * np.sum(w * x) + b
    if tmp >= 0:
        return 0
    else:
        return 1
        
if __name__ == '__main__':
    for xs in [(0, 0), (0, 1), (1,0),(1, 1)]:
        y = AND(xs[0], xs[1])
        print(str(xs) + " -> " + str(y))

similarly , We can realize the NAND gate 、 Or the manual construction of doors .


Single layer perceptrons are only capable of learning linearly separable patterns. For a classification task with some step activation function a single node will have a single line dividing the data points forming the patterns.
Single layer perceptron can only learn linear separable patterns . For classified tasks with certain step activation functions , A single node will have a line , This line separates the data points that form the pattern .


   Single layer perceptron Only and gates can be realized 、 NAND gate 、 Or gate , and The XOR gate cannot be implemented (XOR gate)


inspire

ad locum , It is people who determine the parameters of the perceptron ( I ), Not computers .

   My idea of programming is , Looking at the representation of the truth table, the parameters of the model are determined (w1, w2, b ) Namely (1.0, 1.0, 1.5 ). A straight line formed by :y = -x + 1.5 Put four points in two-dimensional space (0,0)、(0,1)、(1,0) & (1,1) Split into two parts 0 & 1. In this way, the input x1、x2 Categorize the output as 0 perhaps 1 On .

   The task of machine learning is to let the computer automatically decide the parameter value . Learning is determining the right Parameters The process of , What people should do is to think about the perceptron structure ( Model )【 For example, such a neural network 】, And put the training data Give it to the computer .


2.4 Exclusive OR gate : Limitations of single-layer perceptron

   The limitation of single-layer perceptron is that it can only represent the space divided by a straight line
 Insert picture description here
Truth table of XOR gate : Only when the x1 or x2 One of them is 1 when , Will be output 1.
 Insert picture description here
How to implement XOR gate logic ?
Namely the (0,0)、(1,1) And (0,1)、(1,0) Separation ?
 Insert picture description here


3. Multilayer perceptron

Multilayer perceptron (Multilayer Perceptron, abbreviation MLP) It is an artificial neural network with forward structure , Map a set of input vectors to a set of output vectors .
MLP It can be seen as a directed graph , It consists of multiple node layers , Each layer is fully connected to the next layer . In addition to the input node , Each node is a neuron with nonlinear activation function ( Or processing unit ). One is called Back propagation algorithm Supervised learning is often used to train MLP.
MLP It's the extension of perceptron , It overcomes the weakness that the perceptron can not recognize the linear inseparable data .


3.1 The superposition of existing gate circuits realizes XOR logic

 Insert picture description here

 Insert picture description here


3.2 Two layer perceptron can realize XOR gate

Stack layer methods


And “ And ” Door or “ And ” Doors are different ,“ Exclusive or ” The door needs a middle hidden layer for preliminary conversion , In order to realize the “ Exclusive or ” The logic of the gate .

  • XOR Door assignment weights , To satisfy XOR Conditions .
  • It cannot be realized with a single layer perceptron , It requires a multi-layer perceptron or MLP.
  • H For hidden layers , allow XOR Realization .

 Insert picture description here

import numpy as np
def NAND(x1, x2):
    x = np.array([x1, x2])
    w = np.array([1, 1]) # Weight
    b = 1.5 # bias
    tmp = -1 * np.sum(w * x) + b
    if tmp <= 0:
        return 0
    else:
        return 1

def OR(x1, x2):
    x = np.array([x1, x2])
    w = np.array([1, 1]) # Weight
    b = 0.5 # bias
    tmp = -1 * np.sum(w * x) + b
    if tmp >= 0:
        return 0
    else:
        return 1
        
def AND(x1, x2):
    x = np.array([x1, x2])
    w = np.array([1, 1]) # Weight
    b = 1.5 # bias
    tmp = -1 * np.sum(w * x) + b
    if tmp >= 0:
        return 0
    else:
        return 1

def XOR(x1,x2):
    s1=NAND(x1,x2)
    s2=OR(x1,x2)
    y=AND(s1,s2)
    return y

if __name__ == '__main__':
    for xs in [(0, 0), (0, 1), (1,0),(1, 1)]:
        y = XOR(xs[0], xs[1])
        print(str(xs) + " -> " + str(y))

Output results :

(0, 0) -> 0
(0, 1) -> 1
(1, 0) -> 1
(1, 1) -> 0


3.3 Multilayer perceptron (multi-layered perceptron)

 Insert picture description here
   The perceptron can perform nonlinear representation through overlay , Theoretically, it can also represent the processing performed by a computer . Because gate circuit and digital logic are the basis of modern computer .

   Multilayer perceptron or feedforward neural network with two or more layers has greater processing capacity , And it can also deal with nonlinear modes .


3.4 neural network —— Layer overlay to network model

  • The neural network is calculated by a large number of artificial neuron connections .
  • Artificial neural network is a kind of artificial neural network which imitates the behavior characteristics of animal neural network , Conduct ‘ Distributed parallel information processing Mathematical model of the algorithm . This network depends on the complexity of the system , By adjusting the interconnecting relationships between a large number of internal nodes , In order to achieve the purpose of processing information .

 Insert picture description here
A common multilayer feedforward network (Multilayer Feedforward Network) It's made up of three parts ,

  • Input layer (Input layer), Many neurons (Neuron) Accept a large number of non-linear input messages . The input message is called the input vector .
  • Output layer (Output layer), Messages are transmitted over neuronal links 、 analysis 、 Balance , Form output results . The output message is called the output vector .
  • Hidden layer (Hidden layer), abbreviation “ Cryptic layer ”, It's the layers of neurons and links between the input layer and the output layer . The hidden layer can have one or more layers . Hidden layer nodes ( Neuron ) The number is uncertain , But the more the number, the more remarkable the nonlinearity of neural network , Thus, the robustness of neural network (robustness)( The control system is in a certain structure 、 Under the parameter perturbation of size, etc , Characteristics that maintain certain performance ) More significant . It is customary to select input nodes 1.2 to 1.5 Times the node .

   This kind of network is generally called perceptron ( For single hidden layers ) Or multi-layer perceptron ( For multiple hidden layers ), There are many types of neural networks , This hierarchical structure is not applicable to all neural networks .

In the next article, I will describe in detail how to train artificial neural networks


4 Artificial neural network —— Overview of the learning process

   In the above perceptron algorithm , The work of setting weights , That is to determine the right 、 Can meet the expected loss Input and output weights , It's still up to artificial On going . In the last section , We combine and gate 、 Or gate The truth table of determines the appropriate weight manually .

The advantage of neural network is that it can automatically learn the appropriate weight parameters from the data .


4.1 The learning process

   Through the correction of training samples , Correct the weight of each layer (learning) And the process of creating the model , It is called an automatic learning process (training algorithm). The specific learning methods vary with the network structure and model , Commonly used back propagation algorithm (Backpropagation/ Back pass / Back propagation , With output Using first-order differential Delta rule To fix weight) Come on verification .

4.2 Perceptron learning rules

  Perceptron The learning rule indicates that the algorithm will automatically learn the best weight coefficient . Then multiply the input features by these weights , To determine whether neurons trigger .

   The perceptron receives multiple input signals , And if the sum of the input signals exceeds a certain threshold , Then it will output the signal or not return the output . In the case of supervised learning and classification , This can be used to predict the category of the sample .
 Insert picture description here
If you want to know how the code implements , Please go to the next article , I'll go through the derivation process in detail .


summary

   This paper introduces in detail the idea of artificial neural network inspired by biology , As an introductory material for in-depth learning , This column 【 Deep learning Theory Chapter 】 Use in Python 3, Try not to rely on external libraries or tools , Lead readers to create a classic deep learning network from scratch , Make readers gradually understand deep learning in this process .

References :

  1. Wu enda 《deeplearning.ai》 Course
  2. Saito KANGYI 《 Introduction to deep learning : be based on Python Theory and Realization of 》
  3. simplilearn《What is Perceptron: A Beginners Tutorial for Perceptron》
原网站

版权声明
本文为[Jie_ MSD]所创,转载请带上原文链接,感谢
https://yzsam.com/2022/03/202203021424523006.html