当前位置:网站首页>[deep learning skill chapter] Chap.1 from perceptron to artificial neural network (ANN)
[deep learning skill chapter] Chap.1 from perceptron to artificial neural network (ANN)
2022-06-09 06:28:00 【Jie_ MSD】
【 Deep learning Kung Fu chapter 】Chap.1 From perceptron to neural network
This paper introduces in detail the idea of artificial neural network inspired by biology , As an introductory material for in-depth learning , This column 【 Deep learning Theory Chapter 】 Use in Python 3, Try not to rely on external libraries or tools , Lead readers to create a classic deep learning network from scratch , Make readers gradually understand deep learning in this process .
1. MCP Neuron —— The simplest abstract model of brain neurons
How to make machine image human beings Think the same way / Study ?
Why can humans think ?
How humans learn ?
1.1 The nervous system of the brain
Human consciousness comes from the brain The nervous system , There are billions of human brains nerve ( single ) element .
1943 year , researcher Warren McCullock and Walter Pitts Put forward The first abstract model of brain neurons , McCulloch for short - Pitts neurons (McCullock-Pitts neuron) abbreviation MCP Neuron . See the following figure to describe :
Let's review the biology knowledge of high school :
- Neuron Nerve cells , It is the most basic structural and functional unit of the nervous system . It is divided into two parts: cell body and protuberance . There are two kinds of protuberances: axon and dendrite .
- Cell bodies By the nucleus 、 Cell membrane 、 Cytoplasmic composition , have Contact and integrate input information And send out information .
- Dendrites , Short and ( m. ) many , Directly protruding from the expansion of the cell body , Form a dendritic shape , Its function is to receive impulses from other neuronal axons and transmit them to the cell body . Receive input information
- axon , Neuronal Output channel
- synaptic , Use transmitters to transmit nerve impulses , Contact between two neurons or between neurons and effector cells 、 And thereby Transmit information The part of .
- myelin sheath , Myelin sheath is a membrane covering the axons of nerve cells , That is, myelin sheath is composed of Schwann cells and myelin sheath cell membrane . Its effect is : insulation , Prevent the transmission of electrical impulses from one neuron axon to another .

The neural network is calculated by a large number of artificial neuron connections . In most cases, artificial neural network can change the internal structure based on the external information , It's an adaptive system , Generally speaking, it has the function of learning .
1.2. Artificial neuron (The Artificial Neuron) And perceptron (perceptron)
1957 year ,Frank Rosenblatt be based on MCP Neuron Put forward The first perceptron learning algorithm , It is described as The simplest forward propagation neural network .
imitation MCU Neuron , Build a basic Artificial neuron . Illustration of signal transmission : Artificial memory neuron is a formal description of biological neuron , It abstracts the information processing process of biological neurons , And describe it in mathematical language ; Simulate the structure and function of biological neurons , And expressed by model diagram .

We extracted a neuron as the research object , Construct and name the single layer perceptron (Single layer perceptrons ), The process of signaling is crucial .
perceptron There are four parts :
- Input values or One input layer # Input & Output
- Weights and Bias # The weight & bias
- Weighted Sum # To sum by weight
- Activation Function # Activation function

A single nerve cell can be regarded as a machine with only two states —— When activated ‘ yes ’, When not activated, it is ‘ no ’.
Perceptron Language description :
Of nerve cells state Depending on The amount of input signals received from other nerve cells , And synaptic strength ( Inhibit or strengthen ). When the sum of semaphores exceeds a certain value threshold when , The cell body gets excited , Generate electrical pulses . Electrical impulses are transmitted to other neurons along axons and through synapses .
| Concept of perceptron | Simulate the behavior of nerve cells |
|---|---|
| Weight | synaptic |
| bias | bias |
| Activation function | Cell bodies |
| Output | axon |
Multiple signals reach the dendrites , And then integrate into the cell body , If the accumulated signal exceeds a certain threshold , An output signal is generated , This signal is transmitted by the axon .
2. Single layer perceptron algorithm
The process of signaling How is it implemented in an artificial neural network ?
2.1 Mathematical expression of single layer perceptron algorithm
How to make a single-layer perceptron make “ yes ” / “ no ” Judge Well ?
pivotal threshold How can we use mathematical language to describe ?
Can perceptron realize basic Logic Well ?
Suppose the perceptron receives 2 Input signals x1、x2, And a threshold for judgment b, Output a signal y, The output signal is only “0” and “1” Two values . Can achieve Single level binary linear classifier Of Supervised learning The algorithm of .
Input signal x1 and x2 Multiply by the corresponding The weight (weight)w1 and w2, The neurons calculate the sum of the signals they send , If the sum of the input signals exceeds a certain value threshold θ,( That is, the following formula -b ), Then output signal y = 1; otherwise , Output is y = 0. This is also called “ Neurons are activated ” .
- The greater the weight , It should be weighted The importance of signals The higher . Weights are used to control the difficulty of signal flow ,
- b, bias (bias), Threshold criteria for adjusting the model .
Express in mathematical form :y = h( ) 【 Activation function 】
Activation function
- Activate the The function is to decide how to activate the sum of input signals
- The weighted sum of the signals is the node a, And then the nodes a Activated function h (x) Convert to nodes y
Specific steps : It can be divided into the following two formulas .
- a = b + w1x1 + w2x2
- y = h(a) # use h() Function will a Convert to output y.

It is also important to choose the appropriate activation function 
The activation function of this formula Bounded by the threshold , Once the input exceeds the threshold , Just switch the output . Such a function is called “ Step function ”. therefore , It can be said that the perceptron uses the step function as Activation function . in other words , Among the many candidate functions of the activation function , Below The sensor uses a step function .
2.2 python Drawing visual expression to realize simple digital logic design , As with the door 、 Not gate 、 Or gate 、 XOR gate, etc
0 and 1—— The simplest logic
- use 0 and 1 It can represent two opposite logical states :“ yes ”、“ no ”;
- And 、 or 、 Not yes Logical operations Three basic operations of , Other logical operations can be composed of these three basic operations . Digital logic is the foundation of computer .
example , And gate (AND) Truth table
【 Of the following table x1、x2 Refers to two inputs of a gate circuit , It is not the input signal mentioned above 】
In a visual way expression :
stay y = x + 0.5 The point above the line (1,1) Can be regarded as "1"; spot (0,0)、(1,0)、(0,1) Can be regarded as “0”
import numpy as np
import matplotlib.pyplot as plt
x = np.linspace(-3,5, 100)
y = -x + 1.5
plt.xlim(-1,2)
plt.ylim(-1,2)
plt.scatter(0, 0,c='blue')
plt.scatter(0, 1,c='blue')
plt.scatter(1, 0,c='blue')
plt.scatter(1, 1,c='blue')
plt.plot(x, y)
plt.title('Linear classifier')
plt.xlabel('x')
plt.ylabel('y')
plt.show()
The same principle can mean NAND,OR etc. .
How to use program to describe the process of human digital logic thinking ?
about “ And ” operation ,AND.
Observe the following procedure 
import numpy as np
def AND(x1, y):
x = np.array([x1, y])
w = np.array([1, 1]) # Weight
b = 1.5 # bias
tmp = -1 * np.sum(w * x) + b
if tmp >= 0:
return 0
else:
return 1
if __name__ == '__main__':
for xs in [(0, 0), (0, 1), (1,0),(1, 1)]:
y = AND(xs[0], xs[1])
print(str(xs) + " -> " + str(y))
similarly , We can realize the NAND gate 、 Or the manual construction of doors .
Single layer perceptrons are only capable of learning linearly separable patterns. For a classification task with some step activation function a single node will have a single line dividing the data points forming the patterns. Single layer perceptron can only learn linear separable patterns . For classified tasks with certain step activation functions , A single node will have a line , This line separates the data points that form the pattern .
Single layer perceptron Only and gates can be realized 、 NAND gate 、 Or gate , and The XOR gate cannot be implemented (XOR gate)
inspire
ad locum , It is people who determine the parameters of the perceptron ( I ), Not computers .
My idea of programming is , Looking at the representation of the truth table, the parameters of the model are determined (w1, w2, b ) Namely (1.0, 1.0, 1.5 ). A straight line formed by :y = -x + 1.5 Put four points in two-dimensional space (0,0)、(0,1)、(1,0) & (1,1) Split into two parts 0 & 1. In this way, the input x1、x2 Categorize the output as 0 perhaps 1 On .
The task of machine learning is to let the computer automatically decide the parameter value . Learning is determining the right Parameters The process of , What people should do is to think about the perceptron structure ( Model )【 For example, such a neural network 】, And put the training data Give it to the computer .
2.4 Exclusive OR gate : Limitations of single-layer perceptron
The limitation of single-layer perceptron is that it can only represent the space divided by a straight line 
Truth table of XOR gate : Only when the x1 or x2 One of them is 1 when , Will be output 1.
How to implement XOR gate logic ?
Namely the (0,0)、(1,1) And (0,1)、(1,0) Separation ?
3. Multilayer perceptron
Multilayer perceptron (Multilayer Perceptron, abbreviation MLP) It is an artificial neural network with forward structure , Map a set of input vectors to a set of output vectors .
MLP It can be seen as a directed graph , It consists of multiple node layers , Each layer is fully connected to the next layer . In addition to the input node , Each node is a neuron with nonlinear activation function ( Or processing unit ). One is called Back propagation algorithm Supervised learning is often used to train MLP.
MLP It's the extension of perceptron , It overcomes the weakness that the perceptron can not recognize the linear inseparable data .
3.1 The superposition of existing gate circuits realizes XOR logic


3.2 Two layer perceptron can realize XOR gate
Stack layer methods
And “ And ” Door or “ And ” Doors are different ,“ Exclusive or ” The door needs a middle hidden layer for preliminary conversion , In order to realize the “ Exclusive or ” The logic of the gate .
- XOR Door assignment weights , To satisfy XOR Conditions .
- It cannot be realized with a single layer perceptron , It requires a multi-layer perceptron or MLP.
- H For hidden layers , allow XOR Realization .

import numpy as np
def NAND(x1, x2):
x = np.array([x1, x2])
w = np.array([1, 1]) # Weight
b = 1.5 # bias
tmp = -1 * np.sum(w * x) + b
if tmp <= 0:
return 0
else:
return 1
def OR(x1, x2):
x = np.array([x1, x2])
w = np.array([1, 1]) # Weight
b = 0.5 # bias
tmp = -1 * np.sum(w * x) + b
if tmp >= 0:
return 0
else:
return 1
def AND(x1, x2):
x = np.array([x1, x2])
w = np.array([1, 1]) # Weight
b = 1.5 # bias
tmp = -1 * np.sum(w * x) + b
if tmp >= 0:
return 0
else:
return 1
def XOR(x1,x2):
s1=NAND(x1,x2)
s2=OR(x1,x2)
y=AND(s1,s2)
return y
if __name__ == '__main__':
for xs in [(0, 0), (0, 1), (1,0),(1, 1)]:
y = XOR(xs[0], xs[1])
print(str(xs) + " -> " + str(y))
Output results :
(0, 0) -> 0
(0, 1) -> 1
(1, 0) -> 1
(1, 1) -> 0
3.3 Multilayer perceptron (multi-layered perceptron)

The perceptron can perform nonlinear representation through overlay , Theoretically, it can also represent the processing performed by a computer . Because gate circuit and digital logic are the basis of modern computer .
Multilayer perceptron or feedforward neural network with two or more layers has greater processing capacity , And it can also deal with nonlinear modes .
3.4 neural network —— Layer overlay to network model
- The neural network is calculated by a large number of artificial neuron connections .
- Artificial neural network is a kind of artificial neural network which imitates the behavior characteristics of animal neural network , Conduct ‘ Distributed parallel information processing Mathematical model of the algorithm . This network depends on the complexity of the system , By adjusting the interconnecting relationships between a large number of internal nodes , In order to achieve the purpose of processing information .

A common multilayer feedforward network (Multilayer Feedforward Network) It's made up of three parts ,
- Input layer (Input layer), Many neurons (Neuron) Accept a large number of non-linear input messages . The input message is called the input vector .
- Output layer (Output layer), Messages are transmitted over neuronal links 、 analysis 、 Balance , Form output results . The output message is called the output vector .
- Hidden layer (Hidden layer), abbreviation “ Cryptic layer ”, It's the layers of neurons and links between the input layer and the output layer . The hidden layer can have one or more layers . Hidden layer nodes ( Neuron ) The number is uncertain , But the more the number, the more remarkable the nonlinearity of neural network , Thus, the robustness of neural network (robustness)( The control system is in a certain structure 、 Under the parameter perturbation of size, etc , Characteristics that maintain certain performance ) More significant . It is customary to select input nodes 1.2 to 1.5 Times the node .
This kind of network is generally called perceptron ( For single hidden layers ) Or multi-layer perceptron ( For multiple hidden layers ), There are many types of neural networks , This hierarchical structure is not applicable to all neural networks .
In the next article, I will describe in detail how to train artificial neural networks
4 Artificial neural network —— Overview of the learning process
In the above perceptron algorithm , The work of setting weights , That is to determine the right 、 Can meet the expected loss Input and output weights , It's still up to artificial On going . In the last section , We combine and gate 、 Or gate The truth table of determines the appropriate weight manually .
The advantage of neural network is that it can automatically learn the appropriate weight parameters from the data .
4.1 The learning process
Through the correction of training samples , Correct the weight of each layer (learning) And the process of creating the model , It is called an automatic learning process (training algorithm). The specific learning methods vary with the network structure and model , Commonly used back propagation algorithm (Backpropagation/ Back pass / Back propagation , With output Using first-order differential Delta rule To fix weight) Come on verification .
4.2 Perceptron learning rules
Perceptron The learning rule indicates that the algorithm will automatically learn the best weight coefficient . Then multiply the input features by these weights , To determine whether neurons trigger .
The perceptron receives multiple input signals , And if the sum of the input signals exceeds a certain threshold , Then it will output the signal or not return the output . In the case of supervised learning and classification , This can be used to predict the category of the sample .
If you want to know how the code implements , Please go to the next article , I'll go through the derivation process in detail .
summary
This paper introduces in detail the idea of artificial neural network inspired by biology , As an introductory material for in-depth learning , This column 【 Deep learning Theory Chapter 】 Use in Python 3, Try not to rely on external libraries or tools , Lead readers to create a classic deep learning network from scratch , Make readers gradually understand deep learning in this process .
References :
- Wu enda 《deeplearning.ai》 Course
- Saito KANGYI 《 Introduction to deep learning : be based on Python Theory and Realization of 》
- simplilearn《What is Perceptron: A Beginners Tutorial for Perceptron》
边栏推荐
- MySQL password is correct but cannot log in
- Comparison between ZGC and G1 and the meaning of color points
- 全志平台BSP裁剪(1)kernel裁剪--调试工具和调试信息的裁剪
- Singh function sinc (x) and sampling function SA (T)
- RNN foundation of NLP Foundation
- Modularity in typescrtipt
- 全志平台BSP裁剪(6)附件三--rootfs menuconfig配置说明
- Do you really understand entropy (including cross entropy)
- Quanzhi t7/t507 qt5.12.5 transplantation record
- Sqlserver imports and exports data. There is a process in the background and no display in the foreground.
猜你喜欢
随机推荐
全志V3s学习记录--ESP8089的使用
Chapter_06 更改图像的对比度和亮度
SVN账号密码查找
全志H3停产,A40I/T3更胜一筹--CoM-X40I核心模块来了
Lazy counter
全志平台BSP裁剪(6)附件三--rootfs menuconfig配置说明
MySQL 5.7 installation tutorial (full step, nanny level tutorial)
Gh-bladed4.9 lidar module
Common ideas of sparksql dealing with data skew
ImportError: cannot import name ‘joblib‘ from ‘sklearn. externals‘
Quanzhi platform BSP tailoring (4) kernel tailoring --file Systems & Driver & miscellaneous tailoring
Binary tree
全志平台BSP裁剪(7)rootfs裁剪--用户工具和库的裁剪 & rootfs压缩
无缓存安装指令
How matlab writes continuous data with title to mat file
Selection of industrial am335x core modules
cms 和 g1的主要区别
Conversion of data type real and word in PROFIBUS DP communication
Sqlserver imports and exports data. There is a process in the background and no display in the foreground.
量化交易之MySql篇 - mysql数据库 增删改查









