当前位置:网站首页>Statistical learning method -- perceptron
Statistical learning method -- perceptron
2022-07-07 16:14:00 【_ Spring_】
Catalog
Perceptron is one of the most basic models of machine learning , It's the basis of neural networks and support vector machines .
Keywords of perceptron
- Two classification
- Discriminant model 、 Linear model
- Data sets are linearly separable , There are infinitely many solutions
- Unable to resolve XOR problem
The principle of perceptron
The perceptron is based on the feature vector of the data instance x A linear classification model for its second class classification , Output is +1 or -1. The function of the perceptron is :
f ( x ) = s i g n ( w ⋅ x + b ) f(x)=sign(w·x+b) f(x)=sign(w⋅x+b)
among w w w Is the weight , b b b It's bias , w ⋅ x w·x w⋅x yes w w w and x x x Inner product , s i g n sign sign It's a symbolic function , namely ,
s i g n ( x ) = { + 1 x ≥ 0 − 1 x < 0 sign(x) = \begin{cases} +1 & x\ge 0\\ -1 & x <0 \end{cases} sign(x)={ +1−1x≥0x<0When wx+b Greater than 0 when , according to sign function , Output is 1, Corresponding to positive class ; When wx+b Less than 0, Output is -1, Corresponding negative class .Geometric interpretation of perceptron : linear equation w ⋅ x + b = 0 w·x+b=0 w⋅x+b=0 Corresponding to a hyperplane in the feature space S S S, This hyperplane divides the feature space into two parts , The point in both parts ( Eigenvector ) They're divided into positive 、 Negative two types of . hyperplane S S S It is also called separating hyperplane .
A linear equation divides the feature space into two parts . In the two-dimensional feature space ,wx+b=0 That is, between one , Divide the plane into two parts , The point above the line is brought in wx+b The calculated value is greater than 0, Is a positive class , Corresponding y The value is +1, The point below the line is less than 0, Is a negative class , Corresponding y The value is -1.The strategy of perceptron learning is to minimize the loss function Count :
m i n w , b L ( w , b ) = − ∑ y i ( w ⋅ x i + b ) , x i ∈ M min_w, _bL(w,b)=-\sum y_i(w·x_i+b), x_i \in M minw,bL(w,b)=−∑yi(w⋅xi+b),xi∈M
The loss function corresponds to the total distance from the misclassification point to the separation hyperplane .The loss function here focuses on misclassification points , Not all points . The minimum loss function is zero , That is, all points are classified correctly . Therefore, there are infinite solutions .Perceptron learning algorithm is an optimization algorithm of loss function based on random gradient descent method . In the original form , First select a hyperplane , Then the gradient descent method is used to continuously minimize the objective function , In the process , Randomly select one misclassification point at a time to make its gradient drop .
When the training data set is linearly separable [ Add 1], The perceptron learning algorithm is convergent , But there are infinite solutions , These solutions depend on the choice of local values , It also depends on the selection order of misclassification points in the iterative process .
If you want to get a unique hyperplane , We need to add constraints to the separation hyperplane . Refer to support vector machine
Why can't we solve XOR (XOR) problem
XOR problem is in binary operation , The same value is 0, The difference is 1.
Map the XOR problem to two-dimensional space , Can be expressed as :

picture source : https://www.jianshu.com/p/853ebc9e69f6
In this two-dimensional space , We can't find a straight line to divide it into two categories . That is to say, it is impossible to use the perceptron model to X To the side of the straight line , At the same time O To the other side of the line . So the perceptron can't solve the XOR problem .
Supplementary knowledge
- Add 1: Linear separability of data sets
Given a dataset
T = { ( x 1 , y 1 ) , ( x 2 , y 2 ) , … , ( x N , y N ) } T=\{(x_1, y_1), (x_2, y_2),…, (x_N, y_N)\} T={ (x1,y1),(x2,y2),…,(xN,yN)}
among , x i ∈ X = R n x_i \in X = R^n xi∈X=Rn, y i ∈ Y = { + 1 , − 1 } y_i \in Y=\{+1,-1\} yi∈Y={ +1,−1}, i = 1 , 2 , … , N i=1,2,…,N i=1,2,…,N, If I have some hyperplane S S S
w ⋅ x + b = 0 w·x+b=0 w⋅x+b=0 The ability to partition the positive and negative instance points of a data set exactly to either side of a hyperplane , For all y i = + 1 y_i=+1 yi=+1 Example i i i, Yes w ⋅ x i + b > 0 w·x_i+b>0 w⋅xi+b>0, For all y i = − 1 y_i=-1 yi=−1 Example i i i, Yes w ⋅ x i + b < 0 w·x_i+b<0 w⋅xi+b<0, Is called a data set T T T Is a linear fractional data set (Linearly separable data set); otherwise , According to the data set T T T The line shape is inseparable .
Recommended reading :
边栏推荐
- Three. JS introductory learning notes 10:three JS grid
- Communication mode between application program and MATLAB
- Three. JS introduction learning notes 12: the model moves along any trajectory line
- Dotween -- ease function
- TiDB For PostgreSQL和YugabyteDB在Sysbench上的性能对比
- Unity3D_ Class fishing project, bullet rebound effect is achieved
- 招标公告:2022年云南联通gbase数据库维保公开比选项目(第二次)比选公告
- 星瑞格数据库入围“2021年度福建省信息技术应用创新典型解决方案”
- 强化实时数据管理,英方软件助力医保平台安全建设
- Xcode Revoke certificate
猜你喜欢

10 schemes to ensure interface data security

Xcode Revoke certificate

What are compiled languages and interpreted languages?

【花雕体验】15 尝试搭建Beetle ESP32 C3之Arduino开发环境

分步式監控平臺zabbix

Unity3D_ Class fishing project, control the distance between collision walls to adapt to different models

Three. JS introductory learning notes 03: perspective projection camera
Notification uses full resolution

Vs tool word highlight with margin

Numpy -- data cleaning
随机推荐
Lecturer solicitation order | Apache seatunnel (cultivating) meetup sharing guests are in hot Recruitment!
统计学习方法——感知机
You Yuxi, coming!
[flower carving experience] 15 try to build the Arduino development environment of beetle esp32 C3
Three. JS introductory learning notes 13: animation learning
C4D learning notes 1- animation - animation key frames
Three. JS introduction learning notes 12: the model moves along any trajectory line
Excessive dependence on subsidies, difficult collection of key customers, and how strong is the potential to reach the dream of "the first share of domestic databases"?
Bidding announcement: 2022 Yunnan Unicom gbase database maintenance public comparison and selection project (second) comparison and selection announcement
【花雕体验】15 尝试搭建Beetle ESP32 C3之Arduino开发环境
10 schemes to ensure interface data security
Three. JS introductory learning notes 15: threejs frame animation module
Three. JS introductory learning notes 03: perspective projection camera
MySQL中, 如何查询某一天, 某一月, 某一年的数据
Multiplication in pytorch: mul (), multiply (), matmul (), mm (), MV (), dot ()
A link opens the applet code. After compilation, it is easy to understand
SPI master rx time out中断
Shandong old age Expo, 2022 China smart elderly care exhibition, smart elderly care and aging technology exhibition
深度之眼(七)——矩阵的初等变换(附:数模一些模型的解释)
Dotween -- ease function