当前位置:网站首页>Statistical learning method -- perceptron
Statistical learning method -- perceptron
2022-07-07 16:14:00 【_ Spring_】
Catalog
Perceptron is one of the most basic models of machine learning , It's the basis of neural networks and support vector machines .
Keywords of perceptron
- Two classification
- Discriminant model 、 Linear model
- Data sets are linearly separable , There are infinitely many solutions
- Unable to resolve XOR problem
The principle of perceptron
The perceptron is based on the feature vector of the data instance x A linear classification model for its second class classification , Output is +1 or -1. The function of the perceptron is :
f ( x ) = s i g n ( w ⋅ x + b ) f(x)=sign(w·x+b) f(x)=sign(w⋅x+b)
among w w w Is the weight , b b b It's bias , w ⋅ x w·x w⋅x yes w w w and x x x Inner product , s i g n sign sign It's a symbolic function , namely ,
s i g n ( x ) = { + 1 x ≥ 0 − 1 x < 0 sign(x) = \begin{cases} +1 & x\ge 0\\ -1 & x <0 \end{cases} sign(x)={ +1−1x≥0x<0When wx+b Greater than 0 when , according to sign function , Output is 1, Corresponding to positive class ; When wx+b Less than 0, Output is -1, Corresponding negative class .Geometric interpretation of perceptron : linear equation w ⋅ x + b = 0 w·x+b=0 w⋅x+b=0 Corresponding to a hyperplane in the feature space S S S, This hyperplane divides the feature space into two parts , The point in both parts ( Eigenvector ) They're divided into positive 、 Negative two types of . hyperplane S S S It is also called separating hyperplane .
A linear equation divides the feature space into two parts . In the two-dimensional feature space ,wx+b=0 That is, between one , Divide the plane into two parts , The point above the line is brought in wx+b The calculated value is greater than 0, Is a positive class , Corresponding y The value is +1, The point below the line is less than 0, Is a negative class , Corresponding y The value is -1.The strategy of perceptron learning is to minimize the loss function Count :
m i n w , b L ( w , b ) = − ∑ y i ( w ⋅ x i + b ) , x i ∈ M min_w, _bL(w,b)=-\sum y_i(w·x_i+b), x_i \in M minw,bL(w,b)=−∑yi(w⋅xi+b),xi∈M
The loss function corresponds to the total distance from the misclassification point to the separation hyperplane .The loss function here focuses on misclassification points , Not all points . The minimum loss function is zero , That is, all points are classified correctly . Therefore, there are infinite solutions .Perceptron learning algorithm is an optimization algorithm of loss function based on random gradient descent method . In the original form , First select a hyperplane , Then the gradient descent method is used to continuously minimize the objective function , In the process , Randomly select one misclassification point at a time to make its gradient drop .
When the training data set is linearly separable [ Add 1], The perceptron learning algorithm is convergent , But there are infinite solutions , These solutions depend on the choice of local values , It also depends on the selection order of misclassification points in the iterative process .
If you want to get a unique hyperplane , We need to add constraints to the separation hyperplane . Refer to support vector machine
Why can't we solve XOR (XOR) problem
XOR problem is in binary operation , The same value is 0, The difference is 1.
Map the XOR problem to two-dimensional space , Can be expressed as :

picture source : https://www.jianshu.com/p/853ebc9e69f6
In this two-dimensional space , We can't find a straight line to divide it into two categories . That is to say, it is impossible to use the perceptron model to X To the side of the straight line , At the same time O To the other side of the line . So the perceptron can't solve the XOR problem .
Supplementary knowledge
- Add 1: Linear separability of data sets
Given a dataset
T = { ( x 1 , y 1 ) , ( x 2 , y 2 ) , … , ( x N , y N ) } T=\{(x_1, y_1), (x_2, y_2),…, (x_N, y_N)\} T={ (x1,y1),(x2,y2),…,(xN,yN)}
among , x i ∈ X = R n x_i \in X = R^n xi∈X=Rn, y i ∈ Y = { + 1 , − 1 } y_i \in Y=\{+1,-1\} yi∈Y={ +1,−1}, i = 1 , 2 , … , N i=1,2,…,N i=1,2,…,N, If I have some hyperplane S S S
w ⋅ x + b = 0 w·x+b=0 w⋅x+b=0 The ability to partition the positive and negative instance points of a data set exactly to either side of a hyperplane , For all y i = + 1 y_i=+1 yi=+1 Example i i i, Yes w ⋅ x i + b > 0 w·x_i+b>0 w⋅xi+b>0, For all y i = − 1 y_i=-1 yi=−1 Example i i i, Yes w ⋅ x i + b < 0 w·x_i+b<0 w⋅xi+b<0, Is called a data set T T T Is a linear fractional data set (Linearly separable data set); otherwise , According to the data set T T T The line shape is inseparable .
Recommended reading :
边栏推荐
- 分步式监控平台zabbix
- A wave of open source notebooks is coming
- 招标公告:福建省农村信用社联合社数据库审计系统采购项目(重新招标)
- 讲师征集令 | Apache SeaTunnel(Incubating) Meetup 分享嘉宾火热招募中!
- Talk about the cloud deployment of local projects created by SAP IRPA studio
- Three. JS introductory learning notes 19: how to import FBX static model
- leetcode 241. Different Ways to Add Parentheses 为运算表达式设计优先级(中等)
- thinkphp3.2.3中设置路由,优化url
- laravel 是怎么做到运行 composer dump-autoload 不清空 classmap 映射关系的呢?
- Apache Doris just "graduated": why should we pay attention to this kind of SQL data warehouse?
猜你喜欢
![Unity drawing plug-in = = [support the update of the original atlas]](/img/b0/92114ffb1f168a1f27125db46c6797.jpg)
Unity drawing plug-in = = [support the update of the original atlas]

Vite path alias @ configuration

Unity3d click events added to 3D objects in the scene

过度依赖补助,大客户收款难,冲刺“国产数据库第一股”的达梦后劲有多足?
![[flower carving experience] 15 try to build the Arduino development environment of beetle esp32 C3](/img/8f/ca9ab042916f68de7994d9f2124da9.jpg)
[flower carving experience] 15 try to build the Arduino development environment of beetle esp32 C3

Apache Doris just "graduated": why should we pay attention to this kind of SQL data warehouse?

Lecturer solicitation order | Apache seatunnel (cultivating) meetup sharing guests are in hot Recruitment!

Strengthen real-time data management, and the British software helps the security construction of the medical insurance platform

喜讯!科蓝SUNDB数据库与鸿数科技隐私数据保护管理软件完成兼容性适配

SysOM 案例解析:消失的内存都去哪了 !| 龙蜥技术
随机推荐
Align individual elements to the right under flex layout
过度依赖补助,大客户收款难,冲刺“国产数据库第一股”的达梦后劲有多足?
Three. JS introductory learning notes 07: external model import -c4d to JSON file for web pages -fbx import
2022山东智慧养老展,适老穿戴设备展,养老展,山东老博会
Iptables only allows the specified IP address to access the specified port
2022第四届中国(济南)国际智慧养老产业展览会,山东老博会
Lecturer solicitation order | Apache seatunnel (cultivating) meetup sharing guests are in hot Recruitment!
山东老博会,2022中国智慧养老展会,智能化养老、适老科技展
如何在shell中实现 backspace
JS array foreach source code parsing
【花雕体验】15 尝试搭建Beetle ESP32 C3之Arduino开发环境
一个普通人除了去工厂上班赚钱,还能干什么工作?
Three. JS introductory learning notes 08:orbitcontrols JS plug-in - mouse control model rotation, zoom in, zoom out, translation, etc
Performance measure of classification model
深度之眼(六)——矩阵的逆(附:logistic模型一些想法)
Three. JS introductory learning notes 19: how to import FBX static model
Description of vs common shortcut keys
招标公告:福建省农村信用社联合社数据库审计系统采购项目(重新招标)
招标公告:2022年云南联通gbase数据库维保公开比选项目(第二次)比选公告
hellogolang