当前位置:网站首页>Support vector machine for machine learning
Support vector machine for machine learning
2022-07-03 06:10:00 【Master core technology】
One 、 Concept of support vector machine
Support vector machine (Support Vector Machine) It is a kind of generalized linear classifier which classifies data according to supervised learning (generalized linear classifier), The decision boundary is the maximum margin hyperplane for learning samples (maximum-margin hyperplane).
SVM There are three treasures : interval 、 dual 、 Nuclear skills
SVM There are three kinds of :hard-margin SVM、soft-margin、kernel SVM
Given the training sample set D = { ( x 1 , y 1 ) , ( x 2 , y 2 ) , . . . , ( x m , y m ) , } , y i ∈ { − 1 , + 1 } D=\{(x_1,y_1),(x_2,y_2),...,(x_m,y_m),\},y_i\in\{-1,+1\} D={ (x1,y1),(x2,y2),...,(xm,ym),},yi∈{ −1,+1}, Find a hyperplane in the sample space , As far as possible Separate samples of different categories , Make the classification result produced by this partition hyperplane the most robust , See the thick line in the figure below .
The partition hyperplane can be described by the following linear equation :
w T x + b = 0 w^Tx+b=0 wTx+b=0
among w = ( w 1 ; w 2 ; . . . w d ) w=(w_1;w_2;...w_d) w=(w1;w2;...wd) For the normal vector , It determines the direction of the hyperplane ;b Is the displacement term , Determines the distance between the hyperplane and the origin , Partition hyperplane by normal vector w And displacement b determine .
Support vector machine f ( x ) = s i g n ( w T x + b ) f(x)=sign(w^Tx+b) f(x)=sign(wTx+b) It is a classical discriminant model .
Two 、SVM Basic type derivation
SVM Also called Maximum interval classifier , Define the model as :
m a x m a r g i n ( w , b ) = m a x d i s t a n c e ( w , b , x i ) ( 1 ) s . t . { w T x i + b ≥ 0 , y i = + 1 w T x i + b ≤ 0 , y i = − 1 , i = 1 , 2 , . . . , m ( 2 ) max\ margin(w,b)=max\ distance(w,b,x_i) \qquad (1) \\ s.t.\left\{ \begin{aligned} w^Tx_i+b\ge 0,y_i=+1 \\ w^Tx_i+b \le 0,y_i=-1 \end{aligned} \right.,\ i=1,2,...,m\qquad (2) max margin(w,b)=max distance(w,b,xi)(1)s.t.{ wTxi+b≥0,yi=+1wTxi+b≤0,yi=−1, i=1,2,...,m(2)
By way of (2) Telescopic transformation is available :
s . t . { w T x i + b ≥ 1 , y i = + 1 w T x i + b ≤ − 1 , y i = − 1 ( 3 ) s.t.\left\{ \begin{aligned} w^Tx_i+b\ge 1,y_i=+1 \\ w^Tx_i+b \le -1,y_i=-1 \end{aligned}\qquad (3) \right. s.t.{ wTxi+b≥1,yi=+1wTxi+b≤−1,yi=−1(3)
Let a point on the partition hyperplane be x‘, Then there are w T x ′ = − b ( 4 ) w^Tx'=-b\qquad (4) wTx′=−b(4)
Distance formula :
r = ∣ w T ∣ ∣ w ∣ ∣ ( x − x ′ ) ∣ ( cast shadow ) = 1 ∣ ∣ w ∣ ∣ ∣ w T x + b ∣ ( generation Enter into ( 4 ) type ) r=|\frac{w^T}{||w||}(x-x')|( Projection )=\frac{1}{||w||}|w^Tx+b|( Plug in (4) type ) r=∣∣∣w∣∣wT(x−x′)∣( cast shadow )=∣∣w∣∣1∣wTx+b∣( generation Enter into (4) type )
As shown in the figure , These training sample points closest to the hyperplane make (3) The equation equals sign holds , So for the recent point ∣ w T + b ∣ = 1 |w^T+b|=1 ∣wT+b∣=1, They are called ” Support vector “, According to the distance formula :
r = 1 ∣ ∣ w ∣ ∣ ( 5 ) r=\frac1{||w||} \qquad(5) r=∣∣w∣∣1(5)
The sum of the distances from the two heterogeneous support vector machines to the hyperplane is :
γ = 2 ∣ ∣ w ∣ ∣ ( 6 ) \gamma =\frac2{||w||} \qquad(6) γ=∣∣w∣∣2(6)
It's called spacing (margin).
Want to find the maximum interval (maximum margin) The partition hyperplane of , That is to find satisfaction (3) There are three constraint parameters w and b, bring γ \gamma γ Maximum , namely
max w , b = 2 ∣ ∣ w ∣ ∣ s . t . y i ( w T x i + b ) ≥ 1 , i = 1 , 2 , . . . , m ( 7 ) \begin{aligned} &\max\limits_{w,b}=\frac2{||w||}\\ &s.t. \ y_i(w^Tx_i+b)\ge1,i=1,2,...,m \end{aligned}\qquad(7) w,bmax=∣∣w∣∣2s.t. yi(wTxi+b)≥1,i=1,2,...,m(7)
Obviously maximize the interval , Just maximize ∣ ∣ w ∣ ∣ − 1 ||w||^{-1} ∣∣w∣∣−1, This is equivalent to minimizing ∣ ∣ w ∣ ∣ 2 ||w||^{2} ∣∣w∣∣2, therefore (7) Formula rewritten as
min w , b = 1 2 ∣ ∣ w ∣ ∣ 2 s . t . y i ( w T x i + b ) ≥ 1 , i = 1 , 2 , . . . , m ( 8 ) \begin{aligned} &\min\limits_{w,b}=\frac1{2}{||w||}^2\\ &s.t. \ y_i(w^Tx_i+b)\ge1,i=1,2,...,m \end{aligned}\qquad(8) w,bmin=21∣∣w∣∣2s.t. yi(wTxi+b)≥1,i=1,2,...,m(8)
(8) The formula is SVM The basic type of
边栏推荐
- Fluentd is easy to use. Combined with the rainbow plug-in market, log collection is faster
- 从小数据量 MySQL 迁移数据到 TiDB
- Intel's new GPU patent shows that its graphics card products will use MCM Packaging Technology
- Advanced technology management - do you know the whole picture of growth?
- Clickhouse learning notes (I): Clickhouse installation, data type, table engine, SQL operation
- Multithreading and high concurrency (7) -- from reentrantlock to AQS source code (20000 words, one understanding AQS)
- Cesium Click to obtain the longitude and latitude elevation coordinates (3D coordinates) of the model surface
- Skywalking8.7 source code analysis (I): agent startup process, agent configuration loading process, custom class loader agentclassloader, plug-in definition system, plug-in loading
- 深度学习,从一维特性输入到多维特征输入引发的思考
- Oauth2.0 - user defined mode authorization - SMS verification code login
猜你喜欢

Simple handwritten ORM framework

Kubesphere - build Nacos cluster

SVN分支管理

Method of converting GPS coordinates to Baidu map coordinates

智牛股--03
![[teacher Zhao Yuqiang] Cassandra foundation of NoSQL database](/img/cc/5509b62756dddc6e5d4facbc6a7c5f.jpg)
[teacher Zhao Yuqiang] Cassandra foundation of NoSQL database

Cesium 点击获取模型表面经纬度高程坐标(三维坐标)

Simple understanding of ThreadLocal

Clickhouse learning notes (2): execution plan, table creation optimization, syntax optimization rules, query optimization, data consistency

Kubernetes notes (IX) kubernetes application encapsulation and expansion
随机推荐
Kubernetes notes (II) pod usage notes
What's the difference between using the Service Worker Cache API and regular browser cache?
[teacher Zhao Yuqiang] index in mongodb (Part 2)
Cesium Click to obtain the longitude and latitude elevation coordinates (3D coordinates) of the model surface
Multithreading and high concurrency (7) -- from reentrantlock to AQS source code (20000 words, one understanding AQS)
项目总结--01(接口的增删改查;多线程的使用)
技术管理进阶——你了解成长的全貌吗?
Migrate data from Mysql to tidb from a small amount of data
Pytorch builds the simplest version of neural network
Cesium 点击获取模型表面经纬度高程坐标(三维坐标)
Oauth2.0 - user defined mode authorization - SMS verification code login
卷积神经网络CNN中的卷积操作详解
Oracle database synonym creation
Tabbar settings
Sorry, this user does not exist!
[teacher Zhao Yuqiang] the most detailed introduction to PostgreSQL architecture in history
Cesium 点击获三维坐标(经纬度高程)
tabbar的设置
PMP notes
Convolution operation in convolution neural network CNN