当前位置:网站首页>Summary of wuenda's machine learning course (11)_ Support vector machine
Summary of wuenda's machine learning course (11)_ Support vector machine
2022-06-28 00:11:00 【51CTO】
12.1 Objective optimization
(1) Here is the logistic regression and the cost function of a single sample


(2) First, the purple line in the above figure will be used ( be called cost1 perhaps cost0) The substitution curve of , Then the number of samples m Get rid of , The final will be C Instead of 1/λ( It's understandable , But not exactly ), Thus, the cost function of logistic regression is realized to SVM Transformation .

(3)SVM The output of will no longer be the probability of logistic regression , And that is 0 perhaps 1:

12.2 The intuitive understanding of the big boundary
(1) First of all, z The requirements are more stringent , In logistic regression, only greater than or less than zero is required ,, This will be greater than or equal to 1 Or less than or equal to -1.

(2) hypothesis C Very big time , Our optimization will try to make the first term zero , Suppose we can get such a parameter , Then we can convert the cost function into :

That is, to solve the previous minimum value under the following constraints .
(3)C Very big time ( namely λ A very small ), Will try to meet the above constraints , This results in being very sensitive to outliers ( Over fitting ), As shown below :

Then you will get a purple line , If you will C Reduce... Appropriately , You will get a satisfactory black line . namely C When not so big , Some outliers can be ignored .
C It's the penalty factor , It can be understood as adjusting two indicators in the optimization direction ( Interval size , Classification accuracy ) The weight of preferences , Tolerance to error ,C The higher the , The more intolerable the error is , Easy to overfit ,C The smaller it is , Easy under fitting ,C Too large or too small , Poor generalization ability .
(3) Support vector machines are often called maximum distance classifiers , stay C This is true when you are very big , but C Not so big , Will not be , As the example above shows . But this understanding is helpful to understand SVM Of .
(4)C Larger equivalent λ smaller , Over fitting occurs ; On the contrary, there is under fitting .
12.3 The largest boundary classification behind Mathematics ( Elective )
(1) The inner product of a vector : The product of the projection length from one vector to another and the norm of the vector , That is, multiply and add the corresponding coordinates .

(2) The objective function is to make θ As small as possible , At this time, just make x stay θ The projection on the is as large as possible , Can be in θ The smaller the value, the constraint conditions are satisfied , This is it. SVM The math behind it .

(3)θ And boundary rendering 90° vertical , in addition θ0 When it is zero, the boundary passes through the origin , On the contrary, it does not pass through the origin .
12.4 Kernel function 1
(1) If the polynomial is directly used to fit the following boundary , Ken can require a polynomial of a very high degree , There are many features .

(2) utilize x The various features of our pre selected landmarks (landmark)l(1),l(2),l(3), The degree of approximation of the new features f1,f2,f3.

Above is a Gaussian kernel function , notes : This function has nothing to do with normal distribution , It just looks like it .
(3) The closer to the landmark, the result f The closer the 1, The farther f The closer the 0.

(3) It will be easy to classify by the following formula :

(4) The result of kernel function calculation is a new feature .
12.5 Kernel function 2
(1) The number of landmarks is set to the number of samples m, That is, the location of each sample is the location of the landmark :

(2) Apply kernel function to support vector machine ,
Given x, Computing new features f, When θTf>0 when , forecast y=1, Otherwise, vice versa .
The corresponding modification cost function is :

In the specific implementation process , You also need to fine tune the final regularization , In the calculation
when , use θTMθ Instead of θTθ.M It is related to the selected kernel function , Use a few blocks of related libraries to use the kernel function SVM.
Without kernel function SVM It is called a linear kernel function .
(3) Here are two parameters of support vector machine C and σ Influence :
C=1/λ;
C large , amount to λ smaller , May cause over fitting , High variance ;
C More hours , amount to λ more , May lead to under fitting , High deviation ;
σ large , May lead to low variance , High deviation .
σ More hours , May cause low deviation , High variance .
12.6 Using support vector machines
(1) Although you don't have to write it yourself SVM function , Use related libraries directly , But a few things need to be done :
1. It's to propose parameters C The choice of . It has been discussed in the previous video C The influence of square deviation .
2. Select kernel parameters or similar functions you want to use .
(2) Here are the choices of logistic regression and support vector machine :
1. Compared to the number of samples m, Characteristic number n When you are much older , There is not so much data to train a very complex model , Consider using SVM.
2. If n smaller , and m Medium size , for example n stay 1-1000 Between , and m stay 10-1000 Between , Support vector machines using Gaussian functions .
3. If n smaller , and m more , for example n stay 1-1000 Between , and m Greater than 50000, It's very slow to use support vector , The solution is to create and add more features , Then use logistic regression or support vector machine without kernel function .
Neural network can perform well in the above three cases , But neural network training can be very slow , The main reason for choosing support vector machine is that its cost function is convex function , There is no local minimum .
author : Your Rego
The copyright of this article belongs to the author , Welcome to reprint , But without the author's consent, the original link must be given on the article page , Otherwise, the right to pursue legal responsibility is reserved .
边栏推荐
- 华泰证券在网上开户安全吗?
- Thread pool implementation: semaphores can also be understood as small waiting queues
- [VIM] tutorial, common commands, efficient use of vim editor
- matlab axis坐标轴相关设置详解
- 认识微信小程序项目的基本组成结构
- RNA SEQ introduction practice (I): upstream data download, format conversion and quality control cleaning
- Is it safe to open a stock account through the account opening QR code of CICC securities manager? Or is it safe to open an account in a securities company?
- Does the subscription of Siyuan notes stop deleting cloud data directly?
- Msp430f5529 MCU reads gy-906 infrared temperature sensor
- golang使用mongo-driver操作——查(数组相关)
猜你喜欢

An analysis of C language functions

ASP. Net warehouse purchase, sales and inventory ERP management system source code ERP applet source code

MySQL enterprise parameter tuning practice sharing

Chenyun pytorch learning notes_ Build RESNET with 50 lines of code

A summer party

Eliminate gaps around El image images

MySQL企业级参数调优实践分享

零基础自学SQL课程 | IF函数

How to quote Chinese documents when writing a foreign language?

Character interception triplets of data warehouse: substrb, substr, substring
随机推荐
安全省油环保 骆驼AGM启停电池魅力十足
Although the TCGA database has 33 cancers
互联网业衍生出来了新的技术,新的模式,新的产业类型
Matlab基本函数 length函数
How to find Chinese documents corresponding to foreign documents?
After a period of silence, I came out again~
[PCL self study: segmentation4] point cloud segmentation based on Min cut
内网IP和公网IP的区别及作用
[digital ic/fpga] detect the position of the last matching sequence
股市小白在网上股票开户安全吗?
Pat class B 1013
ASP. Net warehouse purchase, sales and inventory ERP management system source code ERP applet source code
Using two stacks to implement queues [two first in first out is first in first out]
CRTMP视频直播服务器部署及测试
C language character pointer and string initialization
Scu| gait switching and target navigation of micro swimming robot through deep reinforcement learning
What are the ways to combine the points system with marketing activities
Is not null and in SQL= Difference between null
Feign通过自定义注解实现路径的转义
Request object, response object, session object