当前位置:网站首页>Summary of wuenda's machine learning course (11)_ Support vector machine
Summary of wuenda's machine learning course (11)_ Support vector machine
2022-06-28 00:11:00 【51CTO】
12.1 Objective optimization
(1) Here is the logistic regression and the cost function of a single sample


(2) First, the purple line in the above figure will be used ( be called cost1 perhaps cost0) The substitution curve of , Then the number of samples m Get rid of , The final will be C Instead of 1/λ( It's understandable , But not exactly ), Thus, the cost function of logistic regression is realized to SVM Transformation .

(3)SVM The output of will no longer be the probability of logistic regression , And that is 0 perhaps 1:

12.2 The intuitive understanding of the big boundary
(1) First of all, z The requirements are more stringent , In logistic regression, only greater than or less than zero is required ,, This will be greater than or equal to 1 Or less than or equal to -1.

(2) hypothesis C Very big time , Our optimization will try to make the first term zero , Suppose we can get such a parameter , Then we can convert the cost function into :

That is, to solve the previous minimum value under the following constraints .
(3)C Very big time ( namely λ A very small ), Will try to meet the above constraints , This results in being very sensitive to outliers ( Over fitting ), As shown below :

Then you will get a purple line , If you will C Reduce... Appropriately , You will get a satisfactory black line . namely C When not so big , Some outliers can be ignored .
C It's the penalty factor , It can be understood as adjusting two indicators in the optimization direction ( Interval size , Classification accuracy ) The weight of preferences , Tolerance to error ,C The higher the , The more intolerable the error is , Easy to overfit ,C The smaller it is , Easy under fitting ,C Too large or too small , Poor generalization ability .
(3) Support vector machines are often called maximum distance classifiers , stay C This is true when you are very big , but C Not so big , Will not be , As the example above shows . But this understanding is helpful to understand SVM Of .
(4)C Larger equivalent λ smaller , Over fitting occurs ; On the contrary, there is under fitting .
12.3 The largest boundary classification behind Mathematics ( Elective )
(1) The inner product of a vector : The product of the projection length from one vector to another and the norm of the vector , That is, multiply and add the corresponding coordinates .

(2) The objective function is to make θ As small as possible , At this time, just make x stay θ The projection on the is as large as possible , Can be in θ The smaller the value, the constraint conditions are satisfied , This is it. SVM The math behind it .

(3)θ And boundary rendering 90° vertical , in addition θ0 When it is zero, the boundary passes through the origin , On the contrary, it does not pass through the origin .
12.4 Kernel function 1
(1) If the polynomial is directly used to fit the following boundary , Ken can require a polynomial of a very high degree , There are many features .

(2) utilize x The various features of our pre selected landmarks (landmark)l(1),l(2),l(3), The degree of approximation of the new features f1,f2,f3.

Above is a Gaussian kernel function , notes : This function has nothing to do with normal distribution , It just looks like it .
(3) The closer to the landmark, the result f The closer the 1, The farther f The closer the 0.

(3) It will be easy to classify by the following formula :

(4) The result of kernel function calculation is a new feature .
12.5 Kernel function 2
(1) The number of landmarks is set to the number of samples m, That is, the location of each sample is the location of the landmark :

(2) Apply kernel function to support vector machine ,
Given x, Computing new features f, When θTf>0 when , forecast y=1, Otherwise, vice versa .
The corresponding modification cost function is :

In the specific implementation process , You also need to fine tune the final regularization , In the calculation
when , use θTMθ Instead of θTθ.M It is related to the selected kernel function , Use a few blocks of related libraries to use the kernel function SVM.
Without kernel function SVM It is called a linear kernel function .
(3) Here are two parameters of support vector machine C and σ Influence :
C=1/λ;
C large , amount to λ smaller , May cause over fitting , High variance ;
C More hours , amount to λ more , May lead to under fitting , High deviation ;
σ large , May lead to low variance , High deviation .
σ More hours , May cause low deviation , High variance .
12.6 Using support vector machines
(1) Although you don't have to write it yourself SVM function , Use related libraries directly , But a few things need to be done :
1. It's to propose parameters C The choice of . It has been discussed in the previous video C The influence of square deviation .
2. Select kernel parameters or similar functions you want to use .
(2) Here are the choices of logistic regression and support vector machine :
1. Compared to the number of samples m, Characteristic number n When you are much older , There is not so much data to train a very complex model , Consider using SVM.
2. If n smaller , and m Medium size , for example n stay 1-1000 Between , and m stay 10-1000 Between , Support vector machines using Gaussian functions .
3. If n smaller , and m more , for example n stay 1-1000 Between , and m Greater than 50000, It's very slow to use support vector , The solution is to create and add more features , Then use logistic regression or support vector machine without kernel function .
Neural network can perform well in the above three cases , But neural network training can be very slow , The main reason for choosing support vector machine is that its cost function is convex function , There is no local minimum .
author : Your Rego
The copyright of this article belongs to the author , Welcome to reprint , But without the author's consent, the original link must be given on the article page , Otherwise, the right to pursue legal responsibility is reserved .
边栏推荐
- MySQL character set
- 【无标题】
- [paper reading | deep reading] sdne:structural deep network embedding
- [idea] idea formatting code skills
- Build an open source and beautiful database monitoring system -lepus
- [try to hack] kill evaluation
- Character interception triplets of data warehouse: substrb, substr, substring
- 认识微信小程序项目的基本组成结构
- How to use the apipost script - global variables
- 积分体系和营销活动结合在一起有哪些玩法
猜你喜欢

Transmitting and receiving antenna pattern

炼金术(1): 识别项目开发中的ProtoType、Demo、MVP

安全省油環保 駱駝AGM啟停電池魅力十足

Eliminate gaps around El image images

零基础自学SQL课程 | SQL中的日期函数大全
![[PCL self study: segmentation4] point cloud segmentation based on Min cut](/img/af/a6c5abf357c1db0718df505499df70.png)
[PCL self study: segmentation4] point cloud segmentation based on Min cut

Promise是什么

An analysis of C language functions

How to quote Chinese documents when writing a foreign language?
![[paper reading | deep reading] sdne:structural deep network embedding](/img/6a/b2edf326f6e7ded83deb77219654aa.png)
[paper reading | deep reading] sdne:structural deep network embedding
随机推荐
Instructions for vivado FFT IP
100 questions for an enterprise architect interview
MYSQL的下载与配置安装
Promise是什么
零基础自学SQL课程 | SQL基本函数大全
Zero foundation self-study SQL course | complete collection of SQL basic functions
[AI application] detailed parameters of NVIDIA geforce RTX 1080ti
Golang uses Mongo driver operation -- Query (array related)
Golang uses Mongo driver operation - query (basic)
Smart wind power | Tupu software digital twin wind turbine equipment, 3D visual intelligent operation and maintenance
Is not null and in SQL= Difference between null
一个人可以到几家证券公司开户?开户安全吗
用两个栈实现队列[两次先进后出便是先进先出]
Character interception triplets of data warehouse: substrb, substr, substring
软件工程作业设计(1): [个人项目] 实现一个日志查看页面
MySQL enterprise parameter tuning practice sharing
数仓的字符截取三胞胎:substrb、substr、substring
request对象、response对象、session对象
Golang uses Mongo driver operation - query (Advanced)
[AI application] detailed parameters of NVIDIA geforce RTX 3060