当前位置:网站首页>Summary of wuenda's machine learning course (11)_ Support vector machine
Summary of wuenda's machine learning course (11)_ Support vector machine
2022-06-28 00:11:00 【51CTO】
12.1 Objective optimization
(1) Here is the logistic regression and the cost function of a single sample


(2) First, the purple line in the above figure will be used ( be called cost1 perhaps cost0) The substitution curve of , Then the number of samples m Get rid of , The final will be C Instead of 1/λ( It's understandable , But not exactly ), Thus, the cost function of logistic regression is realized to SVM Transformation .

(3)SVM The output of will no longer be the probability of logistic regression , And that is 0 perhaps 1:

12.2 The intuitive understanding of the big boundary
(1) First of all, z The requirements are more stringent , In logistic regression, only greater than or less than zero is required ,, This will be greater than or equal to 1 Or less than or equal to -1.

(2) hypothesis C Very big time , Our optimization will try to make the first term zero , Suppose we can get such a parameter , Then we can convert the cost function into :

That is, to solve the previous minimum value under the following constraints .
(3)C Very big time ( namely λ A very small ), Will try to meet the above constraints , This results in being very sensitive to outliers ( Over fitting ), As shown below :

Then you will get a purple line , If you will C Reduce... Appropriately , You will get a satisfactory black line . namely C When not so big , Some outliers can be ignored .
C It's the penalty factor , It can be understood as adjusting two indicators in the optimization direction ( Interval size , Classification accuracy ) The weight of preferences , Tolerance to error ,C The higher the , The more intolerable the error is , Easy to overfit ,C The smaller it is , Easy under fitting ,C Too large or too small , Poor generalization ability .
(3) Support vector machines are often called maximum distance classifiers , stay C This is true when you are very big , but C Not so big , Will not be , As the example above shows . But this understanding is helpful to understand SVM Of .
(4)C Larger equivalent λ smaller , Over fitting occurs ; On the contrary, there is under fitting .
12.3 The largest boundary classification behind Mathematics ( Elective )
(1) The inner product of a vector : The product of the projection length from one vector to another and the norm of the vector , That is, multiply and add the corresponding coordinates .

(2) The objective function is to make θ As small as possible , At this time, just make x stay θ The projection on the is as large as possible , Can be in θ The smaller the value, the constraint conditions are satisfied , This is it. SVM The math behind it .

(3)θ And boundary rendering 90° vertical , in addition θ0 When it is zero, the boundary passes through the origin , On the contrary, it does not pass through the origin .
12.4 Kernel function 1
(1) If the polynomial is directly used to fit the following boundary , Ken can require a polynomial of a very high degree , There are many features .

(2) utilize x The various features of our pre selected landmarks (landmark)l(1),l(2),l(3), The degree of approximation of the new features f1,f2,f3.

Above is a Gaussian kernel function , notes : This function has nothing to do with normal distribution , It just looks like it .
(3) The closer to the landmark, the result f The closer the 1, The farther f The closer the 0.

(3) It will be easy to classify by the following formula :

(4) The result of kernel function calculation is a new feature .
12.5 Kernel function 2
(1) The number of landmarks is set to the number of samples m, That is, the location of each sample is the location of the landmark :

(2) Apply kernel function to support vector machine ,
Given x, Computing new features f, When θTf>0 when , forecast y=1, Otherwise, vice versa .
The corresponding modification cost function is :

In the specific implementation process , You also need to fine tune the final regularization , In the calculation
when , use θTMθ Instead of θTθ.M It is related to the selected kernel function , Use a few blocks of related libraries to use the kernel function SVM.
Without kernel function SVM It is called a linear kernel function .
(3) Here are two parameters of support vector machine C and σ Influence :
C=1/λ;
C large , amount to λ smaller , May cause over fitting , High variance ;
C More hours , amount to λ more , May lead to under fitting , High deviation ;
σ large , May lead to low variance , High deviation .
σ More hours , May cause low deviation , High variance .
12.6 Using support vector machines
(1) Although you don't have to write it yourself SVM function , Use related libraries directly , But a few things need to be done :
1. It's to propose parameters C The choice of . It has been discussed in the previous video C The influence of square deviation .
2. Select kernel parameters or similar functions you want to use .
(2) Here are the choices of logistic regression and support vector machine :
1. Compared to the number of samples m, Characteristic number n When you are much older , There is not so much data to train a very complex model , Consider using SVM.
2. If n smaller , and m Medium size , for example n stay 1-1000 Between , and m stay 10-1000 Between , Support vector machines using Gaussian functions .
3. If n smaller , and m more , for example n stay 1-1000 Between , and m Greater than 50000, It's very slow to use support vector , The solution is to create and add more features , Then use logistic regression or support vector machine without kernel function .
Neural network can perform well in the above three cases , But neural network training can be very slow , The main reason for choosing support vector machine is that its cost function is convex function , There is no local minimum .
author : Your Rego
The copyright of this article belongs to the author , Welcome to reprint , But without the author's consent, the original link must be given on the article page , Otherwise, the right to pursue legal responsibility is reserved .
边栏推荐
- Scu| gait switching and target navigation of micro swimming robot through deep reinforcement learning
- 零基础自学SQL课程 | IF函数
- Character interception triplets of data warehouse: substrb, substr, substring
- Feign通过自定义注解实现路径的转义
- Build an open source and beautiful database monitoring system -lepus
- Golang uses Mongo driver operation - query (basic)
- 现代编程语言:zig
- Are the registered accounts of the top ten securities companies safe and risky?
- 炼金术(3): 怎样做好1个业务流程的接口对接
- TIME_WAIT过多的解决办法
猜你喜欢
![[microservices sentinel] sentinel data persistence](/img/9f/2767945db99761bb35e2bb5434b44d.png)
[microservices sentinel] sentinel data persistence
![[paper reading | deep reading] sdne:structural deep network embedding](/img/6a/b2edf326f6e7ded83deb77219654aa.png)
[paper reading | deep reading] sdne:structural deep network embedding
![计数质数[枚举 -> 空间换时间]](/img/11/c52e1dfce8e35307c848d12ccc6454.png)
计数质数[枚举 -> 空间换时间]

Zero foundation self-study SQL course | if function

ASP.NET仓库进销存ERP管理系统源码 ERP小程序源码

安全省油环保 骆驼AGM启停电池魅力十足

表单form 和 表单元素(input、select、textarea等)
![[PCL self study: Segmentation3] PCL based point cloud segmentation: region growth segmentation](/img/9e/f08ce0729c89b0205c0ac47c523ad7.png)
[PCL self study: Segmentation3] PCL based point cloud segmentation: region growth segmentation

Chenyun pytorch learning notes_ Build RESNET with 50 lines of code
![[untitled]](/img/e4/7c65c6823559b8501a1777cc4eb7ba.jpg)
[untitled]
随机推荐
Mise en œuvre du pool de Threads: les sémaphores peuvent également être considérés comme de petites files d'attente
获取基因有效长度的N种方法
安全省油環保 駱駝AGM啟停電池魅力十足
Is it safe for Huatai Securities to open an account online?
使用cef3开发的浏览器不支持flash问题的解决
计数质数[枚举 -> 空间换时间]
炼金术(9): 简约而不简单,永不停歇的测试 -- always_run
VirtualBox extended dynamic disk size pit
代码整洁之道--函数
Can you do these five steps of single cell data cleaning?
How to use raspberry pie (and all kinds of pies)
How to find Chinese documents corresponding to foreign documents?
[idea] idea formatting code skills
One step forward is excellent, one step backward is ignorant
golang使用mongo-driver操作——查(数组相关)
超纲练习题不超纲
Sell notes | brief introduction to video text pre training
[AI application] detailed parameters of NVIDIA Tesla v100s-pcie-32gb
[digital ic/fpga] detect the position of the last matching sequence
Smart wind power | Tupu software digital twin wind turbine equipment, 3D visual intelligent operation and maintenance