当前位置：网站首页>[CV-Learning] Linear Classifier (SVM Basics)

[CV-Learning] Linear Classifier (SVM Basics)

2022-08-04 06:14:00 【Xiao Liang has to work hard】

数据集介绍（本文所用）

CIFAR10数据集

包含5w张训练样本、1w张测试样本,分为飞机、汽车、鸟、猫、鹿、狗、蛙、马、船、Ten categories of trucks,Images are color images,其大小为32*32.

图像类型（像素表示）

二进制（0/1）
灰度图像
A pixel consists of a bit（Byte）表示,取值为0-255.
degree of color：黑（0）---->----->---->白（255）
彩色图像
One image has red、绿、Three depths of blue,即三个通道.One pixel per channel is made up of one bit（Byte）表示,取值为0-255.The three depth images are combined to represent a color image.比如：The pixels of the image are 500500,则需要500500*3can be represented by a matrix.
degree of color：黑（0）---->----->---->红（255）
Ps：Most classification algorithms require an input vector,Convert an image matrix to a vector.

Therefore, each image of the dataset used in this article is converted into a vector3072（32323）维列向量.

线性分类器

定义：线性分类器是一种线性映射,Map the input image features as 类别分数.
特点：形式简单,易于理解
拓展：通过层级结构（神经网络）或者高维映射（支撑向量机）可以形成功能强大的非线性模型.in the case of small samples,Commonly used support vector machines;in the case of large samples,常用神经网络.

Decisions of Linear Classifiers

决策规则：
在这里插入图片描述
决策步骤：

Images are represented as vectors
计算当前图片每个类别的分数
Determine the current image by category score
Matrix representation of the classifier：

Weight vector for the linear classifier

在这里插入图片描述

Decision boundary for a linear classifier

在这里插入图片描述

损失函数

To find the optimal classification model,Also need the help of loss function and optimization algorithm.The loss function builds the model performance and model parameters（W,b）之间的桥梁,指导模型参数优化.

损失函数定义

损失函数是一个函数,用于度量给定分类器的预测值与真实值的不一致程度,其输出通常是一个非负实值.
其输出的非负实值可以作为反馈信号来对分类器参数进行调整,以降低当前示例对应的损失值,提升分类器的分类效果.
The loss value is a description of the model's performance.
在这里插入图片描述

多类支撑向量机损失

在这里插入图片描述

Ps：限制条件中**+1**It is to reduce the influence of noise near the boundary
L函数举例说明
在这里插入图片描述
计算出多个LThen ask for an average value！

问题解答

多类支撑向量机损失Li的最大/What will be the minimum value?
答：The maximum value is infinite,最小值为0
如果初始化时w和b很小,损失L是多少?
答：此时,Sij和Syiare small andSij-Syi约为0.Li和Lequal to the sample size minus one.This situation can be used to check the correctness of the algorithm.
Consider all categories(包括j=yi), 损失L,Whether it affects the selection of optimal parameters?
答：无影响.
in total lossL计算时,If the sum is used instead of the average,Whether it affects the selection of optimal parameters?
答：无影响.
假设存在一个W使损失函数L=0,这个W是唯一的吗？
答：不唯一.

正则项与超参数

What is regular term loss

Prevent the model from learning too well on the training set（过拟合）,可以在Lplus a regularization loss,The regular term gives the model a preference,Can be used in multiple loss functions0The optimal model parameters are selected from the model parameters,所以损失函数L可以唯一.
在这里插入图片描述

L2正则项

在这里插入图片描述

什么是超参数

在before starting the learning process设置值的参数,rather than learning.
Hyperparameters generally have a significant impact on model performance.
在这里插入图片描述

Commonly used regular term losses

在这里插入图片描述

优化算法

利用The output value of the loss functionAs a feedback signal to adjust the classifier parameters,以提升分类器对训练样本的预测性能.The optimization goal is to find the loss function that makes it soL达到最优的那组参数W.

梯度下降算法

A simple and efficient iterative optimization algorithm.
在这里插入图片描述

数值法
计算量大,得到近似值,不准确.
解析法
精确值,速度快,But derivative derivation is error-prone.

问：What does a numerical gradient do？
求梯度时一般使用解析梯度,而数值梯度主要用于解析梯度的正确性校验(梯度检查).

计算效率

梯度下降

当N很大时,Each calculation of the weight gradient is very computationally expensive,耗时长,效率低下.
随机梯度下降

单个样本的训练可能会带来很多噪声,Although not every iteration is in the direction of the overall optimum,However, a large number of iterations to reflect the law will make the whole go in the optimal direction.
小批量梯度下降

在论文中,一般用epochDescribe the iterative sample situation.
1个epoch需要N/m次迭代,N是样本总数,m是批量大小.

训练过程

数据集划分

数据集=训练集+验证集+测试集
训练集：The learning of the classifier parameters when used for the given hyperparameters.
验证集：Used to select hyperparameters.
测试集：评估泛化能力.
问：when data is scarce,Then the possible validation set contains very few samples,Thus, the data cannot be represented in a statistical sense.At this point we can come up with a method that uses cross-validation.

K折交叉验证

在这里插入图片描述
To make randomness better,We can shuffle the data at each split,This results in a better final average score,这种方法叫做带有打乱数据的重复K折验证.

数据预处理

去均值
x=x-均值;The floating range of the data can be reduced,Highlight relative differences.
归一化
x=(x-均值)/方差;去除量纲的影响.
去相关性
Sometimes it's just a separate discussionxOr just discuss it separatelyy,就需要去掉x,y的相关性,x变化,y不会随着变化;Make the data independent,达到降维的效果.
白化
The normalization operation is performed on the basis of decorrelation.

北京邮电大学–鲁鹏–计算机视觉与深度学习