当前位置：网站首页>[CV] wuenda machine learning course notes Chapter 13

[CV] wuenda machine learning course notes Chapter 13

2022-06-29 04:46:00 【Fannnnf】

If there is no special explanation in this series of articles , The text explains the picture above the text
machine learning | Coursera
Wu Enda machine learning series _bilibili

Catalog

13 clustering algorithm

13 clustering algorithm

13-1 Unsupervised learning

Insert picture description here
The data set of unsupervised learning is a pile of data without labels , They didn't $y$ Value , Only $x$ Value

13-2 K mean value (K-means) Algorithm

Insert picture description here
K The first step of the mean algorithm （ Cluster allocation ）： Determine two cluster centers （ The Blue Cross and Red Cross in the picture ）, Traverse every sample （ Green dot in the picture ）, Determine which cluster center is closer , Divide the sample into two clusters , After sorting, see the following figure
Insert picture description here
K The second step of the mean algorithm （ Mobile clustering center ）： Calculate the mean value of all points in each cluster , And move the cluster center to the mean value , After moving, see the figure below

Then repeat the first step to determine which cluster center each sample is close to , And change his color （ classification ）, Repeat the second step after the change .
This is repeated , Get the final result
Insert picture description here
So you can say K The mean has aggregated

Enter a $K$ Indicates that you want to divide the data into several categories , Enter an unlabeled training set
Let the training set be a n Dimension vector （ It is customary not to consider $x_0=1$ This one ）

Pictured above
use $K$ Indicates that you want to divide the data into $K$ class
use $\mu_k$ It means the first one $k$ The location of a cluster center （ He's a vector / matrix ）, Random initialization obtains
use $c^{(i)}$ Represents the... In the sample $i$ The subscript of the nearest cluster center , That is to say $i$ The distance between samples is no $c^{(i)}$ Cluster centers are closest , That is to say $i$ Samples belong to $c^{(i)}$ Cluster centers , The method is as shown in the blue handwriting in the above figure
After the above values are calculated , Calculate the mean value of the points contained in each cluster center , Assign a value to the corresponding $\mu_k$ , At this point, the location of the new cluster center has been obtained
If there is a cluster center without points , Generally remove directly , In this way, you will finally get K-1 class ; But if it really needs to be divided into K class , Then re initialize the cluster center without points randomly
Insert picture description here
Pictured above , Sometimes K The mean algorithm is also applied to data sets that cannot be clearly classified , For example, I collected the height of many people 、 Weight as a data set , It can be seen that these data are basically continuous , Divide it into S、M、L Three types of , Using clustering algorithm , It can also be divided into three categories . Clustering algorithm can also be used for market segmentation

13-3 Optimization objectives

$\mu_{c^{(i)}}$ It means the first one $i$ The location of the cluster center to which the samples belong
Insert picture description here
K Cost function of mean clustering algorithm （ Optimize the objective function ） by
$J(c^{(1)},...,c^{(m)},\mu_1,...,\mu_K)=\frac{1}{m}\sum_{i=1}^m\Vert x^{(i)}-\mu_{c^{(i)}}\Vert^2$
It means the difference between the position of each sample and its cluster center , Take the norm , Square again , All of m Add up the samples and find the average
This cost function is sometimes called distortion cost function (the distortion cost function) or K Distortion of mean algorithm

13-4 Random initialization (K Mean clustering algorithm )

Insert picture description here

Randomly select from the training set $K$ Samples , Let the first to the K The cluster center of is equal to the random one $K$ Samples

Pictured above , Because it is a randomly selected cluster center , So the result may be globally optimal （ See the coordinate system above ）, It may also fall on the local optimum （ As shown in the figure above, the following two coordinate systems ）
therefore , The method of multiple random initialization is used to find the global optimum

Pictured above , The method of multiple random initialization is ：
function 50-1000 Time K Mean clustering algorithm , You can get the values of many different cost functions , The smallest one is the optimal cluster
If K=2 To 10, So many random initialization can obviously improve the effect of clustering algorithm , If it is greater than 10, Multiple runs may not have a particularly significant improvement

13-5 How to select the number of clusters K

Generally, it is manually selected
Insert picture description here

Pictured above , Use the elbow rule , Coordinate system x The axis is the number of clusters K, Coordinate system y The axis is the value of the cost function , After drawing the curve , Such as the coordinate system on the left , You can see that the curve K=3 From a very high slope to a very low slope , This is thought to be “ elbow ”, choice K=3 It is appropriate. , But it is also possible that the curve is like the image in the right coordinate system , The appropriate number of clusters cannot be found clearly

Or another way , Manually select the number of clusters according to the downstream purpose

原网站

版权声明
本文为[Fannnnf]所创，转载请带上原文链接，感谢
https://yzsam.com/2022/02/202202161031267140.html