当前位置:网站首页>K-means introduction
K-means introduction
2022-06-21 10:32:00 【I'm afraid I'm not retarded】
K-Mean
K-Means namely K Mean clustering , It belongs to partition clustering .
working principle :
According to the initialized cluster center information , Calculate the distance from each sample to these centers , It can be judged that each sample belongs to a class cluster , Update cluster center information , Recalculate the distance from each sample to the new cluster center , Re divide the samples into the corresponding classes of the new cluster center , Repeat , Until the termination conditions are met .
Yes N A sample points , Use K-Means The steps of clustering them :
- Determine the number of clusters k, And designate k The center of a cluster C1,C2…Ck;
- Calculate each sample Si Point to k Distance between centers , And classify the store as the nearest Cj Class , among i∈(1,N), j∈(1,k);
- Recalculate k The central point of a class cluster , Update the location of the original center point C1,C2…Ck
- Repeat step 2、3, Until the position of the center point does not change or the change amplitude is less than the agreed threshold , Or the predefined maximum number of cycles of the large lead , end . Get the final clustering result .
Implementation steps
First step , Determine the number of clusters , Determine the cluster center , Determine the distance calculation formula
- Observation
- Enumeration
- Other technical means
Determine the distance formula : Common Euclidean distance calculation
The second step , Calculate the distance between each point and the cluster center , classified ;
The third step , Calculate the current cluster center , Update cluster center Ck The location of ;
Repeat step 2 , Transfer each sample Si Click on the new cluster center Ck Make a new division ;
Repeat step three , Calculate the cluster center according to the latest cluster , Update Center Ck Value ;
Repeat step 2 , The third step , Know that the position of the cluster center is not changing , Or the number of cycles is greater than the preset threshold , end . Get the final clustering result
Implement pseudo code
choice k A point is used as the center of the initial cluster
repeat
Assign each sample point to the nearest cluster center , formation k Class clusters
Recalculate the center of each class cluster
until Class clusters do not change or Maximum number of iterations reached
k-means Advantages and disadvantages
- advantage
- The principle of simple , Easy to understand , Easy to implement
- The clustering results are easy to interpret
- The clustering results are relatively good
- shortcoming
- Number of categories k It needs to be specified in advance , And designated k Values are different , The clustering results are quite different
- First time k Cluster centers have an impact on the final result , Different choices , The results may be different
- Only spherical clusters can be recognized , Non spherical clustering results are poor
- When there are many sample points , The amount of calculation is large
- Sensitive to outliers , Discrete values require special treatment
边栏推荐
- 程序員新人周一優化一行代碼,周三被勸退?
- 知识点滴 - 什么是加速移动网页(AMP)?
- Electron checks the CPU and memory performance when the module is introduced
- Eureka's timedsupersortask class (periodic task with automatic interval adjustment)
- Polymorphic & class object & registered factory & Reflection & dynamic proxy
- 从零开始做网站11-博客开发
- Ccs7.3 how to erase only part of the flash sector when burning DSP on-chip flash (two projects of on-chip flash burning of a DSP chip)
- 一行代码加速 sklearn 运算上千倍
- Judge the data type of JS
- Brief introduction of quality control conditions before genotype filling
猜你喜欢
![leetcode:715. Range module [brainless segmenttree]](/img/70/6cfb071bb08b30945c31c4947d2cef.png)
leetcode:715. Range module [brainless segmenttree]

leetcode:715. Range 模块【无脑segmentTree】

Performance optimization - image compression, loading and format selection

The spingboot microservice is packaged into a docker image and connected to the database

One line of code accelerates sklearn operations thousands of times

程序員新人周一優化一行代碼,周三被勸退?

Esp8266/esp32 +1.3 "or 0.96" IIC OLED pointer clock

Coordinate system transformation, application in inertial navigation antenna

西电AI专业排名超清北,南大蝉联全国第一 | 2022软科中国大学专业排名

NLog自定义Target之MQTT
随机推荐
123. deep and shallow copy of JS implementation -- code text explanation
Answers to mobile application development learning general test questions
Audio and video synchronization knowledge points you must pay attention to:
Brief introduction of quality control conditions before genotype filling
信号功率谱估计
如何做一个有趣的人
安全百强 中坚力量!美创科技入选《2022年中国数字安全百强报告》
EIG和沙特阿美签署谅解备忘录,扩大能源合作
多态&Class对象&注册工厂&反射&动态代理
[cloud based co creation] enterprise digitalization accelerates "new intelligent manufacturing"
【云驻共创】企业数字化加速“新智造”
应用配置管理,基础原理分析
获取配置文件properties中的数据
ArCore支持的設備
简易的安卓天气app(三)——城市管理、数据库操作
ArCore支持的设备
AI越进化越跟人类大脑像!Meta找到了机器的“前额叶皮层”,AI学者和神经科学家都惊了...
Ccs7.3 how to erase only part of the flash sector when burning DSP on-chip flash (two projects of on-chip flash burning of a DSP chip)
【云原生 | Kubernetes篇】Kubernetes 配置(十五)
Uni app advanced creation component / native rendering [Day9]