当前位置:网站首页>多元聚类分析
多元聚类分析
2022-07-06 08:49:00 【亦是远方】
一、代码
import pandas as pd
from pandas import DataFrame
from sklearn.cluster import KMeans
from sklearn.manifold import TSNE
import matplotlib.pyplot as plt
# 读取文件
datafile = u'student-mat.xlsx' # 文件所在位置,u为防止路径中有中文名称,此处没有,可以省略
outfile = 'stu.xlsx'
data = pd.read_excel(datafile) # datafile是excel文件,所以用read_excel,如果是csv文件则用read_csv
d = DataFrame(data)
# 聚类
n = 5 # 聚成 5 类数据
mod = KMeans(n_clusters=n)
mod.fit_predict(d) # y_pred表示聚类的结果
# 聚成 5 类数据,统计每个聚类下的数据量,并且求出他们的中心
r1 = pd.Series(mod.labels_).value_counts() # 每个类下面有多少个样本
r2 = pd.DataFrame(mod.cluster_centers_) # 中心
r = pd.concat([r2, r1], axis=1)
r.columns = list(d.columns) + [u'类别数目']
# 给每一条数据标注上被分为哪一类
r = pd.concat([d, pd.Series(mod.labels_, index=d.index)], axis=1)
r.columns = list(d.columns) + [u'聚类类别']
print(r)
r.to_excel(outfile) # 如果需要保存到本地,就写上这一列
# 可视化过程
ts = TSNE()
ts.fit_transform(r)
ts = pd.DataFrame(ts.embedding_, index=r.index)
a = ts[r[u'聚类类别'] == 0]
plt.plot(a[0], a[1], 'r.')
a = ts[r[u'聚类类别'] == 1]
plt.plot(a[0], a[1], 'go')
a = ts[r[u'聚类类别'] == 2]
plt.plot(a[0], a[1], 'g*')
a = ts[r[u'聚类类别'] == 3]
plt.plot(a[0], a[1], 'b.')
a = ts[r[u'聚类类别'] == 4]
plt.plot(a[0], a[1], 'b*')
plt.show()
二、结果
三、数据集
边栏推荐
- 查看局域网中电脑设备
- 企微服务商平台收费接口对接教程
- Leetcode: Sword finger offer 42 Maximum sum of continuous subarrays
- @Jsonbackreference and @jsonmanagedreference (solve infinite recursion caused by bidirectional references in objects)
- R language uses the principal function of psych package to perform principal component analysis on the specified data set. PCA performs data dimensionality reduction (input as correlation matrix), cus
- [MySQL] limit implements paging
- LeetCode:836. 矩形重叠
- ESP8266-RTOS物联网开发
- 电脑F1-F12用途
- LeetCode:236. 二叉树的最近公共祖先
猜你喜欢
Screenshot in win10 system, win+prtsc save location
sublime text的编写程序时的Tab和空格缩进问题
The ECU of 21 Audi q5l 45tfsi brushes is upgraded to master special adjustment, and the horsepower is safely and stably increased to 305 horsepower
marathon-envs项目环境配置(强化学习模仿参考动作)
查看局域网中电脑设备
vb. Net changes with the window, scales the size of the control and maintains its relative position
[embedded] cortex m4f DSP Library
Deep analysis of C language pointer
UML圖記憶技巧
JVM quick start
随机推荐
Restful API design specification
角色动画(Character Animation)的现状与趋势
同一局域网的手机和电脑相互访问,IIS设置
Using pkgbuild:: find in R language_ Rtools check whether rtools is available and use sys The which function checks whether make exists, installs it if not, and binds R and rtools with the writelines
poi追加写EXCEL文件
[NVIDIA development board] FAQ (updated from time to time)
To effectively improve the quality of software products, find a third-party software evaluation organization
LeetCode:劍指 Offer 42. 連續子數組的最大和
MYSQL卸载方法与安装方法
Revit 二次开发 HOF 方式调用transaction
被破解毁掉的国产游戏之光
The harm of game unpacking and the importance of resource encryption
Introduction to the differences between compiler options of GCC dynamic library FPIC and FPIC
What are the common processes of software stress testing? Professional software test reports issued by companies to share
How to conduct interface test? What are the precautions? Nanny level interpretation
力扣每日一题(二)
[embedded] print log using JLINK RTT
Sublime text using ctrl+b to run another program without closing other runs
生成器参数传入参数
egg. JS project deployment online server