当前位置:网站首页>多元聚类分析
多元聚类分析
2022-07-06 08:49:00 【亦是远方】
一、代码
import pandas as pd
from pandas import DataFrame
from sklearn.cluster import KMeans
from sklearn.manifold import TSNE
import matplotlib.pyplot as plt
# 读取文件
datafile = u'student-mat.xlsx' # 文件所在位置,u为防止路径中有中文名称,此处没有,可以省略
outfile = 'stu.xlsx'
data = pd.read_excel(datafile) # datafile是excel文件,所以用read_excel,如果是csv文件则用read_csv
d = DataFrame(data)
# 聚类
n = 5 # 聚成 5 类数据
mod = KMeans(n_clusters=n)
mod.fit_predict(d) # y_pred表示聚类的结果
# 聚成 5 类数据,统计每个聚类下的数据量,并且求出他们的中心
r1 = pd.Series(mod.labels_).value_counts() # 每个类下面有多少个样本
r2 = pd.DataFrame(mod.cluster_centers_) # 中心
r = pd.concat([r2, r1], axis=1)
r.columns = list(d.columns) + [u'类别数目']
# 给每一条数据标注上被分为哪一类
r = pd.concat([d, pd.Series(mod.labels_, index=d.index)], axis=1)
r.columns = list(d.columns) + [u'聚类类别']
print(r)
r.to_excel(outfile) # 如果需要保存到本地,就写上这一列
# 可视化过程
ts = TSNE()
ts.fit_transform(r)
ts = pd.DataFrame(ts.embedding_, index=r.index)
a = ts[r[u'聚类类别'] == 0]
plt.plot(a[0], a[1], 'r.')
a = ts[r[u'聚类类别'] == 1]
plt.plot(a[0], a[1], 'go')
a = ts[r[u'聚类类别'] == 2]
plt.plot(a[0], a[1], 'g*')
a = ts[r[u'聚类类别'] == 3]
plt.plot(a[0], a[1], 'b.')
a = ts[r[u'聚类类别'] == 4]
plt.plot(a[0], a[1], 'b*')
plt.show()
二、结果
三、数据集
边栏推荐
- Purpose of computer F1-F12
- 移位运算符
- marathon-envs项目环境配置(强化学习模仿参考动作)
- C语言双指针——经典题型
- LeetCode:剑指 Offer 42. 连续子数组的最大和
- TCP/IP协议
- 广州推进儿童友好城市建设,将探索学校周边200米设安全区域
- Detailed explanation of dynamic planning
- Guangzhou will promote the construction of a child friendly city, and will explore the establishment of a safe area 200 meters around the school
- Deep analysis of C language data storage in memory
猜你喜欢
sublime text的编写程序时的Tab和空格缩进问题
opencv+dlib实现给蒙娜丽莎“配”眼镜
LeetCode:124. 二叉树中的最大路径和
【嵌入式】Cortex M4F DSP库
Marathon envs project environment configuration (strengthen learning and imitate reference actions)
SAP ui5 date type sap ui. model. type. Analysis of the parsing format of date
Trying to use is on a network resource that is unavailable
企微服务商平台收费接口对接教程
MongoDB 的安装和基本操作
[MySQL] multi table query
随机推荐
Roguelike game into crack the hardest hit areas, how to break the bureau?
JS pure function
sublime text中conda环境中plt.show无法弹出显示图片的问题
Deep anatomy of C language -- C language keywords
[MySQL] multi table query
Guangzhou will promote the construction of a child friendly city, and will explore the establishment of a safe area 200 meters around the school
View computer devices in LAN
Screenshot in win10 system, win+prtsc save location
【嵌入式】Cortex M4F DSP库
LeetCode:剑指 Offer 42. 连续子数组的最大和
UML圖記憶技巧
Analysis of the source code of cocos2d-x for mobile game security (mobile game reverse and protection)
Sublime text using ctrl+b to run another program without closing other runs
角色动画(Character Animation)的现状与趋势
Bitwise logical operator
TP-LINK enterprise router PPTP configuration
[embedded] cortex m4f DSP Library
LeetCode:剑指 Offer 48. 最长不含重复字符的子字符串
JS native implementation shuttle box
[embedded] print log using JLINK RTT