当前位置:网站首页>多元聚类分析
多元聚类分析
2022-07-06 08:49:00 【亦是远方】
一、代码
import pandas as pd
from pandas import DataFrame
from sklearn.cluster import KMeans
from sklearn.manifold import TSNE
import matplotlib.pyplot as plt
# 读取文件
datafile = u'student-mat.xlsx' # 文件所在位置,u为防止路径中有中文名称,此处没有,可以省略
outfile = 'stu.xlsx'
data = pd.read_excel(datafile) # datafile是excel文件,所以用read_excel,如果是csv文件则用read_csv
d = DataFrame(data)
# 聚类
n = 5 # 聚成 5 类数据
mod = KMeans(n_clusters=n)
mod.fit_predict(d) # y_pred表示聚类的结果
# 聚成 5 类数据,统计每个聚类下的数据量,并且求出他们的中心
r1 = pd.Series(mod.labels_).value_counts() # 每个类下面有多少个样本
r2 = pd.DataFrame(mod.cluster_centers_) # 中心
r = pd.concat([r2, r1], axis=1)
r.columns = list(d.columns) + [u'类别数目']
# 给每一条数据标注上被分为哪一类
r = pd.concat([d, pd.Series(mod.labels_, index=d.index)], axis=1)
r.columns = list(d.columns) + [u'聚类类别']
print(r)
r.to_excel(outfile) # 如果需要保存到本地,就写上这一列
# 可视化过程
ts = TSNE()
ts.fit_transform(r)
ts = pd.DataFrame(ts.embedding_, index=r.index)
a = ts[r[u'聚类类别'] == 0]
plt.plot(a[0], a[1], 'r.')
a = ts[r[u'聚类类别'] == 1]
plt.plot(a[0], a[1], 'go')
a = ts[r[u'聚类类别'] == 2]
plt.plot(a[0], a[1], 'g*')
a = ts[r[u'聚类类别'] == 3]
plt.plot(a[0], a[1], 'b.')
a = ts[r[u'聚类类别'] == 4]
plt.plot(a[0], a[1], 'b*')
plt.show()
二、结果


三、数据集
边栏推荐
- The mysqlbinlog command uses
- Bitwise logical operator
- 使用latex导出IEEE文献格式
- Detailed explanation of heap sorting
- LeetCode:41. 缺失的第一个正数
- TCP/IP协议
- Mobile phones and computers on the same LAN access each other, IIS settings
- Sublime text using ctrl+b to run another program without closing other runs
- Swagger setting field required is mandatory
- 力扣每日一题(二)
猜你喜欢

生成器参数传入参数

JS native implementation shuttle box

sublime text中conda环境中plt.show无法弹出显示图片的问题
![[today in history] February 13: the father of transistors was born The 20th anniversary of net; Agile software development manifesto was born](/img/70/d275009134fcbf9ae984c0f278659e.jpg)
[today in history] February 13: the father of transistors was born The 20th anniversary of net; Agile software development manifesto was born

Crash problem of Chrome browser

sublime text的编写程序时的Tab和空格缩进问题

Roguelike game into crack the hardest hit areas, how to break the bureau?

704 binary search

Visual implementation and inspection of visdom

TP-LINK 企业路由器 PPTP 配置
随机推荐
After reading the programmer's story, I can't help covering my chest...
Variable length parameter
Tcp/ip protocol
R language ggplot2 visualization: place the title of the visualization image in the upper left corner of the image (customize Title position in top left of ggplot2 graph)
POI add write excel file
R language uses the principal function of psych package to perform principal component analysis on the specified data set. PCA performs data dimensionality reduction (input as correlation matrix), cus
FairGuard游戏加固:游戏出海热潮下,游戏安全面临新挑战
Generator parameters incoming parameters
按位逻辑运算符
Navicat premium create MySQL create stored procedure
Philosophical enlightenment from single point to distributed
查看局域网中电脑设备
Shift Operators
Computer graduation design PHP Zhiduo online learning platform
sublime text中conda环境中plt.show无法弹出显示图片的问题
sublime text的编写程序时的Tab和空格缩进问题
游戏解包的危害及资源加密的重要性
opencv+dlib实现给蒙娜丽莎“配”眼镜
Hutool gracefully parses URL links and obtains parameters
电脑清理,删除的系统文件