当前位置:网站首页>Multivariate cluster analysis
Multivariate cluster analysis
2022-07-06 09:04:00 【Also far away】
One 、 Code
import pandas as pd
from pandas import DataFrame
from sklearn.cluster import KMeans
from sklearn.manifold import TSNE
import matplotlib.pyplot as plt
# Read the file
datafile = u'student-mat.xlsx' # File location ,u To prevent the path from having Chinese names , There is no , It can be omitted
outfile = 'stu.xlsx'
data = pd.read_excel(datafile) # datafile yes excel file , So use read_excel, If it is csv For documents read_csv
d = DataFrame(data)
# clustering
n = 5 # Coalescence 5 Class data
mod = KMeans(n_clusters=n)
mod.fit_predict(d) # y_pred Represents the result of clustering
# Coalescence 5 Class data , Count the amount of data under each cluster , And find their center
r1 = pd.Series(mod.labels_).value_counts() # How many samples are there under each class
r2 = pd.DataFrame(mod.cluster_centers_) # center
r = pd.concat([r2, r1], axis=1)
r.columns = list(d.columns) + [u' Number of categories ']
# Mark each piece of data with which category it is divided
r = pd.concat([d, pd.Series(mod.labels_, index=d.index)], axis=1)
r.columns = list(d.columns) + [u' Clustering categories ']
print(r)
r.to_excel(outfile) # If you need to save to local , Just write this column
# Visualization process
ts = TSNE()
ts.fit_transform(r)
ts = pd.DataFrame(ts.embedding_, index=r.index)
a = ts[r[u' Clustering categories '] == 0]
plt.plot(a[0], a[1], 'r.')
a = ts[r[u' Clustering categories '] == 1]
plt.plot(a[0], a[1], 'go')
a = ts[r[u' Clustering categories '] == 2]
plt.plot(a[0], a[1], 'g*')
a = ts[r[u' Clustering categories '] == 3]
plt.plot(a[0], a[1], 'b.')
a = ts[r[u' Clustering categories '] == 4]
plt.plot(a[0], a[1], 'b*')
plt.show()
Two 、 result


3、 ... and 、 Data sets
边栏推荐
- requests的深入刨析及封装调用
- MySQL uninstallation and installation methods
- What is the role of automated testing frameworks? Shanghai professional third-party software testing company Amway
- Leetcode: Jianzhi offer 04 Search in two-dimensional array
- Compétences en mémoire des graphiques UML
- [OC-Foundation框架]--<Copy对象复制>
- The ECU of 21 Audi q5l 45tfsi brushes is upgraded to master special adjustment, and the horsepower is safely and stably increased to 305 horsepower
- Advance Computer Network Review(1)——FatTree
- Simclr: comparative learning in NLP
- 自定义卷积注意力算子的CUDA实现
猜你喜欢

Selenium+pytest automated test framework practice

Selenium+Pytest自动化测试框架实战(下)

LeetCode:236. The nearest common ancestor of binary tree

Simple use of promise in uniapp

如何正确截取字符串(例:应用报错信息截取入库操作)

Promise 在uniapp的简单使用

Pytest之收集用例规则与运行指定用例

Mise en œuvre de la quantification post - formation du bminf

Intel Distiller工具包-量化实现2

Simclr: comparative learning in NLP
随机推荐
Advanced Computer Network Review(3)——BBR
[oc]- < getting started with UI> -- common controls - prompt dialog box and wait for the prompt (circle)
The ECU of 21 Audi q5l 45tfsi brushes is upgraded to master special adjustment, and the horsepower is safely and stably increased to 305 horsepower
vb. Net changes with the window, scales the size of the control and maintains its relative position
Intel Distiller工具包-量化实现3
Notes 01
Leetcode: Sword finger offer 42 Maximum sum of continuous subarrays
[oc]- < getting started with UI> -- learning common controls
A convolution substitution of attention mechanism
Once you change the test steps, write all the code. Why not try yaml to realize data-driven?
Leetcode刷题题解2.1.1
Pytorch view tensor memory size
UML diagram memory skills
KDD 2022论文合集(持续更新中)
一篇文章带你了解-selenium工作原理详解
[embedded] cortex m4f DSP Library
Leetcode: Sword finger offer 48 The longest substring without repeated characters
LeetCode:221. Largest Square
[OC]-<UI入门>--常用控件的学习
BMINF的後訓練量化實現