当前位置:网站首页>机器学习植物叶片识别
机器学习植物叶片识别
2022-07-06 06:29:00 【啥也不会(sybh)】
植物叶片的识别:给出叶片的数据集”叶子形状.csv”,描述植物叶片的边缘、形状、纹理这三个特征的数值型变量各有64个(共64*3=192个变量)。此外,还有1个记录每片叶片所属植物物种的分类型变量,共193个变量。请采用特征选择方法进行特征选择,并比较各特征选择结果的异同(20分)。通过数据建模,完成叶片形状的识别(30分)。
目录
目录
2画出相关性矩阵(需要根据相关性矩阵,选择特征进行特征工程)
思路
1.数据分析 可视化
2.建立特征工程(需要根据相关性矩阵,选择特征进行特征工程。包括对数据进行预处理,补充缺失值,归一化数据等)
3.机器学习算法模型去验证分析
1导入包
import pandas as pd
from sklearn import svm
import numpy as np
from sklearn.svm import SVC
from sklearn.metrics import accuracy_score
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.decomposition import PCA
2画出相关性矩阵(需要根据相关性矩阵,选择特征进行特征工程)
Train= pd.read_csv("叶子形状.csv")
X = Train.drop(['species'], axis=1)
Y = Train['species'] 
Train['species'].replace(map_dic.keys(), map_dic.values(), inplace=True)
Train.drop(['id'], inplace = True, axis = 1)
Train_ture = Train['species']
#画出相关性矩阵
corr = Train.corr()
f, ax = plt.subplots(figsize=(25, 25))
cmap = sns.diverging_palette(220, 10, as_cmap=True)
sns.heatmap(corr, cmap=cmap, vmax=.3, center=0,
square=True, linewidths=.5)
plt.show()
补充缺失值
np.all(np.any(pd.isnull(Train)))
#false
训练集测试集划分(80%训练集、20%测试集)
x_train,x_test,y_train,y_test=train_test_split(X,Y,test_size=0.2,random_state=123)
对数据归一化处理
standerScaler = StandardScaler()
x_train = standerScaler.fit_transform(x_train)
x_test = standerScaler.fit_transform(x_test)
3进行PCA降维
pca = PCA(n_components=0.9)
x_train_1 = pca.fit_transform(x_train)
x_test_1 = pca.transform(x_test)
## 44个特征
4 KNN网格搜索优化 ,PCA前后
from sklearn.neighbors import KNeighborsClassifier
knn_clf0 = KNeighborsClassifier()
knn_clf0.fit(x_train, y_train)
print('KNeighborsClassifier')
y_predict = knn_clf0.predict(x_test)
score = accuracy_score(y_test, y_predict)
print("Accuracy: {:.4%}".format(score))
print("PCA后")
knn_clf1 = KNeighborsClassifier()
knn_clf1.fit(x_train_1, y_train)
print('KNeighborsClassifier')
y_predict = knn_clf1.predict(x_test_1)
score = accuracy_score(y_test, y_predict)
print("Accuracy: {:.4%}".format(score))
5 SVC
svc_clf = SVC(probability=True)
svc_clf.fit(x_train, y_train)
print("*"*30)
print('SVC')
y_predict = svc_clf.predict(x_test)
score = accuracy_score(y_test, y_predict)
print("Accuracy: {:.4%}".format(score))
svc_clf1 = SVC(probability=True)
svc_clf1.fit(x_train_1, y_train)
print("*"*30)
print('SVC')
y_predict1 = svc_clf1.predict(x_test_1)
score = accuracy_score(y_test, y_predict1)
print("Accuracy: {:.4%}".format(score))

6.逻辑回归
from sklearn.linear_model import LogisticRegressionCV
lr = LogisticRegressionCV(multi_class="ovr",
fit_intercept=True,
Cs=np.logspace(-2,2,20),
cv=2,
penalty="l2",
solver="lbfgs",
tol=0.01)
lr.fit(x_train,y_train)
print('逻辑回归')
y_predict = lr.predict(x_test)
score = accuracy_score(y_test, y_predict)
print("Accuracy: {:.4%}".format(score))

逻辑回归准确率最高98.65
经过特征选择和主成分分析不一定会提高准确率
边栏推荐
- My seven years with NLP
- [English] Verb Classification of grammatical reconstruction -- English rabbit learning notes (2)
- Grouping convolution and DW convolution, residuals and inverted residuals, bottleneck and linearbottleneck
- 关于新冠疫情,常用的英文单词、语句有哪些?
- How much is it to translate Chinese into English for one minute?
- [ 英语 ] 语法重塑 之 动词分类 —— 英语兔学习笔记(2)
- 云服务器 AccessKey 密钥泄露利用
- ECS accessKey key disclosure and utilization
- Oscp raven2 target penetration process
- Black cat takes you to learn EMMC Protocol Part 10: EMMC read and write operation details (read & write)
猜你喜欢
![[ 英语 ] 语法重塑 之 英语学习的核心框架 —— 英语兔学习笔记(1)](/img/02/41dcdcc6e8f12d76b9c1ef838af97d.png)
[ 英语 ] 语法重塑 之 英语学习的核心框架 —— 英语兔学习笔记(1)

The ECU of 21 Audi q5l 45tfsi brushes is upgraded to master special adjustment, and the horsepower is safely and stably increased to 305 horsepower

Use shortcut LNK online CS

Financial German translation, a professional translation company in Beijing

女生学软件测试难不难 入门门槛低,学起来还是比较简单的

MySQL5.72.msi安装失败

生物医学英文合同翻译,关于词汇翻译的特点

如何将flv文件转为mp4文件?一个简单的解决办法

Mise en œuvre d’une fonction complexe d’ajout, de suppression et de modification basée sur jeecg - boot

Cobalt Strike特征修改
随机推荐
Classification des verbes reconstruits grammaticalement - - English Rabbit Learning notes (2)
Tms320c665x + Xilinx artix7 DSP + FPGA high speed core board
Use shortcut LNK online CS
【MQTT从入门到提高系列 | 01】从0到1快速搭建MQTT测试环境
Today's summer solstice
The internationalization of domestic games is inseparable from professional translation companies
SSO流程分析
[web security] nodejs prototype chain pollution analysis
私人云盘部署
Phishing & filename inversion & Office remote template
記一個基於JEECG-BOOT的比較複雜的增删改功能的實現
Chinese English comparison: you can do this Best of luck
Simulation volume leetcode [general] 1109 Flight reservation statistics
关于新冠疫情,常用的英文单词、语句有哪些?
专业论文翻译,英文摘要如何写比较好
Defense (greed), FBI tree (binary tree)
Biomedical localization translation services
利用快捷方式-LNK-上线CS
My seven years with NLP
Wish Dragon Boat Festival is happy