当前位置:网站首页>机器学习植物叶片识别
机器学习植物叶片识别
2022-07-06 06:29:00 【啥也不会(sybh)】
植物叶片的识别:给出叶片的数据集”叶子形状.csv”,描述植物叶片的边缘、形状、纹理这三个特征的数值型变量各有64个(共64*3=192个变量)。此外,还有1个记录每片叶片所属植物物种的分类型变量,共193个变量。请采用特征选择方法进行特征选择,并比较各特征选择结果的异同(20分)。通过数据建模,完成叶片形状的识别(30分)。
目录
目录
2画出相关性矩阵(需要根据相关性矩阵,选择特征进行特征工程)
思路
1.数据分析 可视化
2.建立特征工程(需要根据相关性矩阵,选择特征进行特征工程。包括对数据进行预处理,补充缺失值,归一化数据等)
3.机器学习算法模型去验证分析
1导入包
import pandas as pd
from sklearn import svm
import numpy as np
from sklearn.svm import SVC
from sklearn.metrics import accuracy_score
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.decomposition import PCA
2画出相关性矩阵(需要根据相关性矩阵,选择特征进行特征工程)
Train= pd.read_csv("叶子形状.csv")
X = Train.drop(['species'], axis=1)
Y = Train['species']
Train['species'].replace(map_dic.keys(), map_dic.values(), inplace=True)
Train.drop(['id'], inplace = True, axis = 1)
Train_ture = Train['species']
#画出相关性矩阵
corr = Train.corr()
f, ax = plt.subplots(figsize=(25, 25))
cmap = sns.diverging_palette(220, 10, as_cmap=True)
sns.heatmap(corr, cmap=cmap, vmax=.3, center=0,
square=True, linewidths=.5)
plt.show()
补充缺失值
np.all(np.any(pd.isnull(Train)))
#false
训练集测试集划分(80%训练集、20%测试集)
x_train,x_test,y_train,y_test=train_test_split(X,Y,test_size=0.2,random_state=123)
对数据归一化处理
standerScaler = StandardScaler()
x_train = standerScaler.fit_transform(x_train)
x_test = standerScaler.fit_transform(x_test)
3进行PCA降维
pca = PCA(n_components=0.9)
x_train_1 = pca.fit_transform(x_train)
x_test_1 = pca.transform(x_test)
## 44个特征
4 KNN网格搜索优化 ,PCA前后
from sklearn.neighbors import KNeighborsClassifier
knn_clf0 = KNeighborsClassifier()
knn_clf0.fit(x_train, y_train)
print('KNeighborsClassifier')
y_predict = knn_clf0.predict(x_test)
score = accuracy_score(y_test, y_predict)
print("Accuracy: {:.4%}".format(score))
print("PCA后")
knn_clf1 = KNeighborsClassifier()
knn_clf1.fit(x_train_1, y_train)
print('KNeighborsClassifier')
y_predict = knn_clf1.predict(x_test_1)
score = accuracy_score(y_test, y_predict)
print("Accuracy: {:.4%}".format(score))
5 SVC
svc_clf = SVC(probability=True)
svc_clf.fit(x_train, y_train)
print("*"*30)
print('SVC')
y_predict = svc_clf.predict(x_test)
score = accuracy_score(y_test, y_predict)
print("Accuracy: {:.4%}".format(score))
svc_clf1 = SVC(probability=True)
svc_clf1.fit(x_train_1, y_train)
print("*"*30)
print('SVC')
y_predict1 = svc_clf1.predict(x_test_1)
score = accuracy_score(y_test, y_predict1)
print("Accuracy: {:.4%}".format(score))
6.逻辑回归
from sklearn.linear_model import LogisticRegressionCV
lr = LogisticRegressionCV(multi_class="ovr",
fit_intercept=True,
Cs=np.logspace(-2,2,20),
cv=2,
penalty="l2",
solver="lbfgs",
tol=0.01)
lr.fit(x_train,y_train)
print('逻辑回归')
y_predict = lr.predict(x_test)
score = accuracy_score(y_test, y_predict)
print("Accuracy: {:.4%}".format(score))
逻辑回归准确率最高98.65
经过特征选择和主成分分析不一定会提高准确率
边栏推荐
- On the first day of clock in, click to open a surprise, and the switch statement is explained in detail
- [English] Verb Classification of grammatical reconstruction -- English rabbit learning notes (2)
- 翻译影视剧字幕,这些特点务必要了解
- Simulation volume leetcode [general] 1091 The shortest path in binary matrix
- 金融德语翻译,北京专业的翻译公司
- Making interactive page of "left tree and right table" based on jeecg-boot
- Day 245/300 JS forEach 多层嵌套后数据无法更新到对象中
- Cobalt strike feature modification
- The ECU of 21 Audi q5l 45tfsi brushes is upgraded to master special adjustment, and the horsepower is safely and stably increased to 305 horsepower
- [ 英語 ] 語法重塑 之 動詞分類 —— 英語兔學習筆記(2)
猜你喜欢
[web security] nodejs prototype chain pollution analysis
MySQL5.72.msi安装失败
Engineering organisms containing artificial metalloenzymes perform unnatural biosynthesis
[ 英语 ] 语法重塑 之 动词分类 —— 英语兔学习笔记(2)
On the first day of clock in, click to open a surprise, and the switch statement is explained in detail
Esp32 esp-idf watchdog twdt
Grouping convolution and DW convolution, residuals and inverted residuals, bottleneck and linearbottleneck
Summary of leetcode's dynamic programming 4
女生学软件测试难不难 入门门槛低,学起来还是比较简单的
Remember the implementation of a relatively complex addition, deletion and modification function based on jeecg-boot
随机推荐
Simulation volume leetcode [general] 1109 Flight reservation statistics
【MQTT从入门到提高系列 | 01】从0到1快速搭建MQTT测试环境
论文摘要翻译,多语言纯人工翻译
What are the commonly used English words and sentences about COVID-19?
CS certificate fingerprint modification
国际经贸合同翻译 中译英怎样效果好
Convert the array selected by El tree into an array object
商标翻译有什么特点,如何翻译?
(practice C language every day) reverse linked list II
The ECU of 21 Audi q5l 45tfsi brushes is upgraded to master special adjustment, and the horsepower is safely and stably increased to 305 horsepower
Day 245/300 JS forEach 多层嵌套后数据无法更新到对象中
Advanced MySQL: Basics (1-4 Lectures)
如何做好互联网金融的英语翻译
Simulation volume leetcode [general] 1061 Arrange the smallest equivalent strings in dictionary order
钓鱼&文件名反转&office远程模板
LeetCode 1200. Minimum absolute difference
专业论文翻译,英文摘要如何写比较好
Office-DOC加载宏-上线CS
記一個基於JEECG-BOOT的比較複雜的增删改功能的實現
Apple has open source, but what about it?