当前位置:网站首页>Machine learning plant leaf recognition
Machine learning plant leaf recognition
2022-07-06 06:39:00 【Nothing (sybh)】
Identification of plant leaves : Give the data set of blades ” Leaf shape .csv”, Describe the edges of plant leaves 、 shape 、 The numerical variables of these three features of texture have 64 individual ( common 64*3=192 A variable ). Besides , also 1 Taxonomic variables recording the plant species to which each leaf belongs , common 193 A variable . Please use the feature selection method for feature selection , And compare the similarities and differences of the feature selection results (20 branch ). Through data modeling , Complete the recognition of blade shape (30 branch ).
Catalog
Catalog
3 Conduct PCA Dimension reduction
4 KNN Grid search optimization ,PCA Before and after
Ideas
1. Data analysis visualization
2. establish Feature Engineering ( According to the correlation matrix , Select features for Feature Engineering . Including data preprocessing , Supplement missing values , Normalized data, etc )
3. Machine learning algorithm Model to verify the analysis
1 Import package
import pandas as pd
from sklearn import svm
import numpy as np
from sklearn.svm import SVC
from sklearn.metrics import accuracy_score
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.decomposition import PCA
2 Draw correlation matrix ( According to the correlation matrix , Select features for Feature Engineering )
Train= pd.read_csv(" Leaf shape .csv")
X = Train.drop(['species'], axis=1)
Y = Train['species']
Train['species'].replace(map_dic.keys(), map_dic.values(), inplace=True)
Train.drop(['id'], inplace = True, axis = 1)
Train_ture = Train['species']
# Draw the correlation matrix
corr = Train.corr()
f, ax = plt.subplots(figsize=(25, 25))
cmap = sns.diverging_palette(220, 10, as_cmap=True)
sns.heatmap(corr, cmap=cmap, vmax=.3, center=0,
square=True, linewidths=.5)
plt.show()
Supplement missing values
np.all(np.any(pd.isnull(Train)))
#false
Training set test set division (80% Training set 、20% Test set )
x_train,x_test,y_train,y_test=train_test_split(X,Y,test_size=0.2,random_state=123)
Normalize the data
standerScaler = StandardScaler()
x_train = standerScaler.fit_transform(x_train)
x_test = standerScaler.fit_transform(x_test)
3 Conduct PCA Dimension reduction
pca = PCA(n_components=0.9)
x_train_1 = pca.fit_transform(x_train)
x_test_1 = pca.transform(x_test)
## 44 Features
4 KNN grid Search optimization ,PCA Before and after
from sklearn.neighbors import KNeighborsClassifier
knn_clf0 = KNeighborsClassifier()
knn_clf0.fit(x_train, y_train)
print('KNeighborsClassifier')
y_predict = knn_clf0.predict(x_test)
score = accuracy_score(y_test, y_predict)
print("Accuracy: {:.4%}".format(score))
print("PCA after ")
knn_clf1 = KNeighborsClassifier()
knn_clf1.fit(x_train_1, y_train)
print('KNeighborsClassifier')
y_predict = knn_clf1.predict(x_test_1)
score = accuracy_score(y_test, y_predict)
print("Accuracy: {:.4%}".format(score))
5 SVC
svc_clf = SVC(probability=True)
svc_clf.fit(x_train, y_train)
print("*"*30)
print('SVC')
y_predict = svc_clf.predict(x_test)
score = accuracy_score(y_test, y_predict)
print("Accuracy: {:.4%}".format(score))
svc_clf1 = SVC(probability=True)
svc_clf1.fit(x_train_1, y_train)
print("*"*30)
print('SVC')
y_predict1 = svc_clf1.predict(x_test_1)
score = accuracy_score(y_test, y_predict1)
print("Accuracy: {:.4%}".format(score))
6. Logical regression
from sklearn.linear_model import LogisticRegressionCV
lr = LogisticRegressionCV(multi_class="ovr",
fit_intercept=True,
Cs=np.logspace(-2,2,20),
cv=2,
penalty="l2",
solver="lbfgs",
tol=0.01)
lr.fit(x_train,y_train)
print(' Logical regression ')
y_predict = lr.predict(x_test)
score = accuracy_score(y_test, y_predict)
print("Accuracy: {:.4%}".format(score))
The accuracy of logistic regression is the highest 98.65
After feature selection and principal component analysis, the accuracy will not necessarily be improved
边栏推荐
- LeetCode每日一题(1997. First Day Where You Have Been in All the Rooms)
- Luogu p2141 abacus mental arithmetic test
- Phishing & filename inversion & Office remote template
- Tms320c665x + Xilinx artix7 DSP + FPGA high speed core board
- What are the characteristics of trademark translation and how to translate it?
- [Yu Yue education] Dunhuang Literature and art reference materials of Zhejiang Normal University
- Day 239/300 注册密码长度为8~14个字母数字以及标点符号至少包含2种校验
- Lecture 8: 1602 LCD (Guo Tianxiang)
- Basic knowledge of MySQL
- 今日夏至 Today‘s summer solstice
猜你喜欢
金融德语翻译,北京专业的翻译公司
MFC on the conversion and display of long string unsigned char and CString
Apple has open source, but what about it?
Biomedical English contract translation, characteristics of Vocabulary Translation
In English translation of papers, how to do a good translation?
利用快捷方式-LNK-上线CS
[English] Verb Classification of grammatical reconstruction -- English rabbit learning notes (2)
Thesis abstract translation, multilingual pure human translation
字幕翻译中翻英一分钟多少钱?
Changes in the number of words in English papers translated into Chinese
随机推荐
LeetCode每日一题(1870. Minimum Speed to Arrive on Time)
org.activiti.bpmn.exceptions.XMLException: cvc-complex-type.2.4.a: 发现了以元素 ‘outgoing‘ 开头的无效内容
Use shortcut LNK online CS
[ 英语 ] 语法重塑 之 英语学习的核心框架 —— 英语兔学习笔记(1)
Modify the list page on the basis of jeecg boot code generation (combined with customized components)
How to do a good job in financial literature translation?
CS passed (cdn+ certificate) PowerShell online detailed version
Suspended else
私人云盘部署
生物医学本地化翻译服务
MySQL is sorted alphabetically
翻译生物医学说明书,英译中怎样效果佳
国际经贸合同翻译 中译英怎样效果好
In English translation of papers, how to do a good translation?
Remember the implementation of a relatively complex addition, deletion and modification function based on jeecg-boot
我的创作纪念日
Summary of the post of "Web Test Engineer"
商标翻译有什么特点,如何翻译?
Chinese English comparison: you can do this Best of luck
My seven years with NLP