当前位置:网站首页>5 minutes to master machine learning iris logical regression classification
5 minutes to master machine learning iris logical regression classification
2022-07-06 14:35:00 【ブリンク】
This article will use 5 Minutes to help you master the most classic case of iris classification in machine learning .
sketch
Use scikit-learn library , coordination Numpy、Pandas It can make machine learning simple , Utilization based on Matplotlib Of seaborn Libraries make it easier to visualize .
First, import the library you want to use :
from sklearn import datasets
# We from sklearn You can get the data in your own data set
import numpy as np
import pandas as pd
import seaborn as sns
from sklearn.linear_model import LogisticRegression
# Use logistic regression to learn
from sklearn.model_selection import train_test_split
# Use it to segment data into training set and test set
Import data
sklearn We have prepared some data sets for practice , Including the iris data to be used now , We just need to use datasets Of l o a d i r i s ( ) load_iris() loadiris() The method can :
iris_data = datasets.load_iris()
Got iris_data yes sklearn Type included in , We can use i r i s . k e y s ( ) iris.keys() iris.keys() Method to see what it contains , He will return a dictionary :
>>> iris.keys()
dict_keys(['data', 'target', 'frame',
'target_names', 'DESCR', 'feature_names',
'filename', 'data_module'])
It includes 150 Group data ,data Indicates the included data ,target It means label , That is, what kind of iris this flower belongs to , The iris in the data has 3 Kind of setosa, versicolor and virginica, They are contained in target_names in , Indicates the name of the label .feature_names Indicates the name of the feature , That is, the description of the characteristics of iris , For example, there are petal lengths in the data set 、 Width and calyx length 、 Width . The rest is not used in this example , Don't introduce too much .
Next, extract the data and labels , And stored in Pandas Of DataFrame in ,:
>>> data = iris.data
>>> data = data.pd.DataFrame(data,columns = iris.target_names)
# Change the column name to the name of the feature
>>> data.head()
sepal length (cm) sepal width (cm) petal length (cm) petal width (cm)
0 5.1 3.5 1.4 0.2
1 4.9 3.0 1.4 0.2
2 4.7 3.2 1.3 0.2
3 4.6 3.1 1.5 0.2
4 5.0 3.6 1.4 0.2
Data visualization
Use seaborn Of p a i r p l o t ( ) pairplot() pairplot() Method can quickly view the relationship between each two variables , Including with themselves :
sns.pairplot(data)

model
Use sklearn The estimator of builds a logistic regression model :
modle = LogisticRegression()
Data preprocessing
First, all the data is processed into training set and test set , So that we can test the model , Use train_test_split() Method can easily do this , It will return separately x Training set ,x Test set ,y Training set ,y Test set :
x_train,x_test,y_train,y_test = train_test_split(X=data,y=iris.target,train_size=0.8)
# 80% As a training set The rest are used as test sets
Training models
Using estimators f i t ( ) fit() fit() The method can train the model :
model.fit(x_train,y_train)
Model to evaluate
Can be directly estimated s c o r e ( ) score() score() Methods calculate the score or accuracy of the model under the test set :
>>> model.score(x_test,y_test)
0.9333333333333333 # The accuracy rate has reached 93.33%, This is related to the division of training set and testing machine
Model to predict
The trained model can be used to predict the test machine data , in other words , When you know a set of data about the characteristics of iris , You can use this model to know which kind it belongs to :
>>> s = model.predict(x_test)
array([1, 2, 1, 2, 1, 0, 2, 1, 0,
0, 0, 2, 1, 0, 2, 0, 1, 2,
1, 1, 2, 2,1, 2, 0, 2, 1, 2, 0, 0])
# among 0 Express setosa,1 Express versicolor,2 Express virginica
边栏推荐
- 内网渗透之内网信息收集(三)
- Database monitoring SQL execution
- 《统计学》第八版贾俊平第十三章时间序列分析和预测知识点总结及课后习题答案
- 内网渗透之内网信息收集(四)
- JDBC transactions, batch processing, and connection pooling (super detailed)
- xray與burp聯動 挖掘
- Statistics 8th Edition Jia Junping Chapter XIII Summary of knowledge points of time series analysis and prediction and answers to exercises after class
- 5分钟掌握机器学习鸢尾花逻辑回归分类
- Statistics 8th Edition Jia Junping Chapter 7 Summary of knowledge points and answers to exercises after class
- MSF generate payload Encyclopedia
猜你喜欢

《统计学》第八版贾俊平第六章统计量及抽样分布知识点总结及课后习题答案

移植蜂鸟E203内核至达芬奇pro35T【集创芯来RISC-V杯】(一)

Proceedingjoinpoint API use

Statistics 8th Edition Jia Junping Chapter IX summary of knowledge points of classified data analysis and answers to exercises after class

《统计学》第八版贾俊平第九章分类数据分析知识点总结及课后习题答案

Record once, modify password logic vulnerability actual combat

1.支付系统

《統計學》第八版賈俊平第七章知識點總結及課後習題答案

Statistics, 8th Edition, Jia Junping, Chapter VIII, summary of knowledge points of hypothesis test and answers to exercises after class

High concurrency programming series: 6 steps of JVM performance tuning and detailed explanation of key tuning parameters
随机推荐
函数:求方程的根
XSS之冷门事件
图书管理系统
MySQL learning notes (stage 1)
Sentinel overall workflow
我的第一篇博客
Statistics 8th Edition Jia Junping Chapter IX summary of knowledge points of classified data analysis and answers to exercises after class
Statistics 8th Edition Jia Junping Chapter 7 Summary of knowledge points and answers to exercises after class
Ucos-iii learning records (11) - task management
Statistics 8th Edition Jia Junping Chapter 12 summary of knowledge points of multiple linear regression and answers to exercises after class
{1,2,3,2,5}查重问题
【指针】查找最大的字符串
Statistics, 8th Edition, Jia Junping, Chapter VIII, summary of knowledge points of hypothesis test and answers to exercises after class
How does SQLite count the data that meets another condition under the data that has been classified once
High concurrency programming series: 6 steps of JVM performance tuning and detailed explanation of key tuning parameters
关于交换a和b的值的四种方法
Intranet information collection of Intranet penetration (I)
攻防世界MISC练习区(gif 掀桌子 ext3 )
Statistics, 8th Edition, Jia Junping, Chapter 6 Summary of knowledge points of statistics and sampling distribution and answers to exercises after class
Résumé des points de connaissance et des réponses aux exercices après la classe du chapitre 7 de Jia junping dans la huitième édition des statistiques