当前位置:网站首页>5. Logistic regression

5. Logistic regression

2022-07-05 23:38:00 CGOMG

What is logical regression

 Insert picture description here

Application scenarios

 Insert picture description here

The principle of logical regression

Master logistic regression , You must master the following two points

  • In logical regression , What is the input value
  • How to judge the output of logistic regression

Input

 Insert picture description here

Activation function

 Insert picture description here

Measure losses

 Insert picture description here

Loss

 Insert picture description here
 Insert picture description here

Optimize

 Insert picture description here

API

 Insert picture description here

Tumor prediction cases

Data is introduced

 Insert picture description here

Code implementation

import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.linear_model import LogisticRegression

#  get data 
names = ['Sample code number', 'Clump Thickness', 'Uniformity of Cell Size', 'Uniformity of Cell Shape','Marginal Adhesion', 'Single Epithelial Cell Size', 'Bare Nuclei', 'Bland Chromatin','Normal Nucleoli', 'Mitoses', 'Class']
data = pd.read_csv("https://archive.ics.uci.edu/ml/machine-learning-databases/breast-cancer-wisconsin/breast-cancer-wisconsin.data",names=names)
data.head()

 Insert picture description here

#  Basic data processing 
#  Missing value processing 
data = data.replace(to_replace="?",value=np.nan)
data = data.dropna()
#  Determine eigenvalue , The target 
x = data.iloc[:,1:-1]
y = data["Class"]
#  Split data 
x_train,x_test,y_train,y_test = train_test_split(x,y,random_state=22,test_size=0.2)
#  Feature Engineering   Standardization 
transfer = StandardScaler()
x_train = transfer.fit_transform(x_train)
x_test = transfer.transform(x_test)
#  machine learning 
estmator = LogisticRegression()
estmator.fit(x_train,y_train)
#  Model to evaluate 
print(" Accuracy rate :\n",estmator.score(x_test,y_test))
print(" Predictive value :\n",estmator.predict(x_test))

 Insert picture description here

Evaluation method

Accuracy and recall rate

Confusion matrix

 Insert picture description here
The accuracy formula we used before is :(TP+TN)/(TP+Fn+FP+TN)

Accuracy (Precision) And recall rate (Recall)

 Insert picture description here
Accuracy :(TP)/(TP+FP)
 Insert picture description here
Recall rate :(TP)/(TP+FN)

F1-score

 Insert picture description here

Classification assessment report api

 Insert picture description here

from sklearn.metrics import classification_report
y_pre = estmator.predict(x_test)
ret = classification_report(y_test,y_pre,labels=(2,4),target_names=(" Benign "," Malignant "))
print(ret)

 Insert picture description here

ROC Curve and AUC indicators

TPR And FPR

 Insert picture description here

ROC curve

 Insert picture description here

AUC indicators

 Insert picture description here

AUC Calculation API

 Insert picture description here

from sklearn.metrics import roc_auc_score
y_test = np.where(y_test>3,1,0)
roc_auc_score(y_test,y_pre)

 Insert picture description here

Solve the problem of category imbalance

pip3 install imbalanced-learn

Prepare category imbalance data

from sklearn.datasets import make_classification
import matplotlib.pylab as plt
from collections import Counter

X,Y = make_classification(n_samples=5000,
                          n_features=2, #  The number of features = n_informative()+ n_redundant()+ n_repeated()
                          n_informative=2,#  Number of multi-information features 
                          n_redundant=0,#  Redundant information ,informative Random linear combination of features 
                          n_repeated=0,#  Duplicate information , Random extraction n_informative and n_redundant features 
                          n_classes=3,#  Classification categories 
                          n_clusters_per_class=1,#  A certain category is composed of several cluster Composed of 
                          weights=[0.01,0.05,0.94],#  List the type , Weight ratio 
                          random_state=0)
X,Y,X.shape

 Insert picture description here

Counter(y)

 Insert picture description here

#  Data visualization 
plt.scatter(X[:,0],X[:,1],c=Y)
plt.show()

 Insert picture description here

terms of settlement

 Insert picture description here

Oversampling method

 Insert picture description here

Random oversampling method

 Insert picture description here

from imblearn.over_sampling import RandomOverSampler
ros = RandomOverSampler(random_state=0)
X_resampled,Y_resampled = ros.fit_resample(X,Y)
Counter(Y_resampled)

 Insert picture description here

#  Data visualization 
plt.scatter(X_resampled[:,0],X_resampled[:,1],c=Y_resampled)
plt.show()

 Insert picture description here
 Insert picture description here

Oversampling representative algorithm -SMOTE

 Insert picture description here
 Insert picture description here
 Insert picture description here

from imblearn.over_sampling import SMOTE

X_resampled,Y_resampled = SMOTE().fit_resample(X,Y)
Counter(Y_resampled)

 Insert picture description here

#  Data visualization 
plt.scatter(X_resampled[:,0],X_resampled[:,1],c=Y_resampled)
plt.show()

 Insert picture description here

Under sampling method

 Insert picture description here

Random undersampling method

 Insert picture description here

from imblearn.under_sampling import RandomUnderSampler
rus = RandomUnderSampler(random_state=0)
X_resampled,Y_resampled = rus.fit_resample(X,Y)
Counter(Y_resampled)

 Insert picture description here

#  Data visualization 
plt.scatter(X_resampled[:,0],X_resampled[:,1],c=Y_resampled)
plt.show()

 Insert picture description here
 Insert picture description here

原网站

版权声明
本文为[CGOMG]所创,转载请带上原文链接,感谢
https://yzsam.com/2022/02/202202140307289382.html