当前位置：网站首页>5. Logistic regression

5. Logistic regression

2022-07-05 23:38:00 【CGOMG】

What is logical regression

Insert picture description here

Application scenarios

Insert picture description here

The principle of logical regression

Master logistic regression , You must master the following two points

In logical regression , What is the input value
How to judge the output of logistic regression

Input

Insert picture description here

Activation function

Insert picture description here

Measure losses

Insert picture description here

Loss

Insert picture description here

Optimize

Insert picture description here

API

Insert picture description here

Tumor prediction cases

Data is introduced

Insert picture description here

Code implementation

import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.linear_model import LogisticRegression

#  get data 
names = ['Sample code number', 'Clump Thickness', 'Uniformity of Cell Size', 'Uniformity of Cell Shape','Marginal Adhesion', 'Single Epithelial Cell Size', 'Bare Nuclei', 'Bland Chromatin','Normal Nucleoli', 'Mitoses', 'Class']
data = pd.read_csv("https://archive.ics.uci.edu/ml/machine-learning-databases/breast-cancer-wisconsin/breast-cancer-wisconsin.data",names=names)
data.head()

Insert picture description here

#  Basic data processing 
#  Missing value processing 
data = data.replace(to_replace="?",value=np.nan)
data = data.dropna()
#  Determine eigenvalue , The target 
x = data.iloc[:,1:-1]
y = data["Class"]
#  Split data 
x_train,x_test,y_train,y_test = train_test_split(x,y,random_state=22,test_size=0.2)
#  Feature Engineering   Standardization 
transfer = StandardScaler()
x_train = transfer.fit_transform(x_train)
x_test = transfer.transform(x_test)
#  machine learning 
estmator = LogisticRegression()
estmator.fit(x_train,y_train)
#  Model to evaluate 
print(" Accuracy rate :\n",estmator.score(x_test,y_test))
print(" Predictive value :\n",estmator.predict(x_test))

Insert picture description here

Evaluation method

Accuracy and recall rate

Confusion matrix

Insert picture description here
The accuracy formula we used before is ：（TP+TN）/(TP+Fn+FP+TN)

Accuracy (Precision) And recall rate (Recall)

Insert picture description here
Accuracy ：(TP)/(TP+FP)

Recall rate ：(TP)/(TP+FN)

F1-score

Insert picture description here

Classification assessment report api

Insert picture description here

from sklearn.metrics import classification_report
y_pre = estmator.predict(x_test)
ret = classification_report(y_test,y_pre,labels=(2,4),target_names=(" Benign "," Malignant "))
print(ret)

Insert picture description here

ROC Curve and AUC indicators

TPR And FPR

Insert picture description here

ROC curve

Insert picture description here

AUC indicators

Insert picture description here

AUC Calculation API

Insert picture description here

from sklearn.metrics import roc_auc_score
y_test = np.where(y_test>3,1,0)
roc_auc_score(y_test,y_pre)

Insert picture description here

Solve the problem of category imbalance

pip3 install imbalanced-learn

Prepare category imbalance data

from sklearn.datasets import make_classification
import matplotlib.pylab as plt
from collections import Counter

X,Y = make_classification(n_samples=5000,
                          n_features=2, #  The number of features = n_informative（）+ n_redundant（）+ n_repeated（）
                          n_informative=2,#  Number of multi-information features 
                          n_redundant=0,#  Redundant information ,informative Random linear combination of features 
                          n_repeated=0,#  Duplicate information , Random extraction n_informative and n_redundant features 
                          n_classes=3,#  Classification categories 
                          n_clusters_per_class=1,#  A certain category is composed of several cluster Composed of 
                          weights=[0.01,0.05,0.94],#  List the type , Weight ratio 
                          random_state=0)

X,Y,X.shape

Insert picture description here

Counter(y)

Insert picture description here

#  Data visualization 
plt.scatter(X[:,0],X[:,1],c=Y)
plt.show()

Insert picture description here

terms of settlement

Insert picture description here

Oversampling method

Insert picture description here

Random oversampling method

Insert picture description here

from imblearn.over_sampling import RandomOverSampler
ros = RandomOverSampler(random_state=0)
X_resampled,Y_resampled = ros.fit_resample(X,Y)
Counter(Y_resampled)

Insert picture description here

#  Data visualization 
plt.scatter(X_resampled[:,0],X_resampled[:,1],c=Y_resampled)
plt.show()

Insert picture description here

Oversampling representative algorithm -SMOTE

Insert picture description here

from imblearn.over_sampling import SMOTE

X_resampled,Y_resampled = SMOTE().fit_resample(X,Y)
Counter(Y_resampled)

Insert picture description here

#  Data visualization 
plt.scatter(X_resampled[:,0],X_resampled[:,1],c=Y_resampled)
plt.show()

Insert picture description here

Under sampling method

Insert picture description here

Random undersampling method

Insert picture description here

from imblearn.under_sampling import RandomUnderSampler
rus = RandomUnderSampler(random_state=0)
X_resampled,Y_resampled = rus.fit_resample(X,Y)
Counter(Y_resampled)

Insert picture description here

#  Data visualization 
plt.scatter(X_resampled[:,0],X_resampled[:,1],c=Y_resampled)
plt.show()

Insert picture description here

原网站

版权声明
本文为[CGOMG]所创，转载请带上原文链接，感谢
https://yzsam.com/2022/02/202202140307289382.html

当前位置：网站首页>5. Logistic regression

5. Logistic regression

What is logical regression

Application scenarios

The principle of logical regression

Input

Activation function

Measure losses

Loss

Optimize

API

Tumor prediction cases

Data is introduced

Code implementation

Evaluation method

Accuracy and recall rate

Confusion matrix

Accuracy (Precision) And recall rate (Recall)

F1-score

Classification assessment report api

ROC Curve and AUC indicators

TPR And FPR

ROC curve

AUC indicators

AUC Calculation API

Solve the problem of category imbalance

Prepare category imbalance data

terms of settlement

Oversampling method

Random oversampling method

Oversampling representative algorithm -SMOTE

Under sampling method

Random undersampling method

边栏推荐

猜你喜欢

随机推荐