当前位置:网站首页>Simple example of logistic regression for machine learning
Simple example of logistic regression for machine learning
2022-06-11 22:06:00 【Chrn morning】
logistic Is a linear classifier , For the linear separable problem . utilize logistic The main idea of regression classification is : According to the existing data, the regression formula is established for the classification boundary line , This is used for classification . there “ Return to ” The term comes from best fit , Means to find the best fitting parameter set , therefore ,logistic The way to train the classifier is to find the best fitting parameter , The optimization method is used .
for example : In the case of two classes , Output function 0 or 1, This function is a binary classifier sigmoid function ;


As illustrated x by 0 when ,sigmoid The value of is 0.5, With x The increase of , Corresponding sigmoid Function valued approximation 1, With x Reduction of ,sigmoid The value of a function approximates 0
So in order to implement a logistic Regression classifier , You can multiply each feature by a regression coefficient , Then add up all the values , Substitute this sum into sigmoid Function to get a 0-1 The number of ranges , Any more than 0.5 The data is classified as 1 class , Then less than 0.5 The data is classified as 0 class ,logistic It can be regarded as a kind of probability estimation .
for example : In function f(x)=a*x+b in , To compress the entire target value into (0,1) in , introduce logistic function , So there is 
logistic The general steps of regression : collecting data , Prepare the data , Analyze the data , Training algorithm , The test algorithm , Usage algorithm .
The following is used Python The code is good / Cancer prediction practice .
The original data download address is :https://archive.ics.uci.edu/ml/machine-learing-databases/breast-cancer-wisconsin/breast-cancer-wisconsin.data
import pandas as pd
import numpy as np
# Create a feature list
column_names=['Sample code number','Clump Thickness','Unigormity og Cell Size',
'Uniformity of Cell Shape','Marginal Adhesion','Single Epithlital CellSize',
'Bare Nuclei','Bland Chromation','Normal Nucleoli','Mitoses','Class']
data=pd.read_csv('https://archive.ics.uci.edu/ml/machine-learing-databases/breast-cancer-wisconsin/breast-cancer-wisconsin.data',names=column_names)
data.describle()
# Data preprocessing
# use no Replace with standard missing value
data=data.replace(to_replace='no',value=np.nan)
# Delete data with missing values
data=data.dropna()
# Descriptive analysis of data
data.describle()
# use 25% As a parameter set ,75% As a training set
from sklearn.cross_validation import train_test_split
x_train,x_test,y_train,y_test=train_test_split(data[column_names[1:10]],data[column_names[10]],
test_size=0.25,random_state=33)
# Query the number and category of training samples
y_train.value_counts()
# Query the number and category of test samples
y_test.value_counts()
# use Loistic Regression training on the above data
from sklearn.preprocessing import StandardScaler
from sklearn.linear_model import LogisticRegression
# Standardize data
#StandardScaler function : Make each column of data in the dataset ( That is, each characteristic data ) Are unified and standardized
s=StandardScaler()
x_train=s.fit_transform(x_train)
x_test=s.fit_transform(x_test)
# Initializing the regressor SGDClassifier
lr=LogisticRegression()
# Training models
lr.fit(x_train,y_train)
# forecast
lr_pred=lr.predict(x_test)
# Model performance evaluation
from sklearn.metrics import classification_report
#classification_report The function of is to display the text report of the main classification indicators
# Use the built-in scoring function
print('Accuracy of LR Classifier:',lr.score(x_test,y_test))
边栏推荐
- Collection of articles and literatures related to R language (continuously updated)
- Nmap进行主机探测出现网段IP全部存活情况分析
- Players must read starfish NFT advanced introduction
- Classes and objects (4)
- In the future, cloud expansion technology is expected to be selected as a specialized, special and new enterprise in Shanghai
- The device is in use when win10 ejects USB
- 揭秘爆款的小程序,为何一黑到底
- [Yu Yue education] calculus of Zhejiang University in autumn and winter 2021 (I) reference materials
- Unity3D getLaunchIntentForPackage 获取包返回null问题
- C语言实现八种排序 - 归并排序
猜你喜欢
![[Yu Yue education] Yancheng Normal University Advanced Algebra reference](/img/3f/cd7f6f420fb1d453acca9aa73665ba.jpg)
[Yu Yue education] Yancheng Normal University Advanced Algebra reference

Add anti debugging function to game or code (application level)

C语言实现八种排序 - 归并排序

Conception du Processeur superscalaire Yao yongbin chapitre 2 cache - - sous - section 2.4 extrait

实现栈和队列
![[niuke.com] ky41 put apples](/img/55/cc246aed1438fdd245530beb7574f0.jpg)
[niuke.com] ky41 put apples
![[academic related] under the application review system, how difficult is it to study for a doctoral degree in a double first-class university?](/img/cd/e7ffecbee13596f2298ee8c0a5b873.jpg)
[academic related] under the application review system, how difficult is it to study for a doctoral degree in a double first-class university?

How to use the transaction code sat to find the name of the background storage database table corresponding to a sapgui screen field

玩家必读|Starfish NFT进阶攻略

R语言书籍学习03 《深入浅出R语言数据分析》-第八章 逻辑回归模型 第九章 聚类模型
随机推荐
Players must read starfish NFT advanced introduction
Go OS module
使用VBScript读取网络的日志数据进行处理
238.除自身以外数组的乘积
LeetCode栈题目总结
Go IO module
Why microservices are needed
每日一题 -- 验证回文串
快速排序的优化
Take off efficiently! Can it be developed like this?
How to realize double speed playback and fast forward for restricted ckplayer players
Nmap进行主机探测出现网段IP全部存活情况分析
The device is in use when win10 ejects USB
高考结束,人生才刚刚开始,10年职场老鸟给的建议
每日一题 - 罗马数字转整数
STM32开发笔记113:ADS1258驱动设计——读取温度值
R语言书籍学习03 《深入浅出R语言数据分析》-第十二章 支持向量机 第十三章 神经网络
Tkinter学习笔记(四)
Explain asynchronous tasks in detail: the task of function calculation triggers de duplication
Regular execution of shell scripts in crontab