当前位置:网站首页>Machine Learning - Notes and Implementation of Linear Regression, Logistic Regression Problems
Machine Learning - Notes and Implementation of Linear Regression, Logistic Regression Problems
2022-07-31 07:45:00 【Miracle Fan】
回归问题
概述:
A regression problem is all about predicting the value of a continuous problem,比如……,
And if the above regression problem,利用Sigmoid函数(Logistic 回归),The ability to turn a predicted value into a probability of judging whether or not something can be done,Change the continuous value obtained by regression to (0,1)之间的概率,It can then be used to deal with binary classification problems
一元线性回归
线性回归方程为:
y ^ = a x + b \hat{y} =ax+b y^=ax+b
For example, given a set of data,The following scatterplot can be obtained.
x=np.array([1,2,4,6,8])
y=np.array([2,5,7,8,9])
for linear regression,Equivalent to we fit a straight line,It is a good way to connect the various samples in the above figure,But in general, the perfect fitting effect cannot be achieved,Just wish it was as shown in the image below,The green line represents the error between the predicted point and the true point,We want the error to be as small as possible,That is, a better fitting effect can be achieved.
y_pred=lambda x: a*x+b
plt.scatter(x,y,color='b')
plt.plot(x,y_pred(x),color='r')
plt.plot([x,x], [y,y_pred(x)], color='g')
plt.show()
That is, a loss function can be defined:
L = 1 n ∑ i = 1 n ( y i − y p r e d i ) L=\frac{1}{n}\sum^n_{i=1}(y^i-y_{pred}^i) L=n1i=1∑n(yi−ypredi)
But if you choose this function,When we do the error calculation,In some cases the predicted value is larger than the true value,In some cases the predicted value is smaller than the true value.则会导致 y − y p r e d y-y_{pred} y−ypred出现正、negative case,when adding them together,will cause mutual cancellation,So here we need to use the mean squared loss function:
L = 1 n ∑ i = 1 n ( y i − y p r e d i ) 2 L=\frac{1}{n}\sum^n_{i=1}(y^i-y_{pred}^i)^2 L=n1i=1∑n(yi−ypredi)2
Substitute into the fitting equation:
L = 1 n ∑ i = 1 n ( y i − a x i − b ) 2 L=\frac{1}{n}\sum^n_{i=1}(y^i-ax^i-b)^2 L=n1i=1∑n(yi−axi−b)2
Use the least squares method to derive the rule:
a = ∑ i = 1 n ( x i − x ˉ ) ( y i − y ˉ ) ∑ i = 1 n ( x i − x ˉ ) 2 b = y ˉ − a x ˉ a=\frac{\sum_{i=1}^n(x_i-\bar{x})(y^i-\bar{y})}{\sum_{i=1}^n(x^i-\bar{x})^2}\\ b=\bar{y}-a\bar{x} a=∑i=1n(xi−xˉ)2∑i=1n(xi−xˉ)(yi−yˉ)b=yˉ−axˉ
def Linear_Regression(x,y):
x_mean=np.mean(x)
y_mean=np.mean(y)
# num=np.sum((x-np.tile(x_mean,x.shape))*(y-np.tile(y_mean,y.shape)))
num=np.sum((x-x_mean)*(y-y_mean))
den=np.sum((x-x_mean)**2)
a=num/den
b=y_mean-a*x_mean
return a,b
由于numpy的广播机制,Not necessary herex_mean的维度进行调整.
多元线性回归
对于多元线性回归,其一般表达式为:
y = θ 0 + θ 1 x 1 + θ 2 x 2 + ⋯ + θ n x n y=\theta_0+\theta_1x_1+\theta_2x_2+\dots+\theta_nx_n y=θ0+θ1x1+θ2x2+⋯+θnxn
This formula can be simplified to :
Y = θ ⋅ X Y=\theta \cdot X Y=θ⋅X
X = ( 1 x 11 ⋯ x 1 p 1 x 21 ⋯ x 2 p ⋮ ⋮ ⋱ ⋮ 1 x n 1 ⋯ x n p ) X=\left(\begin{array}{cccc} 1 & x_{11} & \cdots & x_{1 p} \\ 1 & x_{21} & \cdots & x_{2 p} \\ \vdots & \vdots & \ddots & \vdots \\ 1 & x_{n 1} & \cdots & x_{n p} \end{array}\right) X=⎝⎛11⋮1x11x21⋮xn1⋯⋯⋱⋯x1px2p⋮xnp⎠⎞
θ = ( θ 0 θ 1 θ 2 … θ n ) \theta=\begin{pmatrix} \theta_0\\ \theta_1\\ \theta_2\\ \dots \\ \theta _n \end{pmatrix} θ=⎝⎛θ0θ1θ2…θn⎠⎞
而对于 θ \theta θ的求解,Use the least squares method used above,可以得到:
θ = ( X i T X i ) − 1 X i T y \theta=(X_i^TX_i)^{-1}X_i^Ty θ=(XiTXi)−1XiTy
#Generates a column for manipulating intercept values
ones = np.ones((X_train.shape[0], 1))
#在horizentalstack in the direction
X_b = np.hstack((ones, X_train)) # 将XThe matrix is converted to the first column1,其余不变的X_b矩阵
theta = linalg.inv(X_b.T.dot(X_b)).dot(X_b.T).dot(y_train)
interception = theta[0]
coef =theta[1:]
logistic回归
简单的logisticsRegression is the addition of linear regressionSigmoid函数,Implement compression of prediction results,keep it there(0,1)It can also be understood as a probability value,Then usually with 0.5作为分界线,概率大于0.5则为类别1反之为0.
It is used to utilize probability problems obtained by linear regressionsigmoidThe output of the function is a class question.
p = 1 1 + e − z p=\frac{1}{1+e^{-z}} p=1+e−z1
- z:Usually a linear regression equation
- p:The predicted probability,Determine which category belongs to by dividing line
梯度下降
Gradient descent is mainly used to optimize the loss function,找到lossThe parameter value with the smallest value.
For example, suppose a loss function is
L = ( x − 2.5 ) 2 − 1 L=(x-2.5)^2-1 L=(x−2.5)2−1
Then define its loss function and its derivative.
def J(theta):
try:
return (theta-2.5)**2-1
except:
return float('inf')
def dJ(theta):
return 2*(theta-2.5)
每一次迭代
θ = θ + η d J θ \theta=\theta+\eta\frac{dJ}{\theta} θ=θ+ηθdJ
def CalGradient(eta):
theta = 0.0
theta_history = [theta]
epsilon = 1e-8#Used to finally terminate the computation of gradient descent
while True:
gradient = dJ(theta)
last_theta = theta
theta = theta - eta * gradient
theta_history.append(theta)
if (abs(J(theta) - J(last_theta)) < epsilon):
break
plt.title('lr:' + str(eta))
plt.plot(x, J(x), color='r')
plt.plot(np.array(theta_history), J(np.array(theta_history)), color='b', marker='x')
plt.show()
print(len(theta_history))
It is relevant when taking different learning rates,The drop chart is shown below.The learning rate is generally in 0~1之间,As shown in the figure below when the learning rate is1时,Convergence has not been reached,And when the learning rate is greater than 1时,It will show a divergent state.
Logistic回归的损失函数
LogisticRegression The expression after incorporating linear regression is shown below:
p = 1 1 + e − θ X p=\frac{1}{1+e^{-\theta X}} p=1+e−θX1
对于Logistic回归,The logarithmic loss function is generally used,Compute the parameters.
c o s t = { − log ( p p r e d ) i f y = 1 − log ( 1 − p p r e d ) i f y = 0 cost=\left \{\begin{matrix} -\log^{(p_{pred})} \quad if &y=1\\ -\log^{(1-p_{pred})} \quad if &y=0 \end{matrix}\right. cost={ −log(ppred)if−log(1−ppred)ify=1y=0
With a little tidying up, a loss function can be synthesized:
c o s t = − y log ( p p r e d ) − ( 1 − y ) log ( 1 − p p r e d ) cost=-y\log(p_{pred})-(1-y)\log({1-p_{pred}}) cost=−ylog(ppred)−(1−y)log(1−ppred)
import numpy as np
class LogisticRegression:
def __init__(self):
self.coef_ = None
self.intercept_ = None
self._theta = None
def _sigmoid(self, x):
y = 1.0 / (1.0 + np.exp(-x))
return y
def fit(self, x_train, y_train, eta=0.01, n_iters=1e4):
assert x_train.shape[0] == y_train.shape[0], 'The number of training set and its label length samples needs to be consistent'
def J(theta, x, y):
p_pred = self._sigmoid(x.dot(theta))
try:
return -np.sum(y * np.log(p_pred) + (1 - y) * np.log(1 - p_pred)) / len(y)
except:
return float('inf')
def dJ(theta, x, y):
x = self._sigmoid(x.dot(theta))
return x.dot(x - y) / len(x)
# Simulate gradient descent
def gradient_descent(X_b, y, initial_theta, eta, n_iters=1e4, epsilon=1e-8):
theta = initial_theta
i_iter = 0
while i_iter < n_iters:
gradient = dJ(theta, X_b, y)
last_theta = theta
theta = theta - eta * gradient
i_iter += 1
if (abs(J(theta, X_b, y) - J(last_theta, X_b, y)) < epsilon):
break
return theta
X_b = np.hstack([np.ones((len(x_train), 1)), x_train])
initial_theta = np.zeros(X_b.shape[1]) # 列向量
self._theta = gradient_descent(X_b, y_train, initial_theta, eta, n_iters)
self.intercept_ = self._theta[0] # 截距
self.coef_ = self._theta[1:] # 维度
return self
def predict_proba(self, X_predict):
X_b = np.hstack([np.ones((len(X_predict), 1)), X_predict])
return self._sigmoid(X_b.dot(self._theta))
def predict(self, X_predict):
proba = self.predict_proba(X_predict)
return np.array(proba > 0.5, dtype='int')
边栏推荐
- bcos简介及自序
- Financial leasing business
- 剑指offer(一)
- 【Go】Go 语言切片(Slice)
- Explain the example + detail the difference between @Resource and @Autowired annotations (the most complete in the entire network)
- tidyverse笔记——管道函数
- LeetCode:952. 按公因数计算最大组件大小【欧拉筛 + 并查集】
- 2022.07.20_每日一题
- Analysis of the principle and implementation of waterfall flow layout
- 什么是半波整流器?半波整流器的使用方法
猜你喜欢
随机推荐
毫米波技术基础
超级详细的mysql数据库安装指南
Postgresql source code learning (34) - transaction log ⑩ - full page write mechanism
bcos简介及自序
SCI写作指南
【Go语言入门教程】Go语言简介
手把手教你开发微信小程序自定义底部导航栏
Redux state management
Obtaining server and client information
mysql的建表语句_三种常用的MySQL建表语句
【网络攻防】常见的网络攻防技术——黑客攻防(通俗易懂版)
Gradle remove dependency demo
强化学习科研知识必备(数据库、期刊、会议、牛人)
Bulk free text translation
sort函数(快速排列)的使用方法
LeetCode:952. 按公因数计算最大组件大小【欧拉筛 + 并查集】
The Ballad of Lushan Sends Lu's Servant to the Void Boat
Linked list implementation and task scheduling
2022.07.20_每日一题
解决安装 Bun 之后出现 zsh compinit: insecure directories, run compaudit for list. Ignore insecure directorie