当前位置:网站首页>Sklearn Library -- linear regression model
Sklearn Library -- linear regression model
2022-06-26 04:44:00 【I am a little monster】
Catalog
Classification variables are converted into dummy variables
Yes statsmodels The library can be collated with sklearn The collation of the library can be used for reference
statsmodels library —— linear regression model _ I am a little monster blog -CSDN Blog
Simple linear regression
from sklearn import linear_model# Import the required package
lr=linear_model.LinearRegression()# Specify the model
# Again fit Specify arguments and response variables in , Note that capitalization is used here X And lowercase y
# because sklearn It is accepted that numpy Array , Therefore, it is necessary to process data , In order to dataframe Pass in sklearn.
predicted=lr.fit(X=tips['total_bill'].values.reshape(-1,1),y=tips['tip'])
print(predicted.coef_)# By fitting the model coef_ Property to get the coefficient
print('----'*6)# Output horizontal lines to distinguish each output
print(predicted.intercept_)# By fitting the model intercept_ Property to get the intercept
print('----'*6)
y_pre=lr.predict(tips['total_bill'][0:2].values.reshape(-1,1))# The predicted value is obtained according to the fitted linear model
print(y_pre)# Output predicted value
print(tips['tip'][0:2])# Output the actual value for comparison [0.10502452] ------------------------ 0.9202696135546731 ------------------------ [2.70463616 2.00622312] 0 1.01 1 1.66 Name: tip, dtype: float64
Multiple linear regression
from sklearn import linear_model# Import the required package
lr=linear_model.LinearRegression()
predicted=lr.fit(X=tips[['total_bill','size']],y=tips['tip'])# When multiple columns are passed in, the list is used to pass in
print(predicted.coef_)
print('----'*6)
print(predicted.intercept_)[0.09271334 0.19259779] ------------------------ 0.6689447408125031
Classification variables are converted into dummy variables
Because of sklearn You need to manually create virtual variables when you encounter classification variables , Use pandas Medium get_dummies function
import pandas as pd
# Create virtual variables , among drop_first Parameter specifies whether to delete the reference variable
tips_dummy=pd.get_dummies(tips[['total_bill','size','sex','smoker','day','time']],drop_first=False)
print(tips_dummy.head())total_bill size sex_Male sex_Female smoker_Yes smoker_No day_Thur \ 0 16.99 2 0 1 0 1 0 1 10.34 3 1 0 0 1 0 2 21.01 3 1 0 0 1 0 3 23.68 2 1 0 0 1 0 4 24.59 4 0 1 0 1 0 day_Fri day_Sat day_Sun time_Lunch time_Dinner 0 0 0 1 0 1 1 0 0 1 0 1 2 0 0 1 0 1 3 0 0 1 0 1 4 0 0 1 0 1
from sklearn import linear_model# Import the required package
import pandas as pd
# Create virtual variables , among drop_first Parameter specifies whether to delete the reference variable , For example, gender is divided into male and female , Then the system will select the first male as the reference variable , After deletion, the male column will not be converted into a dummy variable
tips_dummy_drop=pd.get_dummies(tips[['total_bill','size','sex','smoker','day','time']],drop_first=True)
print(tips_dummy_drop.head())
lr=linear_model.LinearRegression()
predicted=lr.fit(X=tips_dummy_drop,y=tips['tip'])# When multiple columns are passed in, the list is used to pass in
print(predicted.coef_)
print(predicted.intercept_)total_bill size sex_Female smoker_No day_Fri day_Sat day_Sun \ 0 16.99 2 1 1 0 0 1 1 10.34 3 0 1 0 0 1 2 21.01 3 0 1 0 0 1 3 23.68 2 0 1 0 0 1 4 24.59 4 1 1 0 0 1 time_Dinner 0 1 1 1 2 1 3 1 4 1 ------------------------ [ 0.09448701 0.175992 0.03244094 0.08640832 0.1622592 0.04080082 0.13677854 -0.0681286 ] ------------------------ 0.590837425951376
边栏推荐
- Oracle 数据泵导表
- Tp6 multi table Association (table a is associated with table B, table B is associated with table C, and table d)
- 1.21 learning summary
- Performance test comparison between PHP framework jsnpp and thinkphp6
- Simple personal summary of tp6 multi application deployment -- Part I [original]
- Nabicat连接:本地Mysql&&云服务Mysql以及报错
- 1.17 learning summary
- Group by and order by are used together
- 微信小程序保存圖片的方法
- Laravel uses phpword to generate word documents
猜你喜欢

How can the intelligent transformation path of manufacturing enterprises be broken due to talent shortage and high cost?

1.12 learning summary

Multipass Chinese document - remote use of multipass

Multipass Chinese document - setup driver

Essential foundation of programming - Summary of written interview examination sites - computer network (1) overview

mysql高级学习(跟着尚硅谷老师周阳学习)
![Simple personal summary of tp6 multi application deployment -- Part I [original]](/img/7b/65fab1973423081483dacc9bed9594.jpg)
Simple personal summary of tp6 multi application deployment -- Part I [original]

2022.1.24

图像翻译/GAN:Unsupervised Image-to-Image Translation with Self-Attention Networks基于自我注意网络的无监督图像到图像的翻译
![Alipay failed to verify the signature (sandbox test indicates fishing risk?) [original]](/img/64/c3bb27a3711a6f0cc7b281d1a961af.jpg)
Alipay failed to verify the signature (sandbox test indicates fishing risk?) [original]
随机推荐
Tp6 is easy to tread [original]
2021-01-31
redis集群的方式
202.2.9
ROS 笔记(07)— 客户端 Client 和服务端 Server 的实现
Sort query
Group by and order by are used together
Alipay failed to verify the signature (sandbox test indicates fishing risk?) [original]
Multipass Chinese document - use instance command alias
Resolve PHP is not an internal or external command
How to carry out word-of-mouth marketing for enterprises' products and services? Can word of mouth marketing be done on behalf of others?
Method of saving pictures in wechat applet
1.16 learning summary
"Eight hundred"
Is education important or ability important in software testing
0622-马棕榈跌9%
Nabicat connection: local MySQL & cloud service MySQL and error reporting
Essential foundation of programming - Summary of written interview examination sites - computer network (1) overview
Laravel file stream download file
Database design (I)