当前位置:网站首页>Machine learning -- detailed introduction of standardscaler (), transform (), fit () in sklearn package
Machine learning -- detailed introduction of standardscaler (), transform (), fit () in sklearn package
2022-07-25 09:28:00 【Bubble Yi】
- sklearn(scikit-learn) yes Scipy An extension of , Based on the NumPy and matplotlib Library based . since 2007 Since its release ,sklearn Has become a Python Important machine learning library .
- sklearn Support includes classification 、 Return to 、 The four big machine learning algorithms of descending and clustering . It also includes feature extraction 、 Data processing and model evaluation .
One 、 Sklearn In bag StandardScaler()
1. from sklearn.preprocessing import StandardScaler # Import standardized functions in the data processing module
2. SS = StandardScaler() # Generate entity class module
3. scaler=SS.fit(X_train) # In essence, it is to find the mean and variance of each column
4. X_train=scaler.transform(X_train) # Standardize the data columns
5.# Use the mean and variance of each column in the training set to standardize each column in the test set .
test1=scaler.transform(X_test1)
test2=scaler.transform(X_test2)
Two 、 Method 2 ( Data standardization ):
mean=train_data.mean(axis=0)
train_data-=mean
std=train_data.std(axis=0)
train_data/=std
test_data-=mean
test_data/=stdBe careful , The mean and standard deviation used for the standardization of test data are calculated on the training data . In the workflow , You can't calculate any results on the test data , Even simple things like data standardization are not allowed .
边栏推荐
猜你喜欢
随机推荐
sqli-labs安装 环境:ubuntu18 php7
保姆级Scanner类使用详解
C language and SQL Server database technology
『每日一问』怎么实现一个正确的双重检查锁定
【代码源】每日一题 三段式
< T> Generic method demonstration
[GYCTF2020]Node Game
matplotlib数据可视化三分钟入门,半小时入魔?
TCP网络应用程序开发流程
数据控制语言(DCL)
Mongodb installation and use
[De1CTF 2019]SSRF Me
【代码源】每日一题 农田划分
将list集合的某一字段拼接单个String
@4-1 CCF 2020-06-1 线性分类器
Go基础4
【代码源】每日一题 分数拆分
Deep understanding of static keyword
@3-2 CCF 2020-12-2 期末预测之最佳阈值
log4j2基础配置









