当前位置:网站首页>The difference between stratifiedkfold (classification) and kfold (regression)
The difference between stratifiedkfold (classification) and kfold (regression)
2022-07-03 13:14:00 【Levi Bebe】
One 、 StratifiedKFlod And KFlod The main difference
StratifiedKFlod: Stratified sampling , The comparison of samples in each category in the training set and the test set is the same as that in the original data ;( Classification problem )
KFlod: Stratified sampling , Divide the data into training set and test set , Whether the data of each category in the training set and the test set are the same is not considered ;( The return question )
from sklearn.model_selection import KFold,StratifiedKFold
KFold(n_split, shuffle, random_state)
Parameters :
n_splits: It means dividing the data into several equal parts
shuffle: In each division , Whether to shuffle
if False, The effect is equivalent to random_state Integers ( With zero ), The result of each division is the same
if True, The result of each division is different , Indicates that the cards have been shuffled , Random sampling
random_state: Random seed number , When the set value ( It's usually 0) It is convenient to adjust parameters after , Because the data set generated each time is the same
stratifiedKFold(n_split, shuffle, random_state)
Parameters :
n_splits: It means dividing the data into several equal parts
shuffle: In each division , Whether to shuffle
if False, The effect is equivalent to random_state Integers ( With zero ), The result of each division is the same
if True, The result of each division is different , Indicates that the cards have been shuffled , Random sampling
random_state: Random seed number , When the set value ( It's usually 0) It is convenient to adjust parameters after , Because the data set generated each time is the same
Two 、 StratifiedKFlod And KFlod Different cases
import numpy as np
from sklearn.model_selection import KFold,StratifiedKFold
X=np.array([
[1,2,3,4],
[11,12,13,14],
[21,22,23,24],
[31,32,33,34],
[41,42,43,44],
[51,52,53,54],
[61,62,63,64],
[71,72,73,74]
])
y=np.array([1,1,0,0,1,1,0,0])
KFold = KFold(n_splits=4,shuffle=True,random_state=2021)
StratifiedKFold = StratifiedKFold(n_splits=4,shuffle=True,random_state=2021)
print('---------------------KFlod---------------------------')
for train, test in KFold.split(X,y):
print('Train: %s | test: %s' % (train, test))
print(' Training set label type : %s' % y[train])
print(' Test set label type : %s' % y[test])
print('----------------StratifiedKFold----------------------')
for train, test in StratifiedKFold.split(X,y):
print('Train: %s | test: %s' % (train, test))
print(' Training set label type : %s' % y[train])
print(' Test set label type : %s' % y[test])
# The input result is as follows
''' ---------------------KFlod--------------------------- Train: [0 1 2 4 5 6] | test: [3 7] Training set label type : [1 1 0 1 1 0] Test set label type : [0 0] Train: [0 1 3 4 5 7] | test: [2 6] Training set label type : [1 1 0 1 1 0] Test set label type : [0 0] Train: [2 3 4 5 6 7] | test: [0 1] Training set label type : [0 0 1 1 0 0] Test set label type : [1 1] Train: [0 1 2 3 6 7] | test: [4 5] Training set label type : [1 1 0 0 0 0] Test set label type : [1 1] ----------------StratifiedKFold---------------------- Train: [0 1 2 3 4 6] | test: [5 7] Training set label type : [1 1 0 0 1 0] Test set label type : [1 0] Train: [0 1 2 3 5 7] | test: [4 6] Training set label type : [1 1 0 0 1 0] Test set label type : [1 0] Train: [0 3 4 5 6 7] | test: [1 2] Training set label type : [1 0 1 1 0 0] Test set label type : [1 0] Train: [1 2 4 5 6 7] | test: [0 3] Training set label type : [1 0 1 1 0 0] Test set label type : [1 0] '''
summary :
KFlod It is applicable to user regression type data division
stratifiedKFlod Applicable to classification data division
Reference resources :
https://blog.csdn.net/qq_34107425/article/details/105548800
https://blog.csdn.net/wqh_jingsong/article/details/77896449
边栏推荐
- Will Huawei be the next one to fall
- Leetcode234 palindrome linked list
- 已解决TypeError: Argument ‘parser‘ has incorrect type (expected lxml.etree._BaseParser, got type)
- [comprehensive question] [Database Principle]
- C graphical tutorial (Fourth Edition)_ Chapter 20 asynchronous programming: examples - cases without asynchronous
- Differences and connections between final and static
- [colab] [7 methods of using external data]
- sitesCMS v3.0.2发布,升级JFinal等依赖
- Logback 日志框架
- [Database Principle and Application Tutorial (4th Edition | wechat Edition) Chen Zhibo] [Chapter 6 exercises]
猜你喜欢
【R】 [density clustering, hierarchical clustering, expectation maximization clustering]
Seven habits of highly effective people
Flink SQL knows why (17): Zeppelin, a sharp tool for developing Flink SQL
Leetcode234 palindrome linked list
The upward and downward transformation of polymorphism
Sitescms v3.1.0 release, launch wechat applet
Solve system has not been booted with SYSTEMd as init system (PID 1) Can‘t operate.
Flink SQL knows why (VIII): the wonderful way to parse Flink SQL tumble window
我的创作纪念日:五周年
Sword finger offer 14- ii Cut rope II
随机推荐
C graphical tutorial (Fourth Edition)_ Chapter 15 interface: interfacesamplep271
[exercise 7] [Database Principle]
Gan totem column bridgeless boost PFC (single phase) seven PFC duty cycle feedforward
2022-02-13 plan for next week
已解决(机器学习中查看数据信息报错)AttributeError: target_names
Sword finger offer 14- ii Cut rope II
剑指 Offer 11. 旋转数组的最小数字
R语言gt包和gtExtras包优雅地、漂亮地显示表格数据:nflreadr包以及gtExtras包的gt_plt_winloss函数可视化多个分组的输赢值以及内联图(inline plot)
In the promotion season, how to reduce the preparation time of defense materials by 50% and adjust the mentality (personal experience summary)
Seven habits of highly effective people
2022-01-27 redis cluster technology research
regular expression
2022-02-09 survey of incluxdb cluster
这本数学书AI圈都在转,资深ML研究员历时7年之作,免费电子版可看
Cache penetration and bloom filter
CVPR 2022 image restoration paper
Tencent cloud tdsql database delivery and operation and maintenance Junior Engineer - some questions of Tencent cloud cloudlite certification (TCA) examination
Kotlin - improved decorator mode
剑指 Offer 14- I. 剪绳子
Quick learning 1.8 front and rear interfaces