当前位置:网站首页>The difference between stratifiedkfold (classification) and kfold (regression)
The difference between stratifiedkfold (classification) and kfold (regression)
2022-07-03 13:14:00 【Levi Bebe】
One 、 StratifiedKFlod And KFlod The main difference
StratifiedKFlod: Stratified sampling , The comparison of samples in each category in the training set and the test set is the same as that in the original data ;( Classification problem )
KFlod: Stratified sampling , Divide the data into training set and test set , Whether the data of each category in the training set and the test set are the same is not considered ;( The return question )
from sklearn.model_selection import KFold,StratifiedKFold
KFold(n_split, shuffle, random_state)
Parameters :
n_splits: It means dividing the data into several equal parts
shuffle: In each division , Whether to shuffle
if False, The effect is equivalent to random_state Integers ( With zero ), The result of each division is the same
if True, The result of each division is different , Indicates that the cards have been shuffled , Random sampling
random_state: Random seed number , When the set value ( It's usually 0) It is convenient to adjust parameters after , Because the data set generated each time is the same
stratifiedKFold(n_split, shuffle, random_state)
Parameters :
n_splits: It means dividing the data into several equal parts
shuffle: In each division , Whether to shuffle
if False, The effect is equivalent to random_state Integers ( With zero ), The result of each division is the same
if True, The result of each division is different , Indicates that the cards have been shuffled , Random sampling
random_state: Random seed number , When the set value ( It's usually 0) It is convenient to adjust parameters after , Because the data set generated each time is the same
Two 、 StratifiedKFlod And KFlod Different cases
import numpy as np
from sklearn.model_selection import KFold,StratifiedKFold
X=np.array([
[1,2,3,4],
[11,12,13,14],
[21,22,23,24],
[31,32,33,34],
[41,42,43,44],
[51,52,53,54],
[61,62,63,64],
[71,72,73,74]
])
y=np.array([1,1,0,0,1,1,0,0])
KFold = KFold(n_splits=4,shuffle=True,random_state=2021)
StratifiedKFold = StratifiedKFold(n_splits=4,shuffle=True,random_state=2021)
print('---------------------KFlod---------------------------')
for train, test in KFold.split(X,y):
print('Train: %s | test: %s' % (train, test))
print(' Training set label type : %s' % y[train])
print(' Test set label type : %s' % y[test])
print('----------------StratifiedKFold----------------------')
for train, test in StratifiedKFold.split(X,y):
print('Train: %s | test: %s' % (train, test))
print(' Training set label type : %s' % y[train])
print(' Test set label type : %s' % y[test])
# The input result is as follows
''' ---------------------KFlod--------------------------- Train: [0 1 2 4 5 6] | test: [3 7] Training set label type : [1 1 0 1 1 0] Test set label type : [0 0] Train: [0 1 3 4 5 7] | test: [2 6] Training set label type : [1 1 0 1 1 0] Test set label type : [0 0] Train: [2 3 4 5 6 7] | test: [0 1] Training set label type : [0 0 1 1 0 0] Test set label type : [1 1] Train: [0 1 2 3 6 7] | test: [4 5] Training set label type : [1 1 0 0 0 0] Test set label type : [1 1] ----------------StratifiedKFold---------------------- Train: [0 1 2 3 4 6] | test: [5 7] Training set label type : [1 1 0 0 1 0] Test set label type : [1 0] Train: [0 1 2 3 5 7] | test: [4 6] Training set label type : [1 1 0 0 1 0] Test set label type : [1 0] Train: [0 3 4 5 6 7] | test: [1 2] Training set label type : [1 0 1 1 0 0] Test set label type : [1 0] Train: [1 2 4 5 6 7] | test: [0 3] Training set label type : [1 0 1 1 0 0] Test set label type : [1 0] '''


summary :
KFlod It is applicable to user regression type data division
stratifiedKFlod Applicable to classification data division
Reference resources :
https://blog.csdn.net/qq_34107425/article/details/105548800
https://blog.csdn.net/wqh_jingsong/article/details/77896449
边栏推荐
- Differences and connections between final and static
- [Database Principle and Application Tutorial (4th Edition | wechat Edition) Chen Zhibo] [Chapter V exercises]
- Logback 日志框架
- 这本数学书AI圈都在转,资深ML研究员历时7年之作,免费电子版可看
- 2022-02-10 introduction to the design of incluxdb storage engine TSM
- Create a dojo progress bar programmatically: Dojo ProgressBar
- C graphical tutorial (Fourth Edition)_ Chapter 18 enumerator and iterator: enumerator samplep340
- R语言gt包和gtExtras包优雅地、漂亮地显示表格数据:nflreadr包以及gtExtras包的gt_plt_winloss函数可视化多个分组的输赢值以及内联图(inline plot)
- In the promotion season, how to reduce the preparation time of defense materials by 50% and adjust the mentality (personal experience summary)
- 【数据库原理及应用教程(第4版|微课版)陈志泊】【第三章习题】
猜你喜欢

elk笔记24--用gohangout替代logstash消费日志

106. 如何提高 SAP UI5 应用路由 url 的可读性

【R】【密度聚类、层次聚类、期望最大化聚类】
![[combinatorics] permutation and combination (the combination number of multiple sets | the repetition of all elements is greater than the combination number | the derivation of the combination number](/img/9d/6118b699c0d90810638f9b08d4f80a.jpg)
[combinatorics] permutation and combination (the combination number of multiple sets | the repetition of all elements is greater than the combination number | the derivation of the combination number

有限状态机FSM

Sitescms v3.1.0 release, launch wechat applet

Quick learning 1.8 front and rear interfaces
![[data mining review questions]](/img/96/00f866135e06c4cc0d765c6e499b29.png)
[data mining review questions]

【数据库原理及应用教程(第4版|微课版)陈志泊】【第四章习题】

Flink SQL knows why (XV): changed the source code and realized a batch lookup join (with source code attached)
随机推荐
Sword finger offer 15 Number of 1 in binary
IDEA 全文搜索快捷键Ctr+Shift+F失效问题
剑指 Offer 12. 矩阵中的路径
[Database Principle and Application Tutorial (4th Edition | wechat Edition) Chen Zhibo] [Chapter V exercises]
Leetcode234 palindrome linked list
[combinatorics] permutation and combination (the combination number of multiple sets | the repetition of all elements is greater than the combination number | the derivation of the combination number
Understanding of CPU buffer line
Flink code is written like this. It's strange that the window can be triggered (bad programming habits)
Finite State Machine FSM
Cadre de logback
When the R language output rmarkdown is in other formats (such as PDF), an error is reported, latex failed to compile stocks Tex. solution
2022-02-14 analysis of the startup and request processing process of the incluxdb cluster Coordinator
2022-01-27 use liquibase to manage MySQL execution version
Cache penetration and bloom filter
Sitescms v3.0.2 release, upgrade jfinal and other dependencies
Flink SQL knows why (19): the transformation between table and datastream (with source code)
C graphical tutorial (Fourth Edition)_ Chapter 13 entrustment: delegatesamplep245
Solve system has not been booted with SYSTEMd as init system (PID 1) Can‘t operate.
elk笔记24--用gohangout替代logstash消费日志
Kotlin - improved decorator mode