当前位置:网站首页>The data of pandas was scrambled and the training machine and testing machine set were selected
The data of pandas was scrambled and the training machine and testing machine set were selected
2020-11-06 01:27:00 【Elementary school students in IT field】
describe
In machine learning , To get a pile of training data, we usually need to divide the data into training set and test set , Or cut it into training sets 、 Cross validation sets and test sets , In order to avoid bias in feature distribution of the segmented dataset , We need to scramble the data first , Make the data random , And then it's cutting .
The methods to be used are as follows :
notes :df Representing one pd.DataFrame
df = df.sample(frac=1.0): Press 100% The proportion of sampling is to achieve the effect of disrupting data
df = df.reset_index(): After scrambling the data index It's also messy , If your index If there is no characteristic meaning , Just reset it , Otherwise, we will put index Add a new column , Generate meaningless index
train = df.loc[0:a]: Carry out segmentation operation , The proportion depends on the situation
cv = df.loc[a+1:b]:
test = df.loc[b+1:-1]:
版权声明
本文为[Elementary school students in IT field]所创,转载请带上原文链接,感谢
边栏推荐
- Troubleshooting and summary of JVM Metaspace memory overflow
- How to select the evaluation index of classification model
- Save the file directly to Google drive and download it back ten times faster
- 嘗試從零開始構建我的商城 (二) :使用JWT保護我們的資訊保安,完善Swagger配置
- Nodejs crawler captures ancient books and records, a total of 16000 pages, experience summary and project sharing
- 做外包真的很难,身为外包的我也无奈叹息。
- EOS创始人BM: UE,UBI,URI有什么区别?
- 全球疫情加速互联网企业转型,区块链会是解药吗?
- PN8162 20W PD快充芯片,PD快充充电器方案
- keras model.compile Loss function and optimizer
猜你喜欢
随机推荐
前端工程师需要懂的前端面试题(c s s方面)总结(二)
100元扫货阿里云是怎样的体验?
How do the general bottom buried points do?
嘗試從零開始構建我的商城 (二) :使用JWT保護我們的資訊保安,完善Swagger配置
Multi classification of unbalanced text using AWS sagemaker blazingtext
一篇文章带你了解CSS 分页实例
What problems can clean architecture solve? - jbogard
Natural language processing - BM25 commonly used in search
熬夜总结了报表自动化、数据可视化和挖掘的要点,和你想的不一样
Network security engineer Demo: the original * * is to get your computer administrator rights! 【***】
Let the front-end siege division develop independently from the back-end: Mock.js
Working principle of gradient descent algorithm in machine learning
PN8162 20W PD快充芯片,PD快充充电器方案
CCR炒币机器人:“比特币”数字货币的大佬,你不得不了解的知识
Examples of unconventional aggregation
全球疫情加速互联网企业转型,区块链会是解药吗?
After reading this article, I understand a lot of webpack scaffolding
What is the side effect free method? How to name it? - Mario
High availability cluster deployment of jumpserver: (6) deployment of SSH agent module Koko and implementation of system service management
Filecoin主网上线以来Filecoin矿机扇区密封到底是什么意思



