当前位置:网站首页>The data of pandas was scrambled and the training machine and testing machine set were selected
The data of pandas was scrambled and the training machine and testing machine set were selected
2020-11-06 01:27:00 【Elementary school students in IT field】
describe
In machine learning , To get a pile of training data, we usually need to divide the data into training set and test set , Or cut it into training sets 、 Cross validation sets and test sets , In order to avoid bias in feature distribution of the segmented dataset , We need to scramble the data first , Make the data random , And then it's cutting .
The methods to be used are as follows :
notes :df Representing one pd.DataFrame
df = df.sample(frac=1.0): Press 100% The proportion of sampling is to achieve the effect of disrupting data
df = df.reset_index(): After scrambling the data index It's also messy , If your index If there is no characteristic meaning , Just reset it , Otherwise, we will put index Add a new column , Generate meaningless index
train = df.loc[0:a]: Carry out segmentation operation , The proportion depends on the situation
cv = df.loc[a+1:b]:
test = df.loc[b+1:-1]:
版权声明
本文为[Elementary school students in IT field]所创,转载请带上原文链接,感谢
边栏推荐
- Skywalking series blog 5-apm-customize-enhance-plugin
- Tool class under JUC package, its name is locksupport! Did you make it?
- 从海外进军中国,Rancher要执容器云市场牛耳 | 爱分析调研
- Synchronous configuration from git to consult with git 2consul
- 熬夜总结了报表自动化、数据可视化和挖掘的要点,和你想的不一样
- Leetcode's ransom letter
- How to use parameters in ES6
- Keyboard entry lottery random draw
- The choice of enterprise database is usually decided by the system architect - the newstack
- How to encapsulate distributed locks more elegantly
猜你喜欢
How do the general bottom buried points do?
助力金融科技创新发展,ATFX走在行业最前列
Natural language processing - BM25 commonly used in search
Filecoin的经济模型与未来价值是如何支撑FIL币价格破千的
Working principle of gradient descent algorithm in machine learning
100元扫货阿里云是怎样的体验?
至联云分享:IPFS/Filecoin值不值得投资?
快快使用ModelArts,零基礎小白也能玩轉AI!
Face to face Manual Chapter 16: explanation and implementation of fair lock of code peasant association lock and reentrantlock
采购供应商系统是什么?采购供应商管理平台解决方案
随机推荐
Flink的DataSource三部曲之二:内置connector
Wiremock: a powerful tool for API testing
5.4 static resource mapping
Let the front-end siege division develop independently from the back-end: Mock.js
Examples of unconventional aggregation
從小公司進入大廠,我都做對了哪些事?
前端基础牢记的一些操作-Github仓库管理
Network security engineer Demo: the original * * is to get your computer administrator rights! 【***】
华为云“四个可靠”的方法论
vue-codemirror基本用法:实现搜索功能、代码折叠功能、获取编辑器值及时验证
从海外进军中国,Rancher要执容器云市场牛耳 | 爱分析调研
6.5 request to view name translator (in-depth analysis of SSM and project practice)
PHPSHE 短信插件说明
6.1.1 handlermapping mapping processor (1) (in-depth analysis of SSM and project practice)
助力金融科技创新发展,ATFX走在行业最前列
Working principle of gradient descent algorithm in machine learning
[C / C + + 1] clion configuration and running C language
快快使用ModelArts,零基礎小白也能玩轉AI!
What is the side effect free method? How to name it? - Mario
Python3 e-learning case 4: writing web proxy