当前位置:网站首页>The data of pandas was scrambled and the training machine and testing machine set were selected
The data of pandas was scrambled and the training machine and testing machine set were selected
2020-11-06 01:27:00 【Elementary school students in IT field】
In machine learning , To get a pile of training data, we usually need to divide the data into training set and test set , Or cut it into training sets 、 Cross validation sets and test sets , In order to avoid bias in feature distribution of the segmented dataset , We need to scramble the data first , Make the data random , And then it's cutting .
The methods to be used are as follows :
notes :df Representing one pd.DataFrame
df = df.sample(frac=1.0): Press 100% The proportion of sampling is to achieve the effect of disrupting data
df = df.reset_index(): After scrambling the data index It's also messy , If your index If there is no characteristic meaning , Just reset it , Otherwise, we will put index Add a new column , Generate meaningless index
train = df.loc[0:a]: Carry out segmentation operation , The proportion depends on the situation
cv = df.loc[a+1:b]:
test = df.loc[b+1:-1]:
本文为[Elementary school students in IT field]所创,转载请带上原文链接,感谢
- Skywalking series blog 5-apm-customize-enhance-plugin
- Tool class under JUC package, its name is locksupport! Did you make it?
- 从海外进军中国,Rancher要执容器云市场牛耳 | 爱分析调研
- Synchronous configuration from git to consult with git 2consul
- 熬夜总结了报表自动化、数据可视化和挖掘的要点,和你想的不一样
- Leetcode's ransom letter
- How to use parameters in ES6
- Keyboard entry lottery random draw
- The choice of enterprise database is usually decided by the system architect - the newstack
- How to encapsulate distributed locks more elegantly
How do the general bottom buried points do?
Natural language processing - BM25 commonly used in search
Working principle of gradient descent algorithm in machine learning
Face to face Manual Chapter 16: explanation and implementation of fair lock of code peasant association lock and reentrantlock
Wiremock: a powerful tool for API testing
5.4 static resource mapping
Let the front-end siege division develop independently from the back-end: Mock.js
Examples of unconventional aggregation
Network security engineer Demo: the original * * is to get your computer administrator rights! 【***】
从海外进军中国,Rancher要执容器云市场牛耳 | 爱分析调研
6.5 request to view name translator (in-depth analysis of SSM and project practice)
PHPSHE 短信插件说明
6.1.1 handlermapping mapping processor (1) (in-depth analysis of SSM and project practice)
Working principle of gradient descent algorithm in machine learning
[C / C + + 1] clion configuration and running C language
What is the side effect free method? How to name it? - Mario
Python3 e-learning case 4: writing web proxy