当前位置:网站首页>pycaret源码分析:下载数据集\Lib\site-packages\pycaret\datasets.py
pycaret源码分析:下载数据集\Lib\site-packages\pycaret\datasets.py
2022-08-01 00:00:00 【人工智能曾小健】

def get_data(
dataset="index",
save_copy=False,
profile=False,
verbose=True,
address="https://raw.githubusercontent.com/pycaret/pycaret/master/datasets/",
):
"""
This function loads sample datasets from git repository. List of available
datasets can be checked using ``get_data('index')``.
Example
-------
>>> from pycaret.datasets import get_data
>>> all_datasets = get_data('index')
>>> juice = get_data('juice')
dataset: str, default = 'index'
Index value of dataset.
save_copy: bool, default = False
When set to true, it saves a copy in current working directory.
profile: bool, default = False
When set to true, an interactive EDA report is displayed.
verbose: bool, default = True
When set to False, head of data is not displayed.
address: string, default = "https://raw.githubusercontent.com/pycaret/pycaret/master/datasets/"
Download url of dataset. For people have difficulty linking to github, they can change
the default address to their own (e.g. "https://gitee.com/IncubatorShokuhou/pycaret/raw/master/datasets/")
Returns:
pandas.DataFrame
Warnings
--------
- Use of ``get_data`` requires internet connection.
"""
import pandas as pd
import os.path
from IPython.display import display, HTML, clear_output, update_display
extension = ".csv"
filename = str(dataset) + extension
complete_address = address + filename
if os.path.isfile(filename):
data = pd.read_csv(filename)
else:
data = pd.read_csv(complete_address)
# create a copy for pandas profiler
data_for_profiling = data.copy()
if save_copy:
save_name = filename
data.to_csv(save_name, index=False)
if dataset == "index":
display(data)
else:
if profile:
import pandas_profiling
pf = pandas_profiling.ProfileReport(data_for_profiling)
display(pf)
else:
if verbose:
display(data.head())
return data
边栏推荐
- 编写方法将一个数组扁平化并且去重和递增排序
- Mysql environment installation under Linux (centos)
- Named Entity Recognition - Model: BERT-MRC
- 如何设计高可用高性能中间件 - 作业
- Flutter教程之 02 Flutter 桌面程序开发入门教程运行hello world (教程含源码)
- SVN server construction + SVN client + TeamCity integrated environment construction + VS2019 development
- Interview assault 69: TCP reliable?Why is that?
- Google Earth Engine——Error: Image.clipToBoundsAndScale, argument ‘input‘: Invalid type的错误解决
- Carefully organize 16 MySQL usage specifications to reduce problems by 80% and recommend sharing with the team
- NIO programming
猜你喜欢

How to Design High Availability and High Performance Middleware - Homework

C# Rectangle basic usage and picture cutting

Recommendation system: Summary of common evaluation indicators [accuracy rate, precision rate, recall rate, hit rate, (normalized depreciation cumulative gain) NDCG, mean reciprocal ranking (MRR), ROC

Kyoto University:Masaki Waga | 黑箱环境中强化学习的动态屏蔽

cobaltstrike

类和对象:中

Network security - crack WiFi through handshake packets (detailed tutorial)
Carefully organize 16 MySQL usage specifications to reduce problems by 80% and recommend sharing with the team

《ArchSummit:时代的呐喊,技术人听得到》

Handwritten a simple web server (B/S architecture)
随机推荐
cobaltstrike
vim的基本使用概念
精心总结十三条建议,帮你创建更合适的MySQL索引
助力数字政府建设,中科三方构建域名安全保障体系
Thinking and Implementation of Object Cache Service
高等代数_证明_任何矩阵都相似于一个上三角矩阵
How to import a Golang external package and use it?
Basic use of vim - bottom line mode
One line of code to solve CoreData managed object properties change in SwiftUI problem of animation effects
lua入门案例实战123DIY
什么是动态规划,什么是背包问题
对象缓存服务的思考和实现
[Cloud Residency Co-Creation] [HCSD Big Celebrity Live Broadcast] Personally teach the secrets of interviews in big factories
How to Design High Availability and High Performance Middleware - Homework
Kyoto University:Masaki Waga | 黑箱环境中强化学习的动态屏蔽
NgRx 里 first 和 take(1) 操作符的区别
2022年最新重庆建筑八大员(电气施工员)模拟题库及答案
力扣2326、197
thymeleaf迭代map集合
Keil nRF52832 download failed