当前位置:网站首页>pycaret源码分析:下载数据集\Lib\site-packages\pycaret\datasets.py
pycaret源码分析:下载数据集\Lib\site-packages\pycaret\datasets.py
2022-08-01 00:00:00 【人工智能曾小健】
def get_data(
dataset="index",
save_copy=False,
profile=False,
verbose=True,
address="https://raw.githubusercontent.com/pycaret/pycaret/master/datasets/",
):
"""
This function loads sample datasets from git repository. List of available
datasets can be checked using ``get_data('index')``.
Example
-------
>>> from pycaret.datasets import get_data
>>> all_datasets = get_data('index')
>>> juice = get_data('juice')
dataset: str, default = 'index'
Index value of dataset.
save_copy: bool, default = False
When set to true, it saves a copy in current working directory.
profile: bool, default = False
When set to true, an interactive EDA report is displayed.
verbose: bool, default = True
When set to False, head of data is not displayed.
address: string, default = "https://raw.githubusercontent.com/pycaret/pycaret/master/datasets/"
Download url of dataset. For people have difficulty linking to github, they can change
the default address to their own (e.g. "https://gitee.com/IncubatorShokuhou/pycaret/raw/master/datasets/")
Returns:
pandas.DataFrame
Warnings
--------
- Use of ``get_data`` requires internet connection.
"""
import pandas as pd
import os.path
from IPython.display import display, HTML, clear_output, update_display
extension = ".csv"
filename = str(dataset) + extension
complete_address = address + filename
if os.path.isfile(filename):
data = pd.read_csv(filename)
else:
data = pd.read_csv(complete_address)
# create a copy for pandas profiler
data_for_profiling = data.copy()
if save_copy:
save_name = filename
data.to_csv(save_name, index=False)
if dataset == "index":
display(data)
else:
if profile:
import pandas_profiling
pf = pandas_profiling.ProfileReport(data_for_profiling)
display(pf)
else:
if verbose:
display(data.head())
return data
边栏推荐
猜你喜欢
类和对象:上
基于simulink的Active anti-islanding-AFD主动反孤岛模型仿真
手写一个简单的web服务器(B/S架构)
VOT2021 game introduction
Drawing process of hand-drawn map of scenic spots
【Acwing】第62场周赛 题解
Carefully organize 16 MySQL usage specifications to reduce problems by 80% and recommend sharing with the team
Network security - crack WiFi through handshake packets (detailed tutorial)
leetcode:126. 单词接龙 II
[Reading Notes -> Data Analysis] 02 Data Analysis Preparation
随机推荐
2022年最新重庆建筑八大员(电气施工员)模拟题库及答案
内核对设备树的处理
Shell common scripts: Nexus batch upload local warehouse enhanced version script (strongly recommended)
Interview Question: Implementing Deadlocks
手写一个简单的web服务器(B/S架构)
To help the construction of digital government, the three parties of China Science and Technology build a domain name security system
面试题:实现死锁
一文带你了解 Grafana 最新开源项目 Mimir 的前世今生
leetcode:126. 单词接龙 II
Components of TypeScript
【Acwing】第62场周赛 题解
[QNX Hypervisor 2.2用户手册]9.15 suppress
一体化步进电机在无人机自动机场的应用
The role of /etc/resolv.conf
Binary tree non-recursive traversal
【MATLAB项目实战】LDPC-BP信道编码
hboot and recovery, boot.img, system.img
Basic use of vim - bottom line mode
TFC CTF 2022 WEB Diamand WriteUp
Network security - crack WiFi through handshake packets (detailed tutorial)