当前位置:网站首页>pycaret源码分析:下载数据集\Lib\site-packages\pycaret\datasets.py
pycaret源码分析:下载数据集\Lib\site-packages\pycaret\datasets.py
2022-08-01 00:00:00 【人工智能曾小健】

def get_data(
dataset="index",
save_copy=False,
profile=False,
verbose=True,
address="https://raw.githubusercontent.com/pycaret/pycaret/master/datasets/",
):
"""
This function loads sample datasets from git repository. List of available
datasets can be checked using ``get_data('index')``.
Example
-------
>>> from pycaret.datasets import get_data
>>> all_datasets = get_data('index')
>>> juice = get_data('juice')
dataset: str, default = 'index'
Index value of dataset.
save_copy: bool, default = False
When set to true, it saves a copy in current working directory.
profile: bool, default = False
When set to true, an interactive EDA report is displayed.
verbose: bool, default = True
When set to False, head of data is not displayed.
address: string, default = "https://raw.githubusercontent.com/pycaret/pycaret/master/datasets/"
Download url of dataset. For people have difficulty linking to github, they can change
the default address to their own (e.g. "https://gitee.com/IncubatorShokuhou/pycaret/raw/master/datasets/")
Returns:
pandas.DataFrame
Warnings
--------
- Use of ``get_data`` requires internet connection.
"""
import pandas as pd
import os.path
from IPython.display import display, HTML, clear_output, update_display
extension = ".csv"
filename = str(dataset) + extension
complete_address = address + filename
if os.path.isfile(filename):
data = pd.read_csv(filename)
else:
data = pd.read_csv(complete_address)
# create a copy for pandas profiler
data_for_profiling = data.copy()
if save_copy:
save_name = filename
data.to_csv(save_name, index=False)
if dataset == "index":
display(data)
else:
if profile:
import pandas_profiling
pf = pandas_profiling.ProfileReport(data_for_profiling)
display(pf)
else:
if verbose:
display(data.head())
return data
边栏推荐
猜你喜欢

消息队列消息存储设计(架构实战营 模块八作业)

如何设计高可用高性能中间件 - 作业

C# Rectangle basic usage and picture cutting
![[1161. The maximum sum of elements in the layer]](/img/59/7810f425431779aa719458038ea0b3.png)
[1161. The maximum sum of elements in the layer]

Program processes and threads (concurrency and parallelism of threads) and basic creation and use of threads

一文概述:VPN的基本模型及业务类型

【读书笔记->数据分析】02 数据分析准备

清华大学陈建宇教授团队 | 基于接触丰富机器人操作的接触安全强化学习框架

浏览器下载快捷方式到桌面(PWA)

网易云信圈组上线实时互动频道,「破冰」弱关系社交
随机推荐
【ACM】2022.7.31训练赛
cobaltstrike
Network security - crack WiFi through handshake packets (detailed tutorial)
MLP神经网络,GRNN神经网络,SVM神经网络以及深度学习神经网络对比识别人体健康非健康数据
One line of code to solve CoreData managed object properties change in SwiftUI problem of animation effects
/etc/resolv.conf的作用
EntityFramework保存到SQLServer 小数精度丢失
SQL injection Less38 (stack injection)
How to import a Golang external package and use it?
lua入门案例实战1234定义函数与标准函数库功能
数据分析(一)——matplotlib
Handwritten a simple web server (B/S architecture)
什么时候可以使用 PushGateway
Shell常用脚本:Nexus批量上传本地仓库脚本
景区手绘地图的绘制流程
hboot and recovery, boot.img, system.img
类和对象:上
Recommendation system: Summary of common evaluation indicators [accuracy rate, precision rate, recall rate, hit rate, (normalized depreciation cumulative gain) NDCG, mean reciprocal ranking (MRR), ROC
【读书笔记->数据分析】02 数据分析准备
C# Rectangle basic usage and picture cutting