当前位置:网站首页>pycaret source code analysis: download dataset\Lib\site-packages\pycaret\datasets.py
pycaret source code analysis: download dataset\Lib\site-packages\pycaret\datasets.py
2022-08-01 00:03:00 【Artificial Intelligence Zeng Xiaojian】
def get_data(
dataset="index",
save_copy=False,
profile=False,
verbose=True,
address="https://raw.githubusercontent.com/pycaret/pycaret/master/datasets/",
):
"""
This function loads sample datasets from git repository. List of available
datasets can be checked using ``get_data('index')``.
Example
-------
>>> from pycaret.datasets import get_data
>>> all_datasets = get_data('index')
>>> juice = get_data('juice')
dataset: str, default = 'index'
Index value of dataset.
save_copy: bool, default = False
When set to true, it saves a copy in current working directory.
profile: bool, default = False
When set to true, an interactive EDA report is displayed.
verbose: bool, default = True
When set to False, head of data is not displayed.
address: string, default = "https://raw.githubusercontent.com/pycaret/pycaret/master/datasets/"
Download url of dataset. For people have difficulty linking to github, they can change
the default address to their own (e.g. "https://gitee.com/IncubatorShokuhou/pycaret/raw/master/datasets/")
Returns:
pandas.DataFrame
Warnings
--------
- Use of ``get_data`` requires internet connection.
"""
import pandas as pd
import os.path
from IPython.display import display, HTML, clear_output, update_display
extension = ".csv"
filename = str(dataset) + extension
complete_address = address + filename
if os.path.isfile(filename):
data = pd.read_csv(filename)
else:
data = pd.read_csv(complete_address)
# create a copy for pandas profiler
data_for_profiling = data.copy()
if save_copy:
save_name = filename
data.to_csv(save_name, index=False)
if dataset == "index":
display(data)
else:
if profile:
import pandas_profiling
pf = pandas_profiling.ProfileReport(data_for_profiling)
display(pf)
else:
if verbose:
display(data.head())
return data
边栏推荐
- Shell常用脚本:Nexus批量上传本地仓库脚本
- Binary tree traversal non-recursive program -- using stack to simulate system stack
- 周总结
- Flutter教程之 02 Flutter 桌面程序开发入门教程运行hello world (教程含源码)
- Recommendation system: Summary of common evaluation indicators [accuracy rate, precision rate, recall rate, hit rate, (normalized depreciation cumulative gain) NDCG, mean reciprocal ranking (MRR), ROC
- 基于simulink的Passive anti-islanding-UVP/OVP and UFP/OFP被动反孤岛模型仿真
- lua入门案例实战123DIY
- vim的基本使用-命令模式
- Difference Between Stateless and Stateful
- thymeleaf iterates the map collection
猜你喜欢
面试突击69:TCP 可靠吗?为什么?
UOS - WindTerm use
如何设计高可用高性能中间件 - 作业
/etc/sysconfig/network-scripts 配置网卡
MLP神经网络,GRNN神经网络,SVM神经网络以及深度学习神经网络对比识别人体健康非健康数据
基于simulink的Active anti-islanding-AFD主动反孤岛模型仿真
【云驻共创】【HCSD大咖直播】亲授大厂面试秘诀
[Cloud Residency Co-Creation] [HCSD Big Celebrity Live Broadcast] Personally teach the secrets of interviews in big factories
一文带你了解 Grafana 最新开源项目 Mimir 的前世今生
Advanced Algebra _ Proof _ Any matrix is similar to an upper triangular matrix
随机推荐
消息队列消息存储设计(架构实战营 模块八作业)
vim的基本使用概念
Mysql environment installation under Linux (centos)
硬件设备计算存储及数据交互杂谈
[微服务]分布式事务解决方案-Seata
Kyoto University:Masaki Waga | 黑箱环境中强化学习的动态屏蔽
NIO programming
[Cloud Residency Co-Creation] [HCSD Big Celebrity Live Broadcast] Personally teach the secrets of interviews in big factories
Google Earth Engine——Error: Image.clipToBoundsAndScale, argument ‘input‘: Invalid type的错误解决
面试突击69:TCP 可靠吗?为什么?
2022-07-31:给出一个有n个点,m条有向边的图, 你可以施展魔法,把有向边,变成无向边, 比如A到B的有向边,权重为7。施展魔法之后,A和B通过该边到达彼此的代价都是7。 求,允许施展一次魔法
一行代码解决CoreData托管对象属性变更在SwiftUI中无动画效果的问题
TFC CTF 2022 WEB Diamand WriteUp
(26)Blender源码分析之顶层菜单的关于菜单
如何撰写出一篇优质的数码类好物推荐文
zeno使用方法笔记
/etc/resolv.conf的作用
UOS统信系统 - WindTerm使用
WindowInsetsControllerCompat is simple to use
开源好用的 流程图绘制工具 drawio