Data Inspector is an open-source python library that brings 15++ types of different functions to make EDA, data cleaning easier.

Overview

Data Inspector

Author MIT Contributions welcome Stars Downloads

Data Inspector is an open-source python library that brings 15 types of different functions to make EDA, data cleaning easier.

Author: Kazi Amit Hasan

Project Description:

Data Inspector brings 15++ essential exploratory data analysis, data cleaning automations to make a dataset understandable. This is a perfect tool to get started with you data.

Latest Added Feature:

Added regplots in the library

Installation:

pip install data-inspector

Package available at https://pypi.org/project/data-inspector/

Available automation:

  1. Line plot : line_plot(data, x_data, y_data, x_label="", y_label="", title="")
  2. Skew feature: plot_skewed_feature(data, column)
  3. Showing data distribution: show_distribution(data, column)
  4. Scatter plot: plot_scatter(data,x_data, y_data)
  5. Correlation plot: plot_correlation(data)
  6. Create histogram: histogram(data,column, x_label, y_label, title)
  7. Create bar plot: plot_bar(data, column, xlabel, ylabel, title)
  8. Create boxplots of all features: box_plot(data)
  9. Checking dataset's shape: datasetShape(data)
  10. Get dataset's diagnostic plots: diagnostic_plots(data, variable)
  11. Divide numerical and categorical features: divideFeatures(data)
  12. Fill NaN values: fillNan(data, column, value)
  13. Get pearson's correlation between two variables: get_correlation(column_1, column_2, data)
  14. Plotting kde plots: plot_cont_kde(data, var)
  15. Automatic calculating the missing values and their percentage along with visualization : calculating_missing_values(data)
  16. Regression plot with 95% CI : plot_regplot(data,x_data, y_data)

Tutorial:

Link: https://github.com/AmitHasanShuvo/data-inspector/blob/main/notebook/example%20notebook.ipynb
Colab link: https://colab.research.google.com/drive/1mj9gz2XyQprSYdKMUKlKkJ9Qi8XmleHW?usp=sharing

Some visualizations:



How to cite:

@online{data-inspector,
title={data-inspector},
url={https://pypi.org/project/data-inspector/},
urldate = {2021-08-21}, 
publisher={Kazi Amit Hasan}
}

Future Works:

  1. Add some automations for time series data.

How to contribute:

Any contribution would be highly appreciated. Kindly go through the guidelines for contributing in github.

You might also like...
Sphinx-performance - CLI tool to measure the build time of different, free configurable Sphinx-Projects
Sphinx-performance - CLI tool to measure the build time of different, free configurable Sphinx-Projects

CLI tool to measure the build time of different, free configurable Sphinx-Projec

A module filled with many useful functions and modules in various subjects.
A module filled with many useful functions and modules in various subjects.

Usefulpy Check out the Usefulpy site Usefulpy site is not always up to date Download and Import download and install with with pip download usefulpyth

Template repo to quickly make a tested and documented GitHub action in Python with Poetry

Python + Poetry GitHub Action Template Getting started from the template Rename the src/action_python_poetry package. Globally replace instances of ac

Make posters from Markdown files.
Make posters from Markdown files.

MkPosters Create posters using Markdown. Supports icons, admonitions, and LaTeX mathematics. At the moment it is restricted to the specific layout of

A tutorial for people to run synthetic data replica's from source healthcare datasets
A tutorial for people to run synthetic data replica's from source healthcare datasets

Synthetic-Data-Replica-for-Healthcare Description What is this? A tailored hands-on tutorial showing how to use Python to create synthetic data replic

A Python library for setting up projects using tabular data.

A Python library for setting up projects using tabular data. It can create project folders, standardize delimiters, and convert files to CSV from either individual files or a directory.

Source Code for 'Practical Python Projects' (video) by Sunil Gupta
Source Code for 'Practical Python Projects' (video) by Sunil Gupta

Apress Source Code This repository accompanies %Practical Python Projects by Sunil Gupta (Apress, 2021). Download the files as a zip using the green b

Automatically open a pull request for repositories that have no CONTRIBUTING.md file

automatic-contrib-prs Automatically open a pull request for repositories that have no CONTRIBUTING.md file for a targeted set of repositories. What th

The source code that powers readthedocs.org

Welcome to Read the Docs Purpose Read the Docs hosts documentation for the open source community. It supports Sphinx docs written with reStructuredTex

Releases(eda)
  • eda(Aug 19, 2021)

    Data Inspector brings a total of 15 essential exploratory data analysis, data cleaning automations to make a dataset understandable. This is a perfect tool to get started with you data.

    PYPI link: https://pypi.org/project/data-inspector/

    Source code(tar.gz)
    Source code(zip)
Owner
Kazi Amit Hasan
ML Engineer at ACI Limited | Kaggle Competition Expert (x4) | Researcher
Kazi Amit Hasan
The OpenAPI Specification Repository

The OpenAPI Specification The OpenAPI Specification is a community-driven open specification within the OpenAPI Initiative, a Linux Foundation Collabo

OpenAPI Initiative 25.5k Dec 29, 2022
SamrSearch - SamrSearch can get user info and group info with MS-SAMR

SamrSearch SamrSearch can get user info and group info with MS-SAMR.like net use

knight 10 Oct 06, 2022
LotteryBuyPredictionWebApp - Lottery Purchase Prediction Model

Lottery Purchase Prediction Model Objective and Goal Predict the lottery type th

Wanxuan Zhang 2 Feb 14, 2022
SCTYMN is a GitHub repository that includes some simple scripts(currently only python scripts) that can be useful.

Simple Codes That You Might Need SCTYMN is a GitHub repository that includes some simple scripts(currently only python scripts) that can be useful. In

CodeWriter21 2 Jan 21, 2022
Plugins for MkDocs.

Plugins for MkDocs and Python Markdown pip install neoteroi-mkdocs This package includes the following plugins and extensions: Name Description Type m

35 Dec 23, 2022
A document format conversion service based on Pandoc.

reformed Document format conversion service based on Pandoc. Usage The API specification for the Reformed server is as follows: GET /api/v1/formats: L

David Lougheed 3 Jul 18, 2022
DeltaPy - Tabular Data Augmentation (by @firmai)

DeltaPy⁠⁠ — Tabular Data Augmentation & Feature Engineering Finance Quant Machine Learning ML-Quant.com - Automated Research Repository Introduction T

Derek Snow 470 Dec 28, 2022
Explain yourself! Interrogate a codebase for docstring coverage.

interrogate: explain yourself Interrogate a codebase for docstring coverage. Why Do I Need This? interrogate checks your code base for missing docstri

Lynn Root 435 Dec 29, 2022
layout-parser 3.4k Dec 30, 2022
More detailed upload statistics for Nicotine+

More Upload Statistics A small plugin for Nicotine+ 3.1+ to create more detailed upload statistics. ⚠ No data previous to enabling this plugin will be

Nick 1 Dec 17, 2021
Numpy's Sphinx extensions

numpydoc -- Numpy's Sphinx extensions This package provides the numpydoc Sphinx extension for handling docstrings formatted according to the NumPy doc

NumPy 234 Dec 26, 2022
Mkdocs obsidian publish - Publish your obsidian vault through a python script

Mkdocs Obsidian Mkdocs Obsidian is an association between a python script and a

Mara 49 Jan 09, 2023
Plover jyutping - Plover plugin for Jyutping input

Plover plugin for Jyutping Installation Navigate to the repo directory: cd plove

Samuel Lo 1 Mar 17, 2022
Fast, efficient Blowfish cipher implementation in pure Python (3.4+).

blowfish This module implements the Blowfish cipher using only Python (3.4+). Blowfish is a block cipher that can be used for symmetric-key encryption

Jashandeep Sohi 41 Dec 31, 2022
An open source utility for creating publication quality LaTex figures generated from OpenFOAM data files.

foamTEX An open source utility for creating publication quality LaTex figures generated from OpenFOAM data files. Explore the docs » Report Bug · Requ

1 Dec 19, 2021
MkDocs Plugin allowing your visitors to *File > Print > Save as PDF* the entire site.

mkdocs-print-site-plugin MkDocs plugin that adds a page to your site combining all pages, allowing your site visitors to File Print Save as PDF th

Tim Vink 67 Jan 04, 2023
Python For Finance Cookbook - Code Repository

Python For Finance Cookbook - Code Repository

Packt 544 Dec 25, 2022
A `:github:` role for Sphinx

sphinx-github-role A github role for Sphinx. Usage Basic usage MyST: :caption: index.md See {github}`astrojuanlu/sphinx-github-role#1`. reStructuredT

Juan Luis Cano Rodríguez 4 Nov 22, 2022
An awesome Data Science repository to learn and apply for real world problems.

AWESOME DATA SCIENCE An open source Data Science repository to learn and apply towards solving real world problems. This is a shortcut path to start s

Academic.io 20.3k Jan 09, 2023
Main repository for the Sphinx documentation builder

Sphinx Sphinx is a tool that makes it easy to create intelligent and beautiful documentation for Python projects (or other documents consisting of mul

5.1k Jan 04, 2023