Join our community | Newsletter | Contact us | Docs | Blog
Notebooks are hard to maintain. Teams often prototype projects in notebooks, but maintaining them is an error-prone process that slows progress down. Ploomber overcomes the challenges of working with .ipynb
files allowing teams to develop collaborative, production-ready pipelines using JupyterLab or any text editor.
Main Features
- Scripts as notebooks. Open
.py
files as notebooks, then execute them from the terminal and generate an output notebook to review results. - Dependency resolution. Quickly build a DAG by referring to previous tasks in your code; Ploomber infers execution order and orchestrates execution.
- Incremental builds. Speed up iterations by skipping tasks whose source code hasn't changed since the last execution.
- Production-ready. Deploy to Kubernetes (via Argo Workflows), Airflow, and AWS Batch without code changes.
- Parallelization. Run independent tasks in parallel.
- Testing. Import pipelines in any testing frameworks and test them with any CI service (e.g. GitHub Actions).
- Flexible. Use Jupyter notebooks, Python scripts, R scripts, SQL scripts, Python functions, or a combination of them as pipeline tasks. Write pipelines using a
pipeline.yaml
file or with Python.
Community
Resources
- Documentation
- Examples (Machine Learning pipeline, ETL, among others)
- Blog
- Guest blog post on the official Jupyter blog
- Comparison with other tools
- JupyterCon 2020 talk
- Argo Community Meeting talk
- Pangeo Showcase talk (AWS Batch demo)
Installation
Compatible with Python 3.6 and higher.
Install with pip
:
pip install ploomber
Or with conda
:
conda install ploomber -c conda-forge
Getting started
Use Binder to try out Ploomber without setting up an environment:
Or run an example locally:
# ML pipeline example
ploomber examples --name ml-basic
cd ml-basic
# if using pip
pip install -r requirements.txt
# if using conda
conda env create --file environment.yml
conda activate ml-basic
# run pipeline
ploomber build
Pipeline output saved in the output/
folder. Check out the pipeline definition in the pipeline.yaml
file.
To get a list of examples, run ploomber examples
.
Click here to go to our examples repository.