A lossless neural compression framework built on top of JAX.

Overview

Kompressor

GitHub

Branch CI Coverage
main (active) Build codecov
main Build codecov
development Build codecov

A neural compression framework built on top of JAX.

Install

setup.py assumes a compatible version of JAX and JAXLib are already installed. Automated build is tested for a cuda:11.1-cudnn8-runtime-ubuntu20.04 environment with jaxlib==0.1.76+cuda11.cudnn82.

git clone https://github.com/rosalindfranklininstitute/kompressor.git
cd kompressor
pip install -e .

# Run tests
python -m pytest --cov=src/kompressor tests/

Install & Run through Docker environment

Docker image for the Kompressor dependencies are provided in the quay.io/rosalindfranklininstitute/kompressor:main Quay.io image.

# Run the container for the Kompressor environment
docker run --rm quay.io/rosalindfranklininstitute/kompressor:main \
    python -m pytest --cov=/usr/local/kompressor/src/kompressor /usr/local/kompressor/tests

Install & Run through Singularity environment

Singularity image for the Kompressor dependencies are provided in the rosalindfranklininstitute/kompressor/kompressor:main cloud.sylabs.io image.

singularity pull library://rosalindfranklininstitute/kompressor/kompressor:main
singularity run kompressor_main.sif \
    python -m pytest --cov=/usr/local/kompressor/src/kompressor /usr/local/kompressor/tests
Comments
  • Refactor map tuples to dicts

    Refactor map tuples to dicts

    Closes #14. Functions which currently return an ordered tuple of maps (lrmap, udmap, cmap, ...) now return keyed dictionaries { 'lrmap': lrmap, 'udmap': udmap, 'cmap': cmap, ... } so that order/usage is explicitly enforced.

    List comprehensions over the tuples now use jax.tree_map and jax.tree_multimap to ensure key safety.

    @GMW99, this will break the current implementation of the Metrics Callback class which iterates over a zip of the hardcoded map names and the maps tuple. This iteration can be replaced by iterating over maps.items() since it is now a dict already.

    enhancement 
    opened by JossWhittle 1
  • Ensure jax.jit static_argnums is refactored to static_argnames

    Ensure jax.jit static_argnums is refactored to static_argnames

    Functions that currently mark static_argnums=(0, 1, 2) should be updated to use the safer static_argnames=('tom', 'dick', 'harry') that is now available.

    enhancement high priority 
    opened by JossWhittle 1
  • Update development examples

    Update development examples

    • Splits docker image into JAX base image and Kompressor dependency and install image
    • JAX image installs JAX from source to ensure correct CUDA / CUDNN versions
    • Adjust setup.py to install dependencies from requirement.txt
    • Refactors a how submodules are imported (within the kom.image submodule. Need to check volumes matches)
    • Add kom.image.data submodule for dealing with tensorflow data pipelines
    • Fixed pooling in the total variation losses (used as metrics in the example notebooks)
    • Move all the encoding/decoding functions for the maps into a kom.mapping submodule
    • Add within-k and run-length metrics to kom.image.metrics for example notebooks
    • Added example notebooks for interacting with the maps and training a basic Haiku compression model
    feature 
    opened by JossWhittle 0
  • Add mapping encode/decode functions for float32 data

    Add mapping encode/decode functions for float32 data

    Will need a bit of thinking to get right. We probably need to consider similar tricks that we used for applying Radix Sort on float32 data to make the compression numerically stable and portable between machines.

    enhancement low priority 
    opened by JossWhittle 0
  • Add mapping encode/decode functions for uint32 data

    Add mapping encode/decode functions for uint32 data

    Some of our data is uint32 volumes.

    Will need to trace through the full compression implementation and make sure intermediate value dtypes are large enough to avoid uint32 overflow when needed.

    enhancement low priority 
    opened by JossWhittle 0
  • Modify core encode decode functions to pass a dict to the prediction function

    Modify core encode decode functions to pass a dict to the prediction function

    Currently the lowres inputs are passed directly to the prediction_fn as the only input.

    • Modify to accept a dict that has at least one key for the lowres input.

    • Provide boolean flag to also pass a positional encoding tensor along with the lowres which the model can use if needed.

    • Chunked encode decode will need to generate the correct chunks of the positional encoding for the current chunk.

    • Model can choose how to use positional encodings.

      • Image case would receive (B, H, W, 2) tensor containing the Y and X coordinates of each pixel in the trailing axis.
      • Volume case would receive (B, D, H, W, 3) tensor containing the Z, Y, and X coordinates of each voxel in the trailing axis.
    enhancement high priority 
    opened by JossWhittle 0
  • Look at decompressing sliced chunks

    Look at decompressing sliced chunks

    Decompress sliced chunk of image or volume without needing to decompress the entire data element.

    • May require applying secondary compression in blocks to avoid needing to decompress the full level maps, only to apply the predictor to the target slice.

    • Instead unpack just the blocks needed for the slice then trim.

    • A kompressor (or stack of) trained to secondary compress the maps from the primary kompressor (or stack of) would be able to naturally handle slice chunked decoding.

      • Could such a secondary compressor be shared between levels? Between multiple kompressors in the primary stack?
    experiment low priority 
    opened by JossWhittle 0
  • Look at compressing timeseries data

    Look at compressing timeseries data

    • Experiment with implementing the 1D case for compressing signals.
    • Video as sequence of 2D frames using the 3D volume code directly.
    • Look at compressing within timestep using information from neighbouring timesteps without actually compressing (dropping frames) the temporal axis.
    experiment low priority 
    opened by JossWhittle 0
Releases(v0.0.0)
Owner
Rosalind Franklin Institute
The Rosalind Franklin Institute is dedicated to transforming life science through interdisciplinary research and technology development
Rosalind Franklin Institute
Clustergram - Visualization and diagnostics for cluster analysis in Python

Clustergram Visualization and diagnostics for cluster analysis Clustergram is a diagram proposed by Matthias Schonlau in his paper The clustergram: A

Martin Fleischmann 96 Dec 26, 2022
Open source simulator for autonomous vehicles built on Unreal Engine / Unity, from Microsoft AI & Research

Welcome to AirSim AirSim is a simulator for drones, cars and more, built on Unreal Engine (we now also have an experimental Unity release). It is open

Microsoft 13.8k Jan 03, 2023
The repository offers the official implementation of our BMVC 2021 paper in PyTorch.

CrossMLP Cascaded Cross MLP-Mixer GANs for Cross-View Image Translation Bin Ren1, Hao Tang2, Nicu Sebe1. 1University of Trento, Italy, 2ETH, Switzerla

Bingoren 16 Jul 27, 2022
An optimization and data collection toolbox for convenient and fast prototyping of computationally expensive models.

An optimization and data collection toolbox for convenient and fast prototyping of computationally expensive models. Hyperactive: is very easy to lear

Simon Blanke 422 Jan 04, 2023
Code for MarioNette: Self-Supervised Sprite Learning, in NeurIPS 2021

MarioNette | Webpage | Paper | Video MarioNette: Self-Supervised Sprite Learning Dmitriy Smirnov, Michaël Gharbi, Matthew Fisher, Vitor Guizilini, Ale

Dima Smirnov 28 Nov 18, 2022
Implementation for paper "Towards the Generalization of Contrastive Self-Supervised Learning"

Contrastive Self-Supervised Learning on CIFAR-10 Paper "Towards the Generalization of Contrastive Self-Supervised Learning", Weiran Huang, Mingyang Yi

Weiran Huang 13 Nov 30, 2022
Official PyTorch implementation of the paper "Graph-based Generative Face Anonymisation with Pose Preservation" in ICIAP 2021

Contents AnonyGAN Installation Dataset Preparation Generating Images Using Pretrained Model Train and Test New Models Evaluation Acknowledgments Citat

Nicola Dall'Asen 10 May 24, 2022
A Pytorch Implementation of ClariNet

ClariNet A Pytorch Implementation of ClariNet (Mel Spectrogram -- Waveform) Requirements PyTorch 0.4.1 & python 3.6 & Librosa Examples Step 1. Downlo

Sungwon Kim 286 Sep 15, 2022
The Medical Detection Toolkit contains 2D + 3D implementations of prevalent object detectors such as Mask R-CNN, Retina Net, Retina U-Net, as well as a training and inference framework focused on dealing with medical images.

The Medical Detection Toolkit contains 2D + 3D implementations of prevalent object detectors such as Mask R-CNN, Retina Net, Retina U-Net, as well as a training and inference framework focused on dea

MIC-DKFZ 1.2k Jan 04, 2023
The official implementation of Variable-Length Piano Infilling (VLI).

Variable-Length-Piano-Infilling The official implementation of Variable-Length Piano Infilling (VLI). (paper: Variable-Length Music Score Infilling vi

29 Sep 01, 2022
code for EMNLP 2019 paper Text Summarization with Pretrained Encoders

PreSumm This code is for EMNLP 2019 paper Text Summarization with Pretrained Encoders Updates Jan 22 2020: Now you can Summarize Raw Text Input!. Swit

Yang Liu 1.2k Dec 28, 2022
Bling's Object detection tool

BriVL for Building Applications This repo is used for illustrating how to build applications by using BriVL model. This repo is re-implemented from fo

chuhaojin 47 Nov 01, 2022
Apply our monocular depth boosting to your own network!

MergeNet - Boost Your Own Depth Boost custom or edited monocular depth maps using MergeNet Input Original result After manual editing of base You can

Computational Photography Lab @ SFU 142 Dec 17, 2022
Trading Strategies for Freqtrade

Freqtrade Strategies Strategies for Freqtrade, developed primarily in a partnership between @werkkrew and @JimmyNixx from the Freqtrade Discord. Use t

Bryan Chain 242 Jan 07, 2023
Generate images from texts. In Russian

ruDALL-E Generate images from texts pip install rudalle==1.1.0rc0 🤗 HF Models: ruDALL-E Malevich (XL) ruDALL-E Emojich (XL) (readme here) ruDALL-E S

AI Forever 1.6k Dec 31, 2022
Source code for NAACL 2021 paper "TR-BERT: Dynamic Token Reduction for Accelerating BERT Inference"

TR-BERT Source code and dataset for "TR-BERT: Dynamic Token Reduction for Accelerating BERT Inference". The code is based on huggaface's transformers.

THUNLP 37 Oct 30, 2022
Code repo for "Cross-Scale Internal Graph Neural Network for Image Super-Resolution" (NeurIPS'20)

IGNN Code repo for "Cross-Scale Internal Graph Neural Network for Image Super-Resolution" [paper] [supp] Prepare datasets 1 Download training dataset

Shangchen Zhou 278 Jan 03, 2023
Keras implementation of AdaBound

AdaBound for Keras Keras port of AdaBound Optimizer for PyTorch, from the paper Adaptive Gradient Methods with Dynamic Bound of Learning Rate. Usage A

Somshubra Majumdar 132 Sep 23, 2022
Publication describing 3 ML examples at NSLS-II and interfacing into Bluesky

Machine learning enabling high-throughput and remote operations at large-scale user facilities. Overview This repository contains the source code and

BNL 4 Sep 24, 2022
Face detection using deep learning.

Face Detection Docker Solution Using Faster R-CNN Dockerface is a deep learning face detector. It deploys a trained Faster R-CNN network on Caffe thro

Nataniel Ruiz 181 Dec 19, 2022