Jax/Flax implementation of Variational-DiffWave.

Last update: Dec 16, 2022

Overview

jax-variational-diffwave

Jax/Flax implementation of Variational-DiffWave. (Zhifeng Kong et al., 2020, Diederik P. Kingma et al., 2021.)

DiffWave with Continuous-time Variational Diffusion Models.
DiffWave: A Versatile Diffusion Model for Audio Synthesis, Zhifeng Kong et al., 2020. [arXiv:2009.09761]
Variational Diffusion Models, Diederik P. Kingma et al., 2021. [arXiv:2107.00630]

Requirements

Tested in python 3.7.9 conda environment, requirements.txt

Usage

To train model, run train.py.
Checkpoint will be written on TrainConfig.ckpt, tensorboard summary on TrainConfig.log.

python train.py --data-dir /datasets/ljspeech --from-raw
tensorboard --logdir ./log/

To start to train from previous checkpoint, --load-step is available.

python train.py --load-epoch 10 --config ./ckpt/l1.json

[WIP] To synthesize test set, run synth.py.

python synth.py

[WIP] Pretrained checkpoints are relased on releases.

To use pretrained model, download files and unzip it.
Checkout git repository to proper commit tags and following is sample script.

with open('l1.json') as f:
    config = Config.load(json.load(f))

diffwave = VLBDiffWaveApp(config.model)
diffwave.restore('./l1/l1_99.ckpt')

# mel: [B, T, mel]
audio, _ = diffwave(mel, timesteps=50, key=jax.random.PRNGKey(0))

Jax/Flax implementation of Variational-DiffWave.

Related tags

Overview

jax-variational-diffwave

Requirements

Usage

Owner

YoungJoong Kim

Brain tumor detection using Convolution-Neural Network (CNN)

Pytorch implementation of “Recursive Non-Autoregressive Graph-to-Graph Transformer for Dependency Parsing with Iterative Refinement”

UnFlow: Unsupervised Learning of Optical Flow with a Bidirectional Census Loss

UniMoCo: Unsupervised, Semi-Supervised and Full-Supervised Visual Representation Learning

The dataset of tweets pulling from Twitters with keyword: Hydroxychloroquine, location: US, Time: 2020

Yolo Traffic Light Detection With Python

Official Pytorch implementation for video neural representation (NeRV)

(Preprint) Official PyTorch implementation of "How Do Vision Transformers Work?"

Code and project page for ICCV 2021 paper "DisUnknown: Distilling Unknown Factors for Disentanglement Learning"

source code of Adversarial Feedback Loop Paper

A universal framework for learning timestamp-level representations of time series

Real-time pose estimation accelerated with NVIDIA TensorRT

Source code for the plant extraction workflow introduced in the paper “Agricultural Plant Cataloging and Establishment of a Data Framework from UAV-based Crop Images by Computer Vision”

Official implementation of "Towards Good Practices for Efficiently Annotating Large-Scale Image Classification Datasets" (CVPR2021)

Code repository of the paper Neural circuit policies enabling auditable autonomy published in Nature Machine Intelligence

Project code for weakly supervised 3D object detectors using wide-baseline multi-view traffic camera data: WIBAM.

Learning View Priors for Single-view 3D Reconstruction (CVPR 2019)

Domain Generalization with MixStyle, ICLR'21.

Baseline for the Spoofing-aware Speaker Verification Challenge 2022

PyVideoAI: Action Recognition Framework