Can we visualize a large scientific data set with a surrogate model? We're building a GAN for the Earth's Mantle Convection data set to see if we can!

Overview

EarthGAN - Earth Mantle Surrogate Modeling

Can a surrogate model of the Earthโ€™s Mantle Convection data set be built such that it can be readily run in a web-browser and produce high-fidelity results? We're trying to do just that through the use of a generative adversarial network -- we call ours EarthGAN. We are in active research.

See how EarthGAN currently works! Open up the Colab notebook and create results from the preliminary generator: Open In Colab

compare_epoch41_rindex165_moll

Progress updates, along with my thoughts, can be found in the devlog. The preliminary results were presented at VIS 2021 as part of the SciVis contest. See the paper on arXiv, here.

This is active research. If you have any thoughts, suggestions, or would like to collaborate, please reach out! You can also post questions/ideas in the discussions section.

Source code arXiv

Current Approach

We're leveraging the excellent work of Li et al. who have implemented a GAN for creating super-resolution cosmological simulations. The general method is in their map2map repository. We've used their GAN implementation as it works on 3D data. Please cite their work if you find it useful!

The current approach is based on the StyleGAN2 model. In addition, a conditional-GAN (cGAN) is used to produce results that are partially deterministic.

Setup

Works best if you are in a HPC environment (I used Compute Canada). Also tested locally in linux (MacOS should also work). If you run windows you'll have to do much of the environment setup and data download/preprocessing manually.

To reproduce data pipeline and begin training: *

  1. Clone this repo - clone https://github.com/tvhahn/EarthGAN.git

  2. Create virtual environment. Assumes that Conda is installed when on a local computer.

    • HPC: make create_environment will detect HPC environment and automatically create environment from make_hpc_venv.sh. Tested on Compute Canada. Modify make_hpc_venv.sh for your own HPC cluster.

    • Linux/MacOS: use command from Makefile - `make create_environment

  3. Download raw data.

    • HPC: use make download. Will automatically detect HPC environment.

    • Linux/MacOS: use make download. Will automatically download to appropriate data/raw directory.

  4. Extract raw data.

    • HPC: use make download. Will automatically detect HPC environment. Again, modify for your HPC cluster.
    • Linux/MacOS: use make extract. Will automatically extract to appropriate data/raw directory.
  5. Ensure virtual environment is activated. conda activate earth

  6. From root directory of EarthGAN, run pip install -e . -- this will give the python scripts access to the src folders.

  7. Create the processed data that will be used for training.

    • HPC: use make data. Will automatically detect HPC environment and create the processed data.

      ๐Ÿ“ Note: You will have to modify the make_hpc_data.sh in the ./bash_scripts/ folder to match the requirements of your HPC environment

    • Linux/MacOS: use make data.

  8. Copy the processed data to the scratch folder if you're on the HPC. Modify copy_processed_data_to_scratch.sh in ./bash_scripts/ folder.

  9. Train!

    • HPC: use make train. Again, modify for your HPC cluster. Not yet optimized for multi-GPU training, so be warned, it will be SLOW!

    • Linux/MacOS: use make train.

* Let me know if you run into any problems! This is still in development.

Project Organization

โ”œโ”€โ”€ Makefile           <- Makefile with commands like `make data` or `make train`
โ”‚
โ”œโ”€โ”€ bash_scripts	   <- Bash scripts used in for training models or setting up environment
โ”‚   โ”œโ”€โ”€ train_model_hpc.sh       <- Bash/SLURM script used to train models on HPC (you will need to	modify this to work on your HPC). Called with `make train`
โ”‚   โ””โ”€โ”€ train_model_local.sh     <- Bash script used to train models locally. Called on with `make train`
โ”‚
โ”œโ”€โ”€ data
โ”‚   โ”œโ”€โ”€ interim        <- Intermediate data before we've applied any scaling.
โ”‚   โ”œโ”€โ”€ processed      <- The final, canonical data sets for modeling.
โ”‚   โ””โ”€โ”€ raw            <- Original data from Earth Mantle Convection simulation.
โ”‚
โ”œโ”€โ”€ models             <- Trained and serialized models, model predictions, or model summaries
โ”‚   โ””โ”€โ”€ interim        <- Interim models and summaries
โ”‚   โ””โ”€โ”€ final          <- Final, cononical models
โ”‚
โ”œโ”€โ”€ notebooks          <- Jupyter notebooks. Generally used for explaining various components
โ”‚   โ”‚                     of the code base.
โ”‚   โ””โ”€โ”€ scratch        <- Rough-draft notebooks, of questionable quality. Be warned!
โ”‚
โ”œโ”€โ”€ references         <- Data dictionaries, manuals, and all other explanatory materials.
โ”‚
โ”œโ”€โ”€ reports            <- Generated analysis as HTML, PDF, LaTeX, etc.
โ”‚   โ””โ”€โ”€ figures        <- Generated graphics and figures to be used in reporting
โ”‚
โ”œโ”€โ”€ requirements.txt   <- Recommend using `make create_environment`. However, can use this file
โ”‚                         for to recreate environment with pip
โ”œโ”€โ”€ envearth.yml       <- Used to create conda environment. Use `make create_environment` when
โ”‚                         on local compute				
โ”‚
โ”œโ”€โ”€ setup.py           <- makes project pip installable (pip install -e .) so src can be imported
โ”œโ”€โ”€ src                <- Source code for use in this project.
โ”‚   โ”œโ”€โ”€ __init__.py    <- Makes src a Python module
โ”‚   โ”‚
โ”‚   โ”œโ”€โ”€ data           <- Scripts to download or generate data
โ”‚   โ”‚   โ”œโ”€โ”€ make_dataset.py			<- Script for making downsampled data from the original
โ”‚   โ”‚   โ”œโ”€โ”€ data_prep_utils.py		<- Misc functions used in data prep
โ”‚   โ”‚   โ”œโ”€โ”€ download.sh				<- Bash script to download entire Earth Mantle data set
โ”‚   โ”‚   โ”‚  							   (used when `make data` called)
โ”‚   โ”‚   โ””โ”€โ”€download.sh				<- Bash script to extract all Earth Mantle data set files
โ”‚   โ”‚    							   from zip (used when `make extract` called)								   
โ”‚   โ”‚
โ”‚   โ”œโ”€โ”€ models         <- Scripts to train models and then use trained models to make
โ”‚   โ”‚   โ”‚                 predictions
โ”‚   โ”‚   โ”‚
โ”‚   โ”‚   โ””โ”€โ”€ train_model.py
โ”‚   โ”‚
โ”‚   โ””โ”€โ”€ visualization  <- Scripts to create exploratory and results oriented visualizations
โ”‚       โ””โ”€โ”€ visualize.py
โ”‚
โ”œโ”€โ”€ LICENSE
โ””โ”€โ”€ README.md          <- README describing project.
You might also like...
An implementation of the [Hierarchical (Sig-Wasserstein) GAN] algorithm for large dimensional Time Series Generation
An implementation of the [Hierarchical (Sig-Wasserstein) GAN] algorithm for large dimensional Time Series Generation

Hierarchical GAN for large dimensional financial market data Implementation This repository is an implementation of the [Hierarchical (Sig-Wasserstein

Voxel Set Transformer: A Set-to-Set Approach to 3D Object Detection from Point Clouds (CVPR 2022)
Voxel Set Transformer: A Set-to-Set Approach to 3D Object Detection from Point Clouds (CVPR 2022)

Voxel Set Transformer: A Set-to-Set Approach to 3D Object Detection from Point Clouds (CVPR2022)[paper] Authors: Chenhang He, Ruihuang Li, Shuai Li, L

A multi-functional library for full-stack Deep Learning. Simplifies Model Building, API development, and Model Deployment.
A multi-functional library for full-stack Deep Learning. Simplifies Model Building, API development, and Model Deployment.

chitra What is chitra? chitra (เคšเคฟเคคเฅเคฐ) is a multi-functional library for full-stack Deep Learning. It simplifies Model Building, API development, and M

Official PyTorch implementation for Generic Attention-model Explainability for Interpreting Bi-Modal and Encoder-Decoder Transformers, a novel method to visualize any Transformer-based network. Including examples for DETR, VQA.
Official PyTorch implementation for Generic Attention-model Explainability for Interpreting Bi-Modal and Encoder-Decoder Transformers, a novel method to visualize any Transformer-based network. Including examples for DETR, VQA.

PyTorch Implementation of Generic Attention-model Explainability for Interpreting Bi-Modal and Encoder-Decoder Transformers 1 Using Colab Please notic

Visualize Camera's Pose Using Extrinsic Parameter by Plotting Pyramid Model on 3D Space
Visualize Camera's Pose Using Extrinsic Parameter by Plotting Pyramid Model on 3D Space

extrinsic2pyramid Visualize Camera's Pose Using Extrinsic Parameter by Plotting Pyramid Model on 3D Space Intro A very simple and straightforward modu

Language Models Can See: Plugging Visual Controls in Text Generation
Language Models Can See: Plugging Visual Controls in Text Generation

Language Models Can See: Plugging Visual Controls in Text Generation Authors: Yixuan Su, Tian Lan, Yahui Liu, Fangyu Liu, Dani Yogatama, Yan Wang, Lin

This is my codes that can visualize the psnr image in testing videos.
This is my codes that can visualize the psnr image in testing videos.

CVPR2018-Baseline-PSNRplot This is my codes that can visualize the psnr image in testing videos. Future Frame Prediction for Anomaly Detection โ€“ A New

A library for answering questions using data you cannot see
A library for answering questions using data you cannot see

A library for computing on data you do not own and cannot see PySyft is a Python library for secure and private Deep Learning. PySyft decouples privat

Code and data for the paper
Code and data for the paper "Hearing What You Cannot See"

Hearing What You Cannot See: Acoustic Vehicle Detection Around Corners Public repository of the paper "Hearing What You Cannot See: Acoustic Vehicle D

Releases(v1.0.0)
  • v1.0.0(Nov 4, 2021)

Owner
Tim
Data science. Innovation. ML practitioner.
Tim
Drone detection using YOLOv5

This drone detection system uses YOLOv5 which is a family of object detection architectures and we have trained the model on Drone Dataset. Overview I

Tushar Sarkar 27 Dec 20, 2022
A Kitti Road Segmentation model implemented in tensorflow.

KittiSeg KittiSeg performs segmentation of roads by utilizing an FCN based model. The model achieved first place on the Kitti Road Detection Benchmark

Marvin Teichmann 890 Jan 04, 2023
A Pytorch Implementation of ClariNet

ClariNet A Pytorch Implementation of ClariNet (Mel Spectrogram -- Waveform) Requirements PyTorch 0.4.1 & python 3.6 & Librosa Examples Step 1. Downlo

Sungwon Kim 286 Sep 15, 2022
Official Keras Implementation for UNet++ in IEEE Transactions on Medical Imaging and DLMIA 2018

UNet++: A Nested U-Net Architecture for Medical Image Segmentation UNet++ is a new general purpose image segmentation architecture for more accurate i

Zongwei Zhou 1.8k Jan 07, 2023
Running AlphaFold2 (from ColabFold) in Azure Machine Learning

Running AlphaFold2 (from ColabFold) in Azure Machine Learning Colby T. Ford, Ph.D. Companion repository for Medium Post: How to predict many protein s

Colby T. Ford 3 Feb 18, 2022
Experiments and examples converting Transformers to ONNX

Experiments and examples converting Transformers to ONNX This repository containes experiments and examples on converting different Transformers to ON

Philipp Schmid 4 Dec 24, 2022
Range Image-based LiDAR Localization for Autonomous Vehicles Using Mesh Maps

Range Image-based 3D LiDAR Localization This repo contains the code for our ICRA2021 paper: Range Image-based LiDAR Localization for Autonomous Vehicl

Photogrammetry & Robotics Bonn 208 Dec 15, 2022
A learning-based data collection tool for human segmentation

FullBodyFilter A Learning-Based Data Collection Tool For Human Segmentation Contents Documentation Source Code and Scripts Overview of Project Usage O

Robert Jiang 4 Jun 24, 2022
HDR Video Reconstruction: A Coarse-to-fine Network and A Real-world Benchmark Dataset (ICCV 2021)

Code for HDR Video Reconstruction HDR Video Reconstruction: A Coarse-to-fine Network and A Real-world Benchmark Dataset (ICCV 2021) Guanying Chen, Cha

Guanying Chen 64 Nov 19, 2022
Official respository for "Modeling Defocus-Disparity in Dual-Pixel Sensors", ICCP 2020

Official respository for "Modeling Defocus-Disparity in Dual-Pixel Sensors", ICCP 2020 BibTeX @INPROCEEDINGS{punnappurath2020modeling, author={Abhi

Abhijith Punnappurath 22 Oct 01, 2022
Bot developed in Python that automates races in pegaxy.

espaรฑol | portuguรชs About it: This is a fork from pega-racing-bot. This bot, developed in Python, is to automate races in pegaxy. The game developers

4 Apr 08, 2022
Official codes: Self-Supervised Learning by Estimating Twin Class Distribution

TWIST: Self-Supervised Learning by Estimating Twin Class Distributions Codes and pretrained models for TWIST: @article{wang2021self, title={Self-Sup

Bytedance Inc. 85 Dec 15, 2022
HALO: A Skeleton-Driven Neural Occupancy Representation for Articulated Hands

HALO: A Skeleton-Driven Neural Occupancy Representation for Articulated Hands Oral Presentation, 3DV 2021 Korrawe Karunratanakul, Adrian Spurr, Zicong

Korrawe Karunratanakul 43 Oct 07, 2022
Repositorio de los Laboratorios de Anรกlisis Numรฉrico / Anรกlisis Numรฉrico I de FAMAF, UNC.

Repositorio de los Laboratorios de Anรกlisis Numรฉrico / Anรกlisis Numรฉrico I de FAMAF, UNC. Para los Laboratorios de la materia, vamos a utilizar el len

Luis Biedma 18 Dec 12, 2022
An Evaluation of Generative Adversarial Networks for Collaborative Filtering.

An Evaluation of Generative Adversarial Networks for Collaborative Filtering. This repository was developed by Fernando B. Pรฉrez Maurera. Fernando is

Fernando Benjamรญn Pร‰REZ MAURERA 0 Jan 19, 2022
Unofficial TensorFlow implementation of Protein Interface Prediction using Graph Convolutional Networks.

[TensorFlow] Protein Interface Prediction using Graph Convolutional Networks Unofficial TensorFlow implementation of Protein Interface Prediction usin

YeongHyeon Park 9 Oct 25, 2022
้ ˜ๅŸŸใ‚’ๆŒ‡ๅฎšใ—ใ€ใ‚ญใƒผใ‚’ๅ…ฅๅŠ›ใ™ใ‚‹ใ“ใจใง็”ปๅƒใ‚’ไฟๅญ˜ใ™ใ‚‹ใƒ„ใƒผใƒซใงใ™ใ€‚ใ‚ฏใƒฉใ‚นๅˆ†้กž็”จใฎใƒ‡ใƒผใ‚ฟใ‚ปใƒƒใƒˆไฝœๆˆใ‚’ๆƒณๅฎšใ—ใฆใ„ใพใ™ใ€‚

image-capture-class-annotation ้ ˜ๅŸŸใ‚’ๆŒ‡ๅฎšใ—ใ€ใ‚ญใƒผใ‚’ๅ…ฅๅŠ›ใ™ใ‚‹ใ“ใจใง็”ปๅƒใ‚’ไฟๅญ˜ใ™ใ‚‹ใƒ„ใƒผใƒซใงใ™ใ€‚ ใ‚ฏใƒฉใ‚นๅˆ†้กž็”จใฎใƒ‡ใƒผใ‚ฟใ‚ปใƒƒใƒˆไฝœๆˆใ‚’ๆƒณๅฎšใ—ใฆใ„ใพใ™ใ€‚ Requirement OpenCV 3.4.2 or later Usage ๅฎŸ่กŒๆ–นๆณ•ใฏไปฅไธ‹ใงใ™ใ€‚ ่ตทๅ‹•ๅพŒใฏใƒžใ‚ฆใ‚นใ‚ฏใƒชใƒƒใ‚ฏ4

KazuhitoTakahashi 5 May 28, 2021
Malware Analysis Neural Network project.

MalanaNeuralNetwork Description Malware Analysis Neural Network project. Table of Contents Getting Started Requirements Installation Clone Set-Up VENV

2 Nov 13, 2021
Code for the TASLP paper "PSLA: Improving Audio Tagging With Pretraining, Sampling, Labeling, and Aggregation".

PSLA: Improving Audio Tagging with Pretraining, Sampling, Labeling, and Aggregation Introduction Getting Started FSD50K Recipe AudioSet Recipe Label E

Yuan Gong 84 Dec 27, 2022
Data & Code for ACCENTOR Adding Chit-Chat to Enhance Task-Oriented Dialogues

ACCENTOR: Adding Chit-Chat to Enhance Task-Oriented Dialogues Overview ACCENTOR consists of the human-annotated chit-chat additions to the 23.8K dialo

Facebook Research 69 Dec 29, 2022