AugLiChem - The augmentation library for chemical systems.

Overview

AugLiChem

Build Status codecov

Welcome to AugLiChem! The augmentation library for chemical systems. This package supports augmentation for both crystaline and molecular systems, as well as provides automatic downloading for our benchmark datasets, and easy to use model implementations. In depth documentation about how to use AugLiChem, make use of transformations, and train models is given on our website.

Installation

AugLiChem is a python3.8+ package.

Linux

It is recommended to use an environment manager such as conda to install AugLiChem. Instructions can be found here. If using conda, creating a new environment is ideal and can be done simply by running the following command:

conda create -n auglichem python=3.8

Then activating the new environment with

conda activate auglichem

AugLiChem is built primarily with pytorch and that should be installed independently according to your system specifications. After activating your conda environment, pytorch can be installed easily and instructions are found here.

torch_geometric needs to be installed with conda install pyg -c pyg -c conda-forge.

Once you have pytorch and torch_geometric installed, installing AugLiChem can be done using PyPI:

pip install auglichem

MacOS ARM64 Architecture

A more involved install is required to run on the new M1 chips since some of the packages do not have official support yet. We are working on a more elegant solution given the current limitations.

First, download this repo.

If you do not have it yet,, conda for ARM64 architecture needs to be installed. This can be done with Miniforge (which contains conda installer) which is installed by following the guide here

Once you have miniforge compatible with ARM64 architecture, a new environment with rdkit can be i nstalled. If you do not specify python=3.8 it will default to python=3.9.6 as of the time of writing th is.

conda create -n auglichem python=3.8 rdkit

Now activate the environment:

conda activate auglichem

From here, individual packages can be installed:

conda install -c pytorch pytorch

conda install -c fastchan torchvision

conda install scipy

conda install cython

conda install scikit-learn

pip install torch-scatter -f https://data.pyg.org/whl/torch-1.10.0+cpu.html

pip install torch-sparse -f https://data.pyg.org/whl/torch-1.10.0+cpu.html

pip install torch-geometric

Before installing the package, you must go into setup.py in the main directory and comment out rdkit-pypi and tensorboard from the install_requires list since they are already installed. Not commenting these packages out will result in an error during installation.

Finally, run:

pip install .

Usage guides are provided in the examples/ directory and provide useful guides for using both the molecular and crystal sides of the package. Make sure to install jupyter before working with examples, using conda install jupyter. After installing the package as described above, the example notebooks can be downloaded separately and run locally.

Authors

Rishikesh Magar*, Yuyang Wang*, Cooper Lorsung*, Hariharan Ramasubramanian, Chen Liang, Peiyuan Li, Amir Barati Farimani

*Equal contribution

Paper

Our paper can be found here

Citation

If you use AugLiChem in your work, please cite:

@misc{magar2021auglichem,
      title={AugLiChem: Data Augmentation Library ofChemical Structures for Machine Learning}, 
      author={Rishikesh Magar and Yuyang Wang and Cooper Lorsung and Chen Liang and Hariharan Ramasubramanian and Peiyuan Li and Amir Barati Farimani},
      year={2021},
      eprint={2111.15112},
      archivePrefix={arXiv},
      primaryClass={cs.LG}
}

License

AugLiChem is MIT licensed, as found in the LICENSE file. Please note that some of the dependencies AugLiChem uses may be licensed under different terms.

Owner
BaratiLab
BaratiLab
Code for Understanding Pooling in Graph Neural Networks

Select, Reduce, Connect This repository contains the code used for the experiments of: "Understanding Pooling in Graph Neural Networks" Setup Install

Daniele Grattarola 37 Dec 13, 2022
BlueFog Tutorials

BlueFog Tutorials Welcome to the BlueFog tutorials! In this repository, we've put together a collection of awesome Jupyter notebooks. These notebooks

4 Oct 27, 2021
text_recognition_toolbox: The reimplementation of a series of classical scene text recognition papers with Pytorch in a uniform way.

text recognition toolbox 1. 项目介绍 该项目是基于pytorch深度学习框架,以统一的改写方式实现了以下6篇经典的文字识别论文,论文的详情如下。该项目会持续进行更新,欢迎大家提出问题以及对代码进行贡献。 模型 论文标题 发表年份 模型方法划分 CRNN 《An End-t

168 Dec 24, 2022
General Multi-label Image Classification with Transformers

General Multi-label Image Classification with Transformers Jack Lanchantin, Tianlu Wang, Vicente Ordóñez Román, Yanjun Qi Conference on Computer Visio

QData 154 Dec 21, 2022
Baseline and template code for node21 detection track

Nodule Detection Algorithm This codebase implements a baseline model, Faster R-CNN, for the nodule detection track in NODE21. It contains all necessar

node21challenge 11 Jan 15, 2022
Attention Probe: Vision Transformer Distillation in the Wild

Attention Probe: Vision Transformer Distillation in the Wild Jiahao Wang, Mingdeng Cao, Shuwei Shi, Baoyuan Wu, Yujiu Yang In ICASSP 2022 This code is

Wang jiahao 3 Oct 31, 2022
CSAC - Collaborative Semantic Aggregation and Calibration for Separated Domain Generalization

CSAC Introduction This repository contains the implementation code for paper: Co

ScottYuan 5 Jul 22, 2022
Build Low Code Automated Tensorflow, What-IF explainable models in just 3 lines of code.

Build Low Code Automated Tensorflow explainable models in just 3 lines of code.

Hasan Rafiq 170 Dec 26, 2022
The repo contains the code of the ACL2020 paper `Dice Loss for Data-imbalanced NLP Tasks`

Dice Loss for NLP Tasks This repository contains code for Dice Loss for Data-imbalanced NLP Tasks at ACL2020. Setup Install Package Dependencies The c

223 Dec 17, 2022
This is the repo for the paper "Improving the Accuracy-Memory Trade-Off of Random Forests Via Leaf-Refinement".

Improving the Accuracy-Memory Trade-Off of Random Forests Via Leaf-Refinement This is the repository for the paper "Improving the Accuracy-Memory Trad

3 Dec 29, 2022
Learning View Priors for Single-view 3D Reconstruction (CVPR 2019)

Learning View Priors for Single-view 3D Reconstruction (CVPR 2019) This is code for a paper Learning View Priors for Single-view 3D Reconstruction by

Hiroharu Kato 38 Aug 17, 2022
Aligning Latent and Image Spaces to Connect the Unconnectable

About This repo contains the official implementation of the Aligning Latent and Image Spaces to Connect the Unconnectable paper. It is a GAN model whi

Ivan Skorokhodov 203 Jan 03, 2023
A Repository of Community-Driven Natural Instructions

A Repository of Community-Driven Natural Instructions TLDR; this repository maintains a community effort to create a large collection of tasks and the

AI2 244 Jan 04, 2023
This repository contains code released by Google Research.

This repository contains code released by Google Research.

Google Research 26.6k Dec 31, 2022
Training DiffWave using variational method from Variational Diffusion Models.

Variational DiffWave Training DiffWave using variational method from Variational Diffusion Models. Quick Start python train_distributed.py discrete_10

Chin-Yun Yu 26 Dec 13, 2022
EMNLP'2021: Simple Entity-centric Questions Challenge Dense Retrievers

EntityQuestions This repository contains the EntityQuestions dataset as well as code to evaluate retrieval results from the the paper Simple Entity-ce

Princeton Natural Language Processing 119 Sep 28, 2022
Code for the paper BERT might be Overkill: A Tiny but Effective Biomedical Entity Linker based on Residual Convolutional Neural Networks

Biomedical Entity Linking This repo provides the code for the paper BERT might be Overkill: A Tiny but Effective Biomedical Entity Linker based on Res

Tuan Manh Lai 24 Oct 24, 2022
We present a regularized self-labeling approach to improve the generalization and robustness properties of fine-tuning.

Overview This repository provides the implementation for the paper "Improved Regularization and Robustness for Fine-tuning in Neural Networks", which

NEU-StatsML-Research 21 Sep 08, 2022
Convolutional neural network that analyzes self-generated images in a variety of languages to find etymological similarities

This project is a convolutional neural network (CNN) that analyzes self-generated images in a variety of languages to find etymological similarities. Specifically, the goal is to prove that computer

1 Feb 03, 2022
A comprehensive list of published machine learning applications to cosmology

ml-in-cosmology This github attempts to maintain a comprehensive list of published machine learning applications to cosmology, organized by subject ma

George Stein 290 Dec 29, 2022