PyTorch implementation of Decoupling Value and Policy for Generalization in Reinforcement Learning

Related tags

Deep Learningidaac
Overview

IDAAC: Invariant Decoupled Advantage Actor-Critic

This is a PyTorch implementation of the methods proposed in

Decoupling Value and Policy for Generalization in Reinforcement Learning by

Roberta Raileanu and Rob Fergus.

Citation

If you use this code in your own work, please cite our paper:

@article{Raileanu2021DecouplingVA,
  title={Decoupling Value and Policy for Generalization in Reinforcement Learning},
  author={Roberta Raileanu and R. Fergus},
  journal={ArXiv},
  year={2021},
  volume={abs/2102.10330}
}

Requirements

To install all the required dependencies:

conda create -n idaac python=3.7
conda activate idaac

cd idaac
pip install -r requirements.txt

pip install procgen

git clone https://github.com/openai/baselines.git
cd baselines 
python setup.py install 

Instructions

This repo provides instructions for training IDAAC, DAAC, and PPO on the Procgen benchmark.

Train IDAAC on CoinRun

python train.py --env_name coinrun --algo idaac

Train DAAC on CoinRun

python train.py --env_name coinrun --algo daac

Train PPO on CoinRun

python train.py --env_name coinrun --algo ppo --ppo_epoch 3

Note: The default code uses the same set of hyperparameters (HPs) for all environments, which are the best ones overall. In our studies, we've found some of the games can further benefit from slightly different HPs, so we provide those as well. To use the best hyperparameters for each environment, use the flag --use_best_hps.

Overview of DAAC and IDAAC

IDAAC Overview

Procgen Results

IDAAC achieves state-of-the-art performance on the Procgen benchmark (easy mode), significantly improving the agent's generalization ability over standard RL methods such as PPO.

Test Results on Procgen

Procgen Test Results

Acknowledgements

This code was based on an open sourced PyTorch implementation of PPO.

Mind the Trade-off: Debiasing NLU Models without Degrading the In-distribution Performance

Models for natural language understanding (NLU) tasks often rely on the idiosyncratic biases of the dataset, which make them brittle against test cases outside the training distribution.

Ubiquitous Knowledge Processing Lab 22 Jan 02, 2023
Automatic caption evaluation metric based on typicality analysis.

SeMantic and linguistic UndeRstanding Fusion (SMURF) Automatic caption evaluation metric described in the paper "SMURF: SeMantic and linguistic UndeRs

Joshua Feinglass 6 Jan 09, 2022
💃 VALSE: A Task-Independent Benchmark for Vision and Language Models Centered on Linguistic Phenomena

💃 VALSE: A Task-Independent Benchmark for Vision and Language Models Centered on Linguistic Phenomena.

Heidelberg-NLP 17 Nov 07, 2022
DeRF: Decomposed Radiance Fields

DeRF: Decomposed Radiance Fields Daniel Rebain, Wei Jiang, Soroosh Yazdani, Ke Li, Kwang Moo Yi, Andrea Tagliasacchi Links Paper Project Page Abstract

UBC Computer Vision Group 24 Dec 02, 2022
3D Generative Adversarial Network

Learning a Probabilistic Latent Space of Object Shapes via 3D Generative-Adversarial Modeling This repository contains pre-trained models and sampling

Chengkai Zhang 791 Dec 20, 2022
Code accompanying "Evolving spiking neuron cellular automata and networks to emulate in vitro neuronal activity," accepted to IEEE SSCI ICES 2021

Evolving-spiking-neuron-cellular-automata-and-networks-to-emulate-in-vitro-neuronal-activity Code accompanying "Evolving spiking neuron cellular autom

SOCRATES: Self-Organizing Computational substRATES 2 Dec 02, 2022
Open source Python module for computer vision

About PCV PCV is a pure Python library for computer vision based on the book "Programming Computer Vision with Python" by Jan Erik Solem. More details

Jan Erik Solem 1.9k Jan 06, 2023
A library for implementing Decentralized Graph Neural Network algorithms.

decentralized-gnn A package for implementing and simulating decentralized Graph Neural Network algorithms for classification of peer-to-peer nodes. De

Multimedia Knowledge and Social Analytics Lab 5 Nov 07, 2022
Transformers based fully on MLPs

Awesome MLP-based Transformers papers An up-to-date list of Transformers based fully on MLPs without attention! Why this repo? After transformers and

Fawaz Sammani 35 Dec 30, 2022
MEDS: Enhancing Memory Error Detection for Large-Scale Applications

MEDS: Enhancing Memory Error Detection for Large-Scale Applications Prerequisites cmake and clang Build MEDS supporting compiler $ make Build Using Do

Secomp Lab at Purdue University 34 Dec 14, 2022
The code for our NeurIPS 2021 paper "Kernelized Heterogeneous Risk Minimization".

Kernelized-HRM Jiashuo Liu, Zheyuan Hu The code for our NeurIPS 2021 paper "Kernelized Heterogeneous Risk Minimization"[1]. This repo contains the cod

Liu Jiashuo 8 Nov 20, 2022
PyTorch implementation of EigenGAN

PyTorch Implementation of EigenGAN Train python train.py [image_folder_path] --name [experiment name] Test python test.py [ckpt path] --traverse FFH

62 Nov 12, 2022
🍀 Pytorch implementation of various Attention Mechanisms, MLP, Re-parameter, Convolution, which is helpful to further understand papers.⭐⭐⭐

🍀 Pytorch implementation of various Attention Mechanisms, MLP, Re-parameter, Convolution, which is helpful to further understand papers.⭐⭐⭐

xmu-xiaoma66 7.7k Jan 05, 2023
3 Apr 20, 2022
Class-Attentive Diffusion Network for Semi-Supervised Classification [AAAI'21] (official implementation)

Class-Attentive Diffusion Network for Semi-Supervised Classification Official Implementation of AAAI 2021 paper Class-Attentive Diffusion Network for

Jongin Lim 7 Sep 20, 2022
Rl-quickstart - Reinforcement Learning Quickstart

Reinforcement Learning Quickstart To get setup with the repository, git clone ht

UCLA DataRes 3 Jun 16, 2022
The MATH Dataset

Measuring Mathematical Problem Solving With the MATH Dataset This is the repository for Measuring Mathematical Problem Solving With the MATH Dataset b

Dan Hendrycks 267 Dec 26, 2022
Composable transformations of Python+NumPy programs: differentiate, vectorize, JIT to GPU/TPU, and more

JAX: Autograd and XLA Quickstart | Transformations | Install guide | Neural net libraries | Change logs | Reference docs | Code search News: JAX tops

Google 21.3k Jan 01, 2023
Predictive Modeling on Electronic Health Records(EHR) using Pytorch

Predictive Modeling on Electronic Health Records(EHR) using Pytorch Overview Although there are plenty of repos on vision and NLP models, there are ve

81 Jan 01, 2023
Official Pytorch Implementation of 3DV2021 paper: SAFA: Structure Aware Face Animation.

SAFA: Structure Aware Face Animation (3DV2021) Official Pytorch Implementation of 3DV2021 paper: SAFA: Structure Aware Face Animation. Getting Started

QiulinW 122 Dec 23, 2022