PAIRED in PyTorch 🔥

Related tags

Deep Learningpaired
Overview

License

PAIRED

This codebase provides a PyTorch implementation of Protagonist Antagonist Induced Regret Environment Design (PAIRED), which was first introduced in "Emergent Complexity and Zero-Shot Transfer via Unsupervised Environment Design" (Dennis et al, 2020). This implementation comes integrated with custom adversarial maze environments based on MiniGrid environment (Chevalier-Boisvert et al, 2018), as used in Dennis et al, 2020.

Unsupervised environment design (UED) methods propose a curriculum of tasks or environment instances (levels) that aims to foster more sample efficient learning and robust policies. PAIRED performs unsupervised environment design (UED) using a three-player game among two student agents—the protagonist and antagonist—and an adversary. The antagonist is allied with the adversary, which proposes new environment instances (or levels) aiming to maximize the regret of the protagonist, estimated as the difference in returns achieved by the student agents across a batch of rollouts on proposed levels.

PAIRED has a strong guarantee of robustness in that at Nash equilibrium, it provably induces a minimax regret policy for the protagonist, which means that the protagonist optimally trades off regret across all possible levels that can be proposed by the adversary.

UED algorithms included

  • PAIRED (Protagonist Antagonist Induced Regret Environment Design)
  • Minimax
  • Domain randomization

Set up

To install the necessary dependencies, run the following commands:

conda create --name paired python=3.8
conda activate paired
pip install -r requirements.txt

git clone https://github.com/openai/baselines.git
cd baselines
pip install -e .
cd ..

Configuration

Detailed descriptions of the various command-line arguments for the main training script, train.py can be found in arguments.py.

Experiments

MiniGrid benchmark results

For convenience, configuration json files are provided to generate the commands to run the specific experimental settings featured in Dennis et al, 2020. To generate the command to launch 1 run of the experiment codified by the configuration file config.json in the local folder train_scripts/configs, simply run the following, and copy and paste the output into your command line.

python train_scripts/make_cmd.py --json config --num_trials 1

Alternatively, you can run the following to copy the command directly to your clipboard:

python train_scripts/make_cmd.py --json config --num_trials 1 | pbcopy

By default, each experiment run will generate a folder in ~/logs/paired named after the --xpid argument passed into the the train command. This folder will contain log outputs in logs.csv and periodic screenshots of generated levels in the directory screenshots. Each screenshot uses the naming convention update_<number of PPO updates>.png. The latest model checkpoint will be output to model.tar, and archived model checkpoints are also saved according to the naming convention model_<number of PPO updates>.tar.

The json files for reproducing various MiniGrid experiments from Dennis et al, 2020 are listed below:

Method json config
PAIRED minigrid/paired.json
Minimax minigrid/minimax.json
DR minigrid/dr.json

Evaluation

You can use the following command to batch evaluate all trained models whose output directory shares the same <xpid_prefix> before the indexing _[0-9]+ suffix:

python -m eval \
--base_path "~/logs/paired" \
--prefix '<xpid prefix>' \
--num_processes 2 \
--env_names \
'MultiGrid-SixteenRooms-v0,MultiGrid-Labyrinth-v0,MultiGrid-Maze-v0'
--num_episodes 100 \
--model_tar model
Owner
UCL DARK Lab
UCL Deciding, Acting, and Reasoning with Knowledge (DARK) Lab
UCL DARK Lab
Source code for the paper "PLOME: Pre-training with Misspelled Knowledge for Chinese Spelling Correction" in ACL2021

PLOME:Pre-training with Misspelled Knowledge for Chinese Spelling Correction (ACL2021) This repository provides the code and data of the work in ACL20

197 Nov 26, 2022
A machine learning malware analysis framework for Android apps.

🕵️ A machine learning malware analysis framework for Android apps. ☢️ DroidDetective is a Python tool for analysing Android applications (APKs) for p

James Stevenson 77 Dec 27, 2022
Detector for Log4Shell exploitation attempts

log4shell-detector Detector for Log4Shell exploitation attempts Idea The problem with the log4j CVE-2021-44228 exploitation is that the string can be

Florian Roth 729 Dec 25, 2022
基于PaddleOCR搭建的OCR server... 离线部署用

开头说明 DangoOCR 是基于大家的 CPU处理器 来运行的,CPU处理器 的好坏会直接影响其速度, 但不会影响识别的精度 ,目前此版本识别速度可能在 0.5-3秒之间,具体取决于大家机器的配置,可以的话尽量不要在运行时开其他太多东西。需要配合团子翻译器 Ver3.6 及其以上的版本才可以使用!

胖次团子 131 Dec 25, 2022
This is a simple plugin for Vim that allows you to use OpenAI Codex.

🤖 Vim Codex An AI plugin that does the work for you. This is a simple plugin for Vim that will allow you to use OpenAI Codex. To use this plugin you

Tom Dörr 195 Dec 28, 2022
Official Pytorch Implementation of Unsupervised Image Denoising with Frequency Domain Knowledge

Unsupervised Image Denoising with Frequency Domain Knowledge (BMVC 2021 Oral) : Official Project Page This repository provides the official PyTorch im

Donggon Jang 12 Sep 26, 2022
A fast Evolution Strategy implementation in Python

Evostra: Evolution Strategy for Python Evolution Strategy (ES) is an optimization technique based on ideas of adaptation and evolution. You can learn

Mika 251 Dec 08, 2022
The openspoor package is intended to allow easy transformation between different geographical and topological systems commonly used in Dutch Railway

Openspoor The openspoor package is intended to allow easy transformation between different geographical and topological systems commonly used in Dutch

7 Aug 22, 2022
Official implementation of SIGIR'2021 paper: "Sequential Recommendation with Graph Neural Networks".

SURGE: Sequential Recommendation with Graph Neural Networks This is our TensorFlow implementation for the paper: Sequential Recommendation with Graph

FIB LAB, Tsinghua University 53 Dec 26, 2022
Perturb-and-max-product: Sampling and learning in discrete energy-based models

Perturb-and-max-product: Sampling and learning in discrete energy-based models This repo contains code for reproducing the results in the paper Pertur

Vicarious 2 Mar 14, 2022
This codebase is the official implementation of Test-Time Classifier Adjustment Module for Model-Agnostic Domain Generalization (NeurIPS2021, Spotlight)

Test-Time Classifier Adjustment Module for Model-Agnostic Domain Generalization This codebase is the official implementation of Test-Time Classifier A

47 Dec 28, 2022
[SIGMETRICS 2022] One Proxy Device Is Enough for Hardware-Aware Neural Architecture Search

One Proxy Device Is Enough for Hardware-Aware Neural Architecture Search paper | website One Proxy Device Is Enough for Hardware-Aware Neural Architec

10 Dec 16, 2022
Official implementation of the paper Visual Parser: Representing Part-whole Hierarchies with Transformers

Visual Parser (ViP) This is the official implementation of the paper Visual Parser: Representing Part-whole Hierarchies with Transformers. Key Feature

Shuyang Sun 117 Dec 11, 2022
Based on the given clinical dataset, Predict whether the patient having Heart Disease or Not having Heart Disease

Heart_Disease_Classification Based on the given clinical dataset, Predict whether the patient having Heart Disease or Not having Heart Disease Dataset

Ashish 1 Jan 30, 2022
Implementation for On Provable Benefits of Depth in Training Graph Convolutional Networks

Implementation for On Provable Benefits of Depth in Training Graph Convolutional Networks Setup This implementation is based on PyTorch = 1.0.0. Smal

Weilin Cong 8 Oct 28, 2022
Reinforcement learning for self-driving in a 3D simulation

SelfDrive_AI Reinforcement learning for self-driving in a 3D simulation (Created using UNITY-3D) 1. Requirements for the SelfDrive_AI Gym You need Pyt

Surajit Saikia 17 Dec 14, 2021
RaceBERT -- A transformer based model to predict race and ethnicty from names

RaceBERT -- A transformer based model to predict race and ethnicty from names Installation pip install racebert Using a virtual environment is highly

Prasanna Parasurama 3 Nov 02, 2022
Dynamic Capacity Networks using Tensorflow

Dynamic Capacity Networks using Tensorflow Dynamic Capacity Networks (DCN; http://arxiv.org/abs/1511.07838) implementation using Tensorflow. DCN reduc

Taeksoo Kim 8 Feb 23, 2021
ICLR 2021, Fair Mixup: Fairness via Interpolation

Fair Mixup: Fairness via Interpolation Training classifiers under fairness constraints such as group fairness, regularizes the disparities of predicti

Ching-Yao Chuang 49 Nov 22, 2022
Your interactive network visualizing dashboard

Your interactive network visualizing dashboard Documentation: Here What is Jaal Jaal is a python based interactive network visualizing tool built usin

Mohit 177 Jan 04, 2023