PantheonRL is a package for training and testing multi-agent reinforcement learning environments.

Overview

PantheonRL

PantheonRL is a package for training and testing multi-agent reinforcement learning environments. The goal of PantheonRL is to provide a modular and extensible framework for training agent policies, fine-tuning agent policies, ad-hoc pairing of agents, and more. PantheonRL also provides a web user interface suitable for lightweight experimentation and prototyping.

PantheonRL is built on top of StableBaselines3 (SB3), allowing direct access to many of SB3's standard RL training algorithms such as PPO. PantheonRL currently follows a decentralized training paradigm -- each agent is equipped with its own replay buffer and update algorithm. The agents objects are designed to be easily manipulable. They can be saved, loaded and plugged into different training procedures such as self-play, ad-hoc / cross-play, round-robin training, or finetuning.

This package will be presented as a demo at the AAAI-22 Demonstrations Program.

Demo Paper

Demo Video

"PantheonRL: A MARL Library for Dynamic Training Interactions"
Bidipta Sarkar*, Aditi Talati*, Andy Shih*, Dorsa Sadigh
In Proceedings of the 36th AAAI Conference on Artificial Intelligence (Demo Track), 2022

@inproceedings{sarkar2021pantheonRL,
  title={PantheonRL: A MARL Library for Dynamic Training Interactions},
  author={Sarkar, Bidipta and Talati, Aditi and Shih, Andy and Sadigh Dorsa},
  booktitle = {Proceedings of the 36th AAAI Conference on Artificial Intelligence (Demo Track)},
  year={2022}
}

Installation

# Optionally create conda environments
conda create -n PantheonRL python=3.7
conda activate PantheonRL

# Clone and install PantheonRL
git clone https://github.com/Stanford-ILIAD/PantheonRL.git
cd PantheonRL
pip install -e .

Overcooked Installation

# Optionally install Overcooked environment
git submodule update --init --recursive
pip install -e overcookedgym/human_aware_rl/overcooked_ai

PettingZoo Installation

# Optionally install PettingZoo environments
pip install pettingzoo

# to install a group of pettingzoo environments
pip install "pettingzoo[classic]"

Command Line Invocation

Example

python3 trainer.py LiarsDice-v0 PPO PPO --seed 10 --preset 1
# requires Overcooked installation (see above instructions)
python3 trainer.py OvercookedMultiEnv-v0 PPO PPO --env-config '{"layout_name":"simple"}' --seed 10 --preset 1

For examples on round-robin training followed by partner adaptation, check out these instructions.

For more examples, check out the examples/ directory.

Web User Interface

The first time the web interface is being run in a new location, the database must be initialized. After that, the init-db command should not be called again, because this will clear all user account data.

Set environment variables and (re)inititalize the database

export FLASK_APP=website
export FLASK_ENV=development
flask init-db

Start the web user interface. Make sure that ports 5000 and 5001 (used for Tensorboard) are not taken.

flask run --host=0.0.0.0 --port=5000


Agent selection screen. Users can customize the ego and partner agents.


Training screen. Users can view basic information, or spawn a Tensorboard tab for full monitoring.

Features

General Features PantheonRL
Documentation ✔️
Web user interface ✔️
Built on top of SB3 ✔️
Supports PettingZoo Envs ✔️
Environment Features PantheonRL
Frame stacking (recurrence) ✔️
Simultaneous multiagent envs ✔️
Turn-based multiagent envs ✔️
2-player envs ✔️
N-player envs ✔️
Custom environments ✔️
Training Features PantheonRL
Self-play ✔️
Ad-hoc / cross-play ✔️
Round-robin training ✔️
Finetune / adapt to new partners ✔️
Custom policies ✔️

Current Environments

Name Environment Type Reward Type Players Visualization
Rock Paper Scissors SimultaneousEnv Competitive 2
Liar's Dice TurnBasedEnv Competitive 2
Block World [1] TurnBasedEnv Cooperative 2 ✔️
Overcooked [2] SimultaneousEnv Cooperative 2 ✔️
PettingZoo [3] Mixed Mixed N ✔️

[1] Adapted from the block construction task from https://github.com/cogtoolslab/compositional-abstractions

[2] Adapted from the Human_Aware_Rl / Overcooked AI package from https://github.com/HumanCompatibleAI/human_aware_rl

[3] PettingZoo environments from https://github.com/Farama-Foundation/PettingZoo

Owner
Stanford Intelligent and Interactive Autonomous Systems Group
Stanford Intelligent and Interactive Autonomous Systems Group
Stanford Intelligent and Interactive Autonomous Systems Group
Memory efficient transducer loss computation

Introduction This project implements the optimization techniques proposed in Improving RNN Transducer Modeling for End-to-End Speech Recognition to re

Fangjun Kuang 51 Nov 25, 2022
Train DeepLab for Semantic Image Segmentation

Train DeepLab for Semantic Image Segmentation Martin Kersner, [email protected]

Martin Kersner 172 Dec 14, 2022
Keywords : Streamlit, BertTokenizer, BertForMaskedLM, Pytorch

Next Word Prediction Keywords : Streamlit, BertTokenizer, BertForMaskedLM, Pytorch 🎬 Project Demo ✔ Application is hosted on Streamlit. You can see t

Vivek7 3 Aug 26, 2022
Code for "LASR: Learning Articulated Shape Reconstruction from a Monocular Video". CVPR 2021.

LASR Installation Build with conda conda env create -f lasr.yml conda activate lasr # install softras cd third_party/softras; python setup.py install;

Google 157 Dec 26, 2022
A flexible framework of neural networks for deep learning

Chainer: A deep learning framework Website | Docs | Install Guide | Tutorials (ja) | Examples (Official, External) | Concepts | ChainerX Forum (en, ja

Chainer 5.8k Jan 06, 2023
Based on Stockfish neural network(similar to LcZero)

MarcoEngine Marco Engine - interesnaya neyronnaya shakhmatnaya set', kotoraya ispol'zuyet metod samoobucheniya(dostizheniye khoroshoy igy putem proboy

Marcus Kemaul 4 Mar 12, 2022
AISTATS 2019: Confidence-based Graph Convolutional Networks for Semi-Supervised Learning

Confidence-based Graph Convolutional Networks for Semi-Supervised Learning Source code for AISTATS 2019 paper: Confidence-based Graph Convolutional Ne

MALL Lab (IISc) 56 Dec 03, 2022
An end-to-end machine learning library to directly optimize AUC loss

LibAUC An end-to-end machine learning library for AUC optimization. Why LibAUC? Deep AUC Maximization (DAM) is a paradigm for learning a deep neural n

Andrew 75 Dec 12, 2022
Sarus implementation of classical ML models. The models are implemented using the Keras API of tensorflow 2. Vizualization are implemented and can be seen in tensorboard.

Sarus published models Sarus implementation of classical ML models. The models are implemented using the Keras API of tensorflow 2. Vizualization are

Sarus Technologies 39 Aug 19, 2022
P-Tuning v2: Prompt Tuning Can Be Comparable to Finetuning Universally Across Scales and Tasks

P-tuning v2 P-Tuning v2: Prompt Tuning Can Be Comparable to Finetuning Universally Across Scales and Tasks An optimized prompt tuning strategy for sma

THUDM 540 Dec 30, 2022
Domain Generalization with MixStyle, ICLR'21.

MixStyle This repo contains the code of our ICLR'21 paper, "Domain Generalization with MixStyle". The OpenReview link is https://openreview.net/forum?

Kaiyang 208 Dec 28, 2022
Official PyTorch Implementation for InfoSwap: Information Bottleneck Disentanglement for Identity Swapping

InfoSwap: Information Bottleneck Disentanglement for Identity Swapping Code usage Please check out the user manual page. Paper Gege Gao, Huaibo Huang,

Grace Hešeri 56 Dec 20, 2022
Deconfounding Temporal Autoencoder: Estimating Treatment Effects over Time Using Noisy Proxies

Deconfounding Temporal Autoencoder (DTA) This is a repository for the paper "Deconfounding Temporal Autoencoder: Estimating Treatment Effects over Tim

Milan Kuzmanovic 3 Feb 04, 2022
This repository holds code and data for our PETS'22 article 'From "Onion Not Found" to Guard Discovery'.

From "Onion Not Found" to Guard Discovery (PETS'22) This repository holds the code and data for our PETS'22 paper titled 'From "Onion Not Found" to Gu

Lennart Oldenburg 3 May 04, 2022
The (Official) PyTorch Implementation of the paper "Deep Extraction of Manga Structural Lines"

MangaLineExtraction_PyTorch The (Official) PyTorch Implementation of the paper "Deep Extraction of Manga Structural Lines" Usage model_torch.py [sourc

Miaomiao Li 82 Jan 02, 2023
✔️ Visual, reactive testing library for Julia. Time machine included.

PlutoTest.jl (alpha release) Visual, reactive testing library for Julia A macro @test that you can use to verify your code's correctness. But instead

Pluto 68 Dec 20, 2022
OBBDetection is a oriented object detection library, which is based on MMdetection.

OBBDetection news: We are now updating OBBDetection to new vision based on MMdetection v2.10, which has more advanced models and more efficient featur

jbwang1997 401 Jan 02, 2023
Dense Prediction Transformers

Vision Transformers for Dense Prediction This repository contains code and models for our paper: Vision Transformers for Dense Prediction René Ranftl,

Intelligent Systems Lab Org 1.3k Jan 02, 2023
Deploy optimized transformer based models on Nvidia Triton server

🤗 Hugging Face Transformer submillisecond inference 🤯 and deployment on Nvidia Triton server Yes, you can perfom inference with transformer based mo

Lefebvre Sarrut Services 1.2k Jan 05, 2023
The pytorch implementation of the paper "text-guided neural image inpainting" at MM'2020

TDANet: Text-Guided Neural Image Inpainting, MM'2020 (Oral) MM | ArXiv This repository implements the paper "Text-Guided Neural Image Inpainting" by L

LisaiZhang 75 Dec 22, 2022