This repository contains source code for the Situated Interactive Language Grounding (SILG) benchmark

Related tags

Deep Learningsilg
Overview

SILG

This repository contains source code for the Situated Interactive Language Grounding (SILG) benchmark. If you find this work helpful, please consider citing this work:

@inproceedings{ zhong2021silg,
  title={ {SILG}: The Multi-environment Symbolic InteractiveLanguage Grounding Benchmark },
  author={ Victor Zhong and Austin W. Hanjie and Karthik Narasimhan and Luke Zettlemoyer },
  booktitle={ NeurIPS },
  year={ 2021 }
}

Please also consider citing the individual tasks included in SILG. They are RTFM, Messenger, NetHack Learning Environment, AlfWorld, and Touchdown.

RTFM

RTFM

Messenger

Messenger

SILGNethack

SILGNethack

ALFWorld

ALFWorld

SILGSymTouchdown

SILGSymTouchdown

How to install

You have to install the individual environments in order for SILG to work. The GitHub repository for each environment are found at

Our dockerfile also provides an example of how to install the environments in Ubuntu. You can also try using our install_envs.sh, which has only been tested in Ubuntu and MacOS.

bash install_envs.sh

Once you have installed the individual environments, install SILG as follows

pip install -r requirements.txt
pip install -e .

Some environments have (potentially a large quantity of) data files. Please download these via

bash download_env_data.sh  # if you do not want to use VisTouchdown, feel free to comment out its very large feature file

As a part of this download, we will symlink a ./cache directory from ./mycache. SILG environments will pull data files from this directory. If you are on NFS, you might want to move mycache to local disk and then relink the cache directory to avoid hitting NFS.

Docker

We provide a Docker container for this project. You can build the Docker image via docker build -t vzhong/silg . -f docker/Dockerfile. Alternatively you can pull my build from docker pull vzhong/silg. This contains the environments as well as SILG, but doesn't contain the large data download. You will still have to download the environment data and then mount the cache folder to the container. You may need to specify --platform linux/amd64 to Docker if you are running a M1 Mac.

Because some of the environments require that you install them first before downloading their data files, you want to download using the Docker container as well. You can do

docker run --rm --user "$(id -u):$(id -g)" -v $PWD/download_env_data.sh:/opt/silg/download_env_data.sh -v $PWD/mycache:/opt/silg/cache vzhong/silg bash download_env_data.sh

Once you have downloaded the environment data, you can use the container by doing something like

docker run --rm --user "$(id -u):$(id -g)" -it -v $PWD/mycache:/opt/silg/cache vzhong/silg /bin/bash

Visualizing environments

We provide a script to play SILG environments in the terminal. You can access it via

silg_play --env silg:rtfm_train_s1-v0  # use -h to see options

# docker variant
docker run --rm -it -v $PWD/mycache:/opt/silg/cache vzhong/silg silg_play --env silg:rtfm_train_s1-v0

These recordings are shown at the start of this document and are created using asciinema.

How to run experiments

The entrypoint to experiments is run_exp.py. We provide a slurm script to run experiments in launch.py. These scripts can also run jobs locally (e.g. without slurm). For example, to run RTFM:

python launch.py --local --envs rtfm

You can also log to WanDB with the --wandb option. For more, use the -h flag.

How to add a new environment

First, create a wrapper class in silg/envs/ .py . This wrapper will wrap the real environment and provide APIs used by the baseline models and the training script. silg/envs/rtfm.py contains an example of how to do this for RTFM. Once you have made the wrapper, don't forget to include its file in silg/envs/__init__.py.

The wrapper class must subclass silg.envs.base.SILGEnv and implement:

# return the list of text fields in the observation space
def get_text_fields(self):
    ...

# return max number of actions
def get_max_actions(self):
    ...

# return observation space
def get_observation_space(self):
    ...

# resets the environment
def my_reset(self):
    ...

# take a step in the environment
def my_step(self, action):
    ...

Additionally, you may want to implemnt rendering functions such as render_grid, parse_user_action, and get_user_actions so that it can be played with silg_play.

Note There is an implementation detail right now in that the Torchbeast code considers a "win" to be equivalent to the environment returning a reward >0.8. We hope to change this in the future (likely by adding another tensor field denoting win state) but please keep this in mind when implementing your environment. You likely want to keep the reward between -1 and +1, which high rewards >0.8 reserved for winning if you would like to use the training code as-is.

Changelog

Version 1.0

Initial release.

Owner
Victor Zhong
I am a PhD student at the University of Washington. Formerly Salesforce Research / MetaMind, @stanfordnlp, and ECE at UToronto.
Victor Zhong
Online Pseudo Label Generation by Hierarchical Cluster Dynamics for Adaptive Person Re-identification

Online Pseudo Label Generation by Hierarchical Cluster Dynamics for Adaptive Person Re-identification

TANG, shixiang 6 Nov 25, 2022
Implementation of "Fast and Flexible Temporal Point Processes with Triangular Maps" (Oral @ NeurIPS 2020)

Fast and Flexible Temporal Point Processes with Triangular Maps This repository includes a reference implementation of the algorithms described in "Fa

Oleksandr Shchur 20 Dec 02, 2022
Keeping it safe - AI Based COVID-19 Tracker using Deep Learning and facial recognition

Keeping it safe - AI Based COVID-19 Tracker using Deep Learning and facial recognition

Vansh Wassan 15 Jun 17, 2021
Indonesian Car License Plate Character Recognition using Tensorflow, Keras and OpenCV.

Monopol Indonesian Car License Plate (Indonesia Mobil Nomor Polisi) Character Recognition using Tensorflow, Keras and OpenCV. Background This applicat

Jayaku Briliantio 3 Apr 07, 2022
DrWhy is the collection of tools for eXplainable AI (XAI). It's based on shared principles and simple grammar for exploration, explanation and visualisation of predictive models.

Responsible Machine Learning With Great Power Comes Great Responsibility. Voltaire (well, maybe) How to develop machine learning models in a responsib

Model Oriented 590 Dec 26, 2022
Read and write layered TIFF ImageSourceData and ImageResources tags

Read and write layered TIFF ImageSourceData and ImageResources tags Psdtags is a Python library to read and write the Adobe Photoshop(r) specific Imag

Christoph Gohlke 4 Feb 05, 2022
🎯 A comprehensive gradient-free optimization framework written in Python

Solid is a Python framework for gradient-free optimization. It contains basic versions of many of the most common optimization algorithms that do not

Devin Soni 565 Dec 26, 2022
🚩🚩🚩

My CTF Challenges 2021 AIS3 Pre-exam / MyFirstCTF Name Category Keywords Difficulty β’Έβ“„β“‹β’Ύβ’Ή-①⑨ (MyFirstCTF Only) Reverse Baby β˜… Piano Reverse C#, .NET β˜…

6 Oct 28, 2021
Python-based Informatics Kit for Analysing Chemical Units

INSTALLATION Python-based Informatics Kit for the Analysis of Chemical Units Step 1: Make a conda environment: conda create -n pikachu python=3.9 cond

47 Dec 23, 2022
Official Implementation of SWAD (NeurIPS 2021)

SWAD: Domain Generalization by Seeking Flat Minima (NeurIPS'21) Official PyTorch implementation of SWAD: Domain Generalization by Seeking Flat Minima.

Junbum Cha 97 Dec 20, 2022
Block-wisely Supervised Neural Architecture Search with Knowledge Distillation (CVPR 2020)

DNA This repository provides the code of our paper: Blockwisely Supervised Neural Architecture Search with Knowledge Distillation. Illustration of DNA

Changlin Li 215 Dec 19, 2022
Pretrained models for Jax/Flax: StyleGAN2, GPT2, VGG, ResNet.

Pretrained models for Jax/Flax: StyleGAN2, GPT2, VGG, ResNet.

Matthias Wright 169 Dec 26, 2022
РСшСния, подсказки, тСсты ΠΈ ΡƒΡ‚ΠΈΠ»ΠΈΡ‚Ρ‹ для Ρ‚Ρ€Π΅Π½ΠΈΡ€ΠΎΠ²ΠΊΠΈ ΠΏΠΎ Π°Π»Π³ΠΎΡ€ΠΈΡ‚ΠΌΠ°ΠΌ ΠΎΡ‚ ЯндСкса.

РСшСния ΠΈ подсказки ΠΊ Ρ‚Ρ€Π΅Π½ΠΈΡ€ΠΎΠ²ΠΊΠ΅ ΠΏΠΎ Π°Π»Π³ΠΎΡ€ΠΈΡ‚ΠΌΠ°ΠΌ ΠΎΡ‚ ЯндСкса Π§Ρ‚ΠΎ Π΅ΡΡ‚ΡŒ Π²Π½ΡƒΡ‚Ρ€ΠΈ РСшСния с подсказками ΠΈ коммСнтариями; Ρ€Π΅ΠΊΠΎΠΌΠ΅Π½Π΄ΡƒΡŽ сначала ΡΠΌΠΎΡ‚Ρ€Π΅Ρ‚ΡŒ md Ρ„Π°ΠΉΠ» ΠΏ

Yankovsky Andrey 50 Dec 26, 2022
Nested cross-validation is necessary to avoid biased model performance in embedded feature selection in high-dimensional data with tiny sample sizes

Pruner for nested cross-validation - Sphinx-Doc Nested cross-validation is necessary to avoid biased model performance in embedded feature selection i

1 Dec 15, 2021
Vision Transformer and MLP-Mixer Architectures

Vision Transformer and MLP-Mixer Architectures Update (2.7.2021): Added the "When Vision Transformers Outperform ResNets..." paper, and SAM (Sharpness

Google Research 6.4k Jan 04, 2023
A quick recipe to learn all about Transformers

Transformers have accelerated the development of new techniques and models for natural language processing (NLP) tasks.

DAIR.AI 772 Dec 31, 2022
DI-HPC is an acceleration operator component for general algorithm modules in reinforcement learning algorithms

DI-HPC: Decision Intelligence - High Performance Computation DI-HPC is an acceleration operator component for general algorithm modules in reinforceme

OpenDILab 185 Dec 29, 2022
MWPToolkit is a PyTorch-based toolkit for Math Word Problem (MWP) solving.

MWPToolkit is a PyTorch-based toolkit for Math Word Problem (MWP) solving. It is a comprehensive framework for research purpose that integrates popular MWP benchmark datasets and typical deep learnin

119 Jan 04, 2023
Collapse by Conditioning: Training Class-conditional GANs with Limited Data

Collapse by Conditioning: Training Class-conditional GANs with Limited Data Moha

Mohamad Shahbazi 33 Dec 06, 2022
People Interaction Graph

Gihan Jayatilaka*, Jameel Hassan*, Suren Sritharan*, Janith Senananayaka, Harshana Weligampola, et. al., 2021. Holistic Interpretation of Public Scenes Using Computer Vision and Temporal Graphs to Id

University of Peradeniya : COVID Research Group 1 Aug 24, 2022