[ICLR 2021] Rank the Episodes: A Simple Approach for Exploration in Procedurally-Generated Environments.

Related tags

Deep Learningrapid
Overview

[ICLR 2021] RAPID: A Simple Approach for Exploration in Reinforcement Learning

This is the Tensorflow implementation of ICLR 2021 paper Rank the Episodes: A Simple Approach for Exploration in Procedurally-Generated Environments. We propose a simple method RAPID for exploration through scroring the previous episodes and reproducing the good exploration behaviors with imitation learning. overview

The implementation is based on OpenAI baselines. For all the experiments, add the option --disable_rapid to see the baseline result. RAPID can achieve better performance and sample efficiency than state-of-the-art exploration methods on MiniGrid environments. rendering performance

Cite This Work

@inproceedings{
zha2021rank,
title={Rank the Episodes: A Simple Approach for Exploration in Procedurally-Generated Environments},
author={Daochen Zha and Wenye Ma and Lei Yuan and Xia Hu and Ji Liu},
booktitle={International Conference on Learning Representations},
year={2021},
url={https://openreview.net/forum?id=MtEE0CktZht}
}

Installation

Please make sure that you have Python 3.5+ installed. First, clone the repo with

git clone https://github.com/daochenzha/rapid.git
cd rapid

Then install the dependencies with pip:

pip install -r requirements.txt
pip install -e .

To run MuJoCo experiments, you need to have the MuJoCo license. Install mujoco-py with

pip install mujoco-py==1.50.1.68

How to run the code

The entry is main.py. Some important hyperparameters are as follows.

  • --env: what environment to be used
  • --num_timesteps: the number of timesteps to be run
  • --w0: the weight of extrinsic reward score
  • --w1: the weight of local score
  • --w2: the weight of global score
  • --sl_until: do the RAPID update until which timestep
  • --disable_rapid: use it to compare with PPO baseline
  • --log_dir: the directory to save logs

Reproducing the result of MiniGrid environments

For MiniGrid-KeyCorridorS3R2, run

python main.py --env MiniGrid-KeyCorridorS3R2-v0 --sl_until 1200000

For MiniGrid-KeyCorridorS3R3, run

python main.py --env MiniGrid-KeyCorridorS3R3-v0 --sl_until 3000000

For other environments, run

python main.py --env $ENV

where $ENV is the environment name.

Run MiniWorld Maze environment

  1. Clone the latest master branch of MiniWorld and install it
git clone -b master --single-branch --depth=1 https://github.com/maximecb/gym-miniworld.git
cd gym-miniwolrd
pip install -e .
cd ..
  1. Start training with
python main.py --env MiniWorld-MazeS5-v0 --num_timesteps 5000000 --nsteps 512 --w1 0.00001 --w2 0.0 --log_dir results/MiniWorld-MazeS5-v0

For server without screens, you may install xvfb with

apt-get install xvfb

Then start training with

xvfb-run -a -s "-screen 0 1024x768x24 -ac +extension GLX +render -noreset" python main.py --env MiniWorld-MazeS5-v0 --num_timesteps 5000000 --nsteps 512 --w1 0.00001 --w2 0.0 --log_dir results/MiniWorld-MazeS5-v0

Run MuJoCo experiments

Run

python main.py --seed 0 --env $env --num_timesteps 5000000 --lr 5e-4 --w1 0.001 --w2 0.0 --log_dir logs/$ENV/rapid

where $ENV can be EpisodeSwimmer-v2, EpisodeHopper-v2, EpisodeWalker2d-v2, EpisodeInvertedPendulum-v2, DensityEpisodeSwimmer-v2, or ViscosityEpisodeSwimmer-v2.

Owner
Daochen Zha
PhD student in Machine Learning and Data Mining
Daochen Zha
CRF-RNN for Semantic Image Segmentation - PyTorch version

This repository contains the official PyTorch implementation of the "CRF-RNN" semantic image segmentation method, published in the ICCV 2015

Sadeep Jayasumana 170 Dec 13, 2022
Model-based 3D Hand Reconstruction via Self-Supervised Learning, CVPR2021

S2HAND: Model-based 3D Hand Reconstruction via Self-Supervised Learning S2HAND presents a self-supervised 3D hand reconstruction network that can join

Yujin Chen 72 Dec 12, 2022
Code for technical report "An Improved Baseline for Sentence-level Relation Extraction".

RE_improved_baseline Code for technical report "An Improved Baseline for Sentence-level Relation Extraction". Requirements torch = 1.8.1 transformers

Wenxuan Zhou 74 Nov 29, 2022
A curated list of awesome Deep Learning tutorials, projects and communities.

Awesome Deep Learning Table of Contents Books Courses Videos and Lectures Papers Tutorials Researchers Websites Datasets Conferences Frameworks Tools

Christos 20k Jan 05, 2023
OpenMMLab Video Perception Toolbox. It supports Video Object Detection (VID), Multiple Object Tracking (MOT), Single Object Tracking (SOT), Video Instance Segmentation (VIS) with a unified framework.

English | 简体中文 Documentation: https://mmtracking.readthedocs.io/ Introduction MMTracking is an open source video perception toolbox based on PyTorch.

OpenMMLab 2.7k Jan 08, 2023
Source Code for ICSE 2022 Paper - ``Can We Achieve Fairness Using Semi-Supervised Learning?''

Fair-SSL Source Code for ICSE 2022 Paper - Can We Achieve Fairness Using Semi-Supervised Learning? Ethical bias in machine learning models has become

1 Dec 18, 2021
Multistream CNN for Robust Acoustic Modeling

Multistream Convolutional Neural Network (CNN) A multistream CNN is a novel neural network architecture for robust acoustic modeling in speech recogni

ASAPP Research 37 Sep 21, 2022
Fog Simulation on Real LiDAR Point Clouds for 3D Object Detection in Adverse Weather

LiDAR fog simulation Created by Martin Hahner at the Computer Vision Lab of ETH Zurich. This is the official code release of the paper Fog Simulation

Martin Hahner 110 Dec 30, 2022
Model Zoo for AI Model Efficiency Toolkit

We provide a collection of popular neural network models and compare their floating point and quantized performance.

Qualcomm Innovation Center 137 Jan 03, 2023
Listing arxiv - Personalized list of today's articles from ArXiv

Personalized list of today's articles from ArXiv Print and/or send to your gmail

Lilianne Nakazono 5 Jun 17, 2022
Proposal, Tracking and Segmentation (PTS): A Cascaded Network for Video Object Segmentation

Proposal, Tracking and Segmentation (PTS): A Cascaded Network for Video Object Segmentation By Qiang Zhou*, Zilong Huang*, Lichao Huang, Han Shen, Yon

Forest 117 Apr 01, 2022
Structure-Preserving Deraining with Residue Channel Prior Guidance (ICCV2021)

SPDNet Structure-Preserving Deraining with Residue Channel Prior Guidance (ICCV2021) Requirements Linux Platform NVIDIA GPU + CUDA CuDNN PyTorch == 0.

41 Dec 12, 2022
LIAO Shuiying 6 Dec 01, 2022
Embeds a story into a music playlist by sorting the playlist so that the order of the music follows a narrative arc.

playlist-story-builder This project attempts to embed a story into a music playlist by sorting the playlist so that the order of the music follows a n

Dylan R. Ashley 0 Oct 28, 2021
Intrusion Test Tool with Python

P3ntsT00L Uma ferramenta escrita em Python, feita para Teste de intrusão. Requisitos ter o python 3.9.8 instalado em sua máquina. ter a git instalada

josh washington 2 Dec 27, 2021
darija <-> english dictionary

darija-dictionary Having advanced IT solutions that are well adapted to the Moroccan context passes inevitably through understanding Moroccan dialect.

DODa 102 Jan 01, 2023
Single Red Blood Cell Hydrodynamic Traps Via the Generative Design

Rbc-traps-generative-design - The generative design for single red clood cell hydrodynamic traps using GEFEST framework

Natural Systems Simulation Lab 4 Jun 16, 2022
UAV-Networks-Routing is a Python simulator for experimenting routing algorithms and mac protocols on unmanned aerial vehicle networks.

UAV-Networks Simulator - Autonomous Networking - A.A. 20/21 UAV-Networks-Routing is a Python simulator for experimenting routing algorithms and mac pr

0 Nov 13, 2021
A Number Recognition algorithm

Paddle-VisualAttention Results_Compared SVHN Dataset Methods Steps GPU Batch Size Learning Rate Patience Decay Step Decay Rate Training Speed (FPS) Ac

1 Nov 12, 2021
Localized representation learning from Vision and Text (LoVT)

Localized Vision-Text Pre-Training Contrastive learning has proven effective for pre- training image models on unlabeled data and achieved great resul

Philip Müller 10 Dec 07, 2022