PyTorch implementation of SMODICE: Versatile Offline Imitation Learning via State Occupancy Matching

Related tags

Deep LearningSMODICE
Overview

SMODICE: Versatile Offline Imitation Learning via State Occupancy Matching

This is the official PyTorch implementation of SMODICE: Versatile Offline Imitation Learning via State Occupancy Matching.

SMODICE Demos

Tabular Experiments

  1. Offline Imitation Learning from Mismatched Experts
python smodice_tabular/run_tabular_mismatched.py
  1. Offline Imitation Learning from Examples
python smodice_tabular/run_tabular_example.py

Deep IL Experiments

Setup

  1. Create conda environment and activate it:
    conda env create -f environment.yml
    conda activate smodice
    pip install --upgrade numpy
    pip install torch==1.7.1+cu110 torchvision==0.8.2+cu110 torchaudio===0.7.2 -f https://download.pytorch.org/whl/torch_stable.html
    git clone https://github.com/rail-berkeley/d4rl
    cd d4rl
    pip install -e .
    
    

Offline IL from Observations

  1. Run the following command with variable ENV set to any of hopper, walker2d, halfcheetah, ant, kitchen.
python run_oil_observations.py --env_name $ENV
  1. For the AntMaze environment, first generate the random dataset:
cd envs
python generate_antmaze_random.py --noise

Then, run

python run_oil_antmaze.py

Offline IL from Mismatched Experts

  1. For halfcheetah and ant, run
python run_oil_observations.py --env_name halfcheetah --dataset 0.5 --mismatch True

and

python run_oil_observations.py --env_name ant --dataset disabled --mismatch True

respectively. 2. For AntMaze, run

python run_oil_antmaze.py --mismatch True

Offline IL from Examples

  1. For the PointMass-4Direction task, run
python run_oil_examples_pointmass.py
  1. For the AntMaze task, run
python run_oil_antmaze.py --mismatch False --example True
  1. For the Franka Kitchen based tasks, run
python run_oil_examples_kitchen.py --dataset $DATASET

where DATASET can be one of microwave, kettle.

Baselines

For any task, the BC baseline can be run by appending --disc_type bc to the above commands.

For RCE-TD3-BC and ORIL baselines, on the appropriate tasks, append --algo_type $ALGO where ALGO can be one of rce, oril.

Citation

If you find this repository useful for your research, please cite

@article{ma2022smodice,
      title={SMODICE: Versatile Offline Imitation Learning via State Occupancy Matching}, 
      author={Yecheng Jason Ma and Andrew Shen and Dinesh Jayaraman and Osbert Bastani},
      year={2022},
      url={https://arxiv.org/abs/2202.02433}
}

Contact

If you have any questions regarding the code or paper, feel free to contact me at [email protected].

Acknowledgment

This codebase is partially adapted from optidice, rce, relay-policy-learning, and d4rl ; We thank the authors and contributors for open-sourcing their code.

Owner
Jason Ma
Jason Ma
MoCoGAN: Decomposing Motion and Content for Video Generation

MoCoGAN: Decomposing Motion and Content for Video Generation This repository contains an implementation and further details of MoCoGAN: Decomposing Mo

Sergey Tulyakov 514 Dec 18, 2022
MixText: Linguistically-Informed Interpolation of Hidden Space for Semi-Supervised Text Classification

MixText This repo contains codes for the following paper: Jiaao Chen, Zichao Yang, Diyi Yang: MixText: Linguistically-Informed Interpolation of Hidden

GT-SALT 309 Dec 12, 2022
Official code for Next Check-ins Prediction via History and Friendship on Location-Based Social Networks (MDM 2018)

MUC Next Check-ins Prediction via History and Friendship on Location-Based Social Networks (MDM 2018) Performance Details for Accuracy: | Dataset

Yijun Su 3 Oct 09, 2022
Semi-supervised Implicit Scene Completion from Sparse LiDAR

Semi-supervised Implicit Scene Completion from Sparse LiDAR Paper Created by Pengfei Li, Yongliang Shi, Tianyu Liu, Hao Zhao, Guyue Zhou and YA-QIN ZH

114 Nov 30, 2022
Locally Differentially Private Distributed Deep Learning via Knowledge Distillation (LDP-DL)

Locally Differentially Private Distributed Deep Learning via Knowledge Distillation (LDP-DL) A preprint version of our paper: Link here This is a samp

Di Zhuang 3 Jan 08, 2023
Application of K-means algorithm on a music dataset after a dimensionality reduction with PCA

PCA for dimensionality reduction combined with Kmeans Goal The Goal of this notebook is to apply a dimensionality reduction on a big dataset in order

Arturo Ghinassi 0 Sep 17, 2022
Reinforcement Learning Theory Book (rus)

Reinforcement Learning Theory Book (rus)

qbrick 206 Nov 27, 2022
Escaping the Gradient Vanishing: Periodic Alternatives of Softmax in Attention Mechanism

Period-alternatives-of-Softmax Experimental Demo for our paper 'Escaping the Gradient Vanishing: Periodic Alternatives of Softmax in Attention Mechani

slwang9353 0 Sep 06, 2021
Danfeng Hong, Lianru Gao, Jing Yao, Bing Zhang, Antonio Plaza, Jocelyn Chanussot. Graph Convolutional Networks for Hyperspectral Image Classification, IEEE TGRS, 2021.

Graph Convolutional Networks for Hyperspectral Image Classification Danfeng Hong, Lianru Gao, Jing Yao, Bing Zhang, Antonio Plaza, Jocelyn Chanussot T

Danfeng Hong 154 Dec 13, 2022
Spatial Contrastive Learning for Few-Shot Classification (SCL)

This repo contains the official implementation of Spatial Contrastive Learning for Few-Shot Classification (SCL), which presents of a novel contrastive learning method applied to few-shot image class

Yassine 34 Dec 25, 2022
Azua - build AI algorithms to aid efficient decision-making with minimum data requirements.

Project Azua 0. Overview Many modern AI algorithms are known to be data-hungry, whereas human decision-making is much more efficient. The human can re

Microsoft 197 Jan 06, 2023
Pytorch implementation of the paper Time-series Generative Adversarial Networks

TimeGAN-pytorch Pytorch implementation of the paper Time-series Generative Adversarial Networks presented at NeurIPS'19. Jinsung Yoon, Daniel Jarrett

Zhiwei ZHANG 21 Nov 24, 2022
An Exact Solver for Semi-supervised Minimum Sum-of-Squares Clustering

PC-SOS-SDP: an Exact Solver for Semi-supervised Minimum Sum-of-Squares Clustering PC-SOS-SDP is an exact algorithm based on the branch-and-bound techn

Antonio M. Sudoso 1 Nov 13, 2022
Implementations of polygamma, lgamma, and beta functions for PyTorch

lgamma Implementations of polygamma, lgamma, and beta functions for PyTorch. It's very hacky, but that's usually ok for research use. To build, run: .

Rachit Singh 24 Nov 09, 2021
This is the official pytorch implementation of the BoxEL for the description logic EL++

BoxEL: Box EL++ Embedding This is the official pytorch implementation of the BoxEL for the description logic EL++. BoxEL++ is a geometric approach bas

1 Nov 03, 2022
Happywhale - Whale and Dolphin Identification Silver🥈 Solution (26/1588)

Kaggle-Happywhale Happywhale - Whale and Dolphin Identification Silver 🥈 Solution (26/1588) 竞赛方案思路 图像数据预处理-标志性特征图片裁剪:首先根据开源的标注数据训练YOLOv5x6目标检测模型,将训练集

Franxx 20 Nov 14, 2022
BookMyShowPC - Movie Ticket Reservation App made with Tkinter

Book My Show PC What is this? Movie Ticket Reservation App made with Tkinter. Tk

The Nithin Balaji 3 Dec 09, 2022
Code release of paper "Deep Multi-View Stereo gone wild"

Deep MVS gone wild Pytorch implementation of "Deep MVS gone wild" (Paper | website) This repository provides the code to reproduce the experiments of

François Darmon 53 Dec 24, 2022
HMLLDB is a collection of LLDB commands to assist in the debugging of iOS apps.

HMLLDB is a collection of LLDB commands to assist in the debugging of iOS apps. 中文介绍 Features Non-intrusive. Your iOS project does not need to be modi

mao2020 47 Oct 22, 2022
diablo2 resurrected loot filter

Only For Chinese and Traditional Chinese The filter only for Chinese and Traditional Chinese, i didn't change it for other language.Maybe you could mo

elmagnifico 249 Dec 04, 2022