Conjugated Discrete Distributions for Distributional Reinforcement Learning (C2D)

Related tags

Deep Learningc2d
Overview

Conjugated Discrete Distributions for Distributional Reinforcement Learning (C2D)

Code & Data Appendix for Conjugated Discrete Distributions for Distributional Reinforcement Learning.

Björn Lindenberg, Jonas Nordqvist, Karl-Olof Lindahl

Citation

If you use C2D in your research we ask you to please cite the following:

@misc{lindenberg2021conjugated,
      title={Conjugated Discrete Distributions for Distributional Reinforcement Learning}, 
      author={Björn Lindenberg and Jonas Nordqvist and Karl-Olof Lindahl},
      year={2021},
      eprint={2112.07424},
      archivePrefix={arXiv},
      primaryClass={cs.LG}
}

Data

  • Agent scores are available in the data folder.
  • Raw experiment data for each seed is available in the folder data/supplementary.
  • Each seed was run on a VM Ubuntu 20.04 server with 64GB RAM, a single Nvidia Quadro P4000 GPU and TensorFlow 2.5.

Code

  • The C++20 source code that handles ALE and transition buffering resides in src.
  • The agent code, written in TensorFlow/Python (with algorithms), can be viewed in c2d.
  • Requires cuDNN, TensorFlow 2.X, python3, The Arcade Learning Environment, C++20 and LZ4. For a comprehensive view of dependencies, have a look at our VM setup files in install_scripts.

Atari Games

  • To avoid legal issues, our Atari 2600 rom file directory ale_roms is left empty. However the corresponding binaries are widely available for import from elsewhere, e.g., Breakout or breakout.bin can be extracted from the atari-py Python package.

Library

  • The directory ale_roms needs to be populated by the relevant binaries of different Atari games. ALE's checksum file md5.txt for checking binary compatibility is present in the root directory.

  • The initial library setup or any changes to settings.cmake will require compilation by

    bash build_lib.sh
    
  • One can train for one iteration (1M frames) in Breakout with:

    python3 run.py --game breakout --tag test --iterations 1
    

Figures

Performance Profile (Deep reinforcement learning at the edge of the statistical precipice, Agarwal et al. 2021)

Performance Profile Aggregate Metrics

Sampling Efficiency: Mean and Median

Mean Median

Training Graphs

All Games

Strong/Weak Examples

Support Evolution

Support

Official repository of OFA. Paper: Unifying Architectures, Tasks, and Modalities Through a Simple Sequence-to-Sequence Learning Framework

Paper | Blog OFA is a unified multimodal pretrained model that unifies modalities (i.e., cross-modality, vision, language) and tasks (e.g., image gene

OFA Sys 1.4k Jan 08, 2023
Translate darknet to tensorflow. Load trained weights, retrain/fine-tune using tensorflow, export constant graph def to mobile devices

Intro Real-time object detection and classification. Paper: version 1, version 2. Read more about YOLO (in darknet) and download weight files here. In

Trieu 6.1k Dec 30, 2022
Graph Transformer Architecture. Source code for

Graph Transformer Architecture Source code for the paper "A Generalization of Transformer Networks to Graphs" by Vijay Prakash Dwivedi and Xavier Bres

NTU Graph Deep Learning Lab 561 Jan 08, 2023
learned_optimization: Training and evaluating learned optimizers in JAX

learned_optimization: Training and evaluating learned optimizers in JAX learned_optimization is a research codebase for training learned optimizers. I

Google 533 Dec 30, 2022
An official source code for "Augmentation-Free Self-Supervised Learning on Graphs"

Augmentation-Free Self-Supervised Learning on Graphs An official source code for Augmentation-Free Self-Supervised Learning on Graphs paper, accepted

Namkyeong Lee 59 Dec 01, 2022
In this project, two programs can help you take full agvantage of time on the model training with a remote server

In this project, two programs can help you take full agvantage of time on the model training with a remote server, which can push notification to your phone about the information during model trainin

GrayLee 8 Dec 27, 2022
5 Jan 05, 2023
Channel Pruning for Accelerating Very Deep Neural Networks (ICCV'17)

Channel Pruning for Accelerating Very Deep Neural Networks (ICCV'17)

Yihui He 1k Jan 03, 2023
CvT-ASSD: Convolutional vision-Transformerbased Attentive Single Shot MultiBox Detector (ICTAI 2021 CCF-C 会议)The 33rd IEEE International Conference on Tools with Artificial Intelligence

CvT-ASSD including extra CvT, CvT-SSD, VGG-ASSD models original-code-website: https://github.com/albert-jin/CvT-SSD new-code-website: https://github.c

金伟强 -上海大学人工智能小渣渣~ 5 Mar 07, 2022
Transfer Learning Shootout for PyTorch's model zoo (torchvision)

pytorch-retraining Transfer Learning shootout for PyTorch's model zoo (torchvision). Load any pretrained model with custom final layer (num_classes) f

Alexander Hirner 169 Jun 29, 2022
[CVPR 2021] NormalFusion: Real-Time Acquisition of Surface Normals for High-Resolution RGB-D Scanning

NormalFusion: Real-Time Acquisition of Surface Normals for High-Resolution RGB-D Scanning Project Page | Paper | Supplemental material #1 | Supplement

KAIST VCLAB 49 Nov 24, 2022
Learning from Synthetic Data with Fine-grained Attributes for Person Re-Identification

Less is More: Learning from Synthetic Data with Fine-grained Attributes for Person Re-Identification Suncheng Xiang Shanghai Jiao Tong University Over

SunchengXiang 68 Dec 13, 2022
Post-training Quantization for Neural Networks with Provable Guarantees

Post-training Quantization for Neural Networks with Provable Guarantees Authors: Jinjie Zhang ( Yixuan Zhou 2 Nov 29, 2022

i-SpaSP: Structured Neural Pruning via Sparse Signal Recovery

i-SpaSP: Structured Neural Pruning via Sparse Signal Recovery This is a public code repository for the publication: i-SpaSP: Structured Neural Pruning

Cameron Ronald Wolfe 5 Nov 04, 2022
Adabelief-Optimizer - Repository for NeurIPS 2020 Spotlight "AdaBelief Optimizer: Adapting stepsizes by the belief in observed gradients"

AdaBelief Optimizer NeurIPS 2020 Spotlight, trains fast as Adam, generalizes well as SGD, and is stable to train GANs. Release of package We have rele

Juntang Zhuang 998 Dec 29, 2022
This is the solution for 2nd rank in Kaggle competition: Feedback Prize - Evaluating Student Writing.

Feedback Prize - Evaluating Student Writing This is the solution for 2nd rank in Kaggle competition: Feedback Prize - Evaluating Student Writing. The

Udbhav Bamba 41 Dec 14, 2022
A-SDF: Learning Disentangled Signed Distance Functions for Articulated Shape Representation (ICCV 2021)

A-SDF: Learning Disentangled Signed Distance Functions for Articulated Shape Representation (ICCV 2021) This repository contains the official implemen

81 Dec 14, 2022
Tutorial in Python targeted at Epidemiologists. Will discuss the basics of analysis in Python 3

Python-for-Epidemiologists This repository is an introduction to epidemiology analyses in Python. Additionally, the tutorials for my library zEpid are

Paul Zivich 120 Nov 17, 2022
You can draw the corresponding bounding box into the image and save it according to the result file (txt format) run by the tracker.

You can draw the corresponding bounding box into the image and save it according to the result file (txt format) run by the tracker.

Huiyiqianli 42 Dec 06, 2022
Scrutinizing XAI with linear ground-truth data

This repository contains all the experiments presented in the corresponding paper: "Scrutinizing XAI using linear ground-truth data with suppressor va

braindata lab 2 Oct 04, 2022