Advantage async actor-critic Algorithms (A3C) in PyTorch
@inproceedings{mnih2016asynchronous,
title={Asynchronous methods for deep reinforcement learning},
author={Mnih, Volodymyr and Badia, Adria Puigdomenech and Mirza, Mehdi and Graves, Alex and Lillicrap, Timothy P and Harley, Tim and Silver, David and Kavukcuoglu, Koray},
booktitle={International Conference on Machine Learning},
year={2016}}
This repository contains an implementation of Adavantage async Actor-Critic (A3C) in PyTorch based on the original paper by the authors and the PyTorch implementation by Ilya Kostrikov.
A3C is the state-of-art Deep Reinforcement Learning method.
Dependencies
- Python 2.7
- PyTorch
- gym (OpenAI)
- universe (OpenAI)
- opencv (for env state processing)
- visdom (for visualization)
Training
./train_lstm.sh
Test wigh trained weight after 169000 updates for PongDeterminisitc-v3.
./test_lstm.sh 169000
A test result video is available.