A clean and robust Pytorch implementation of PPO on continuous action space.

Last update: Dec 16, 2022

Related tags

Overview

PPO-Continuous-Pytorch

I found the current implementation of PPO on continuous action space is whether somewhat complicated or not stable.
And this is a clean and robust Pytorch implementation of PPO on continuous action space. Here is the result:

All the experiments are trained with same hyperparameters.

Dependencies

gym==0.18.3
box2d==2.3.10
numpy==1.21.2
pytorch==1.8.1

How to use my code

Play with trained model

run 'python main.py --write False --render True --Loadmodel True --ModelIdex 400'

Train from scratch

run 'python main.py', where the default enviroment is Pendulum-v0.

Change Enviroment

If you want to train on different enviroments, just run 'python main.py --EnvIdex 0'.
The --EnvIdex can be set to be 0~5, where
'--EnvIdex 0' for 'BipedalWalker-v3'
'--EnvIdex 1' for 'BipedalWalkerHardcore-v3'
'--EnvIdex 2' for 'LunarLanderContinuous-v2'
'--EnvIdex 3' for 'Pendulum-v0'
'--EnvIdex 4' for 'Humanoid-v2'
'--EnvIdex 5' for 'HalfCheetah-v2'

Visualize the training curve

You can use the tensorboard to visualize the training curve. History training curve is saved at '\runs'

Hyperparameter Setting

For more details of Hyperparameter Setting, please check 'main.py'

A clean and robust Pytorch implementation of PPO on continuous action space.

Related tags

Overview

PPO-Continuous-Pytorch

Dependencies

How to use my code

Play with trained model

Train from scratch

Change Enviroment

Visualize the training curve

Hyperparameter Setting

Owner

XinJingHao

DIVeR: Deterministic Integration for Volume Rendering

This is a Image aid classification software based on python TK library development

Flickr-Faces-HQ (FFHQ) is a high-quality image dataset of human faces, originally created as a benchmark for generative adversarial networks (GAN)

NAACL'2021: Factual Probing Is [MASK]: Learning vs. Learning to Recall

Code for the TIP 2021 Paper "Salient Object Detection with Purificatory Mechanism and Structural Similarity Loss"

Recursive Bayesian Networks

python debugger and anti-vm that checks if you're in a virtual machine or if someones trying to debug your file

Api for getting bin info and getting encrypted card details for adyen.

This is an official implementation of the High-Resolution Transformer for Dense Prediction.

Implementation of self-attention mechanisms for general purpose. Focused on computer vision modules. Ongoing repository.

Randstad Artificial Intelligence Challenge (powered by VGEN). Soluzione proposta da Stefano Fiorucci (anakin87) - primo classificato

DIR-GNN - Discovering Invariant Rationales for Graph Neural Networks

Customer-Transaction-Analysis - This analysis is based on a synthesised transaction dataset containing 3 months worth of transactions for 100 hypothetical customers.

constructing maps of intellectual influence from publication data

Speech Recognition using DeepSpeech2.

LAVT: Language-Aware Vision Transformer for Referring Image Segmentation

Code for "My(o) Armband Leaks Passwords: An EMG and IMU Based Keylogging Side-Channel Attack" paper

official implemntation for "Contrastive Learning with Stronger Augmentations"

Official PyTorch code for Mutual Affine Network for Spatially Variant Kernel Estimation in Blind Image Super-Resolution (MANet, ICCV2021)

Build Low Code Automated Tensorflow, What-IF explainable models in just 3 lines of code.