Pytorch implementation of Distributed Proximal Policy Optimization: https://arxiv.org/abs/1707.02286

Overview

Pytorch-DPPO

Pytorch implementation of Distributed Proximal Policy Optimization: https://arxiv.org/abs/1707.02286 Using PPO with clip loss (from https://arxiv.org/pdf/1707.06347.pdf).

I finally fixed what was wrong with the gradient descent step, using previous log-prob from rollout batches. At least ppo.py is fixed, the rest is going to be corrected as well very soon.

In the following example I was not patient enough to wait for million iterations, I just wanted to check if the model is properly learning:

Progress of single PPO:

InvertedPendulum

InvertedPendulum

InvertedDoublePendulum

InvertedDoublePendulum

HalfCheetah

HalfCheetah

hopper (PyBullet)

hopper (PyBullet)

halfcheetah (PyBullet)

halfcheetah (PyBullet)

Progress of DPPO (4 agents) [TODO]

Acknowledgments

The structure of this code is based on https://github.com/ikostrikov/pytorch-a3c.

Hyperparameters and loss computation has been taken from https://github.com/openai/baselines

Owner
Alexis David Jacq
Alexis David Jacq
Paper: De-rendering Stylized Texts

Paper: De-rendering Stylized Texts Wataru Shimoda1, Daichi Haraguchi2, Seiichi Uchida2, Kota Yamaguchi1 1CyberAgent.Inc, 2 Kyushu University Accepted

CyberAgent AI Lab 55 Dec 18, 2022
Feature board for ERPNext

ERPNext Feature Board Feature board for ERPNext Development Prerequisites k3d kubectl helm bench Install K3d Cluster # export K3D_FIX_CGROUPV2=1 # use

Revant Nandgaonkar 16 Nov 09, 2022
Neural Fixed-Point Acceleration for Convex Optimization

Licensing The majority of neural-scs is licensed under the CC BY-NC 4.0 License, however, portions of the project are available under separate license

Facebook Research 27 Oct 06, 2022
Source code for CIKM 2021 paper for Relation-aware Heterogeneous Graph for User Profiling

RHGN Source code for CIKM 2021 paper for Relation-aware Heterogeneous Graph for User Profiling Dependencies torch==1.6.0 torchvision==0.7.0 dgl==0.7.1

Big Data and Multi-modal Computing Group, CRIPAC 6 Nov 29, 2022
Bare bones use-case for deploying a containerized web app (built in streamlit) on AWS.

Containerized Streamlit web app This repository is featured in a 3-part series on Deploying web apps with Streamlit, Docker, and AWS. Checkout the blo

Collin Prather 62 Jan 02, 2023
A knowledge base construction engine for richly formatted data

Fonduer is a Python package and framework for building knowledge base construction (KBC) applications from richly formatted data. Note that Fonduer is

HazyResearch 386 Dec 05, 2022
Improving Calibration for Long-Tailed Recognition (CVPR2021)

MiSLAS Improving Calibration for Long-Tailed Recognition Authors: Zhisheng Zhong, Jiequan Cui, Shu Liu, Jiaya Jia [arXiv] [slide] [BibTeX] Introductio

Jia Research Lab 116 Dec 20, 2022
Pretrained Cost Model for Distributed Constraint Optimization Problems

Pretrained Cost Model for Distributed Constraint Optimization Problems Requirements PyTorch 1.9.0 PyTorch Geometric 1.7.1 Directory structure baseline

2 Aug 28, 2022
SiT: Self-supervised vIsion Transformer

This repository contains the official PyTorch self-supervised pretraining, finetuning, and evaluation codes for SiT (Self-supervised image Transformer).

Sara Ahmed 275 Dec 28, 2022
PyTorch Implementation of Backbone of PicoDet

PicoDet-Backbone PyTorch Implementation of Backbone of PicoDet Original Implementation is implemented on PaddlePaddle. Example picodet_l_backbone = ES

Yonghye Kwon 7 Jul 12, 2022
Gradient Step Denoiser for convergent Plug-and-Play

Source code for the paper "Gradient Step Denoiser for convergent Plug-and-Play"

Samuel Hurault 11 Sep 17, 2022
Code for CVPR 2021 paper TransNAS-Bench-101: Improving Transferrability and Generalizability of Cross-Task Neural Architecture Search.

TransNAS-Bench-101 This repository contains the publishable code for CVPR 2021 paper TransNAS-Bench-101: Improving Transferrability and Generalizabili

Yawen Duan 17 Nov 20, 2022
A deep learning framework for historical document image analysis

DIVA-DAF Description A deep learning framework for historical document image analysis. How to run Install dependencies # clone project git clone https

9 Aug 04, 2022
Code for our CVPR2021 paper coordinate attention

Coordinate Attention for Efficient Mobile Network Design (preprint) This repository is a PyTorch implementation of our coordinate attention (will appe

Qibin (Andrew) Hou 726 Jan 05, 2023
ConformalLayers: A non-linear sequential neural network with associative layers

ConformalLayers: A non-linear sequential neural network with associative layers ConformalLayers is a conformal embedding of sequential layers of Convo

Prograf-UFF 5 Sep 28, 2022
Dialect classification

Dialect-Classification This repository presents the data that was used in a talk at ICKL-5 (5th International Conference on Kurdish Linguistics) at th

Kurdish-BLARK 0 Nov 12, 2021
Vector Neurons: A General Framework for SO(3)-Equivariant Networks

Vector Neurons: A General Framework for SO(3)-Equivariant Networks Created by Congyue Deng, Or Litany, Yueqi Duan, Adrien Poulenard, Andrea Tagliasacc

Congyue Deng 332 Dec 29, 2022
My implementation of Fully Convolutional Neural Networks in Keras

Keras-FCN This repository contains my implementation of Fully Convolutional Networks in Keras (Tensorflow backend). Currently, semantic segmentation c

The Duy Nguyen 15 Jan 13, 2020
Unofficial TensorFlow implementation of the Keyword Spotting Transformer model

Keyword Spotting Transformer This is the unofficial TensorFlow implementation of the Keyword Spotting Transformer model. This model is used to train o

Intelligent Machines Limited 8 May 11, 2022
A non-linear, non-parametric Machine Learning method capable of modeling complex datasets

Fast Symbolic Regression Symbolic Regression is a non-linear, non-parametric Machine Learning method capable of modeling complex data sets. fastsr aim

VAMSHI CHOWDARY 3 Jun 22, 2022