Safe Policy Optimization with Local Features

Overview

Safe Policy Optimization with Local Feature (SPO-LF)

This is the source-code for implementing the algorithms in the paper "Safe Policy Optimization with Local Generalized Linear Function Approximations" which was presented in NeurIPS-21.

Installation

There is requirements.txt in this repository. Except for the common modules (e.g., numpy, scipy), our source code depends on the following modules.

We also provide Dockerfile in this repository, which can be used for reproducing our grid-world experiment.

Simulation configuration

We manage the simulation configuration using hydra. Configurations are listed in config.yaml. For example, the algorithm to run should be chosen from the ones we implemented:

sim_type: {safe_glm, unsafe_glm, random, oracle, safe_gp_state, safe_gp_feature, safe_glm_stepwise}

Grid World Experiment

The source code necessary for our grid-world experiment is contained in /grid_world folder. To run the simulation, for example, use the following commands.

cd grid_world
python main.py sim_type=safe_glm env.reuse_env=False

For the monte carlo simulation while comparing our proposed method with baselines, use the shell file, run.sh.

We also provide a script for visualization. If you want to render how the agent behaves, use the following command.

python main.py sim_type=safe_glm env.reuse_env=True

Safety-Gym Experiment

The source code necessary for our safety-gym experiment is contained in /safety_gym_discrete folder. Our experiment is based on safety-gym. Our proposed method utilize dynamic programming algorithms to solve Bellman Equation, so we modified engine.py to discrtize the environment. We attach modified safety-gym source code in /safety_gym_discrete/engine.py. To use the modified library, please clone safety-gym, then replace safety-gym/safety_gym/envs/engine.py using /safety_gym_discrete/engine.py in our repo. Using the following commands to install the modified library:

cd safety_gym
pip install -e .

Note that MuJoCo licence is needed for installing Safety-Gym. To run the simulation, use the folowing commands.

cd safety_gym_discrete
python main.py sim_idx=0

We compare our proposed method with three notable baselines: CPO, PPO-Lagrangian, and TRPO-Lagrangian. The baseline implementation depends on safety-starter-agents. We modified run_agent.py in the repo source code.

To run the baseline, use the folowing commands.

cd safety_gym_discrete/baseline
python baseline_run.py sim_type=cpo

The environment that agent runs on is generated using generate_env.py. We provide 10 50*50 environments. If you want to generate other environments, you can change the world shape in safety_gym_discrete.py, and running the following commands:

cd safety_gym_discrete
python generate_env.py

Citation

If you find this code useful in your research, please consider citing:

@inproceedings{wachi_yue_sui_neurips2021,
  Author = {Wachi, Akifumi and Wei, Yunyue and Sui, Yanan},
  Title = {Safe Policy Optimization with Local Generalized Linear Function Approximations},
  Booktitle  = {Neural Information Processing Systems (NeurIPS)},
  Year = {2021}
}
Owner
Akifumi Wachi
Akifumi Wachi
A simple way to store your passwords without requiring third party applications

SimplePasswordManager A simple way to store your passwords without requiring third party applications Simple To Use. Store Your Passwords For Each Web

Leone Odinga 1 Dec 23, 2021
User-friendly reference finder in IDA

IDARefHunter Updated: This project's been introduced on IDA Plugin Contest 2021! Why do we need RefHunter? Getting reference information in one specif

Jiwon 29 Dec 04, 2022
Auto Tor Ip Changer

AutoTor Auto Tor Ip Changer for Linux! git clone https://github.com/Arest7/AutoTor cd AutoTor pip install -r requirements.txt python3 AutoTor.py follo

Ken Ryuguji 3 Jan 23, 2022
SubFind - Subdomain Finder Tools

SubFind (Subdomain Finder Tools) Info Tools Result Of Subdomain Command In Termi

LangMurpY 2 Jan 25, 2022
KeyKatcher is a keylogger that records keystrokes made on a computer and sends to the E-Mail.

What is a keylogger? A keylogger is a software application or piece of hardware that monitors and records keystrokes made on a computer keyboard. The

Himank_Jain 7 Sep 19, 2022
Tool To generate Stable Undetected Payload

windowsPayload Tool To generate Stable Undetected Payload Don t Upload to Virus Total :) Follow on Social Media Platforms ScreenShots How to install +

youhacker55 117 Dec 30, 2022
Safe Policy Optimization with Local Features

Safe Policy Optimization with Local Feature (SPO-LF) This is the source-code for implementing the algorithms in the paper "Safe Policy Optimization wi

Akifumi Wachi 6 Jun 05, 2022
Northwave Log4j CVE-2021-44228 checker

Northwave Log4j CVE-2021-44228 checker Friday 10 December 2021 a new Proof-of-Concept 1 addressing a Remote code Execution (RCE) vulnerability in the

Northwave 125 Dec 09, 2022
This is a simple PoC for the newly found Polkit error names PwnKit

A Python3 and a BASH PoC for CVE-2021-4034 by Kim Schulz

Kim Schulz 16 Sep 06, 2022
An forensics tool to help aid in the investigation of spoofed emails based off the email headers.

A forensic tool to make analysis of email headers easy to aid in the quick discovery of the attacker. Table of Contents About mailMeta Installation Us

Syed Modassir Ali 59 Nov 26, 2022
Grafana-0Day-Vuln-POC

Grafana V8.0+版本存在未授权任意文件读取 0Day漏洞 - POC 1 漏洞信息 1.1 基本信息 漏洞厂商:Grafana 厂商官网:https://grafana.com/ 1.2 漏洞描述 Grafana是一个跨平台、开源的数据可视化网络应用程序平台。用户配置连接的数据源之后,Gr

mik1th0n 3 Dec 13, 2021
PyExtractor is a decompiler that can fully decompile exe's compiled with pyinstaller or py2exe

PyExtractor is a decompiler that can fully decompile exe's compiled with pyinstaller or py2exe with additional features such as malware checker/detector! Also checks file(s) for suspicious words, dis

Rdimo 56 Jul 31, 2022
Gefilte Fish GMail filter creator

Gefilte Fish: GMail filter maker Gefilte Fish automates the creation of GMail filters. Use it like this: from gefilte import GefilteFish,

Ned Batchelder 31 Sep 28, 2022
labsecurity is a tool that brings together python scripts made for ethical hacking, in a single tool, through a console interface

labsecurity labsecurity is a tool that brings together python scripts made for ethical hacking, in a single tool, through a console interface. Warning

Dylan Meca 16 Dec 08, 2022
Laravel RCE (CVE-2021-3129)

CVE-2021-3129 - Laravel RCE About The script has been made for exploiting the Laravel RCE (CVE-2021-3129) vulnerability. This script allows you to wri

Joshua van der Poll 21 Dec 27, 2022
A Python r2pipe script to automatically create a Frida hook to intercept TLS traffic for Flutter based apps

boring-flutter A Python r2pipe script to automatically create a Frida hook to intercept TLS traffic for Flutter based apps. Currently only supporting

Hamza 64 Oct 18, 2022
Simple tool to create passwords.

PasswordGenerator Simple password generator: -Simplisitc Window Application -Allows Numbers, Symbols & letters upper and lowercase -Restricts rows of

DM 1 Jan 10, 2022
A deobfuscator for multiple python obfuscators

PY4COC A deobfuscator for multiple python obfuscators, supports exe's packed with pyinstaller too. How to use python3 py4coc.py exe file or py file o

svenskithesource 16 Dec 03, 2022
This is the fuzzer I made to fuzz Preview on macOS and iOS like 8years back when I just started fuzzing things.

Fuzzing PDFs like its 1990s This is the fuzzer I made to fuzz Preview on macOS and iOS like 8years back when I just started fuzzing things. Some discl

Chaithu 14 Sep 30, 2022
A Python application to predict what is cooking

ez-cuisine-classifier A Python application to predict what is cooking Environment Python 3.9 Windows 10 Install python -m venv venv .\venv\Scripts\act

Zeheng Li 1 Jun 21, 2022