ManiSkill-Learn is a framework for training agents on SAPIEN Open-Source Manipulation Skill Challenge (ManiSkill Challenge), a large-scale learning-from-demonstrations benchmark for object manipulation.

Overview

ManiSkill-Learn

ManiSkill-Learn is a framework for training agents on SAPIEN Open-Source Manipulation Skill Challenge, a large-scale learning-from-demonstrations benchmark for object manipulation. In this challenge, an agent is aimed at generalizing its manipulation skills to unseen objects of the same category given demonstrations and inputs.

An important feature of this package is that it supports visual inputs, especially point-cloud inputs. Such visual data is widely obtainable and applicable in real-world settings, such as self-driving and robotics. Point cloud features also contain explicit and accurate positional information, which could be challenging to be inferred purely through RGB-D images.

ManiSkill-Learn implements various point cloud-based network architectures (e.g. PointNet, PointNet Transformer) adapted to various learning-from-demonstrations algorithms (e.g. Behavior Cloning(BC), Offline/Batch RL(BCQ, CQL, TD3-BC)). It is easy for everyone to design new network architectures and new learning-from-demonstrations algorithms, change the observation processing framework, and generate new demonstrations.

Getting Started

Installation

ManiSkill-Learn requires the python version to be >= 3.6 and torch version to be >= 1.5.0. We suggest users to use python 3.8 and pytorch 1.9.0 with cuda 11.1. The evaluation system of ManiSkill challenge uses python 3.8.10. To create the corresponding Anaconda environment, run conda create -n "myenv" python=3.8.10.

To get started, enter the parent directory of where you installed ManiSkill (mani_skill) and clone this repo, then install with pip under the same environment where mani_skill is installed.

cd {parent_directory_of_mani_skill}
conda activate mani_skill #(activate the anaconda env or virtual env where mani_skill is installed)
git clone https://github.com/haosulab/ManiSkill-Learn.git
cd ManiSkill-Learn/
pip install -e .

Due to variations in CUDA versions, we did not include pytorch or torchvision in requirements.txt. Therefore, please install torch and torchvision before using this package. A version reference can be found in this url.

Simple Workflow

Download Data - Quick Example

Enter this repo, then download example demonstration data from Google Drive and store the data under ./example_mani_skill_data.

Evaluation on Simple Pretrained Models

This section provides a simple example of our evaluation pipeline. To evaluate the example pre-trained models, please check the scripts in scripts/simple_mani_skill_example/eval_bc_example.sh. For example, you can run

python -m tools.run_rl configs/bc/mani_skill_point_cloud_transformer.py \
--gpu-ids=0 --cfg-options "env_cfg.env_name=OpenCabinetDrawer_1045_link_0-v0" \
"eval_cfg.save_video=True" "eval_cfg.num=10" "eval_cfg.use_log=True" \
--work-dir=./test/OpenCabinetDrawer_1045_link_0-v0_pcd \
--resume-from=./example_mani_skill_data/OpenCabinetDrawer_1045_link_0-v0_PN_Transformer.ckpt --evaluation

to test the performance of PointNet + Transformer on environment OpenCabinetDrawer_1045_link_0-v0 in ManiSkill.

Train a Simple Agent with Behavior Cloning (BC).

This section provides a simple example of our training pipeline. To train with example demonstration data, please check the scripts in scripts/simple_mani_skill_example/. For example, you can run

python -m tools.run_rl configs/bc/mani_skill_point_cloud_transformer.py \
--gpu-ids=0 --cfg-options "env_cfg.env_name=OpenCabinetDrawer_1045_link_0-v0" \
--work-dir=./work_dirs/OpenCabinetDrawer_1045_link_0-v0 --clean-up

to train PointNet + Transformer with demonstration data generated on OpenCabinetDrawer_1045_link_0-v0 in ManiSkill.

There can be some noise between different runs (which is very common in RL), so you might want to train multiple times and choose the best model.

Download Full Demonstration Data and Pretrained Models

The simple examples above only consider training and evaluation on one single object instance, but our challenge aims at training on many object instances and generalizing the learned manipulation skills to novel object instances.

All the data used in this repository is stored on Google Drive.

You can download the full point cloud demonstration dataset (which is stored here) along with models (which is stored here) pre-trained on all of our training object instances. The demonstration data and pre-trained models need to be stored under ./full_mani_skill_data to run the provided scripts in scripts/full_mani_skill_example.

If you want to render point cloud demonstrations from state demonstrations by yourself, you can download the demonstration data with environment state here and put them in ./full_mani_skill_state_data/. Please read Demonstrations and Generating Custom Point Cloud Demonstrations.

You can run python tools/check_md5.py to check if you download all the files correctly.

Download Data from ScienceDB.cn

If you cannot access Google easily, you can download data from here.

Full Training and Evaluation Pipeline

After you download the full data, you are ready to start the ManiSkill challenge. For the full training and evaluation pipeline, see Workflow.

Submit a Model

After you train a full model, you can submit it to our ManiSkill challenge. Please see Challenge Submission in Workflow for more details.

Demonstrations

To download our demonstration dataset, please see Download Full Demonstration Data and Pretrained Models.

This section introduces important details about our point cloud demonstration data. The format of our demonstrations is explained in Demonstrations Format. The provided point cloud demonstrations are downsampled and processed using an existing processing function, which is explained in more detail in Observation Processing. If you want to generate point cloud demonstrations using custom post-processing functions, please refer to Generating Custom Point Cloud Demonstrations.

We did not generate RGB-D demonstration since downsampling an RGB-D image can easily lose important information, while downsampling a point cloud is a lot easier. If you want to, please see Generating RGB-D Demonstrations.

Demonstrations Format

The demonstrations are stored in HDF5 format. The point cloud demonstrations have the following structure:

>>> from h5py import File
>>> f = File('./example_mani_skill_data/OpenCabinetDrawer_1045_link_0-v0_pcd.h5', 'r')
# f is a h5py.Group with keys traj_0 ... traj_n
>>> f['traj_0'].keys()
<KeysViewHDF5 ['actions', 'dones', 'next_obs', 'obs', 'rewards']>
# Let the length of the trajectory 0 be l,
# then f['traj_0']['actions'] is h5py.Dataset with shape == [l, action_dim]
# f['traj_0']['dones'] is a h5py.Dataset with shape == (l,), and the last element is True
# f['traj_0']['rewards'] is a h5py.Dataset with shape == (l,)
# f['traj_0']['obs'] and f['traj_0']['next_obs'] are h5py.Group,
both have the following structure:
  {
  state: h5py.Dataset, shape (l, state_shape); agent's state, including pose, velocity, angular velocity of the moving platform of the robot, joint angles and joint velocities of all robot joints, positions and velocities of end-effectors
  pointcloud: h5py.Group
    {
    xyz: h5py.Dataset, shape (l, n_points, 3); position for each point, recorded in world frame
    rgb: h5py.Dataset, shape (l, n_points, 3); RGB values for each point
    seg: h5py.Dataset, shape (l, n_points, n_seg); some task-relevant segmentaion masks, e.g. handle of a cabinet door
    }
  }
# For the description of segmentation masks (`seg`) for each task, 
please refer to the ManiSkill repo.

You may notice that the structure of obs and next_obs is different from the raw observation structure of the ManiSkill environments. The raw observation structure of ManiSkill environments can be obtained from running the script below:

>>> import mani_skill.env, gym
>>> env=gym.make('OpenCabinetDoor-v0')
>>> env.seg_env_mode(obs_mode='pointcloud')
>>> obs=env.reset()
{
  'agent': ... , # a vector that describes agent's state, including pose, velocity, angular velocity of the moving platform of the robot, joint angles and joint velocities of all robot joints, positions and velocities of end-effectors
  'pointcloud': {
    'rgb': ... , # (N, 3) array, RGB values for each point
    'xyz': ... , # (N, 3) array, position for each point, recorded in world frame
    'seg': ... , # (N, k) array, some task-relevant segmentaion masks, e.g. handle of a cabinet door
  }
}

Our demonstration data has the following differences. The differences are due to the observation post-processing function in mani_skill_learn/env/observation_process.py called by the Gym environment wrapper (mani_skill_learn/env/wrappers.py), which is explained in more detail in Observation Processing.

  • The demonstrations are downsampled and processed from the raw point cloud when utilizing the wrapper (see Observation Processing). Therefore, our demonstration has much fewer points (l=1200) than the raw point cloud (N=160*400*3).
  • The state key in our demonstration corresponds to the agent key in the raw observation of the ManiSkill environments. An important thing to keep in mind is that the vector only represents the robot state of the agent and does not contain ground truth information of the environment (i.e. does not contain information such as object location, distance from the robot to the object, etc, and such information should be inferred from the point cloud)

Even though our demonstration format differs from the raw observation structure of ManiSkill environments, since our environments are wrapped by the SapienRLWrapper class in mani_skill_learn/env/wrappers.py, and we always utilize the wrapper to call the observation processing function whenever we advance a step in the environment and obtain the corresponding observations (i.e. observation, reward, done, info = env.step(action)) during model inference, we do NOT need to do anything special in our repo.

Observation Processing

For demonstration generation, online data collection, and model inference, the observation processing is done in the environment wrappers in mani_skill_learn/env/wrapper.py. For ManiSkill environments, the wrapper calls process_sapien_rl_base in mani_skill_learn/env/observation_process.py to downsample and process the raw point cloud. In our implementation, we selectively downsample from 160*400*3 points to 1200 points. Specifically, for each dimension of the segmentation mask, we first sample 50 points where the mask is True (if there are fewer than 50 points, then we keep all the points). We then randomly sample from the rest of points where at least one of the segmentation masks is true, such that we obtain a total of 800 points (if there are fewer than 800, then we keep all of them). Finally, we randomly sample from the points where none of the segmentation masks are true and where the points are not on the ground (i.e. have positive $z$-coordinate value), such that we obtain a total of 1200 points.

Point clouds in the demonstration data we provided have already been processed by process_sapien_rl_base our repo. If you want to write custom processing function and you want to use any demonstration data during training, you need to first re-render the demonstrations. See Generating Custom Point Cloud Demonstrations.

Generating Custom Point Cloud Demonstrations

To generate custom point cloud demonstrations using custom post-processing functions (by replacing process_sapien_rl_base, see Observation Processing), we have provided demonstrations containing all the necessary information to precisely reconstruct the environment at each time step. You should have downloaded the files in ./full_mani_skill_state_data/

The demonstration has the following format:

>>> from h5py import File
>>> f = File(path, 'r')
>>> f.keys()
dict_keys(['actions', 'dones', 'env_levels', 'env_scene_states', 
'env_states', 'episode_dones', ... , 'next_env_scene_states', 
'next_env_states', 'next_obs', 'obs', 'rewards'])

The API for rendering and converting the above demonstration to point cloud data is provided in tools/convert_state.py.

To run convert_state.py, example script is provided in scripts/simple_mani_skill_example/convert_state_to_pcd.sh. The script generates point cloud for only one of the environments, so you need to repeat so for all environments (see available_environments.txt in the ManiSkill repo).

Chunking Demonstration Dataset

The sum of size of generated point cloud demonstrations for all environments of a task is larger than 10G. After loading and converting into internal dataset, the total consumed memory during agent training will be 60 - 120G. If your computer does not have enough memory, you can use a chunked dataset. Please refer to Demonstration Loading and Replay Buffer for more details.

Generating RGB-D Demonstrations

We did not provide pre-generated RGB-D demonstrations because, unlike point cloud demonstrations, they cannot be easily downsampled without losing important information, which means they have a much larger size and would be in the scale of terabytes (300 trajs/env * 170 training envs * about 30 steps per traj * 160 * 400 * 3 * 4 * 4bytes/float = 4.7TB). If you would like to train models using RGB-D demonstrations, you could use tools/convert_state.py by passing --obs_mode=rgbd to generate the demonstrations. In addition, you need to also implement custom network architectures that process RGB-D images (see "Network Architectures" below).

Workflow

Training

We have provided example training scripts in scripts/full_mani_skill_example/. If you want to directly evaluate a pretrained model (such as those we provided), see Evaluation.

To train an agent, run tools/run_rl.py with appropriate arguments. Among the arguments, config requires a user to specify a path to the config file. cfg-options overrides some configs in an existing config file. The config file specifies information regarding the algorithm, the hyperparameters, the network architectures, the training process, and the evaluation process. For more details about the config, see Detailed Functionalities and Config Settings. Example config files for learning-from-demonstrations algorithms are in configs/.

There can be some noise between different runs (which is very common in RL), so you might want to train multiple times and choose the best model.

If you are interested in designing any custom functionalities, such as writing custom Reinforcement Learning algorithms / training pipelines or designing custom network architectures, please also see Detailed Functionalities and Config Settings for specific information.

If you are interested in designing custom computer vision architectures which map point cloud input to action output, please also pay attention to Demonstrations Format, Observation Processing, and Network Architectures.

Evaluation

Evaluating an agent can be done during training or using an existing model.

To evaluate an agent during training, run tools/run_rl.py with appropriate configs (see the Evaluation config settings below).

To evaluate an agent solely through an existing model, set appropriate eval_cfg and run tools/run_rl.py with --evaluation and --resume-from options.

Example evaluation scripts using existing pretrained models can be found in scripts/full_mani_skill_example/.

Challenge Submission

To create a submission for the ManiSkill challenge, please read the instructions in mani-skill-submission-example first.

If you want to include functionalities in our ManiSkill-Learn repo for your submission, example user_solution.py and environment.yml can be found in the submission_example directory. In this case, please first move these files directly under the repo (i.e. move them to {this_repo}/user_solution.py and {this_repo}/environment.yaml). Also please ensure that the correct versions of torch and torchvision are in environment.yml.

Before submitting to our server, you can utilize the ManiSkill (local) evaluation tool to ensure that things are set up correctly (please refer to "Evaluation" section in the ManiSkill repo).

Detailed Functionalities and Config Settings

This section introduces different modules and functionalities in ManiSkill-Learn, along with how the config settings are parsed in our scripts. This section is especially useful if you want to implement any new functionalities or make changes to the default config settings we have given.

General Config Settings and Config in Command Line

At a high level, the configs for learning-from-demonstrations algorithms are in configs/. The configs/_base_/ subdirectory contains configs that can be reused across different environments, such as network architectures and evaluation configs. To train or evaluate an agent, run tools/run_rl.py with appropriate arguments and path to the config file (--config=path_to_config_file).

The config files are processed as iterated dictionaries when running tools/run_rl.py, with rules as follows: when a config file is imported as _base_ config in another config file, in case there is a conflict in the value of the same dictionary key in the two files, the value in the latter file overrides the value in the former file. If the latter file contains dictionary keys which the _base_ config doesn't have, then the keys are merged together.

One can also specify cfg-options in the command lines when running tools/run_rl.py. In this case, the configs passed to the command line override the existing configs in the config files.

General Training / Evaluation Process Management

During training, tools/run_rl.py calls mani_skill_learn/apis/train_rl.py to manage the training process, evaluation during training, and result logging. mani_skill_learn/apis/train_rl.py calls the implementations in mani_skill_learn/methods/ for algorithm-specific model forward functions and parameter update functions. These functions then invoke files under mani_skill_learn/networks/policy_network/ for forwarding policy networks and mani_skill_learn/networks/value_network/ for forwarding value networks (if the algorithm has them). When evaluating during training, mani_skill_learn/apis/train_rl.py calls the evaluator in mani_skill_learn/env/evaluation.py.

During evaluating pre-trained models, tools/run_rl.py directly calls the evaluator in mani_skill_learn/env/evaluation.py.

Learning-from-Demonstrations (LfD) Algorithms

We implemented several learning-from-demonstration algorithms in mani_skill_learn/methods/: BC (Behavior Cloning), BCQ (Batch-Constrained Q-Learning), CQL (Conservative Q-Learning), and TD3+BC.

Besides learning-from-demonstrations algorithms, we also provide implementations of online model-free agents such as SAC and TD3 (which don't use any demonstration data) in mani_skill_learn/methods/mfrl/.

The algorithm hyperparameters, along with the policy and value network architectures, are specified in the agent entry in the config files. The train_mfrl_cfg entry specifies training parameters. Specifically,

  • total_steps is the total number of training gradient steps.
  • For replay buffer related configs, i.e. init_replay_buffers and init_replay_with_split, please see Demonstration Loading and Replay Buffer. The reason we call the data structure for storing loaded demonstration data "replay buffer" is that we can share the LfD algorithm interface with the online algorithm interface; in pure imitation learning / offline RL settings, since we are not collecting online data and adding it to the replay buffer, when we load the demonstrations, the replay buffer becomes the entire dataset.
  • n_steps is set to 0 for pure imitation learning / offline settings. If one wants to combine offline algorithms with online data collection, then one needs to set some nonzero n_steps similar to the configs in configs/mfrl/. In addition, rollout_cfg needs to be set, agent.policy needs to be implemented explicitly for algorithms, and you might also want to change the implementation of replay buffer (since the original demonstrations might be overwritten when the number of samples reaches the buffer capacity). However, keep in mind that online data collection is expensive in point cloud setting.
  • n_updates refers to the number of agent updates for every sampled batch.
  • num_trajs_per_demo_file refers to the number of trajectories to load per demo file,

Network Architectures

Our network architectures are implemented in mani_skill_learn/networks. The mani_skill_learn/networks/backbones directory contains point cloud-based architectures such as PointNet and PointNet + Transformer. These specific architectures are built from configs during the policy and value network building processes.

We did not implement RGB-D based architectures because the RGB-D demonstrations would be a lot larger than the point cloud demonstrations (since downsampling an RGB-D image can easily lose important information, while downsampling a point cloud is a lot easier). Thus you need to implement custom image-based architectures if you want to train a model using RGB-D demonstrations.

Architecture-specific configurations are specified in the agent/{policy or value network}/nn_cfg entries of the config files. Note that some algorithms have multiple policy or value networks.

During model training, the inputs to the model (sampled from the replay buffer, see init_replay_buffers) have the following format (which is different from the raw observation structure in ManiSkill environments, recall Demonstrations Format):

let b = batch_size
dict(
  actions: (b, action_dim)
  dones: (b,)
  rewards: (b,)
  obs: dict(
    state: (b, state_shape)
    pointcloud: dict(
      xyz: (b, n_points, 3)
      rgb: (b, n_points, 3)
      seg: (b, n_points, n_seg)
    )
  )
  next_obs has the same format as obs 
)

During model inference, the inputs to the model have the following format (since we are not training the model, we don't have information such as next_obs):

let b = batch_size
dict(
  obs: dict(
    state: (b, state_shape)
    pointcloud: dict(
      xyz: (b, n_points, 3)
      rgb: (b, n_points, 3)
      seg: (b, n_points, n_seg)
    )
  )
  actions: (b, action_dim)
)

Environments

Source files for the environment utilities are stored in mani_skill_learn/env/.

Configs

In the config file, environment-related configurations can be set similar to the format below:

env_cfg = dict(
  type='gym',
  unwrapped=False,
  stack_frame=1,
  obs_mode='pointcloud',
  reward_type='dense',
  env_name={env_name},
)

Here stack_frame refers to the number of observation frames to stack (similar to Atari, except that we are now working in a point cloud). env_name refers to the environment name. Other arguments are typically unchanged.

Getting Environment Info

The environment info can be obtained through the script below. Note that the environment is wrapped and the point cloud observation has been post-processed.

>>> from mani_skill_learn.env import get_env_info
>>> env = gym.make(env_name)
>>> env.set_env_mode(obs_mode='pointcloud')
>>> obs_shape, action_shape, action_space = \
get_env_info({'env_name': env_name, 'obs_mode': 'pointcloud', 'type': 'gym'})
>>> obs_shape
{'pointcloud': {'rgb': (1200, 3), 'xyz': (1200, 3), 'seg': (1200, 3)}, 'state': some_int1}
>>> action_shape
some_int2
>>> action_space
Box(-1.0, 1.0, (some_int2,), float32)

Demonstration Loading and Replay Buffer

We have provided functionalities on loading demonstrations into the replay buffer in mani_skill_learn/env/replay_buffer.py. During agent training, the replay buffer is loaded through buffer.restore in mani_skill_learn/apis/train_rl.py. The raw utilities on HDF5 file operations are in mani_skill_learn/utils/fileio/h5_utils.py.

Config for the replay buffer is set in both train_mfrl_cfg and replay_cfg entries. In train_mfrl_cfg, init_replay_buffer refers to the path to a single demonstration data / a list of demonstration data/directory to the chunked datasets (see the paragraphs below for explanations) to be loaded into the replay buffer. init_replay_with_split takes a list of exactly two strings, where the first string is the directory of the demonstration dataset, and the second string is the path to the YAML file containing a list of training door/drawer/chair/bucket environment ids (refer to scripts/full_mani_skill_example/ for examples). This argument loads all demonstration files that correspond to the environment indices in the second file, under the directory specified by the first file. One should only use one of init_replay_buffer or init_replay_with_split arguments. You may set init_replay_buffer='' to ignore it.

For replay_cfg, the capacity config refers to the limit on the number of (observation, action, reward, next_observation, info) samples. The type config specifies the type of replay buffer (ReplayMemory if loading all demonstration data into the replay buffer at once and/or doing any online sampling; ReplayDisk if loading only a chunk of demonstration data as we go in order to save memory, in this case, we need to set a smaller capacity that is divisible by the batch size)

If your computer does not have enough memory to load all demonstrations at once, you can generate a chunked dataset by using tools/split_datasets.py. The demonstrations from different environments will be randomly shuffled and stored into several files under the specified folder. To load a chunked dataset for agent training, you need let replay_cfg.type='ReplayDisk' and train_mfrl_cfg.init_replay_buffers='the folder that stores the chunked dataset'. To find more details, you can check out configs/bc/mani_skill_point_cloud_transformer_disk.py. Example scripts are in scripts/simple_mani_skill_example/run_with_chunked_dataset.sh and scripts/full_mani_skill_example/run_with_chunked_dataset.sh.

Evaluation

Most of evaluation configs are specified in the eval_cfg entry in the config files. Inside the entry, the num parameter specifies the number of trajectories to evaluate per environment. The num_proc parameter controls the parallelization.

In addition, the n_eval parameter in the train_mfrl_cfg entry specifies the number of gradient steps between two evaluations during agent training (if n_eval = None then no evaluation is performed during training). Use save_video=True if you want to see the video generated by the trained agent. It may slow down the evaluation speed.

Acknowledgements

Some functions (e.g. config system, checkpoint) are adopted from MMCV:

Citation

@article{mu2021maniskill,
  title={ManiSkill: Learning-from-Demonstrations Benchmark for Generalizable Manipulation Skills},
  author={Mu, Tongzhou and Ling, Zhan and Xiang, Fanbo and Yang, Derek and Li, Xuanlin, and Tao, Stone and Huang, Zhiao and Jia, Zhiwei and Su, Hao},
  journal={arXiv preprint arXiv:2107.14483},
  year={2021}
}

License

ManiSkill-Learn is released under the Apache 2.0 license, while some specific operations in this library are with other licenses.

Owner
Hao Su's Lab, UCSD
Hao Su's Lab, UCSD
Eth brownie struct encoding example

eth-brownie struct encoding example Overview This repository contains an example of encoding a struct, so that it can be used in a function call, usin

Ittai Svidler 2 Mar 04, 2022
Fader Networks: Manipulating Images by Sliding Attributes - NIPS 2017

FaderNetworks PyTorch implementation of Fader Networks (NIPS 2017). Fader Networks can generate different realistic versions of images by modifying at

Facebook Research 753 Dec 23, 2022
Kroomsa: A search engine for the curious

Kroomsa A search engine for the curious. It is a search algorithm designed to en

Wingify 7 Jun 20, 2022
Source code for the Paper: CombOptNet: Fit the Right NP-Hard Problem by Learning Integer Programming Constraints}

CombOptNet: Fit the Right NP-Hard Problem by Learning Integer Programming Constraints Installation Run pipenv install (at your own risk with --skip-lo

Autonomous Learning Group 65 Dec 27, 2022
Summary Explorer is a tool to visually explore the state-of-the-art in text summarization.

Summary Explorer Summary Explorer is a tool to visually inspect the summaries from several state-of-the-art neural summarization models across multipl

Webis 42 Aug 14, 2022
This project provides a stock market environment using OpenGym with Deep Q-learning and Policy Gradient.

Stock Trading Market OpenAI Gym Environment with Deep Reinforcement Learning using Keras Overview This project provides a general environment for stoc

Kim, Ki Hyun 769 Dec 25, 2022
This is the official implementation of 3D-CVF: Generating Joint Camera and LiDAR Features Using Cross-View Spatial Feature Fusion for 3D Object Detection, built on SECOND.

3D-CVF This is the official implementation of 3D-CVF: Generating Joint Camera and LiDAR Features Using Cross-View Spatial Feature Fusion for 3D Object

YecheolKim 97 Dec 20, 2022
Prototype for Baby Action Detection and Classification

Baby Action Detection Table of Contents About Install Run Predictions Demo About An attempt to harness the power of Deep Learning to come up with a so

Shreyas K 30 Dec 16, 2022
Negative Interactions for Improved Collaborative Filtering:

Negative Interactions for Improved Collaborative Filtering: Don’t go Deeper, go Higher This notebook provides an implementation in Python 3 of the alg

Harald Steck 21 Mar 05, 2022
[ICCV 2021] FaPN: Feature-aligned Pyramid Network for Dense Image Prediction

FaPN: Feature-aligned Pyramid Network for Dense Image Prediction [arXiv] [Project Page] @inproceedings{ huang2021fapn, title={{FaPN}: Feature-alig

EMI-Group 175 Dec 30, 2022
Official PyTorch implementation of "Uncertainty-Based Offline Reinforcement Learning with Diversified Q-Ensemble" (NeurIPS'21)

Uncertainty-Based Offline Reinforcement Learning with Diversified Q-Ensemble This is the code for reproducing the results of the paper Uncertainty-Bas

43 Nov 23, 2022
DECAF: Deep Extreme Classification with Label Features

DECAF DECAF: Deep Extreme Classification with Label Features @InProceedings{Mittal21, author = "Mittal, A. and Dahiya, K. and Agrawal, S. and Sain

46 Nov 06, 2022
Python TFLite scripts for detecting objects of any class in an image without knowing their label.

Python TFLite scripts for detecting objects of any class in an image without knowing their label.

Ibai Gorordo 42 Oct 07, 2022
Implement the Pareto Optimizer and pcgrad to make a self-adaptive loss for multi-task

multi-task_losses_optimizer Implement the Pareto Optimizer and pcgrad to make a self-adaptive loss for multi-task 已经实验过了,不会有cuda out of memory情况 ##Par

14 Dec 25, 2022
unofficial pytorch implement of "Squareplus: A Softplus-Like Algebraic Rectifier"

SquarePlus (Pytorch implement) unofficial pytorch implement of "Squareplus: A Softplus-Like Algebraic Rectifier" SquarePlus Squareplus is a Softplus-L

SeeFun 3 Dec 29, 2021
This repository provides data for the VAW dataset as described in the CVPR 2021 paper titled "Learning to Predict Visual Attributes in the Wild"

Visual Attributes in the Wild (VAW) This repository provides data for the VAW dataset as described in the CVPR 2021 Paper: Learning to Predict Visual

Adobe Research 36 Dec 30, 2022
Simple sinc interpolation in PyTorch.

Kazane: simple sinc interpolation for 1D signal in PyTorch Kazane utilize FFT based convolution to provide fast sinc interpolation for 1D signal when

Chin-Yun Yu 10 May 03, 2022
Sinkformers: Transformers with Doubly Stochastic Attention

Code for the paper : "Sinkformers: Transformers with Doubly Stochastic Attention" Paper You will find our paper here. Compat This package has been dev

Michael E. Sander 31 Dec 29, 2022
Multivariate Time Series Forecasting with efficient Transformers. Code for the paper "Long-Range Transformers for Dynamic Spatiotemporal Forecasting."

Spacetimeformer Multivariate Forecasting This repository contains the code for the paper, "Long-Range Transformers for Dynamic Spatiotemporal Forecast

QData 440 Jan 02, 2023
Official Pytorch implementation for video neural representation (NeRV)

NeRV: Neural Representations for Videos (NeurIPS 2021) Project Page | Paper | UVG Data Hao Chen, Bo He, Hanyu Wang, Yixuan Ren, Ser-Nam Lim, Abhinav S

hao 214 Dec 28, 2022