SAAVN - Sound Adversarial Audio-Visual Navigation,ICLR2022 (In PyTorch)

Last update: Aug 30, 2022

Related tags

Deep Learning SAAVN

Overview

SAAVN

SAAVN Code release for paper "Sound Adversarial Audio-Visual Navigation,ICLR2022" (In PyTorch)

These code are under cleaning! Some of bugs maybe happen, please tell me if you have any trouble.

Thanks

These codes are based on the SoundSpaces code base.

Usage

This repo supports AudioGoal Task on Replica and Matterport3D datasets.

Below we show the commands for training and evaluating AudioGoal with Depth sensor on Replica, but it applies to Matterport dataset as well.

Training

python main.py --default av_nav --run-type train --exp-config [exp_config_file] --model-dir data/models/replica/av_nav/e0000/audiogoal_depth --tag-config [tag_config_file] TORCH_GPU_ID 0 SIMULATOR_GPU_ID 0

Validation (evaluate each checkpoint and generate a validation curve)

python main.py --default av_nav --run-type eval --exp-config [exp_config_file] --model-dir data/models/replica/av_nav/e0000/audiogoal_depth --tag-config [tag_config_file] TORCH_GPU_ID 0 SIMULATOR_GPU_ID 0

Test the best validation checkpoint based on validation curve

python main.py --default av_nav --run-type eval --exp-config [exp_config_file] --model-dir data/models/replica/av_nav/e0000/audiogoal_depth --tag-config [tag_config_file] TORCH_GPU_ID 0 SIMULATOR_GPU_ID 0

Generate demo video with audio

python main.py --default av_nav --run-type eval --exp-config [exp_config_file] --model-dir data/models/replica/av_nav/e0000/audiogoal_depth --tag-config [tag_config_file] TORCH_GPU_ID 0 SIMULATOR_GPU_ID 0

Note: [exp_config_file] is the main parameter configuration file of the experiment, while [tag_config_file] is special parameter configuration file for abalation experiments.

Citation

If you use this model in your research, please cite the following paper:

@inproceedings{YinfengICLR2022saavn,
	title = {Sound Adversarial Audio-Visual Navigation},
	author = {Yinfeng Yu, Wenbing Huang, Fuchun Sun, Changan Chen, Yikai Wang, Xiaohong Liu},
	year = {2022},
        booktitle={ICLR},
}

SAAVN - Sound Adversarial Audio-Visual Navigation,ICLR2022 (In PyTorch)

Related tags

Overview

SAAVN

SAAVN Code release for paper "Sound Adversarial Audio-Visual Navigation,ICLR2022" (In PyTorch)

These code are under cleaning! Some of bugs maybe happen, please tell me if you have any trouble.

Thanks

Usage

Citation

Owner

YinfengYu

Codes and scripts for "Explainable Semantic Space by Grounding Languageto Vision with Cross-Modal Contrastive Learning"

Train Yolov4 using NBX-Jobs

Code for "Training Neural Networks with Fixed Sparse Masks" (NeurIPS 2021).

End-to-end machine learning project for rices detection

Yolov5 deepsort inference，使用YOLOv5+Deepsort实现车辆行人追踪和计数，代码封装成一个Detector类，更容易嵌入到自己的项目中

Official implementation of Few-Shot and Continual Learning with Attentive Independent Mechanisms

Example scripts for the detection of lanes using the ultra fast lane detection model in ONNX.

Learned Token Pruning for Transformers

On Nonlinear Latent Transformations for GAN-based Image Editing - PyTorch implementation

Copy Paste positive polyp using poisson image blending for medical image segmentation

[Link]mareteutral - pars tradg wth M []

Si Adek Keras is software VR dangerous object detection.

This is 2nd term discrete maths project done by UCU students that uses backtracking to solve various problems.

PyTorch implementation for paper StARformer: Transformer with State-Action-Reward Representations.

Readings for "A Unified View of Relational Deep Learning for Polypharmacy Side Effect, Combination Therapy, and Drug-Drug Interaction Prediction."

A convolutional recurrent neural network for classifying A/B phases in EEG signals recorded for sleep analysis.

Decision Transformer: A brand new Offline RL Pattern

Code for "Unsupervised Source Separation via Bayesian inference in the latent domain"

Augmented CLIP - Training simple models to predict CLIP image embeddings from text embeddings, and vice versa.

Azua - build AI algorithms to aid efficient decision-making with minimum data requirements.