Code for "AutoMTL: A Programming Framework for Automated Multi-Task Learning"

Related tags

Deep LearningAutoMTL
Overview

AutoMTL: A Programming Framework for Automated Multi-Task Learning

This is the website for our paper "AutoMTL: A Programming Framework for Automated Multi-Task Learning", submitted to MLSys 2022. The arXiv version will be public at Tue, 26 Oct 2021.

Abstract

Multi-task learning (MTL) jointly learns a set of tasks. It is a promising approach to reduce the training and inference time and storage costs while improving prediction accuracy and generalization performance for many computer vision tasks. However, a major barrier preventing the widespread adoption of MTL is the lack of systematic support for developing compact multi-task models given a set of tasks. In this paper, we aim to remove the barrier by developing the first programming framework AutoMTL that automates MTL model development. AutoMTL takes as inputs an arbitrary backbone convolutional neural network and a set of tasks to learn, then automatically produce a multi-task model that achieves high accuracy and has small memory footprint simultaneously. As a programming framework, AutoMTL could facilitate the development of MTL-enabled computer vision applications and even further improve task performance.

overview

Cite

Welcome to cite our work if you find it is helpful to your research. [TODO: cite info]

Description

Environment

conda install pytorch==1.6.0 torchvision==0.7.0 -c pytorch # Or higher
conda install protobuf
pip install opencv-python
pip install scikit-learn

Datasets

We conducted experiments on three popular datasets in multi-task learning (MTL), CityScapes [1], NYUv2 [2], and Tiny-Taskonomy [3]. You can download the them here. For Tiny-Taskonomy, you will need to contact the authors directly. See their official website.

File Structure

├── data
│   ├── dataloader
│   │   ├── *_dataloader.py
│   ├── heads
│   │   ├── pixel2pixel.py
│   ├── metrics
│   │   ├── pixel2pixel_loss/metrics.py
├── framework
│   ├── layer_containers.py
│   ├── base_node.py
│   ├── layer_node.py
│   ├── mtl_model.py
│   ├── trainer.py
├── models
│   ├── *.prototxt
├── utils
└── └── pytorch_to_caffe.py

Code Description

Our code can be divided into three parts: code for data, code of AutoMTL, and others

  • For Data

    • Dataloaders *_dataloader.py: For each dataset, we offer a corresponding PyTorch dataloader with a specific task variable.
    • Heads pixel2pixel.py: The ASPP head [4] is implemented for the pixel-to-pixel vision tasks.
    • Metrics pixel2pixel_loss/metrics.py: For each task, it has its own criterion and metric.
  • AutoMTL

    • Multi-Task Model Generator mtl_model.py: Transfer the given backbone model in the format of prototxt, and the task-specific model head dictionary to a multi-task supermodel.
    • Trainer Tools trainer.py: Meterialize a three-stage training pipeline to search out a good multi-task model for the given tasks. pipeline
  • Others

    • Input Backbone *.prototxt: Typical vision backbone models including Deeplab-ResNet34 [4], MobileNetV2, and MNasNet.
    • Transfer to Prototxt pytorch_to_caffe.py: If you define your own customized backbone model in PyTorch API, we also provide a tool to convert it to a prototxt file.

How to Use

Set up Data

Each task will have its own dataloader for both training and validation, task-specific criterion (loss), evaluation metric, and model head. Here we take CityScapes as an example.

tasks = ['segment_semantic', 'depth_zbuffer']
task_cls_num = {'segment_semantic': 19, 'depth_zbuffer': 1} # the number of classes in each task

You can also define your own dataloader, criterion, and evaluation metrics. Please refer to files in data/ to make sure your customized classes have the same output format as ours to fit for our framework.

dataloader dictionary

trainDataloaderDict = {}
valDataloaderDict = {}
for task in tasks:
    dataset = CityScapes(dataroot, 'train', task, crop_h=224, crop_w=224)
    trainDataloaderDict[task] = DataLoader(dataset, <batch_size>, shuffle=True)

    dataset = CityScapes(dataroot, 'test', task)
    valDataloaderDict[task] = DataLoader(dataset, <batch_size>, shuffle=True)

criterion dictionary

criterionDict = {}
for task in tasks:
    criterionDict[task] = CityScapesCriterions(task)

evaluation metric dictionary

metricDict = {}
for task in tasks:
    metricDict[task] = CityScapesMetrics(task)

task-specific heads dictionary

headsDict = nn.ModuleDict() # must be nn.ModuleDict() instead of python dictionary
for task in tasks:
    headsDict[task] = ASPPHeadNode(<feature_dim>, task_cls_num[task])

Construct Multi-Task Supermodel

prototxt = 'models/deeplab_resnet34_adashare.prototxt' # can be any CNN model
mtlmodel = MTLModel(prototxt, headsDict)

3-stage Training

define the trainer

trainer = Trainer(mtlmodel, trainDataloaderDict, valDataloaderDict, criterionDict, metricDict)

pre-train phase

trainer.pre_train(iters=<total_iter>, lr=<init_lr>, savePath=<save_path>)

policy-train phase

loss_lambda = {'segment_semantic': 1, 'depth_zbuffer': 1, 'policy':0.0005} # the weights for each task and the policy regularization term from the paper
trainer.alter_train_with_reg(iters=<total_iter>, policy_network_iters=<alter_iters>, policy_lr=<policy_lr>, network_lr=<network_lr>, 
                             loss_lambda=loss_lambda, savePath=<save_path>)

Notice that when training the policy and the model weights together, we alternatively train them for specified iters in policy_network_iters.

post-train phase

trainer.post_train(ters=<total_iter>, lr=<init_lr>, 
                   loss_lambda=loss_lambda, savePath=<save_path>, reload=<policy_train_model_name>)

Note: Please refer to Example.ipynb for more details.

References

[1] Cordts, Marius and Omran, Mohamed and Ramos, Sebastian and Rehfeld, Timo and Enzweiler, Markus and Benenson, Rodrigo and Franke, Uwe and Roth, Stefan and Schiele, Bernt. The cityscapes dataset for semantic urban scene understanding. CVPR, 3213-3223, 2016.

[2] Silberman, Nathan and Hoiem, Derek and Kohli, Pushmeet and Fergus, Rob. Indoor segmentation and support inference from rgbd images. ECCV, 746-760, 2012.

[3] Zamir, Amir R and Sax, Alexander and Shen, William and Guibas, Leonidas J and Malik, Jitendra and Savarese, Silvio. Taskonomy: Disentangling task transfer learning. CVPR, 3712-3722, 2018.

[4] Chen, Liang-Chieh and Papandreou, George and Kokkinos, Iasonas and Murphy, Kevin and Yuille, Alan L. Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs. PAMI, 834-848, 2017.

Owner
Ivy Zhang
Ivy Zhang
PyTorch implementations of the beta divergence loss.

Beta Divergence Loss - PyTorch Implementation This repository contains code for a PyTorch implementation of the beta divergence loss. Dependencies Thi

Billy Carson 7 Nov 09, 2022
Code release to accompany paper "Geometry-Aware Gradient Algorithms for Neural Architecture Search."

Geometry-Aware Gradient Algorithms for Neural Architecture Search This repository contains the code required to run the experiments for the DARTS sear

18 May 27, 2022
NAS-FCOS: Fast Neural Architecture Search for Object Detection (CVPR 2020)

NAS-FCOS: Fast Neural Architecture Search for Object Detection This project hosts the train and inference code with pretrained model for implementing

Ning Wang 180 Dec 06, 2022
Unified Pre-training for Self-Supervised Learning and Supervised Learning for ASR

UniSpeech The family of UniSpeech: UniSpeech (ICML 2021): Unified Pre-training for Self-Supervised Learning and Supervised Learning for ASR UniSpeech-

Microsoft 282 Jan 09, 2023
Image-retrieval-baseline - MUGE Multimodal Retrieval Baseline

MUGE Multimodal Retrieval Baseline This repo is implemented based on the open_cl

47 Dec 16, 2022
Tools for the Cleveland State Human Motion and Control Lab

Introduction This is a collection of tools that are helpful for gait analysis. Some are specific to the needs of the Human Motion and Control Lab at C

CSU Human Motion and Control Lab 88 Dec 16, 2022
CMT: Convolutional Neural Networks Meet Vision Transformers

CMT: Convolutional Neural Networks Meet Vision Transformers [arxiv] 1. Introduction This repo is the CMT model which impelement with pytorch, no refer

FlyEgle 83 Dec 30, 2022
Imaging, analysis, and simulation software for radio interferometry

ehtim (eht-imaging) Python modules for simulating and manipulating VLBI data and producing images with regularized maximum likelihood methods. This ve

Andrew Chael 5.2k Dec 28, 2022
Learning from Synthetic Humans, CVPR 2017

Learning from Synthetic Humans (SURREAL) Gül Varol, Javier Romero, Xavier Martin, Naureen Mahmood, Michael J. Black, Ivan Laptev and Cordelia Schmid,

Gul Varol 538 Dec 18, 2022
Real-time face detection and emotion/gender classification using fer2013/imdb datasets with a keras CNN model and openCV.

Real-time face detection and emotion/gender classification using fer2013/imdb datasets with a keras CNN model and openCV.

Octavio Arriaga 5.3k Dec 30, 2022
iNAS: Integral NAS for Device-Aware Salient Object Detection

iNAS: Integral NAS for Device-Aware Salient Object Detection Introduction Integral search design (jointly consider backbone/head structures, design/de

顾宇超 77 Dec 02, 2022
Official PyTorch repo for JoJoGAN: One Shot Face Stylization

JoJoGAN: One Shot Face Stylization This is the PyTorch implementation of JoJoGAN: One Shot Face Stylization. Abstract: While there have been recent ad

1.3k Dec 29, 2022
This is an official implementation for "ResT: An Efficient Transformer for Visual Recognition".

ResT By Qing-Long Zhang and Yu-Bin Yang [State Key Laboratory for Novel Software Technology at Nanjing University] This repo is the official implement

zhql 222 Dec 13, 2022
Synthetic structured data generators

Join us on What is Synthetic Data? Synthetic data is artificially generated data that is not collected from real world events. It replicates the stati

YData 850 Jan 07, 2023
Learning What and Where to Draw

###Learning What and Where to Draw Scott Reed, Zeynep Akata, Santosh Mohan, Samuel Tenka, Bernt Schiele, Honglak Lee This is the code for our NIPS 201

Scott Ellison Reed 337 Nov 18, 2022
Unsupervised Semantic Segmentation by Contrasting Object Mask Proposals.

Unsupervised Semantic Segmentation by Contrasting Object Mask Proposals This repo contains the Pytorch implementation of our paper: Unsupervised Seman

Wouter Van Gansbeke 335 Dec 28, 2022
Key information extraction from invoice document with Graph Convolution Network

Key Information Extraction from Scanned Invoices Key information extraction from invoice document with Graph Convolution Network Related blog post fro

Phan Hoang 39 Dec 16, 2022
Not Suitable for Work (NSFW) classification using deep neural network Caffe models.

Open nsfw model This repo contains code for running Not Suitable for Work (NSFW) classification deep neural network Caffe models. Please refer our blo

Yahoo 5.6k Jan 05, 2023
CPU inference engine that delivers unprecedented performance for sparse models

The DeepSparse Engine is a CPU runtime that delivers unprecedented performance by taking advantage of natural sparsity within neural networks to reduce compute required as well as accelerate memory b

Neural Magic 1.2k Jan 09, 2023
The code of paper "Block Modeling-Guided Graph Convolutional Neural Networks".

Block Modeling-Guided Graph Convolutional Neural Networks This repository contains the demo code of the paper: Block Modeling-Guided Graph Convolution

22 Dec 08, 2022