PixelPick This is an official implementation of the paper "All you need are a few pixels: semantic segmentation with PixelPick."

Overview

PixelPick

This is an official implementation of the paper "All you need are a few pixels: semantic segmentation with PixelPick."

[Project page] [Paper]

Table of contents

Abstract

A central challenge for the task of semantic segmentation is the prohibitive cost of obtaining dense pixel-level annotations to supervise model training. In this work, we show that in order to achieve a good level of segmentation performance, all you need are a few well-chosen pixel labels. We make the following contributions: (i) We investigate the novel semantic segmentation setting in which labels are supplied only at sparse pixel locations, and show that deep neural networks can use a handful of such labels to good effect; (ii) We demonstrate how to exploit this phenomena within an active learning framework, termed PixelPick, to radically reduce labelling cost, and propose an efficient “mouse-free” annotation strategy to implement our approach; (iii) We conduct extensive experiments to study the influence of annotation diversity under a fixed budget, model pretraining, model capacity and the sampling mechanism for picking pixels in this low annotation regime; (iv) We provide comparisons to the existing state of the art in semantic segmentation with active learning, and demonstrate comparable performance with up to two orders of magnitude fewer pixel annotations on the CamVid, Cityscapes and PASCAL VOC 2012 benchmarks; (v) Finally, we evaluate the efficiency of our annotation pipeline and its sensitivity to annotator error to demonstrate its practicality. Our code, models and annotation tool will be made publicly available.

Installation

Prerequisites

Our code is based on Python 3.8 and uses the following Python packages.

torch>=1.8.1
torchvision>=0.9.1
tqdm>=4.59.0
cv2>=4.5.1.48
Clone this repository
git clone https://github.com/NoelShin/PixelPick.git
cd PixelPick
Download dataset

Follow one of the instructions below to download a dataset you are interest in. Then, set the dir_dataset variable in args.py to the directory path which contains the downloaded dataset.

  • For CamVid, you need to download SegNet-Tutorial codebase as a zip file and use CamVid directory which contains images/annotations for training and test after unzipping it. You don't need to change the directory structure. [CamVid]

  • For Cityscapes, first visit the link and login to download. Once downloaded, you need to unzip it. You don't need to change the directory structure. It is worth noting that, if you set downsample variable in args.py (4 by default), it will first downsample train and val images of Cityscapes and store them within {dir_dataset}_d{downsample} folder which will be located in the same directory of dir_dataset. This is to enable a faster dataloading during training. [Cityscapes]

  • For PASCAL VOC 2012, the dataset will be automatically downloaded via torchvision.datasets.VOCSegmentation. You just need to specify which directory you want to download it with dir_dataset variable. If the automatic download fails, you can manually download through the following page (you don't need to untar VOCtrainval_11-May-2012.tar file which will be downloaded). [PASCAL VOC 2012 segmentation]

For more details about the data we used to train/validate our model, please visit datasets directory and find {camvid, cityscapes, voc}_{train, val}.txt file.

Train and validate

By default, the current code validates the model every epoch while training. To train a MobileNetv2-based DeepLabv3+ network, follow the below lines. (The pretrained MobileNetv2 will be loaded automatically.)

cd scripts
sh pixelpick-dl-cv.sh

Benchmark results

For CamVid and Cityscapes, we report the average of 5 different runs and 3 different runs for PASCAL VOC 2012. Please refer to our paper for details. ± one std of mean IoU is denoted.

CamVid
model backbone (encoder) # labelled pixels per img (% annotation) mean IoU (%)
PixelPick MobileNetv2 20 (0.012) 50.8 ± 0.2
PixelPick MobileNetv2 40 (0.023) 53.9 ± 0.7
PixelPick MobileNetv2 60 (0.035) 55.3 ± 0.5
PixelPick MobileNetv2 80 (0.046) 55.2 ± 0.7
PixelPick MobileNetv2 100 (0.058) 55.9 ± 0.1
Fully-supervised MobileNetv2 360x480 (100) 58.2 ± 0.6
PixelPick ResNet50 20 (0.012) 59.7 ± 0.9
PixelPick ResNet50 40 (0.023) 62.3 ± 0.5
PixelPick ResNet50 60 (0.035) 64.0 ± 0.3
PixelPick ResNet50 80 (0.046) 64.4 ± 0.6
PixelPick ResNet50 100 (0.058) 65.1 ± 0.3
Fully-supervised ResNet50 360x480 (100) 67.8 ± 0.3
Cityscapes

Note that to make training time manageable, we train on the quarter resolution (256x512) of the original Cityscapes images (1024x2048).

model backbone (encoder) # labelled pixels per img (% annotation) mean IoU (%)
PixelPick MobileNetv2 20 (0.015) 52.0 ± 0.6
PixelPick MobileNetv2 40 (0.031) 54.7 ± 0.4
PixelPick MobileNetv2 60 (0.046) 55.5 ± 0.6
PixelPick MobileNetv2 80 (0.061) 56.1 ± 0.3
PixelPick MobileNetv2 100 (0.076) 56.5 ± 0.3
Fully-supervised MobileNetv2 256x512 (100) 61.4 ± 0.5
PixelPick ResNet50 20 (0.015) 56.1 ± 0.4
PixelPick ResNet50 40 (0.031) 60.0 ± 0.3
PixelPick ResNet50 60 (0.046) 61.6 ± 0.4
PixelPick ResNet50 80 (0.061) 62.3 ± 0.4
PixelPick ResNet50 100 (0.076) 62.8 ± 0.4
Fully-supervised ResNet50 256x512 (100) 68.5 ± 0.3
PASCAL VOC 2012
model backbone (encoder) # labelled pixels per img (% annotation) mean IoU (%)
PixelPick MobileNetv2 10 (0.009) 51.7 ± 0.2
PixelPick MobileNetv2 20 (0.017) 53.9 ± 0.8
PixelPick MobileNetv2 30 (0.026) 56.7 ± 0.3
PixelPick MobileNetv2 40 (0.034) 56.9 ± 0.7
PixelPick MobileNetv2 50 (0.043) 57.2 ± 0.3
Fully-supervised MobileNetv2 N/A (100) 57.9 ± 0.5
PixelPick ResNet50 10 (0.009) 59.7 ± 0.8
PixelPick ResNet50 20 (0.017) 65.6 ± 0.5
PixelPick ResNet50 30 (0.026) 66.4 ± 0.2
PixelPick ResNet50 40 (0.034) 67.2 ± 0.1
PixelPick ResNet50 50 (0.043) 67.4 ± 0.5
Fully-supervised ResNet50 N/A (100) 69.4 ± 0.3

Models

model dataset backbone (encoder) # labelled pixels per img (% annotation) mean IoU (%) Download
PixelPick CamVid MobileNetv2 100 (0.058) 56.1 Link
PixelPick CamVid ResNet50 100 (0.058) TBU TBU
PixelPick Cityscapes MobileNetv2 100 (0.076) 56.8 Link
PixelPick Cityscapes ResNet50 100 (0.076) 63.3 Link
PixelPick VOC 2012 MobileNetv2 50 (0.043) 57.4 Link
PixelPick VOC 2012 ResNet50 50 (0.043) 68.0 Link

PixelPick mouse-free annotation tool

Code for the annotation tool will be made available.

Citation

To be updated.

Acknowledgements

We borrowed code for the MobileNetv2-based DeepLabv3+ network from https://github.com/Shuai-Xie/DEAL.

If you have any questions, please contact us at {gyungin, weidi, samuel}@robots.ox.ac.uk.

Owner
Gyungin Shin
Serving others
Gyungin Shin
Official pytorch implementation of Active Learning for deep object detection via probabilistic modeling (ICCV 2021)

Active Learning for Deep Object Detection via Probabilistic Modeling This repository is the official PyTorch implementation of Active Learning for Dee

NVIDIA Research Projects 130 Jan 06, 2023
RoIAlign & crop_and_resize for PyTorch

RoIAlign for PyTorch This is a PyTorch version of RoIAlign. This implementation is based on crop_and_resize and supports both forward and backward on

Long Chen 530 Jan 07, 2023
[ICML'21] Estimate the accuracy of the classifier in various environments through self-supervision

What Does Rotation Prediction Tell Us about Classifier Accuracy under Varying Testing Environments? [Paper] [ICML'21 Project] PyTorch Implementation T

24 Oct 26, 2022
DiSECt: Differentiable Simulator for Robotic Cutting

DiSECt: Differentiable Simulator for Robotic Cutting Website | Paper | Dataset | Video | Blog post DiSECt is a simulator for the cutting of deformable

NVIDIA Research Projects 73 Oct 29, 2022
Heart Arrhythmia Classification

This program takes and input of an ECG in European Data Format (EDF) and outputs the classification for heartbeats into normal vs different types of arrhythmia . It uses a deep learning model for cla

4 Nov 02, 2022
The mini-AlphaStar (mini-AS, or mAS) - mini-scale version (non-official) of the AlphaStar (AS)

A mini-scale reproduction code of the AlphaStar program. Note: the original AlphaStar is the AI proposed by DeepMind to play StarCraft II.

Ruo-Ze Liu 216 Jan 04, 2023
Lipstick ain't enough: Beyond Color-Matching for In-the-Wild Makeup Transfer (CVPR 2021)

Table of Content Introduction Datasets Getting Started Requirements Usage Example Training & Evaluation CPM: Color-Pattern Makeup Transfer CPM is a ho

VinAI Research 248 Dec 13, 2022
OCR Streamlit App is used to extract text from images using python's easyocr, pytorch and streamlit packages

OCR-Streamlit-App OCR Streamlit App is used to extract text from images using python's easyocr, pytorch and streamlit packages OCR app gets an image a

Siva Prakash 5 Apr 05, 2022
Official repository with code and data accompanying the NAACL 2021 paper "Hurdles to Progress in Long-form Question Answering" (https://arxiv.org/abs/2103.06332).

Hurdles to Progress in Long-form Question Answering This repository contains the official scripts and datasets accompanying our NAACL 2021 paper, "Hur

Kalpesh Krishna 41 Nov 08, 2022
A PyTorch implementation of the paper Mixup: Beyond Empirical Risk Minimization in PyTorch

Mixup: Beyond Empirical Risk Minimization in PyTorch This is an unofficial PyTorch implementation of mixup: Beyond Empirical Risk Minimization. The co

Harry Yang 121 Dec 17, 2022
[ICCV21] Self-Calibrating Neural Radiance Fields

Self-Calibrating Neural Radiance Fields, ICCV, 2021 Project Page | Paper | Video Author Information Yoonwoo Jeong [Google Scholar] Seokjun Ahn [Google

381 Dec 30, 2022
Code for paper "Multi-level Disentanglement Graph Neural Network"

Multi-level Disentanglement Graph Neural Network (MD-GNN) This is a PyTorch implementation of the MD-GNN, and the code includes the following modules:

Lirong Wu 6 Dec 29, 2022
Industrial knn-based anomaly detection for images. Visit streamlit link to check out the demo.

Industrial KNN-based Anomaly Detection ⭐ Now has streamlit support! ⭐ Run $ streamlit run streamlit_app.py This repo aims to reproduce the results of

aventau 102 Dec 26, 2022
Compare neural networks by their feature similarity

PyTorch Model Compare A tiny package to compare two neural networks in PyTorch. There are many ways to compare two neural networks, but one robust and

Anand Krishnamoorthy 181 Jan 04, 2023
Flexible-Modal Face Anti-Spoofing: A Benchmark

Flexible-Modal FAS This is the official repository of "Flexible-Modal Face Anti-

Zitong Yu 22 Nov 10, 2022
[CVPR 2021] Official PyTorch Implementation for "Iterative Filter Adaptive Network for Single Image Defocus Deblurring"

IFAN: Iterative Filter Adaptive Network for Single Image Defocus Deblurring Checkout for the demo (GUI/Google Colab)! The GUI version might occasional

Junyong Lee 173 Dec 30, 2022
Unofficial PyTorch implementation of Neural Additive Models (NAM) by Agarwal, et al.

nam-pytorch Unofficial PyTorch implementation of Neural Additive Models (NAM) by Agarwal, et al. [abs, pdf] Installation You can access nam-pytorch vi

Rishabh Anand 11 Mar 14, 2022
Simulated garment dataset for virtual try-on

Simulated garment dataset for virtual try-on This repository contains the dataset used in the following papers: Self-Supervised Collision Handling via

33 Dec 20, 2022
Implementation of paper "Self-supervised Learning on Graphs:Deep Insights and New Directions"

SelfTask-GNN A PyTorch implementation of "Self-supervised Learning on Graphs: Deep Insights and New Directions". [paper] In this paper, we first deepe

Wei Jin 85 Oct 13, 2022
Unofficial PyTorch implementation of "RTM3D: Real-time Monocular 3D Detection from Object Keypoints for Autonomous Driving" (ECCV 2020)

RTM3D-PyTorch The PyTorch Implementation of the paper: RTM3D: Real-time Monocular 3D Detection from Object Keypoints for Autonomous Driving (ECCV 2020

Nguyen Mau Dzung 271 Nov 29, 2022