Generalized Category Discovery

This repo is a placeholder for code for our paper: Generalized Category Discovery

Abstract: In this paper, we consider a highly general image recognition setting wherein, given a labelled and unlabelled set of images, the task is to categorize all images in the unlabelled set. Here, the unlabelled images may come from labelled classes or from novel ones. Existing recognition methods are not able to deal with this setting, because they make several restrictive assumptions, such as the unlabelled instances only coming from known --- or unknown --- classes and the number of unknown classes being known a-priori. We address the more unconstrained setting, naming it `Generalized Category Discovery', and challenge all these assumptions. We first establish strong baselines by taking state-of-the-art algorithms from novel category discovery and adapting them for this task. Next, we propose the use of vision transformers with contrastive representation learning for this open world setting. We then introduce a simple yet effective semi-supervised $k$-means method to cluster the unlabelled data into seen and unseen classes automatically, substantially outperforming the baselines. Finally, we also propose a new approach to estimate the number of classes in the unlabelled data. We thoroughly evaluate our approach on public datasets for generic object classification including CIFAR10, CIFAR100 and ImageNet-100, and for fine-grained visual recognition including CUB, Stanford Cars and Herbarium19, benchmarking on this new setting to foster future research.

Code for our paper 'Generalized Category Discovery'

Related tags

Overview

Generalized Category Discovery

Code Coming Soon!

Owner

Learning Saliency Propagation for Semi-supervised Instance Segmentation

RetinaFace: Deep Face Detection Library in TensorFlow for Python

Code for Blind Image Decomposition (BID) and Blind Image Decomposition network (BIDeN).

A Human-in-the-Loop workflow for creating HD images from text

Only valid pull requests will be allowed. Use python only and readme changes will not be accepted.

This code reproduces the results of the paper, "Measuring Data Leakage in Machine-Learning Models with Fisher Information"

RODD: A Self-Supervised Approach for Robust Out-of-Distribution Detection

Artifacts for paper "MMO: Meta Multi-Objectivization for Software Configuration Tuning"

A general framework for inferring CNNs efficiently. Reduce the inference latency of MobileNet-V3 by 1.3x on an iPhone XS Max without sacrificing accuracy.

Learning Representational Invariances for Data-Efficient Action Recognition

Implementation of Memformer, a Memory-augmented Transformer, in Pytorch

Code for AA-RMVSNet: Adaptive Aggregation Recurrent Multi-view Stereo Network (ICCV 2021).

Rlmm blender toolkit - A set of tools to streamline level generation in UDK straight from Blender

Video Contrastive Learning with Global Context

OCR-D wrapper for detectron2 based segmentation models

discovering subdomains, hidden paths, extracting unique links

[CVPR 2021] A Peek Into the Reasoning of Neural Networks: Interpreting with Structural Visual Concepts

Using knowledge-informed machine learning on the PRONOSTIA (FEMTO) and IMS bearing data sets. Predict remaining-useful-life (RUL).

Activating More Pixels in Image Super-Resolution Transformer

A Pythonic library for Nvidia Codec.