Learning to Prompt for Continual Learning

Overview

Learning to Prompt for Continual Learning (L2P) Official Jax Implementation

L2P is a novel continual learning technique which learns to dynamically prompt a pre-trained model to learn tasks sequentially under different task transitions. Different from mainstream rehearsal-based or architecture-based methods, L2P requires neither a rehearsal buffer nor test-time task identity. L2P can be generalized to various continual learning settings including the most challenging and realistic task-agnostic setting. L2P consistently outperforms prior state-of-the-art methods. Surprisingly, L2P achieves competitive results against rehearsal-based methods even without a rehearsal buffer.

Code is written by Zifeng Wang. Acknowledgement to https://github.com/google-research/nested-transformer.

This is not an officially supported Google product.

Enviroment setup

pip install -r requirements.txt

Getting pretrained ViT model

ViT-B/16 model used in this paper can be downloaded at here.

Instructions on running L2P

We provide the configuration file to train and evaluate L2P on multiple benchmarks in configs.

To run our method on the Split CIFAR-100 dataset (class-incremental setting):

python -m main.py --my_config configs/cifar100_l2p.py --workdir=./cifar100_l2p --my_config.init_checkpoint=<ViT-saved-path/ViT-B_16.npz>

To run our method on the more complex Gaussian Scheduled CIFAR-100 dataset (task-agnostic setting):

python -m main.py --my_config configs/cifar100_gaussian_l2p.py --workdir=./cifar100_gaussian_l2p --my_config.init_checkpoint=<ViT-saved-path/ViT-B_16.npz>

Note: we run our experiments using 8 V100 GPUs or 4 TPUs, and we specify a per device batch size of 16 in the config files. This indicates that we use a total batch size of 128.

Visualize results

We use tensorboard to visualize the result. For example, if the working directory specified to run L2P is workdir=./cifar100_l2p, the command to check result is as follows:

tensorboard --logdir ./cifar100_l2p

Here are the important metrics to keep track of, and their corresponding meanings:

Metric Description
accuracy_n Accuracy of the n-th task
forgetting Average forgetting up until the current task
avg_acc Average evaluation accuracy up until the current task

Cite

@inproceedings{wang2021learning,
  title={Learning to Prompt for Continual Learning},
  author={Zifeng Wang and Zizhao Zhang and Chen-Yu Lee and Han Zhang and Ruoxi Sun and Xiaoqi Ren and Guolong Su and Vincent Perot and Jennifer Dy and Tomas Pfister},
  booktitle={arXiv preprint arXiv:2112.08654},
  year={2021}
}
Fast and exact ILP-based solvers for the Minimum Flow Decomposition (MFD) problem, and variants of it.

MFD-ILP Fast and exact ILP-based solvers for the Minimum Flow Decomposition (MFD) problem, and variants of it. The solvers are implemented using Pytho

Algorithmic Bioinformatics Group @ University of Helsinki 4 Oct 23, 2022
OpenCVのGrabCut()を利用したセマンティックセグメンテーション向けアノテーションツール(Annotation tool using GrabCut() of OpenCV. It can be used to create datasets for semantic segmentation.)

[Japanese/English] GrabCut-Annotation-Tool GrabCut-Annotation-Tool.mp4 OpenCVのGrabCut()を利用したアノテーションツールです。 セマンティックセグメンテーション向けのデータセット作成にご使用いただけます。 ※Grab

KazuhitoTakahashi 30 Nov 18, 2022
NumQMBasic - A mini-course offered to Undergrad physics students

The best way to use this material is by forking it by click the Fork button at the top, right corner. Then you will get your own copy to play with! Th

Raghu 35 Dec 05, 2022
Repo 4 basic seminar §How to make human machine readable"

WORK IN PROGRESS... Notebooks from the Seminar: Human Machine Readable WS21/22 Introduction into programming Georg Trogemann, Christian Heck, Mattis

experimental-informatics 3 May 29, 2022
[ACL-IJCNLP 2021] "EarlyBERT: Efficient BERT Training via Early-bird Lottery Tickets"

EarlyBERT This is the official implementation for the paper in ACL-IJCNLP 2021 "EarlyBERT: Efficient BERT Training via Early-bird Lottery Tickets" by

VITA 13 May 11, 2022
MicRank is a Learning to Rank neural channel selection framework where a DNN is trained to rank microphone channels.

MicRank: Learning to Rank Microphones for Distant Speech Recognition Application Scenario Many applications nowadays envision the presence of multiple

Samuele Cornell 20 Nov 10, 2022
Tensorflow Implementation of Pixel Transposed Convolutional Networks (PixelTCN and PixelTCL)

Pixel Transposed Convolutional Networks Created by Hongyang Gao, Hao Yuan, Zhengyang Wang and Shuiwang Ji at Texas A&M University. Introduction Pixel

Hongyang Gao 95 Jul 24, 2022
Look Closer: Bridging Egocentric and Third-Person Views with Transformers for Robotic Manipulation

Look Closer: Bridging Egocentric and Third-Person Views with Transformers for Robotic Manipulation Official PyTorch implementation for the paper Look

Rishabh Jangir 20 Nov 24, 2022
The Python ensemble sampling toolkit for affine-invariant MCMC

emcee The Python ensemble sampling toolkit for affine-invariant MCMC emcee is a stable, well tested Python implementation of the affine-invariant ense

Dan Foreman-Mackey 1.3k Dec 31, 2022
Measure WWjj polarization fraction

WlWl Polarization Measure WWjj polarization fraction Paper: arXiv:2109.09924 Notice: This code can only be used for the inference process, if you want

4 Apr 10, 2022
Python scripts form performing stereo depth estimation using the high res stereo model in PyTorch .

PyTorch-High-Res-Stereo-Depth-Estimation Python scripts form performing stereo depth estimation using the high res stereo model in PyTorch. Stereo dep

Ibai Gorordo 26 Nov 24, 2022
🍀 Pytorch implementation of various Attention Mechanisms, MLP, Re-parameter, Convolution, which is helpful to further understand papers.⭐⭐⭐

🍀 Pytorch implementation of various Attention Mechanisms, MLP, Re-parameter, Convolution, which is helpful to further understand papers.⭐⭐⭐

xmu-xiaoma66 7.7k Jan 05, 2023
[ICCV21] Self-Calibrating Neural Radiance Fields

Self-Calibrating Neural Radiance Fields, ICCV, 2021 Project Page | Paper | Video Author Information Yoonwoo Jeong [Google Scholar] Seokjun Ahn [Google

381 Dec 30, 2022
This is a Keras implementation of a CNN for estimating age, gender and mask from a camera.

face-detector-age-gender This is a Keras implementation of a CNN for estimating age, gender and mask from a camera. Before run face detector app, expr

Devdreamsolution 2 Dec 04, 2021
IDM: An Intermediate Domain Module for Domain Adaptive Person Re-ID,

Intermediate Domain Module (IDM) This repository is the official implementation for IDM: An Intermediate Domain Module for Domain Adaptive Person Re-I

Yongxing Dai 87 Nov 22, 2022
Official code for Next Check-ins Prediction via History and Friendship on Location-Based Social Networks (MDM 2018)

MUC Next Check-ins Prediction via History and Friendship on Location-Based Social Networks (MDM 2018) Performance Details for Accuracy: | Dataset

Yijun Su 3 Oct 09, 2022
IA for recognising Traffic Signs using Keras [Tensorflow]

Traffic Signs Recognition ⚠️ 🚦 Fundamentals of Intelligent Systems Introduction 📄 Development of a neural network capable of recognizing nine differ

Sebastián Fernández García 2 Dec 19, 2022
Reduce end to end training time from days to hours (or hours to minutes), and energy requirements/costs by an order of magnitude using coresets and data selection.

COResets and Data Subset selection Reduce end to end training time from days to hours (or hours to minutes), and energy requirements/costs by an order

decile-team 244 Jan 09, 2023
Our solution for SSN Invente 2021's Hackathon

Our solution for SSN Invente 2021's Hackathon. To help maitain godowns in a pristine and safe condition using raspberry pi.

1 Jan 12, 2022
[CVPR2021] The source code for our paper 《Removing the Background by Adding the Background: Towards Background Robust Self-supervised Video Representation Learning》.

TBE The source code for our paper "Removing the Background by Adding the Background: Towards Background Robust Self-supervised Video Representation Le

Jinpeng Wang 150 Dec 28, 2022