Learning where to learn - Gradient sparsity in meta and continual learning

Last update: Dec 09, 2022

Related tags

Overview

Learning where to learn - Gradient sparsity in meta and continual learning

In this paper, we investigate gradient sparsity found by MAML in various continual and few-shot learning scenarios.
Instead of only learning the initialization of neural network parameters, we additionally meta-learn parameters underneath a step function that stops gradient descent when smaller then 0.

We term this version Sparse-MAML - Link to the paper here.

Interestingly, we see that structured sparsity emerges in both the classic 4-layer ConvNet as well as a ResNet-12 for few-shot learning. This is accompanied by improved robustness and generalisation across many hyperparameters.

Note that Sparse-MAML is an extremely simple variant of MAML that possesses only the possibility to shut on/off training of specific parameters compared to proper gradient modulation.

This codebase implents the few-shot learning experiments that are presented in the paper. To reproduce the results in the paper, please follow these instructions:

Installation

#1. Install a conda env:

conda create -n sparse-MAML

#2. Activate the env:

source activate sparse-MAML

#3. Install anaconda:

conda install anaconda

#4. Install extra requiremetns (make sure you use the correct pip3):

pip3 install -r requirements.txt

#5. Run:

chmod u+x run_sparse_MAML.sh

#6. Execute:

./run_sparse_MAML.sh

Results

MiniImageNet Few-Shot	MAML	ANIL	BOIL	sparse-MAML	sparse-ReLU-MAML
5-way 5-shot \| ConvNet	63.15	61.50	66.45	67.03	64.84
5-way 1-shot \| ConvNet	48.07	46.70	49.61	50.35	50.39
5-way 5-shot \| ResNet12	69.36	70.03	70.50	70.02	73.01
5-way 1-shot \| ResNet12	53.91	55.25	-	55.02	56.39

BOIL results are taken from the original paper.

This code based is heavily build on top of torchmeta.

Learning where to learn - Gradient sparsity in meta and continual learning

Related tags

Overview

Learning where to learn - Gradient sparsity in meta and continual learning

Installation

Results

Owner

Johannes Oswald

The fastest way to visualize GradCAM with your Keras models.

SAS: Self-Augmentation Strategy for Language Model Pre-training

PyTorch implementation of the paper: "Preference-Adaptive Meta-Learning for Cold-Start Recommendation", IJCAI, 2021.

This is the dataset and code release of the OpenRooms Dataset.

Automatic packaging of the open-composite libs for OvGME

Volumetric parameterization of the placenta to a flattened template

TransMVSNet: Global Context-aware Multi-view Stereo Network with Transformers.

StackNet is a computational, scalable and analytical Meta modelling framework

wgan, wgan2(improved, gp), infogan, and dcgan implementation in lasagne, keras, pytorch

NeuTex: Neural Texture Mapping for Volumetric Neural Rendering

Keras-tensorflow implementation of Fully Convolutional Networks for Semantic Segmentation（Unfinished）

[EMNLP 2020] Keep CALM and Explore: Language Models for Action Generation in Text-based Games

Free-duolingo-plus - Duolingo account creator that uses your invite code to get you free duolingo plus

A PyTorch implementation of SlowFast based on ICCV 2019 paper "SlowFast Networks for Video Recognition"

NeuroMorph: Unsupervised Shape Interpolation and Correspondence in One Go

Pytorch implementation for "Open Compound Domain Adaptation" (CVPR 2020 ORAL)

NER for Indian languages

FluxTraining.jl gives you an endlessly extensible training loop for deep learning

RLBot Python bindings for the Rust crate rl_ball_sym

Scalable Attentive Sentence-Pair Modeling via Distilled Sentence Embedding (AAAI 2020) - PyTorch Implementation