Convnet transfer - Code for paper How transferable are features in deep neural networks?

Overview

How transferable are features in deep neural networks?

This repository contains source code necessary to reproduce the results presented in the following paper:

@inproceedings{yosinski_2014_NIPS
  title={How transferable are features in deep neural networks?},
  author={Yosinski, Jason and Clune, Jeff and Bengio, Yoshua and Lipson, Hod},
  booktitle={Advances in Neural Information Processing Systems 27 (NIPS '14)},
  editor = {Z. Ghahramani and M. Welling and C. Cortes and N.D. Lawrence and K.Q. Weinberger},
  publisher = {Curran Associates, Inc.},
  pages = {3320--3328},
  year={2014}
}

The are four steps to using this codebase to reproduce the results in the paper.

  • Assemble prerequisites
  • Create datasets
  • Train models
  • Gather and plot results

Each is described below. Training results are also provided in the results directory for those just wishing to compare results to their own work without undertaking the arduous training process.

Assemble prerequisites

Several dependencies should be installed.

  • To run experiments: Caffe and its relevant dependencies (see install tutorial).
  • To produce plots: the IPython, numpy, and matplotlib packages for python. Depending on your setup, it may be possible to install these via pip install ipython numpy matplotlib.

Create Datasets

1. Obtain ILSVRC 2012 dataset

The ImageNet Large Scale Visual Recognition Challenge (ILSVRC) 2012 dataset can be downloaded here (registration required).

2. Create derivative dataset splits

The necessary smaller derivative datasets (random halves, natural and man-made halves, and reduced volume versions) can be created from the raw ILSVRC12 dataset.

$ cd ilsvrc12
$ ./make_reduced_datasets.sh

The script will do most of the work, including setting random seeds to hopefully produce the exact same random splits used in the paper. Md5sums are listed for each dataset file at the bottom of make_reduced_datasets.sh, which can be used to verify the match. Results may vary on different platforms though, so don't worry too much if your sums don't match.

3. Convert datasets to databases

The datasets created above are so far just text files providing a list of image filenames and class ids. To train a Caffe model, they should be converted to a LevelDB or LMDB, one per dataset. See the Caffe ImageNet Tutorial for a more in depth look at this process.

First, edit create_all_leveldbs.sh and set the IMAGENET_DIR and CAFFE_TOOLS_DIR to point to the directories containing the ImageNet image files and compiled caffe tools (like convert_imageset.bin), respectively. Then run:

$ ./create_all_leveldbs.sh

This step takes a lot of space (and time), approximately 230 GB for the base training dataset, and on average 115 GB for each of the 10 split versions, for a total of about 1.5 TB. If this is prohibitive, you might consider using a different type of data layer type for Caffe that loads images directly from a single shared directory.

4. Compute the mean of each dataset

Again, edit the paths in the script to point to the appropriate locations, and then run:

$ ./create_all_means.sh

This just computes the mean of each dataset and saves it in the dataset directory. Means are subtracted from input images during training and inference.

Train models

A total of 163 networks were trained to produce the results in the paper. Many of these networks can be trained in parallel, but because weights are transferred from one network to another, some must be trained serially. In particular, all networks in the first block below must be trained before any in the second block can be trained. All networks within a block may be trained at the same time. The "whenever" block does not contain dependencies and can be trained any time.

Block: one
  half*       (10 nets)

Block: two
  transfer*   (140 nets)

Block: whenever
  netbase     (1 net)
  reduced-*   (12 nets)

To train a given network, change to its directory, copy (or symlink) the required caffe executable, and run the training procedure. This can be accomplished using the following commands, demonstrated for the half0A network:

$ cd results/half0A
$ cp /path/to/caffe/build/tools/caffe.bin .
$ ./caffe.bin train -solver imagenet_solver.prototxt

Repeat this process for all networks in block: one and block: whenever above. Once the networks in block: one are trained, train all the networks in block: two similarly. This time the command is slightly different, because we need to load the base network in order to fine-tune it on the target task. Here's an example for the transfer0A0A_1_1 network:

$ cd results/transfer0A0A_1_1
$ cp /path/to/caffe/build/tools/caffe.bin .
$ ./caffe.bin train -solver imagenet_solver.prototxt -weights basenet/caffe_imagenet_train_iter_450000

The basenet symlinks have been added to point to the appropriate base network, but the basenet/caffe_imagenet_train_iter_450000 file will not exist until the relevant block: one networks has been trained.

Training notes: while the above procedure should work if followed literally, because each network takes about 9.5 days to train (on a K20 GPU), it will be much faster to train networks in parallel in a cluster environment. To do so, create and submit jobs as appropriate for your system. You'll also want to ensure that the output of the training procedure is logged, either by piping to a file

$ ./caffe.bin train ... > log_file 2>&1

or via whatever logging facilities are supplied by your cluster or job manager setup.

Plot results

Once the networks are trained, the results can be plotted using the included IPython notebook plots/transfer_plots.ipynb. Start the IPython Notebook server:

$ cd plots
$ ipython notebook

Select the transfer_plots.ipynb notebook and execute the included code. Note that without modification, the code will load results from the cached log files included in this repository. If you've run your own training and wish to plot those log files, change the paths in the "Load all the data" section to point to your log files instead.

Shortcut: to skip all the work and just see the results, take a look at this notebook with cached plots.

Questions?

Please drop me a line if you have any questions!

Owner
Jason Yosinski
Jason Yosinski
Lightweight, Python library for fast and reproducible experimentation :microscope:

Steppy What is Steppy? Steppy is a lightweight, open-source, Python 3 library for fast and reproducible experimentation. Steppy lets data scientist fo

minerva.ml 134 Jul 10, 2022
Pgn2tex - Scripts to convert pgn files to latex document. Useful to build books or pdf from pgn studies

Pgn2Latex (WIP) A simple script to make pdf from pgn files and studies. It's sti

12 Jul 23, 2022
A platform for intelligent agent learning based on a 3D open-world FPS game developed by Inspir.AI.

Wilderness Scavenger: 3D Open-World FPS Game AI Challenge This is a platform for intelligent agent learning based on a 3D open-world FPS game develope

46 Nov 24, 2022
Context Axial Reverse Attention Network for Small Medical Objects Segmentation

CaraNet: Context Axial Reverse Attention Network for Small Medical Objects Segmentation This repository contains the implementation of a novel attenti

401 Dec 23, 2022
Align before Fuse: Vision and Language Representation Learning with Momentum Distillation

This is the official PyTorch implementation of the ALBEF paper [Blog]. This repository supports pre-training on custom datasets, as well as finetuning on VQA, SNLI-VE, NLVR2, Image-Text Retrieval on

Salesforce 805 Jan 09, 2023
[SIGGRAPH Asia 2019] Artistic Glyph Image Synthesis via One-Stage Few-Shot Learning

AGIS-Net Introduction This is the official PyTorch implementation of the Artistic Glyph Image Synthesis via One-Stage Few-Shot Learning. paper | suppl

Yue Gao 102 Jan 02, 2023
🔥🔥High-Performance Face Recognition Library on PaddlePaddle & PyTorch🔥🔥

face.evoLVe: High-Performance Face Recognition Library based on PaddlePaddle & PyTorch Evolve to be more comprehensive, effective and efficient for fa

Zhao Jian 3.1k Jan 02, 2023
This repository contains the code and models necessary to replicate the results of paper: How to Robustify Black-Box ML Models? A Zeroth-Order Optimization Perspective

Black-Box-Defense This repository contains the code and models necessary to replicate the results of our recent paper: How to Robustify Black-Box ML M

OPTML Group 2 Oct 05, 2022
Pytorch implementations of the paper Value Functions Factorization with Latent State Information Sharing in Decentralized Multi-Agent Policy Gradients

LSF-SAC Pytorch implementations of the paper Value Functions Factorization with Latent State Information Sharing in Decentralized Multi-Agent Policy G

Hanhan 2 Aug 14, 2022
Keeper for Ricochet Protocol, implemented with Apache Airflow

Ricochet Keeper This repository contains Apache Airflow DAGs for executing keeper operations for Ricochet Exchange. Usage You will need to run this us

Ricochet Exchange 5 May 24, 2022
A pytorch implementation of Detectron. Both training from scratch and inferring directly from pretrained Detectron weights are available.

Use this instead: https://github.com/facebookresearch/maskrcnn-benchmark A Pytorch Implementation of Detectron Example output of e2e_mask_rcnn-R-101-F

Roy 2.8k Dec 29, 2022
Using Python to Play Cyberpunk 2077

CyberPython 2077 Using Python to Play Cyberpunk 2077 This repo will contain code from the Cyberpython 2077 video series on Youtube (youtube.

Harrison 118 Oct 18, 2022
Official PyTorch implementation of the ICRA 2021 paper: Adversarial Differentiable Data Augmentation for Autonomous Systems.

Adversarial Differentiable Data Augmentation This repository provides the official PyTorch implementation of the ICRA 2021 paper: Adversarial Differen

Manli 3 Oct 15, 2022
PyTorch Code of "Memory In Memory: A Predictive Neural Network for Learning Higher-Order Non-Stationarity from Spatiotemporal Dynamics"

Memory In Memory Networks It is based on the paper Memory In Memory: A Predictive Neural Network for Learning Higher-Order Non-Stationarity from Spati

Yang Li 12 May 30, 2022
Sign-to-Speech for Sign Language Understanding: A case study of Nigerian Sign Language

Sign-to-Speech for Sign Language Understanding: A case study of Nigerian Sign Language This repository contains the code, model, and deployment config

16 Oct 23, 2022
This repo is developed for Strong Baseline For Vehicle Re-Identification in Track 2 Ai-City-2021 Challenges

A STRONG BASELINE FOR VEHICLE RE-IDENTIFICATION This paper is accepted to the IEEE Conference on Computer Vision and Pattern Recognition Workshop(CVPR

Cybercore Co. Ltd 78 Dec 29, 2022
Main repository for the HackBio'2021 Virtual Internship Experience for #Team-Greider ❤️

Hello 🤟 #Team-Greider The team of 20 people for HackBio'2021 Virtual Bioinformatics Internship 💝 🖨️ 👨‍💻 HackBio: https://thehackbio.com 💬 Ask us

Siddhant Sharma 7 Oct 20, 2022
An efficient PyTorch library for Global Wheat Detection using YOLOv5. The project is based on this Kaggle competition Global Wheat Detection (2021).

Global-Wheat-Detection An efficient PyTorch library for Global Wheat Detection using YOLOv5. The project is based on this Kaggle competition Global Wh

Chuxin Wang 11 Sep 25, 2022
This package contains deep learning models and related scripts for RoseTTAFold

RoseTTAFold This package contains deep learning models and related scripts to run RoseTTAFold This repository is the official implementation of RoseTT

1.6k Jan 03, 2023
Fine-tune pretrained Convolutional Neural Networks with PyTorch

Fine-tune pretrained Convolutional Neural Networks with PyTorch. Features Gives access to the most popular CNN architectures pretrained on ImageNet. A

Alex Parinov 694 Nov 23, 2022