How Do Adam and Training Strategies Help BNNs Optimization? In ICML 2021.

Last update: Sep 20, 2022

Related tags

Overview

AdamBNN

This is the pytorch implementation of our paper "How Do Adam and Training Strategies Help BNNs Optimization?", published in ICML 2021.

In this work, we explore the intrisic reasons why Adam is superior to other optimizers like SGD for BNN optimization and provide analytical explanations that support specific training strategies. By visualizing the optimization trajectory, we show that the optimization lies in extremely rugged loss landscape and the second-order momentum in Adam is crucial to revitalize the weights that are dead due to the activation saturation in BNNs. Based on analysis, we derive a specific training scheme and achieve 70.5% top-1 accuracy on the ImageNet dataset using the same achitecture as ReActNet while achieving 1.1% higher accuracy.

Citation

If you find our code useful for your research, please consider citing:

@conference{liu2021how,
title = {How do adam and training strategies help bnns optimization?},
author = {Liu, Zechun and Shen, Zhiqiang and Li, Shichao and Helwegen, Koen and Huang, Dong and Cheng, Kwang-Ting},
booktitle = {International Conference on Machine Learning},
year = {2021},
organization={PMLR}
}

Run

1. Requirements:

python3, pytorch 1.7.1, torchvision 0.8.2

2. Data:

Download ImageNet dataset

3. Steps to run:

(1) Step1: binarizing activations

Change directory to ./step1/
run bash run.sh

(2) Step2: binarizing weights + activations

Change directory to ./step2/
run bash run.sh

Models

Methods	Backbone	Top1-Acc	FLOPs	Trained Model
ReActNet	ReActNet-A	69.4%	0.87 x 10^8	Model-ReAct
AdamBNN	ReActNet-A	70.5%	0.87 x 10^8	Model-ReAct-AdamBNN-Training

Contact

Zechun Liu, HKUST and CMU (zliubq at connect.ust.hk / zechunl at andrew.cmu.edu)

Zhiqiang Shen, CMU (zhiqians at andrew.cmu.edu)

How Do Adam and Training Strategies Help BNNs Optimization? In ICML 2021.

Related tags

Overview

AdamBNN

Citation

Run

1. Requirements:

2. Data:

3. Steps to run:

Models

Contact

Owner

Zechun Liu

This repository contains the source codes for the paper AtlasNet V2 - Learning Elementary Structures.

DABO: Data Augmentation with Bilevel Optimization

MIMIC Code Repository: Code shared by the research community for the MIMIC-III database

Repository of the paper Compressing Sensor Data for Remote Assistance of Autonomous Vehicles using Deep Generative Models at ML4AD @ NeurIPS 2021.

Ensemble Visual-Inertial Odometry (EnVIO)

A Context-aware Visual Attention-based training pipeline for Object Detection from a Webpage screenshot!

🔎 Super-scale your images and run experiments with Residual Dense and Adversarial Networks.

PyTorch Implementation of Sparse DETR

Automatic meme generation model using Tensorflow Keras.

cl;asification problem using classification models in supervised learning

Rendering Point Clouds with Compute Shaders

Self-supervised Augmentation Consistency for Adapting Semantic Segmentation (CVPR 2021)

MEAL V2: Boosting Vanilla ResNet-50 to 80%+ Top-1 Accuracy on ImageNet without Tricks

LibFewShot: A Comprehensive Library for Few-shot Learning.

TorchCV: A PyTorch-Based Framework for Deep Learning in Computer Vision

ICCV2021 - A New Journey from SDRTV to HDRTV.

Unsupervised Semantic Segmentation by Contrasting Object Mask Proposals.

Pytorch reimplementation of the Mixer (MLP-Mixer: An all-MLP Architecture for Vision)

The official PyTorch code for 'DER: Dynamically Expandable Representation for Class Incremental Learning' accepted by CVPR2021

BabelCalib: A Universal Approach to Calibrating Central Cameras. In ICCV (2021)