🔪 Elimination based Lightweight Neural Net with Pretrained Weights

Overview

ELimNet

ELimNet: Eliminating Layers in a Neural Network Pretrained with Large Dataset for Downstream Task

  • Removed top layers from pretrained EfficientNetB0 and ResNet18 to construct lightweight CNN model with less than 1M #params.
  • Assessed on Trash Annotations in Context(TACO) Dataset sampled for 6 classes with 20,851 images.
  • Compared performance with lightweight models generated with Optuna's Neural Architecture Search(NAS) constituted with same convolutional blocks.

Quickstart

Installation

# clone the repository
git clone https://github.com/snoop2head/elimnet

# fetch image dataset and unzip
!wget -cq https://aistages-prod-server-public.s3.amazonaws.com/app/Competitions/000081/data/data.zip
!unzip ./data.zip -d ./

Train

# finetune on the dataset with pretrained model
python train.py --model ./model/efficientnet_b0.yaml

# finetune on the dataset with ElimNet
python train.py --model ./model/efficientnet_b0_elim_3.yaml

Inference

# inference with the lastest ran model
python inference.py --model_dir ./exp/latest/

Performance

Performance is compared with (1) original pretrained model and (2) Optuna NAS constructed models with no pretrained weights.

  • Indicates that top convolutional layers eliminated pretrained CNN models outperforms empty Optuna NAS models generated with same convolutional blocks.
  • Suggests that eliminating top convolutional layers creates lightweight model that shows similar(or better) classifcation performance with original pretrained model.
  • Reduces parameters to 7%(or less) of its original parameters while maintaining(or improving) its performance. Saves inference time by 20% or more by eliminating top convolutional layters.

ELimNet vs Pretrained Models (Train)

[100 epochs] # of Parameters # of Layers Train Validation Test F1
Pretrained EfficientNet B0 4.0M 352 Loss: 0.43
Acc: 81.23%
F1: 0.84
Loss: 0.469
Acc: 82.17%
F1: 0.76
0.7493
EfficientNet B0 Elim 2 0.9M 245 Loss:0.652
Acc: 87.22%
F1: 0.84
Loss: 0.622
Acc: 87.22%
F1: 0.77
0.7603
EfficientNet B0 Elim 3 0.30M 181 Loss: 0.602
Acc: 78.17%
F1: 0.74
Loss: 0.661
Acc: 77.41%
F1: 0.74
0.7349
Resnet18 11.17M 69 Loss: 0.578
Acc: 78.90%
F1: 0.76
Loss: 0.700
Acc: 76.17%
F1: 0.719
-
Resnet18 Elim 2 0.68M 37 Loss: 0.447
Acc: 83.73%
F1: 0.71
Loss: 0.712
Acc: 75.42%
F1: 0.71
-

ELimNet vs Pretrained Models (Inference)

# of Parameters # of Layers CPU times (sec) CUDA time (sec) Test Inference Time (sec)
Pretrained EfficientNet B0 4.0M 352 3.9s 4.0s 105.7s
EfficientNet B0 Elim 2 0.9M 245 4.1s 13.0s 83.4s
EfficientNet B0 Elim 3 0.30M 181 3.0s 9.0s 73.5s
Resnet18 11.17M 69 - - -
Resnet18 Elim 2 0.68M 37 - - -

ELimNet vs Empty Optuna NAS Models (Train)

[100 epochs] # of Parameters # of Layers Train Valid Test F1
Empty MobileNet V3 4.2M 227 Loss 0.925
Acc: 65.18%
F1: 0.58
Loss 0.993
Acc: 62.83%
F1: 0.56
-
Empty EfficientNet B0 1.3M 352 Loss 0.867
Acc: 67.28%
F1: 0.61
Loss 0.898
Acc: 66.80%
F1: 0.61
0.6337
Empty DWConv & InvertedResidualv3 NAS 0.08M 66 - Loss: 0.766
Acc: 71.71%
F1: 0.68
0.6740
Empty MBConv NAS 0.33M 141 Loss: 0.786
Acc: 70.72%
F1: 0.66
Loss: 0.866
Acc: 68.09%
F1: 0.62
0.6245
Resnet18 Elim 2 0.68M 37 Loss: 0.447
Acc: 83.73%
F1: 0.71
Loss: 0.712
Acc: 75.42%
F1: 0.71
-
EfficientNet B0 Elim 3 0.30M 181 Loss: 0.602
Acc: 78.17%
F1: 0.74
Loss: 0.661
Acc: 77.41%
F1: 0.74
0.7603

ELimNet vs Empty Optuna NAS Models (Inference)

# of Parameters # of Layers CPU times (sec) CUDA time (sec) Test Inference Time (sec)
Empty MobileNet V3 4.2M 227 4 13 -
Empty EfficientNet B0 1.3M 352 3.780 3.782 68.4s
Empty DWConv &
InvertedResidualv3 NAS
0.08M 66 1 3.5 61.1s
Empty MBConv NAS 0.33M 141 2.14 7.201 67.1s
Resnet18 Elim 2 0.68M 37 - - -
EfficientNet B0 Elim 3 0.30M 181 3.0s 9s 73.5s

Background & WiP

Background

Work in Progress

  • Will test the performance of replacing convolutional blocks with pretrained weights with a single convolutional layer without pretrained weights.
  • Will add ResNet18's inference time data and compare with Optuna's NAS constructed lightweight model.
  • Will test on pretrained MobileNetV3, MnasNet on torchvision with elimination based lightweight model architecture search.
  • Will be applied on other small datasets such as Fashion MNIST dataset and Plant Village dataset.

Others

  • "Empty" stands for model with no pretrained weights.
  • "EfficientNet B0 Elim 2" means 2 convolutional blocks have been eliminated from pretrained EfficientNet B0. Number next to "Elim" annotates how many convolutional blocks have been removed.
  • Table's performance illustrates best performance out of 100 epochs of finetuning on TACO Dataset.

Authors

Owner
snoop2head
break, compose, display
snoop2head
A GPT, made only of MLPs, in Jax

MLP GPT - Jax (wip) A GPT, made only of MLPs, in Jax. The specific MLP to be used are gMLPs with the Spatial Gating Units. Working Pytorch implementat

Phil Wang 53 Sep 27, 2022
Discriminative Condition-Aware PLDA

DCA-PLDA This repository implements the Discriminative Condition-Aware Backend described in the paper: L. Ferrer, M. McLaren, and N. Brümmer, "A Speak

Luciana Ferrer 31 Aug 05, 2022
A very lightweight monitoring system for Raspberry Pi clusters running Kubernetes.

OMNI A very lightweight monitoring system for Raspberry Pi clusters running Kubernetes. Why? When I finished my Kubernetes cluster using a few Raspber

Matias Godoy 148 Dec 29, 2022
Sparse Physics-based and Interpretable Neural Networks

Sparse Physics-based and Interpretable Neural Networks for PDEs This repository contains the code and manuscript for research done on Sparse Physics-b

28 Jan 03, 2023
PyTorch implementation of the Crafting Better Contrastive Views for Siamese Representation Learning

Crafting Better Contrastive Views for Siamese Representation Learning This is the official PyTorch implementation of the ContrastiveCrop paper: @artic

249 Dec 28, 2022
Code repository for "Stable View Synthesis".

Stable View Synthesis Code repository for "Stable View Synthesis". Setup Install the following Python packages in your Python environment - numpy (1.1

Intelligent Systems Lab Org 195 Dec 24, 2022
Code of the paper "Multi-Task Meta-Learning Modification with Stochastic Approximation".

Multi-Task Meta-Learning Modification with Stochastic Approximation This repository contains the code for the paper "Multi-Task Meta-Learning Modifica

Andrew 3 Jan 05, 2022
Official codebase for ICLR oral paper Unsupervised Vision-Language Grammar Induction with Shared Structure Modeling

CLIORA This is the official codebase for ICLR oral paper: Unsupervised Vision-Language Grammar Induction with Shared Structure Modeling. We introduce

Bo Wan 32 Dec 23, 2022
Official implementations of PSENet, PAN and PAN++.

News (2021/11/03) Paddle implementation of PAN, see Paddle-PANet. Thanks @simplify23. (2021/04/08) PSENet and PAN are included in MMOCR. Introduction

395 Dec 14, 2022
Awesome Monocular 3D detection

Awesome Monocular 3D detection Paper list of 3D detetction, keep updating! Contents Paper List 2022 2021 2020 2019 2018 2017 2016 KITTI Results Paper

Zhikang Zou 184 Jan 04, 2023
DeepSpeed is a deep learning optimization library that makes distributed training easy, efficient, and effective.

DeepSpeed is a deep learning optimization library that makes distributed training easy, efficient, and effective.

Microsoft 8.4k Jan 01, 2023
Model Quantization Benchmark

Introduction MQBench is an open-source model quantization toolkit based on PyTorch fx. The envision of MQBench is to provide: SOTA Algorithms. With MQ

500 Jan 06, 2023
Skyformer: Remodel Self-Attention with Gaussian Kernel and Nystr\"om Method (NeurIPS 2021)

Skyformer This repository is the official implementation of Skyformer: Remodel Self-Attention with Gaussian Kernel and Nystr"om Method (NeurIPS 2021).

Qi Zeng 46 Sep 20, 2022
《LightXML: Transformer with dynamic negative sampling for High-Performance Extreme Multi-label Text Classification》(AAAI 2021) GitHub:

LightXML: Transformer with dynamic negative sampling for High-Performance Extreme Multi-label Text Classification

76 Dec 05, 2022
ML-Decoder: Scalable and Versatile Classification Head

ML-Decoder: Scalable and Versatile Classification Head Paper Official PyTorch Implementation Tal Ridnik, Gilad Sharir, Avi Ben-Cohen, Emanuel Ben-Baru

189 Jan 04, 2023
Official Pytorch implementation of C3-GAN

Official pytorch implemenation of C3-GAN Contrastive Fine-grained Class Clustering via Generative Adversarial Networks [Paper] Authors: Yunji Kim, Jun

NAVER AI 114 Dec 02, 2022
Optical Character Recognition + Instance Segmentation for russian and english languages

Распознавание рукописного текста в школьных тетрадях Соревнование, проводимое в рамках олимпиады НТО, разработанное Сбером. Платформа ODS. Результаты

Gerasimov Maxim 21 Dec 19, 2022
This repository contains the code used in the paper "Prompt-Based Multi-Modal Image Segmentation".

Prompt-Based Multi-Modal Image Segmentation This repository contains the code used in the paper "Prompt-Based Multi-Modal Image Segmentation". The sys

Timo Lüddecke 305 Dec 30, 2022
UltraGCN: An Ultra Simplification of Graph Convolutional Networks for Recommendation

UltraGCN This is our Pytorch implementation for our CIKM 2021 paper: Kelong Mao, Jieming Zhu, Xi Xiao, Biao Lu, Zhaowei Wang, Xiuqiang He. UltraGCN: A

XUEPAI 93 Jan 03, 2023
Self-supervised Label Augmentation via Input Transformations (ICML 2020)

Self-supervised Label Augmentation via Input Transformations Authors: Hankook Lee, Sung Ju Hwang, Jinwoo Shin (KAIST) Accepted to ICML 2020 Install de

hankook 96 Dec 29, 2022