[NeurIPS-2021] Slow Learning and Fast Inference: Efficient Graph Similarity Computation via Knowledge Distillation

Overview

Efficient Graph Similarity Computation - (EGSC)

This repo contains the source code and dataset for our paper:

Slow Learning and Fast Inference: Efficient Graph Similarity Computation via Knowledge Distillation
2021 Conference on Neural Information Processing Systems (NeurIPS 2021)
[Paper]

@inproceedings{qin2021slow,
              title={Slow Learning and Fast Inference: Efficient Graph Similarity Computation via Knowledge Distillation},
              author={Qin, Can and Zhao, Handong and Wang, Lichen and Wang, Huan and Zhang, Yulun and Fu, Yun},
              booktitle={Thirty-Fifth Conference on Neural Information Processing Systems},
              year={2021}
            }
    

EGSC Illustration of knowledge distillation to achieve a fast model given a early-fusion model. Such the fast/individual model enables the online inference.

Introduction

Graph Similarity Computation (GSC) is essential to wide-ranging graph appli- cations such as retrieval, plagiarism/anomaly detection, etc. The exact computation of graph similarity, e.g., Graph Edit Distance (GED), is an NP-hard problem that cannot be exactly solved within an adequate time given large graphs. Thanks to the strong representation power of graph neural network (GNN), a variety of GNN-based inexact methods emerged. To capture the subtle difference across graphs, the key success is designing the dense interaction with features fusion at the early stage, which, however, is a trade-off between speed and accuracy. For Slow Learning of graph similarity, this paper proposes a novel early-fusion approach by designing a co-attention-based feature fusion network on multilevel GNN features. To further improve the speed without much accuracy drop, we introduce an efficient GSC solution by distilling the knowledge from the slow early-fusion model to the student one for Fast Inference. Such a student model also enables the offline collection of individual graph embeddings, speeding up the inference time in orders. To address the instability through knowledge transfer, we decompose the dynamic joint embedding into the static pseudo individual ones for precise teacher-student alignment. The experimental analysis on the real-world datasets demonstrates the superiority of our approach over the state-of-the-art methods on both accuracy and efficiency. Particularly, we speed up the prior art by more than 10x on the benchmark AIDS data.

Dataset

We have used the standard dataloader, i.e., ‘GEDDataset’, directly provided in the PyG, whose downloading link can be referred below.

AIDS700nef

LINUX

ALKANE

IMDBMulti

The code takes pairs of graphs for training from an input folder where each pair of graph is stored as a JSON. Pairs of graphs used for testing are also stored as JSON files. Every node id and node label has to be indexed from 0. Keys of dictionaries are stored strings in order to make JSON serialization possible.

Every JSON file has the following key-value structure:

{"graph_1": [[0, 1], [1, 2], [2, 3], [3, 4]],
 "graph_2":  [[0, 1], [1, 2], [1, 3], [3, 4], [2, 4]],
 "labels_1": [2, 2, 2, 2],
 "labels_2": [2, 3, 2, 2, 2],
 "ged": 1}

The **graph_1** and **graph_2** keys have edge list values which descibe the connectivity structure. Similarly, the **labels_1** and **labels_2** keys have labels for each node which are stored as list - positions in the list correspond to node identifiers. The **ged** key has an integer value which is the raw graph edit distance for the pair of graphs.

Requirements

The codebase is implemented in Python 3.6.12. package versions used for development are just below.

matplotlib        3.3.4
networkx          2.4
numpy             1.19.5
pandas            1.1.2
scikit-learn      0.23.2
scipy             1.4.1
texttable         1.6.3
torch             1.6.0
torch-cluster     1.5.9
torch-geometric   1.7.0
torch-scatter     2.0.6
torch-sparse      0.6.9
tqdm              4.60.0

The installation of pytorch-geometric (PyG) please refers to its official tutorial.

File Structure

.
├── README.md
├── LICENSE                            
├── EGSC-T
│   ├── src
│   │    ├── egsc.py 
│   │    ├── layers.py
│   │    ├── main.py
│   │    ├── parser.py        
│   │    └── utils.py                             
│   ├── README.md                      
│   └── train.sh                        
└── GSC_datasets
    ├── AIDS700nef
    ├── ALKANE
    ├── IMDBMulti
    └── LINUX

To Do

- [x] GED Datasets Processing
- [x] Teacher Model Training
- [ ] Student Model Training
- [ ] Knowledge Distillation
- [ ] Online Inference

The remaining implementations are coming soon.

Acknowledgement

We would like to thank the SimGNN and Extended-SimGNN which we used for this implementation.

Owner
PhD student in Northeastern University, Boston, USA
SnapMix: Semantically Proportional Mixing for Augmenting Fine-grained Data (AAAI 2021)

SnapMix: Semantically Proportional Mixing for Augmenting Fine-grained Data (AAAI 2021) PyTorch implementation of SnapMix | paper Method Overview Cite

DavidHuang 126 Dec 30, 2022
Dataset and Source code of paper 'Enhancing Keyphrase Extraction from Academic Articles with their Reference Information'.

Enhancing Keyphrase Extraction from Academic Articles with their Reference Information Overview Dataset and code for paper "Enhancing Keyphrase Extrac

15 Nov 24, 2022
Transformer part of 12th place solution in Riiid! Answer Correctness Prediction

kaggle_riiid Transformer part of 12th place solution in Riiid! Answer Correctness Prediction. Please see here for more information. Execution You need

Sakami Kosuke 2 Apr 23, 2022
Code and models for ICCV2021 paper "Robust Object Detection via Instance-Level Temporal Cycle Confusion".

Robust Object Detection via Instance-Level Temporal Cycle Confusion This repo contains the implementation of the ICCV 2021 paper, Robust Object Detect

Xin Wang 69 Oct 13, 2022
Fast, modular reference implementation of Instance Segmentation and Object Detection algorithms in PyTorch.

Faster R-CNN and Mask R-CNN in PyTorch 1.0 maskrcnn-benchmark has been deprecated. Please see detectron2, which includes implementations for all model

Facebook Research 9k Jan 04, 2023
PyTorch implementation of "A Simple Baseline for Low-Budget Active Learning".

A Simple Baseline for Low-Budget Active Learning This repository is the implementation of A Simple Baseline for Low-Budget Active Learning. In this pa

10 Nov 14, 2022
A ssl analyzer which could analyzer target domain's certificate.

ssl_analyzer A ssl analyzer which could analyzer target domain's certificate. Analyze the domain name ssl certificate information according to the inp

vincent 17 Dec 12, 2022
The offcial repository for 'CharacterBERT and Self-Teaching for Improving the Robustness of Dense Retrievers on Queries with Typos', SIGIR2022

CharacterBERT-DR The offcial repository for CharacterBERT and Self-Teaching for Improving the Robustness of Dense Retrievers on Queries with Typos, Sh

ielab 11 Nov 15, 2022
StackGAN: Text to Photo-realistic Image Synthesis with Stacked Generative Adversarial Networks

StackGAN Pytorch implementation Inception score evaluation StackGAN-v2-pytorch Tensorflow implementation for reproducing main results in the paper Sta

Han Zhang 1.8k Dec 21, 2022
CCAFNet: Crossflow and Cross-scale Adaptive Fusion Network for Detecting Salient Objects in RGB-D Images

Code and result about CCAFNet(IEEE TMM) 'CCAFNet: Crossflow and Cross-scale Adaptive Fusion Network for Detecting Salient Objects in RGB-D Images' IEE

zyrant丶 14 Dec 29, 2021
Gated-Shape CNN for Semantic Segmentation (ICCV 2019)

GSCNN This is the official code for: Gated-SCNN: Gated Shape CNNs for Semantic Segmentation Towaki Takikawa, David Acuna, Varun Jampani, Sanja Fidler

859 Dec 26, 2022
SPCL: A New Framework for Domain Adaptive Semantic Segmentation via Semantic Prototype-based Contrastive Learning

SPCL SPCL: A New Framework for Domain Adaptive Semantic Segmentation via Semantic Prototype-based Contrastive Learning Update on 2021/11/25: ArXiv Ver

Binhui Xie (谢斌辉) 11 Oct 29, 2022
Code for “ACE-HGNN: Adaptive Curvature ExplorationHyperbolic Graph Neural Network”

ACE-HGNN: Adaptive Curvature Exploration Hyperbolic Graph Neural Network This repository is the implementation of ACE-HGNN in PyTorch. Environment pyt

9 Nov 28, 2022
Code for CVPR 2021 oral paper "Exploring Data-Efficient 3D Scene Understanding with Contrastive Scene Contexts"

Exploring Data-Efficient 3D Scene Understanding with Contrastive Scene Contexts The rapid progress in 3D scene understanding has come with growing dem

Facebook Research 182 Dec 30, 2022
📚 A collection of Jupyter notebooks for learning and experimenting with OpenVINO 👓

A collection of ready-to-run Python* notebooks for learning and experimenting with OpenVINO developer tools. The notebooks are meant to provide an introduction to OpenVINO basics and teach developers

OpenVINO Toolkit 840 Jan 03, 2023
Diffusion Probabilistic Models for 3D Point Cloud Generation (CVPR 2021)

Diffusion Probabilistic Models for 3D Point Cloud Generation [Paper] [Code] The official code repository for our CVPR 2021 paper "Diffusion Probabilis

Shitong Luo 323 Jan 05, 2023
Official PyTorch Implementation of paper "Deep 3D Mask Volume for View Synthesis of Dynamic Scenes", ICCV 2021.

Deep 3D Mask Volume for View Synthesis of Dynamic Scenes Official PyTorch Implementation of paper "Deep 3D Mask Volume for View Synthesis of Dynamic S

Ken Lin 17 Oct 12, 2022
KoRean based ELECTRA pre-trained models (KR-ELECTRA) for Tensorflow and PyTorch

KoRean based ELECTRA (KR-ELECTRA) This is a release of a Korean-specific ELECTRA model with comparable or better performances developed by the Computa

12 Jun 03, 2022
PConv-Keras - Unofficial implementation of "Image Inpainting for Irregular Holes Using Partial Convolutions". Try at: www.fixmyphoto.ai

Partial Convolutions for Image Inpainting using Keras Keras implementation of "Image Inpainting for Irregular Holes Using Partial Convolutions", https

Mathias Gruber 871 Jan 05, 2023
Code for the paper "Training GANs with Stronger Augmentations via Contrastive Discriminator" (ICLR 2021)

Training GANs with Stronger Augmentations via Contrastive Discriminator (ICLR 2021) This repository contains the code for reproducing the paper: Train

Jongheon Jeong 174 Dec 29, 2022