MPLP: Metapath-Based Label Propagation for Heterogenous Graphs

Related tags

Deep Learningmplp
Overview

MPLP: Metapath-Based Label Propagation for Heterogenous Graphs

Results on MAG240M

Here, we demonstrate the following performance on the MAG240M dataset from [email protected] 2021.

Model Test Acc Validation Acc Parameters Hardware
Our Model 0.7447 0.7669 ± 0.0003 (ensemble 0.7696) 743,449 Tesla V100 (21GB)

Reproducing results

0. Requirements

Here just list python3 packages we used in this competition:

numpy==1.19.2
torch==1.5.1+cu101
dgl-cu101==0.6.0.post1
ogb==1.3.1
sklearn==0.23.2
tqdm==4.46.1

1. Prepare Graph and Features

The preprocess code modifed from dgl baseline. We created graph with 6 different edge types instead of 5.

# Time cost: 3hours,30mins

python3 $MAG_CODE_PATH/preprocess.py
        --rootdir $MAG_INPUT_PATH \
        --author-output-path $MAG_PREP_PATH/author.npy \
        --inst-output-path $MAG_PREP_PATH/inst.npy \
        --graph-output-path $MAG_PREP_PATH \
        --graph-as-homogeneous \
        --full-output-path $MAG_PREP_PATH/full_feat.npy

The graphs and features will be saved in MAG_PREP_PATH , where the MAG_PREP_PATH is specified in run.sh.

Calculate features

The meta-path based features are generated by this script. Details can be found in our technical report.

# Time cost: 2hours,20mins (only generate label related features)

python3 $MAG_CODE_PATH/feature.py
        $MAG_INPUT_PATH \
        $MAG_PREP_PATH/dgl_graph_full_heterogeneous_csr.bin \
        $MAG_FEAT_PATH \
        --seed=42

Train RGAT model and prepare RGAT features

The RGAT model is modifed from dgl baseline. The validation accuracy is 0.701 , as same as described in the dgl baseline github.

# Time cost: 33hours,40mins (20mins for each epoch)

python3 $MAG_CODE_PATH/rgat.py
        --rootdir $MAG_INPUT_PATH \
        --graph-path $MAG_PREP_PATH/dgl_graph_full_homogeneous_csc.bin \
        --full-feature-path $MAG_PREP_PATH/full_feat.npy \
        --output-path $MAG_RGAT_PATH/ \
        --epochs=100 \
        --model-path $MAG_RGAT_PATH/model.pt \
        --submission-path $MAG_RGAT_PATH/

You will get embeddings as input features of the following MPLP models.

2. Train MPLP models

The train process splits to two steps:

  1. train the model with full train samples at a large learning rate (here we use StepLR(lr=0.01, step_size=100, gamma=0.25))
  2. then fine tune the model with latest train samples (eg, paper with year >= 2018) with a small learning rate (0.000625)

You can train the MPLP model by running the following commands:

# Time cost: 2hours,40mins for each seed

for seed in $(seq 0 7);
do
    python3 $MAG_CODE_PATH/mplp.py \
            $MAG_INPUT_PATH \
            $MAG_MPLP_PATH/data/ \
            $MAG_MPLP_PATH/output/seed${seed} \
            --gpu \
            --seed=${seed} \
            --batch_size=10240 \
            --epochs=200 \
            --num_layers=2 \
            --learning_rate=0.01 \
            --dropout=0.5 \
            --num_splits=5
done

3. Ensemble MPLP results

While having all the results with k-fold cross validation training under 8 different seeds, you can average the results by running code below:

python3 $MAG_CODE_PATH/ensemble.py $MAG_MPLP_PATH/output/ $MAG_SUBM_PATH
Owner
Qiuying Peng
Astrophysics -> Data Science
Qiuying Peng
Computing Shapley values using VAEAC

Shapley values and the VAEAC method In this GitHub repository, we present the implementation of the VAEAC approach from our paper "Using Shapley Value

3 Nov 23, 2022
Code release for NeX: Real-time View Synthesis with Neural Basis Expansion

NeX: Real-time View Synthesis with Neural Basis Expansion Project Page | Video | Paper | COLAB | Shiny Dataset We present NeX, a new approach to novel

538 Jan 09, 2023
CV backbones including GhostNet, TinyNet and TNT, developed by Huawei Noah's Ark Lab.

CV Backbones including GhostNet, TinyNet, TNT (Transformer in Transformer) developed by Huawei Noah's Ark Lab. GhostNet Code TinyNet Code TNT Code Pyr

HUAWEI Noah's Ark Lab 3k Jan 08, 2023
Generative Models for Graph-Based Protein Design

Graph-Based Protein Design This repo contains code for Generative Models for Graph-Based Protein Design by John Ingraham, Vikas Garg, Regina Barzilay

John Ingraham 159 Dec 15, 2022
Voice Conversion by CycleGAN (语音克隆/语音转换):CycleGAN-VC3

CycleGAN-VC3-PyTorch 中文说明 | English This code is a PyTorch implementation for paper: CycleGAN-VC3: Examining and Improving CycleGAN-VCs for Mel-spectr

Kun Ma 110 Dec 24, 2022
Reliable probability face embeddings

ProbFace, arxiv This is a demo code of training and testing [ProbFace] using Tensorflow. ProbFace is a reliable Probabilistic Face Embeddging (PFE) me

Kaen Chan 34 Dec 31, 2022
Jupyter Dock is a set of Jupyter Notebooks for performing molecular docking protocols interactively, as well as visualizing, converting file formats and analyzing the results.

Molecular Docking integrated in Jupyter Notebooks Description | Citation | Installation | Examples | Limitations | License Table of content Descriptio

Angel J. Ruiz Moreno 173 Dec 25, 2022
[ICML 2022] The official implementation of Graph Stochastic Attention (GSAT).

Graph Stochastic Attention (GSAT) The official implementation of GSAT for our paper: Interpretable and Generalizable Graph Learning via Stochastic Att

85 Nov 27, 2022
[AAAI2022] Source code for our paper《Suppressing Static Visual Cues via Normalizing Flows for Self-Supervised Video Representation Learning》

SSVC The source code for paper [Suppressing Static Visual Cues via Normalizing Flows for Self-Supervised Video Representation Learning] samples of the

7 Oct 26, 2022
CaFM-pytorch ICCV ACCEPT Introduction of dataset VSD4K

CaFM-pytorch ICCV ACCEPT Introduction of dataset VSD4K Our dataset VSD4K includes 6 popular categories: game, sport, dance, vlog, interview and city.

96 Jul 05, 2022
The code for the NeurIPS 2021 paper "A Unified View of cGANs with and without Classifiers".

Energy-based Conditional Generative Adversarial Network (ECGAN) This is the code for the NeurIPS 2021 paper "A Unified View of cGANs with and without

sianchen 22 May 28, 2022
Natural Posterior Network: Deep Bayesian Predictive Uncertainty for Exponential Family Distributions

Natural Posterior Network This repository provides the official implementation o

Oliver Borchert 54 Dec 06, 2022
PyTorch implementation of "Representing Shape Collections with Alignment-Aware Linear Models" paper.

deep-linear-shapes PyTorch implementation of "Representing Shape Collections with Alignment-Aware Linear Models" paper. If you find this code useful i

Romain Loiseau 27 Sep 24, 2022
Real-time Object Detection for Streaming Perception, CVPR 2022

StreamYOLO Real-time Object Detection for Streaming Perception Jinrong Yang, Songtao Liu, Zeming Li, Xiaoping Li, Sun Jian Real-time Object Detection

Jinrong Yang 237 Dec 27, 2022
《DeepViT: Towards Deeper Vision Transformer》(2021)

DeepViT This repo is the official implementation of "DeepViT: Towards Deeper Vision Transformer". The repo is based on the timm library (https://githu

109 Dec 02, 2022
CCAFNet: Crossflow and Cross-scale Adaptive Fusion Network for Detecting Salient Objects in RGB-D Images

Code and result about CCAFNet(IEEE TMM) 'CCAFNet: Crossflow and Cross-scale Adaptive Fusion Network for Detecting Salient Objects in RGB-D Images' IEE

zyrant丶 14 Dec 29, 2021
Code of the lileonardo team for the 2021 Emotion and Theme Recognition in Music task of MediaEval 2021

Emotion and Theme Recognition in Music The repository contains code for the submission of the lileonardo team to the 2021 Emotion and Theme Recognitio

Vincent Bour 8 Aug 02, 2022
Class-Balanced Loss Based on Effective Number of Samples. CVPR 2019

Class-Balanced Loss Based on Effective Number of Samples Tensorflow code for the paper: Class-Balanced Loss Based on Effective Number of Samples Yin C

Yin Cui 546 Jan 08, 2023
Code of the paper "Performance-Efficiency Trade-offs in Unsupervised Pre-training for Speech Recognition"

SEW (Squeezed and Efficient Wav2vec) The repo contains the code of the paper "Performance-Efficiency Trade-offs in Unsupervised Pre-training for Speec

ASAPP Research 67 Dec 01, 2022
A modern pure-Python library for reading PDF files

pdf A modern pure-Python library for reading PDF files. The goal is to have a modern interface to handle PDF files which is consistent with itself and

6 Apr 06, 2022