Boost learning for GNNs from the graph structure under challenging heterophily settings. (NeurIPS'20)

Overview

Beyond Homophily in Graph Neural Networks: Current Limitations and Effective Designs

Jiong Zhu, Yujun Yan, Lingxiao Zhao, Mark Heimann, Leman Akoglu, and Danai Koutra. 2020. Beyond Homophily in Graph Neural Networks: Current Limitations and Effective Designs. Advances in Neural Information Processing Systems 33 (2020).

[Paper] [Poster] [Slides]

Requirements

Basic Requirements

  • Python >= 3.7 (tested on 3.8)

  • signac: this package utilizes signac to manage experiment data and jobs. signac can be installed with the following command:

    pip install signac==1.1 signac-flow==0.7.1 signac-dashboard

    Note that the latest version of signac may cause incompatibility.

  • numpy (tested on 1.18.5)

  • scipy (tested on 1.5.0)

  • networkx >= 2.4 (tested on 2.4)

  • scikit-learn (tested on 0.23.2)

For H2GCN

  • TensorFlow >= 2.0 (tested on 2.2)

Note that it is possible to use H2GCN without signac and scikit-learn on your own data and experimental framework.

For baselines

We also include the code for the baseline methods in the repository. These code are mostly the same as the reference implementations provided by the authors, with our modifications to add JK-connections, interoperability with our experimental pipeline, etc. For the requirements to run these baselines, please refer to the instructions provided by the original authors of the corresponding code, which could be found in each folder under /baselines.

As a general note, TensorFlow 1.15 can be used for all code requiring TensorFlow 1.x; for PyTorch, it is usually fine to use PyTorch 1.6; all code should be able to run under Python >= 3.7. In addition, the basic requirements must also be met.

Usage

Download Datasets

The datasets can be downloaded using the bash scripts provided in /experiments/h2gcn/scripts, which also prepare the datasets for use in our experimental framework based on signac.

We make use of signac to index and manage the datasets: the datasets and experiments are stored in hierarchically organized signac jobs, with the 1st level storing different graphs, 2nd level storing different sets of features, and 3rd level storing different training-validation-test splits. Each level contains its own state points and job documents to differentiate with other jobs.

Use signac schema to list all available properties in graph state points; use signac find to filter graphs using properties in the state points:

cd experiments/h2gcn/

# List available properties in graph state points
signac schema

# Find graphs in syn-products with homophily level h=0.1
signac find numNode 10000 h 0.1

# Find real benchmark "Cora"
signac find benchmark true datasetName\.\$regex "cora"

/experiments/h2gcn/utils/signac_tools.py provides helpful functions to iterate through the data space in Python; more usages of signac can be found in these documents.

Replicate Experiments with signac

  • To replicate our experiments of each model on specific datasets, use Python scripts in /experiments/h2gcn, and the corresponding JSON config files in /experiments/h2gcn/configs. For example, to run H2GCN on our synthetic benchmarks syn-cora:

    cd experiments/h2gcn/
    python run_hgcn_experiments.py -c configs/syn-cora/h2gcn.json [-i] run [-p PARALLEL_NUM]
    • Files and results generated in experiments are also stored with signac on top of the hierarchical order introduced above: the 4th level separates different models, and the 5th level stores files and results generated in different runs with different parameters of the same model.

    • By default, stdout and stderr of each model are stored in terminal_output.log in the 4th level; use -i if you want to see them through your terminal.

    • Use -p if you want to run experiments in parallel on multiple graphs (1st level).

    • Baseline models can be run through the following scripts:

      • GCN, GCN-Cheby, GCN+JK and GCN-Cheby+JK: run_gcn_experiments.py
      • GraphSAGE, GraphSAGE+JK: run_graphsage_experiments.py
      • MixHop: run_mixhop_experiments.py
      • GAT: run_gat_experiments.py
      • MLP: run_hgcn_experiments.py
  • To summarize experiment results of each model on specific datasets to a CSV file, use Python script /experiments/h2gcn/run_experiments_summarization.py with the corresponding model name and config file. For example, to summarize H2GCN results on our synthetic benchmark syn-cora:

    cd experiments/h2gcn/
    python run_experiments_summarization.py h2gcn -f configs/syn-cora/h2gcn.json
  • To list all paths of the 3rd level datasets splits used in a experiment (in planetoid format) without running experiments, use the following command:

    cd experiments/h2gcn/
    python run_hgcn_experiments.py -c configs/syn-cora/h2gcn.json --check_paths run

Standalone H2GCN Package

Our implementation of H2GCN is stored in the h2gcn folder, which can be used as a standalone package on your own data and experimental framework.

Example usages:

  • H2GCN-2

    cd h2gcn
    python run_experiments.py H2GCN planetoid \
      --dataset ind.citeseer \
      --dataset_path ../baselines/gcn/gcn/data/
  • H2GCN-1

    cd h2gcn
    python run_experiments.py H2GCN planetoid \
      --network_setup M64-R-T1-G-V-C1-D0.5-MO \
      --dataset ind.citeseer \
      --dataset_path ../baselines/gcn/gcn/data/
  • Use --help for more advanced usages:

    python run_experiments.py H2GCN planetoid --help

We only support datasets stored in planetoid format. You could also add support to different data formats and models beyond H2GCN by adding your own modules to /h2gcn/datasets and /h2gcn/models, respectively; check out ou code for more details.

Contact

Please contact Jiong Zhu ([email protected]) in case you have any questions.

Citation

Please cite our paper if you make use of this code in your own work:

@article{zhu2020beyond,
  title={Beyond Homophily in Graph Neural Networks: Current Limitations and Effective Designs},
  author={Zhu, Jiong and Yan, Yujun and Zhao, Lingxiao and Heimann, Mark and Akoglu, Leman and Koutra, Danai},
  journal={Advances in Neural Information Processing Systems},
  volume={33},
  year={2020}
}
Owner
GEMS Lab: Graph Exploration & Mining at Scale, University of Michigan
Code repository for work by the GEMS Lab: https://gemslab.github.io/research/
GEMS Lab: Graph Exploration & Mining at Scale, University of Michigan
YOLOv7 - Framework Beyond Detection

🔥🔥🔥🔥 YOLO with Transformers and Instance Segmentation, with TensorRT acceleration! 🔥🔥🔥

JinTian 3k Jan 01, 2023
Residual Pathway Priors for Soft Equivariance Constraints

Residual Pathway Priors for Soft Equivariance Constraints This repo contains the implementation and the experiments for the paper Residual Pathway Pri

Marc Finzi 13 Oct 12, 2022
[ICLR 2022] Pretraining Text Encoders with Adversarial Mixture of Training Signal Generators

AMOS This repository contains the scripts for fine-tuning AMOS pretrained models on GLUE and SQuAD 2.0 benchmarks. Paper: Pretraining Text Encoders wi

Microsoft 22 Sep 15, 2022
Decensoring Hentai with Deep Neural Networks. Formerly named DeepMindBreak.

DeepCreamPy Decensoring Hentai with Deep Neural Networks. Formerly named DeepMindBreak. A deep learning-based tool to automatically replace censored a

616 Jan 06, 2023
An addon uses SMPL's poses and global translation to drive cartoon character in Blender.

Blender addon for driving character The addon drives the cartoon character by passing SMPL's poses and global translation into model's armature in Ble

犹在镜中 153 Dec 14, 2022
[ICCV 2021] Deep Hough Voting for Robust Global Registration

Deep Hough Voting for Robust Global Registration, ICCV, 2021 Project Page | Paper | Video Deep Hough Voting for Robust Global Registration Junha Lee1,

Junha Lee 10 Dec 02, 2022
Pixel-level Crack Detection From Images Of Levee Systems : A Comparative Study

PIXEL-LEVEL CRACK DETECTION FROM IMAGES OF LEVEE SYSTEMS : A COMPARATIVE STUDY G

Manisha Panta 2 Jul 23, 2022
Image-Scaling Attacks and Defenses

Image-Scaling Attacks & Defenses This repository belongs to our publication: Erwin Quiring, David Klein, Daniel Arp, Martin Johns and Konrad Rieck. Ad

Erwin Quiring 163 Nov 21, 2022
Using Random Effects to Account for High-Cardinality Categorical Features and Repeated Measures in Deep Neural Networks

LMMNN Using Random Effects to Account for High-Cardinality Categorical Features and Repeated Measures in Deep Neural Networks This is the working dire

Giora Simchoni 10 Nov 02, 2022
A Benchmark For Measuring Systematic Generalization of Multi-Hierarchical Reasoning

Orchard Dataset This repository contains the code used for generating the Orchard Dataset, as seen in the Multi-Hierarchical Reasoning in Sequences: S

Bill Pung 1 Jun 05, 2022
Mercury: easily convert Python notebook to web app and share with others

Mercury Share your Python notebooks with others Easily convert your Python notebooks into interactive web apps by adding parameters in YAML. Simply ad

MLJAR 2.2k Dec 27, 2022
Like ThreeJS but for Python and based on wgpu

pygfx A render engine, inspired by ThreeJS, but for Python and targeting Vulkan/Metal/DX12 (via wgpu). Introduction This is a Python render engine bui

139 Jan 07, 2023
Cleaned up code for DSTC 10: SIMMC 2.0 track: subtask 2: multimodal coreference resolution

UNITER-Based Situated Coreference Resolution with Rich Multimodal Input: arXiv MMCoref_cleaned Code for the MMCoref task of the SIMMC 2.0 dataset. Pre

Yichen (William) Huang 2 Dec 05, 2022
PyTorch/TorchScript compiler for NVIDIA GPUs using TensorRT

PyTorch/TorchScript compiler for NVIDIA GPUs using TensorRT

NVIDIA Corporation 1.8k Dec 30, 2022
PyTorch Language Model for 1-Billion Word (LM1B / GBW) Dataset

PyTorch Large-Scale Language Model A Large-Scale PyTorch Language Model trained on the 1-Billion Word (LM1B) / (GBW) dataset Latest Results 39.98 Perp

Ryan Spring 114 Nov 04, 2022
MetaBalance: Improving Multi-Task Recommendations via Adapting Gradient Magnitudes of Auxiliary Tasks

MetaBalance: Improving Multi-Task Recommendations via Adapting Gradient Magnitudes of Auxiliary Tasks Introduction This repo contains the pytorch impl

Meta Research 38 Oct 10, 2022
💛 Code and Dataset for our EMNLP 2021 paper: "Perspective-taking and Pragmatics for Generating Empathetic Responses Focused on Emotion Causes"

Perspective-taking and Pragmatics for Generating Empathetic Responses Focused on Emotion Causes Official PyTorch implementation and EmoCause evaluatio

Hyunwoo Kim 51 Jan 06, 2023
Spectral normalization (SN) is a widely-used technique for improving the stability and sample quality of Generative Adversarial Networks (GANs)

Why Spectral Normalization Stabilizes GANs: Analysis and Improvements [paper (NeurIPS 2021)] [paper (arXiv)] [code] Authors: Zinan Lin, Vyas Sekar, Gi

Zinan Lin 32 Dec 16, 2022
Sibur challange 2021 competition - 6 place

sibur challange 2021 Решение на 6 место: https://sibur.ai-community.com/competitions/5/tasks/13 Скор 1.4066/1.4159 public/private. Архитектура - однос

Ivan 5 Jan 11, 2022
PyTorch implementation of the implicit Q-learning algorithm (IQL)

Implicit-Q-Learning (IQL) PyTorch implementation of the implicit Q-learning algorithm IQL (Paper) Currently only implemented for online learning. Offl

Sebastian Dittert 27 Dec 30, 2022