A Broader Picture of Random-walk Based Graph Embedding

Overview

Random-walk Embedding Framework

This repository is a reference implementation of the random-walk embedding framework as described in the paper:

A Broader Picture of Random-walk Based Graph Embedding.
Zexi Huang, Arlei Silva, Ambuj Singh.
ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2021.

The framework decomposes random-walk based graph embedding into three major components: random-walk process, similarity function, and embedding algorithm. By tuning the components, it not only covers many existing approaches such as DeepWalk but naturally motivates novel ones that have shown superior performance on certain downstream tasks.

Usage

Example

To use the framework with default settings to embed the BlogCatalog network:
python src/embedding.py --graph graph/blogcatalog.edges --embeddings emb/blogcatalog.embeddings
where graph/blogcatalog.edges stores the input graph and emb/blogcatalog.embeddings is the target file for output embeddings.

Options

You can check out all the available options (framework components, Markov time parameters, graph types, etc.) with:
python src/embedding.py --help

Input Graph

The supported input graph format is a list of edges:

node1_id_int node2_id_int <weight_float, optional>

where node ids are should be consecutive integers starting from 1. The graph is by default undirected and unweighted, which can be changed by setting appropriate flags.

Output Embeddings

The output embedding file has n lines where n is the number of nodes in the graph. Each line stores the learned embedding of the node with its id equal to the line number:

emb_dim1 emb_dim2 ... emb_dimd

Evaluating

Here, we show by examples how to evaluate and compare different settings of our framework on node classification, link prediction, and community detection tasks. Full evaluation options are can be found with:
python src/evaluating.py --help

Note that the results shown below may not be identical to those in the paper due to different random seeds, but the conclusions are the same.

Node Classification

Once we generate the embedding with the script in previous section, we can call
python src/evaluating.py --task node-classification --embeddings emb/blogcatalog.embeddings --training-ratio 0.5
to compute the Micro-F1 and Macro-F1 scores of the node classification.

The results for comparing Pointwise Mutual Information (PMI) and Autocovariance (AC) similarity metrics with the best Markov times and varying training ratios are as follows:

Training Ratio 10% 20% 30% 40% 50% 60% 70% 80% 90%
PMI Micro-F1 0.3503 0.3814 0.3993 0.4106 0.4179 0.4227 0.4255 0.4222 0.4228
(time=4) Macro-F1 0.2212 0.2451 0.2575 0.2669 0.2713 0.2772 0.2768 0.2689 0.2678
AC Micro-F1 0.3547 0.3697 0.3785 0.3837 0.3872 0.3906 0.3912 0.3927 0.3930
(time=5) Macro-F1 0.2137 0.2299 0.2371 0.2406 0.2405 0.2413 0.2385 0.2356 0.2352

Link Prediction

Prepare

To evaluate the embedding method on link prediction, we first have to remove a ratio of edges in the original graph:
python src/evaluating.py --task link-prediction --mode prepare --graph graph/blogcatalog.edges --remaining-edges graph/blogcatalog.remaining-edges --removed-edges graph/blogcatalog.removed-edges

This takes the original graph graph/blogcatalog.edges as input and output the removed and remaining edges to graph/blogcatalog.removed-edges and graph/blogcatalog.remaining-edges.

Embed

Then, we embed based on the remaining edges of the network with the embedding script. For example:
python src/embedding.py --graph graph/blogcatalog.remaining-edges --embeddings emb/blogcatalog.residual-embeddings

Evaluate

Finally, we evaluate the performance of link prediction in terms of [email protected] based on the embeddings of the residual graph and the removed edges:
python src/evaluating.py --task link-prediction --mode evaluate --embeddings emb/blogcatalog.residual-embeddings --remaining-edges graph/blogcatalog.remaining-edges --removed-edges graph/blogcatalog.removed-edges --k 1.0

The results for comparing PMI and autocovariance similarity metrics with the best Markov times and varying k are as follows:

k 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
PMI (time=1) 0.2958 0.2380 0.2068 0.1847 0.1678 0.1560 0.1464 0.1382 0.1315 0.1260
AC (time=3) 0.4213 0.3420 0.2982 0.2667 0.2434 0.2253 0.2112 0.2000 0.1893 0.1802

Community Detection

Assume the embeddings for the Airport network emb/airport.embeddings have been generated. The following computes the Normalized Mutual Information (NMI) between the ground-truth country communities and the k-means clustering of embeddings:
python src/evaluating.py --task community-detection --embeddings emb/airport.embeddings --communities graph/airport.country-labels

Citing

If you find our framework useful, please consider citing the following paper:

@inproceedings{random-walk-embedding,
author = {Huang, Zexi and Silva, Arlei and Singh, Ambuj},
 title = {A Broader Picture of Random-walk Based Graph Embedding},
 booktitle = {SIGKDD},
 year = {2021}
}
Owner
Zexi Huang
Zexi Huang
Unofficial PyTorch Implementation of AHDRNet (CVPR 2019)

AHDRNet-PyTorch This is the PyTorch implementation of Attention-guided Network for Ghost-free High Dynamic Range Imaging (CVPR 2019). The official cod

Yutong Zhang 4 Sep 08, 2022
OpenMMLab Text Detection, Recognition and Understanding Toolbox

Introduction English | 简体中文 MMOCR is an open-source toolbox based on PyTorch and mmdetection for text detection, text recognition, and the correspondi

OpenMMLab 3k Jan 07, 2023
Decentralized Reinforcment Learning: Global Decision-Making via Local Economic Transactions (ICML 2020)

Decentralized Reinforcement Learning This is the code complementing the paper Decentralized Reinforcment Learning: Global Decision-Making via Local Ec

40 Oct 30, 2022
a reimplementation of Optical Flow Estimation using a Spatial Pyramid Network in PyTorch

pytorch-spynet This is a personal reimplementation of SPyNet [1] using PyTorch. Should you be making use of this work, please cite the paper according

Simon Niklaus 269 Jan 02, 2023
Implementation of the 😇 Attention layer from the paper, Scaling Local Self-Attention For Parameter Efficient Visual Backbones

HaloNet - Pytorch Implementation of the Attention layer from the paper, Scaling Local Self-Attention For Parameter Efficient Visual Backbones. This re

Phil Wang 189 Nov 22, 2022
Simple sinc interpolation in PyTorch.

Kazane: simple sinc interpolation for 1D signal in PyTorch Kazane utilize FFT based convolution to provide fast sinc interpolation for 1D signal when

Chin-Yun Yu 10 May 03, 2022
Auditing Black-Box Prediction Models for Data Minimization Compliance

Data-Minimization-Auditor An auditing tool for model-instability based data minimization that is introduced in "Auditing Black-Box Prediction Models f

Bashir Rastegarpanah 2 Mar 24, 2022
NLP From Scratch Without Large-Scale Pretraining: A Simple and Efficient Framework

NLP From Scratch Without Large-Scale Pretraining This repository contains the code, pre-trained model checkpoints and curated datasets for our paper:

Xingcheng Yao 224 Dec 08, 2022
《LXMERT: Learning Cross-Modality Encoder Representations from Transformers》(EMNLP 2020)

The Most Important Thing. Our code is developed based on: LXMERT: Learning Cross-Modality Encoder Representations from Transformers

53 Dec 16, 2022
PyTorch implementation of "Learning to Discover Cross-Domain Relations with Generative Adversarial Networks"

DiscoGAN in PyTorch PyTorch implementation of Learning to Discover Cross-Domain Relations with Generative Adversarial Networks. * All samples in READM

Taehoon Kim 1k Jan 04, 2023
A trusty face recognition research platform developed by Tencent Youtu Lab

Introduction TFace: A trusty face recognition research platform developed by Tencent Youtu Lab. It provides a high-performance distributed training fr

Tencent 956 Jan 01, 2023
SEOVER: Sentence-level Emotion Orientation Vector based Conversation Emotion Recognition Model

SEOVER-Master This code is the implementation of paper: SEOVER: Sentence-level Emotion Orientation Vector based Conversation Emotion Recognition Model

4 Feb 24, 2022
Pytorch implementation of the paper: "SAPNet: Segmentation-Aware Progressive Network for Perceptual Contrastive Image Deraining"

SAPNet This repository contains the official Pytorch implementation of the paper: "SAPNet: Segmentation-Aware Progressive Network for Perceptual Contr

11 Oct 17, 2022
An implementation of the 1. Parallel, 2. Streaming, 3. Randomized SVD using MPI4Py

PYPARSVD This implementation allows for a singular value decomposition which is: Distributed using MPI4Py Streaming - data can be shown in batches to

Romit Maulik 44 Dec 31, 2022
Pytorch implementation of Masked Auto-Encoder

Masked Auto-Encoder (MAE) Pytorch implementation of Masked Auto-Encoder: Kaiming He, Xinlei Chen, Saining Xie, Yanghao Li, Piotr Dollár, Ross Girshick

Jiyuan 22 Dec 13, 2022
Pytorch GUI(demo) for iVOS(interactive VOS) and GIS (Guided iVOS)

GUI for iVOS(interactive VOS) and GIS (Guided iVOS) GUI Implementation of CVPR2021 paper "Guided Interactive Video Object Segmentation Using Reliabili

Yuk Heo 13 Dec 09, 2022
Code from Daniel Lemire, A Better Alternative to Piecewise Linear Time Series Segmentation

PiecewiseLinearTimeSeriesApproximation code from Daniel Lemire, A Better Alternative to Piecewise Linear Time Series Segmentation, SIAM Data Mining 20

Daniel Lemire 21 Oct 27, 2022
Official public repository of paper "Intention Adaptive Graph Neural Network for Category-Aware Session-Based Recommendation"

Intention Adaptive Graph Neural Network (IAGNN) This is the official repository of paper Intention Adaptive Graph Neural Network for Category-Aware Se

9 Nov 22, 2022
Embracing Single Stride 3D Object Detector with Sparse Transformer

SST: Single-stride Sparse Transformer This is the official implementation of paper: Embracing Single Stride 3D Object Detector with Sparse Transformer

TuSimple 385 Dec 28, 2022
学习 python3 以来写的一些垃圾玩具……

和东哥做兄弟 Author: chiupam 版权 未经本人同意,仓库内所有资源文件,禁止任何公众号、自媒体、开发者进行任何形式的转载、发布、搬运。 声明 这不是一个开源项目,只是把 GitHub 当作一个代码的存储空间,本项目不接受任何开源要求。 仅用于学习研究,禁止用于商业用途,不能保证其合法性

Chiupam 67 Mar 26, 2022