PyG (PyTorch Geometric) - A library built upon PyTorch to easily write and train Graph Neural Networks (GNNs)

Last update: Jan 08, 2023

Overview

Documentation | Paper | Colab Notebooks and Video Tutorials | External Resources | OGB Examples

PyG (PyTorch Geometric) is a library built upon PyTorch to easily write and train Graph Neural Networks (GNNs) for a wide range of applications related to structured data.

It consists of various methods for deep learning on graphs and other irregular structures, also known as geometric deep learning, from a variety of published papers. In addition, it consists of easy-to-use mini-batch loaders for operating on many small and single giant graphs, multi GPU-support, distributed graph learning via Quiver, a large number of common benchmark datasets (based on simple interfaces to create your own), the GraphGym experiment manager, and helpful transforms, both for learning on arbitrary graphs as well as on 3D meshes or point clouds. Click here to join our Slack community!

Library Highlights
Quick Tour for New Users
Architecture Overview
Implemented GNN Models
Installation

Library Highlights

Whether you are a machine learning researcher or first-time user of machine learning toolkits, here are some reasons to try out PyG for machine learning on graph-structured data.

Easy-to-use and unified API: All it takes is 10-20 lines of code to get started with training a GNN model (see the next section for a quick tour). PyG is PyTorch-on-the-rocks: It utilizes a tensor-centric API and keeps design principles close to vanilla PyTorch. If you are already familiar with PyTorch, utilizing PyG is straightforward.
Comprehensive and well-maintained GNN models: Most of the state-of-the-art Graph Neural Network architectures have been implemented by library developers or authors of research papers and are ready to be applied.
Great flexibility: Existing PyG models can easily be extended for conducting your own research with GNNs. Making modifications to existing models or creating new architectures is simple, thanks to its easy-to-use message passing API, and a variety of operators and utility functions.
Large-scale real-world GNN models: We focus on the need of GNN applications in challenging real-world scenarios, and support learning on diverse types of graphs, including but not limited to: scalable GNNs for graphs with millions of nodes; dynamic GNNs for node predictions over time; heterogeneous GNNs with multiple node types and edge types.
GraphGym integration: GraphGym lets users easily reproduce GNN experiments, is able to launch and analyze thousands of different GNN configurations, and is customizable by registering new modules to a GNN learning pipeline.

Quick Tour for New Users

In this quick tour, we highlight the ease of creating and training a GNN model with only a few lines of code.

Train your own GNN model

In the first glimpse of PyG, we implement the training of a GNN for classifying papers in a citation graph. For this, we load the Cora dataset, and create a simple 2-layer GCN model using the pre-defined GCNConv:

import torch
from torch import Tensor
from torch_geometric.nn import GCNConv
from torch_geometric.datasets import Planetoid

dataset = Planetoid(root='.', name='Cora')

class GCN(torch.nn.Module):
    def __init__(self, in_channels, hidden_channels, out_channels):
        super().__init__()
        self.conv1 = GCNConv(in_channels, hidden_channels)
        self.conv2 = GCNConv(hidden_channels, out_channels)

    def forward(self, x: Tensor, edge_index: Tensor) -> Tensor:
        # x: Node feature matrix of shape [num_nodes, in_channels]
        # edge_index: Graph connectivity matrix of shape [2, num_edges]
        x = self.conv1(x, edge_index).relu()
        x = self.conv2(x, edge_index)
        return x

model = GCN(dataset.num_features, 16, dataset.num_classes)

We can now optimize the model in a training loop, similar to the standard PyTorch training procedure.

import torch.nn.functional as F

data = dataset[0]
optimizer = torch.optim.Adam(model.parameters(), lr=0.01)

for epoch in range(200):
    pred = model(data.x, data.edge_index)
    loss = F.cross_entropy(pred[data.train_mask], data.y[data.train_mask])

    # Backpropagation
    optimizer.zero_grad()
    loss.backward()
    optimizer.step()

More information about evaluating final model performance can be found in the corresponding example.

Create your own GNN layer

In addition to the easy application of existing GNNs, PyG makes it simple to implement custom Graph Neural Networks (see here for the accompanying tutorial). For example, this is all it takes to implement the edge convolutional layer from Wang et al.:

Tensor: # x: Node feature matrix of shape [num_nodes, in_channels] # edge_index: Graph connectivity matrix of shape [2, num_edges] return self.propagate(edge_index, x=x) # shape [num_nodes, out_channels] def message(self, x_j: Tensor, x_i: Tensor) -> Tensor: # x_j: Source node features of shape [num_edges, in_channels] # x_i: Target node features of shape [num_edges, in_channels] edge_features = torch.cat([x_i, x_j - x_i], dim=-1) return self.mlp(edge_features) # shape [num_edges, out_channels] ">

import torch
from torch import Tensor
from torch.nn import Sequential, Linear, ReLU
from torch_geometric.nn import MessagePassing

class EdgeConv(MessagePassing):
    def __init__(self, in_channels, out_channels):
        super().__init__(aggr="max")  # "Max" aggregation.
        self.mlp = Sequential(
            Linear(2 * in_channels, out_channels),
            ReLU(),
            Linear(out_channels, out_channels),
        )

    def forward(self, x: Tensor, edge_index: Tensor) -> Tensor:
        # x: Node feature matrix of shape [num_nodes, in_channels]
        # edge_index: Graph connectivity matrix of shape [2, num_edges]
        return self.propagate(edge_index, x=x)  # shape [num_nodes, out_channels]

    def message(self, x_j: Tensor, x_i: Tensor) -> Tensor:
        # x_j: Source node features of shape [num_edges, in_channels]
        # x_i: Target node features of shape [num_edges, in_channels]
        edge_features = torch.cat([x_i, x_j - x_i], dim=-1)
        return self.mlp(edge_features)  # shape [num_edges, out_channels]

Manage experiments with GraphGym

GraphGym allows you to manage and launch GNN experiments, using a highly modularized pipeline (see here for the accompanying tutorial).

git clone https://github.com/pyg-team/pytorch_geometric.git
cd pytorch_geometric/graphgym
bash run_single.sh  # run a single GNN experiment (node/edge/graph-level)
bash run_batch.sh   # run a batch of GNN experiments, using differnt GNN designs/datasets/tasks

Users are highly encouraged to check out the documentation, which contains additional tutorials on the essential functionalities of PyG, including data handling, creation of datasets and a full list of implemented methods, transforms, and datasets. For a quick start, check out our examples in examples/.

Architecture Overview

PyG provides a multi-layer framework that enables users to build Graph Neural Network solutions on both low and high levels. It comprises of the following components:

The PyG engine utilizes the powerful PyTorch deep learning framework, as well as additions of efficient CUDA libraries for operating on sparse data, e.g., torch-scatter, torch-sparse and torch-cluster.
The PyG storage handles data processing, transformation and loading pipelines. It is capable of handling and processing large-scale graph datasets, and provides effective solutions for heterogeneous graphs. It further provides a variety of sampling solutions, which enable training of GNNs on large-scale graphs.
The PyG operators bundle essential functionalities for implementing Graph Neural Networks. PyG supports important GNN building blocks that can be combined and applied to various parts of a GNN model, ensuring rich flexibility of GNN design.
Finally, PyG provides an abundant set of GNN models, and examples that showcase GNN models on standard graph benchmarks. Thanks to its flexibility, users can easily build and modify custom GNN models to fit their specific needs.

Implemented GNN Models

We list currently supported PyG models, layers and operators according to category:

GNN layers: All Graph Neural Network layers are implemented via the nn.MessagePassing interface. A GNN layer specifies how to perform message passing, i.e. by designing different message, aggregation and update functions as defined here. These GNN layers can be stacked together to create Graph Neural Network models.

GCNConv from Kipf and Welling: Semi-Supervised Classification with Graph Convolutional Networks (ICLR 2017) [Example]
ChebConv from Defferrard et al.: Convolutional Neural Networks on Graphs with Fast Localized Spectral Filtering (NIPS 2016) [Example]
GATConv from Veličković et al.: Graph Attention Networks (ICLR 2018) [Example]

Expand to see all implemented GNN layers...

GCN2Conv from Chen et al.: Simple and Deep Graph Convolutional Networks (ICML 2020) [Example1, Example2]
SplineConv from Fey et al.: SplineCNN: Fast Geometric Deep Learning with Continuous B-Spline Kernels (CVPR 2018) [Example1, Example2]
NNConv from Gilmer et al.: Neural Message Passing for Quantum Chemistry (ICML 2017) [Example1, Example2]
CGConv from Xie and Grossman: Crystal Graph Convolutional Neural Networks for an Accurate and Interpretable Prediction of Material Properties (Physical Review Letters 120, 2018)
ECConv from Simonovsky and Komodakis: Edge-Conditioned Convolution on Graphs (CVPR 2017)
EGConv from Tailor et al.: Adaptive Filters and Aggregator Fusion for Efficient Graph Convolutions (GNNSys 2021) [Example]
GATv2Conv from Brody et al.: How Attentive are Graph Attention Networks? (CoRR 2021)
TransformerConv from Shi et al.: Masked Label Prediction: Unified Message Passing Model for Semi-Supervised Classification (CoRR 2020)
SAGEConv from Hamilton et al.: Inductive Representation Learning on Large Graphs (NIPS 2017) [Example1, Example2, Example3]
GraphConv from, e.g., Morris et al.: Weisfeiler and Leman Go Neural: Higher-order Graph Neural Networks (AAAI 2019)
GatedGraphConv from Li et al.: Gated Graph Sequence Neural Networks (ICLR 2016)
ResGatedGraphConv from Bresson and Laurent: Residual Gated Graph ConvNets (CoRR 2017)
GINConv from Xu et al.: How Powerful are Graph Neural Networks? (ICLR 2019) [Example]
GINEConv from Hu et al.: Strategies for Pre-training Graph Neural Networks (ICLR 2020)
ARMAConv from Bianchi et al.: Graph Neural Networks with Convolutional ARMA Filters (CoRR 2019) [Example]
SGConv from Wu et al.: Simplifying Graph Convolutional Networks (CoRR 2019) [Example]
APPNP from Klicpera et al.: Predict then Propagate: Graph Neural Networks meet Personalized PageRank (ICLR 2019) [Example]
MFConv from Duvenaud et al.: Convolutional Networks on Graphs for Learning Molecular Fingerprints (NIPS 2015)
AGNNConv from Thekumparampil et al.: Attention-based Graph Neural Network for Semi-Supervised Learning (CoRR 2017) [Example]
TAGConv from Du et al.: Topology Adaptive Graph Convolutional Networks (CoRR 2017) [Example]
PNAConv from Corso et al.: Principal Neighbourhood Aggregation for Graph Nets (CoRR 2020) [Example]
FAConv from Bo et al.: Beyond Low-Frequency Information in Graph Convolutional Networks (AAAI 2021)
PDNConv from Rozemberczki et al.: Pathfinder Discovery Networks for Neural Message Passing (WWW 2021)
RGCNConv from Schlichtkrull et al.: Modeling Relational Data with Graph Convolutional Networks (ESWC 2018) [Example1, Example2]
FiLMConv from Brockschmidt: GNN-FiLM: Graph Neural Networks with Feature-wise Linear Modulation (ICML 2020) [Example]
SignedConv from Derr et al.: Signed Graph Convolutional Network (ICDM 2018) [Example]
DNAConv from Fey: Just Jump: Dynamic Neighborhood Aggregation in Graph Neural Networks (ICLR-W 2019) [Example]
PANConv from Ma et al.: Path Integral Based Convolution and Pooling for Graph Neural Networks (NeurIPS 2020)
PointConv (including Iterative Farthest Point Sampling, dynamic graph generation based on nearest neighbor or maximum distance, and k-NN interpolation for upsampling) from Qi et al.: PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation (CVPR 2017) and PointNet++: Deep Hierarchical Feature Learning on Point Sets in a Metric Space (NIPS 2017) [Example1, Example2]
EdgeConv from Wang et al.: Dynamic Graph CNN for Learning on Point Clouds (CoRR, 2018) [Example1, Example2]
XConv from Li et al.: PointCNN: Convolution On X-Transformed Points (NeurIPS 2018) [Example]
PPFConv from Deng et al.: PPFNet: Global Context Aware Local Features for Robust 3D Point Matching (CVPR 2018)
GMMConv from Monti et al.: Geometric Deep Learning on Graphs and Manifolds using Mixture Model CNNs (CVPR 2017)
FeaStConv from Verma et al.: FeaStNet: Feature-Steered Graph Convolutions for 3D Shape Analysis (CVPR 2018)
PointTransformerConv from Zhao et al.: Point Transformer (2020)
HypergraphConv from Bai et al.: Hypergraph Convolution and Hypergraph Attention (CoRR 2019)
GravNetConv from Qasim et al.: Learning Representations of Irregular Particle-detector Geometry with Distance-weighted Graph Networks (European Physics Journal C, 2019)
SuperGAT from Kim and Oh: How To Find Your Friendly Neighborhood: Graph Attention Design With Self-Supervision (ICLR 2021) [Example]
HGTConv from Hu et al.: Heterogeneous Graph Transformer (WWW 2020) [Example]
HEATConv from Mo et al.: Heterogeneous Edge-Enhanced Graph Attention Network For Multi-Agent Trajectory Prediction (CoRR 2021)
A MetaLayer for building any kind of graph network similar to the TensorFlow Graph Nets library from Battaglia et al.: Relational Inductive Biases, Deep Learning, and Graph Networks (CoRR 2018)

Pooling layers: Graph pooling layers combine the vectorial representations of a set of nodes in a graph (or a subgraph) into a single vector representation that summarizes its properties of nodes. It is commonly applied to graph-level tasks, which require combining node features into a single graph representation.

Top-K Pooling from Gao and Ji: Graph U-Nets (ICML 2019), Cangea et al.: Towards Sparse Hierarchical Graph Classifiers (NeurIPS-W 2018) and Knyazev et al.: Understanding Attention and Generalization in Graph Neural Networks (ICLR-W 2019) [Example]
DiffPool from Ying et al.: Hierarchical Graph Representation Learning with Differentiable Pooling (NeurIPS 2018) [Example]

Expand to see all implemented pooling layers...

GlobalAttention from Li et al.: Gated Graph Sequence Neural Networks (ICLR 2016) [Example]
Set2Set from Vinyals et al.: Order Matters: Sequence to Sequence for Sets (ICLR 2016) [Example]
Sort Pool from Zhang et al.: An End-to-End Deep Learning Architecture for Graph Classification (AAAI 2018) [Example]
Dense MinCUT Pooling from Bianchi et al.: MinCUT Pooling in Graph Neural Networks (CoRR 2019) [Example]
Graclus Pooling from Dhillon et al.: Weighted Graph Cuts without Eigenvectors: A Multilevel Approach (PAMI 2007) [Example]
Voxel Grid Pooling from, e.g., Simonovsky and Komodakis: Dynamic Edge-Conditioned Filters in Convolutional Neural Networks on Graphs (CVPR 2017) [Example]
SAG Pooling from Lee et al.: Self-Attention Graph Pooling (ICML 2019) and Knyazev et al.: Understanding Attention and Generalization in Graph Neural Networks (ICLR-W 2019) [Example]
Edge Pooling from Diehl et al.: Towards Graph Pooling by Edge Contraction (ICML-W 2019) and Diehl: Edge Contraction Pooling for Graph Neural Networks (CoRR 2019) [Example]
ASAPooling from Ranjan et al.: ASAP: Adaptive Structure Aware Pooling for Learning Hierarchical Graph Representations (AAAI 2020) [Example]
PANPooling from Ma et al.: Path Integral Based Convolution and Pooling for Graph Neural Networks (NeurIPS 2020)
MemPooling from Khasahmadi et al.: Memory-Based Graph Networks (ICLR 2020) [Example]
GraphMultisetTransformer from Baek et al.: Accurate Learning of Graph Representations with Graph Multiset Pooling (ICLR 2021) [Example]

GNN models: Our supported GNN models incorporate multiple message passing layers, and users can directly use these pre-defined models to make predictions on graphs. Unlike simple stacking of GNN layers, these models could involve pre-processing, additional learnable parameters, skip connections, graph coarsening, etc.

SchNet from Schütt et al.: SchNet: A Continuous-filter Convolutional Neural Network for Modeling Quantum Interactions (NIPS 2017) [Example]
DimeNet from Klicpera et al.: Directional Message Passing for Molecular Graphs (ICLR 2020) [Example]
Node2Vec from Grover and Leskovec: node2vec: Scalable Feature Learning for Networks (KDD 2016) [Example]
Deep Graph Infomax from Veličković et al.: Deep Graph Infomax (ICLR 2019) [Example1, Example2]

Expand to see all implemented GNN models...

Jumping Knowledge from Xu et al.: Representation Learning on Graphs with Jumping Knowledge Networks (ICML 2018) [Example]
MetaPath2Vec from Dong et al.: metapath2vec: Scalable Representation Learning for Heterogeneous Networks (KDD 2017) [Example]
All variants of Graph Autoencoders and Variational Autoencoders from:
- Variational Graph Auto-Encoders from Kipf and Welling (NIPS-W 2016) [Example]
- Adversarially Regularized Graph Autoencoder for Graph Embedding from Pan et al. (IJCAI 2018) [Example]
- Simple and Effective Graph Autoencoders with One-Hop Linear Models from Salha et al. (ECML 2020) [Example]
SEAL from Zhang and Chen: Link Prediction Based on Graph Neural Networks (NeurIPS 2018)
RENet from Jin et al.: Recurrent Event Network for Reasoning over Temporal Knowledge Graphs (ICLR-W 2019) [Example]
GraphUNet from Gao and Ji: Graph U-Nets (ICML 2019) [Example]
AttentiveFP from Xiong et al.: Pushing the Boundaries of Molecular Representation for Drug Discovery with the Graph Attention Mechanism (J. Med. Chem. 2020) [Example]
DeepGCN and the GENConv from Li et al.: DeepGCNs: Can GCNs Go as Deep as CNNs? (ICCV 2019) and DeeperGCN: All You Need to Train Deeper GCNs (CoRR 2020) [Example]
RECT from Wang et al.: Network Embedding with Completely-imbalanced Labels (TKDE 2020) [Example]
GNNExplainer from Ying et al.: GNNExplainer: Generating Explanations for Graph Neural Networks (NeurIPS 2019) [Example1, Example2]
SEAL from Zhang and Chen: Link Prediction Based on Graph Neural Networks (NeurIPS 2018) [Example]
Graph-less Neural Networks from Zhang et al.: Graph-less Neural Networks: Teaching Old MLPs New Tricks via Distillation (CoRR 2021) [Example]
LINKX from Lim et al.: Large Scale Learning on Non-Homophilous Graphs: New Benchmarks and Strong Simple Methods (NeurIPS 2021) [Example]

GNN operators and utilities: PyG comes with a rich set of neural network operators that are commonly used in many GNN models. They follow an extensible design: It is easy to apply these operators and graph utilities to existing GNN layers and models to further enhance model performance.

DropEdge from Rong et al.: DropEdge: Towards Deep Graph Convolutional Networks on Node Classification (ICLR 2020)
GraphNorm from Cai et al.: GraphNorm: A Principled Approach to Accelerating Graph Neural Network Training (ICML 2021)
GDC from Klicpera et al.: Diffusion Improves Graph Learning (NeurIPS 2019) [Example]

Expand to see all implemented GNN operators and utilities...

GraphSizeNorm from Dwivedi et al.: Benchmarking Graph Neural Networks (CoRR 2020)
PairNorm from Zhao and Akoglu: PairNorm: Tackling Oversmoothing in GNNs (ICLR 2020)
DiffGroupNorm from Zhou et al.: Towards Deeper Graph Neural Networks with Differentiable Group Normalization (NeurIPS 2020)
Tree Decomposition from Jin et al.: Junction Tree Variational Autoencoder for Molecular Graph Generation (ICML 2018)
TGN from Rossi et al.: Temporal Graph Networks for Deep Learning on Dynamic Graphs (GRL+ 2020) [Example]
Weisfeiler Lehman Algorithm from Weisfeiler and Lehman: A Reduction of a Graph to a Canonical Form and an Algebra Arising During this Reduction (Nauchno-Technicheskaya Informatsia 1968) [Example]
Label Propagation from Zhu and Ghahramani: Learning from Labeled and Unlabeled Data with Label Propagation (CMU-CALD 2002) [Example]
Local Degree Profile from Cai and Wang: A Simple yet Effective Baseline for Non-attribute Graph Classification (CoRR 2018)
CorrectAndSmooth from Huang et al.: Combining Label Propagation And Simple Models Out-performs Graph Neural Networks (CoRR 2020) [Example]
Gini and BRO regularization from Henderson et al.: Improving Molecular Graph Neural Network Explainability with Orthonormalization and Induced Sparsity (ICML 2021)

Scalable GNNs: PyG supports the implementation of Graph Neural Networks that can scale to large-scale graphs. Such application is challenging since the entire graph, its associated features and the GNN parameters cannot fit into GPU memory. Many state-of-the-art scalability approaches tackle this challenge by sampling neighborhoods for mini-batch training, graph clustering and partitioning, or by using simplified GNN models. These approaches have been implemented in PyG, and can benefit from the above GNN layers, operators and models.

NeighborSampler and NeighborLoader from Hamilton et al.: Inductive Representation Learning on Large Graphs (NIPS 2017) [Example1, Example2, Example3, Example4]
ClusterGCN from Chiang et al.: Cluster-GCN: An Efficient Algorithm for Training Deep and Large Graph Convolutional Networks (KDD 2019) [Example1, Example2]
GraphSAINT from Zeng et al.: GraphSAINT: Graph Sampling Based Inductive Learning Method (ICLR 2020) [Example]

Expand to see all implemented scalable GNNs...

ShaDow from Zeng et al.: Deep Graph Neural Networks with Shallow Subgraph Samplers (CoRR 2020) [Example]
SIGN from Rossi et al.: SIGN: Scalable Inception Graph Neural Networks (CoRR 2020) [Example]
HGTLoader from Hu et al.: Heterogeneous Graph Transformer (WWW 2020) [Example]

Installation

Anaconda

Update: You can now install PyG via Anaconda for all major OS/PyTorch/CUDA combinations 🤗 Given that you have PyTorch >= 1.8.0 installed, simply run

conda install pyg -c pyg -c conda-forge

Pip Wheels

We alternatively provide pip wheels for all major OS/PyTorch/CUDA combinations, see here.

PyTorch 1.10.0

To install the binaries for PyTorch 1.10.0, simply run

pip install torch-scatter -f https://data.pyg.org/whl/torch-1.10.0+${CUDA}.html
pip install torch-sparse -f https://data.pyg.org/whl/torch-1.10.0+${CUDA}.html
pip install torch-geometric

where ${CUDA} should be replaced by either cpu, cu102, or cu113 depending on your PyTorch installation (torch.version.cuda).

	`cpu`	`cu102`	`cu113`
Linux	✅	✅	✅
Windows	✅	✅	✅
macOS	✅

For additional but optional functionality, run

pip install torch-cluster -f https://data.pyg.org/whl/torch-1.10.0+${CUDA}.html
pip install torch-spline-conv -f https://data.pyg.org/whl/torch-1.10.0+${CUDA}.html

PyTorch 1.9.0/1.9.1

To install the binaries for PyTorch 1.9.0 and 1.9.1, simply run

pip install torch-scatter -f https://data.pyg.org/whl/torch-1.9.0+${CUDA}.html
pip install torch-sparse -f https://data.pyg.org/whl/torch-1.9.0+${CUDA}.html
pip install torch-geometric

where ${CUDA} should be replaced by either cpu, cu102, or cu111 depending on your PyTorch installation (torch.version.cuda).

	`cpu`	`cu102`	`cu111`
Linux	✅	✅	✅
Windows	✅	✅	✅
macOS	✅

For additional but optional functionality, run

pip install torch-cluster -f https://data.pyg.org/whl/torch-1.9.0+${CUDA}.html
pip install torch-spline-conv -f https://data.pyg.org/whl/torch-1.9.0+${CUDA}.html

Note: Binaries of older versions are also provided for PyTorch 1.4.0, PyTorch 1.5.0, PyTorch 1.6.0, PyTorch 1.7.0/1.7.1 and PyTorch 1.8.0/1.8.1 (following the same procedure).

From master

In case you want to experiment with the latest PyG features which are not fully released yet, ensure that torch-scatter and torch-sparse are installed by following the steps mentioned above, and install PyG from master via:

pip install git+https://github.com/pyg-team/pytorch_geometric.git

Cite

Please cite our paper (and the respective papers of the methods used) if you use this code in your own work:

@inproceedings{Fey/Lenssen/2019,
  title={Fast Graph Representation Learning with {PyTorch Geometric}},
  author={Fey, Matthias and Lenssen, Jan E.},
  booktitle={ICLR Workshop on Representation Learning on Graphs and Manifolds},
  year={2019},
}

Feel free to email us if you wish your work to be listed in the external resources. If you notice anything unexpected, please open an issue and let us know. If you have any questions or are missing a specific feature, feel free to discuss them with us. We are motivated to constantly make PyG even better.

Comments

Issue reproducing the results of the original ecc implementation. Pooling layer and conv layer are giving different results of the original implementation
As I mentioned in #319 I have problems to reproduce the ecc implemenation using pytorch_geometric. I found some differences between the results obtained, first one is that the results of both convolution operations using the same weights have different results. Moreover, the results of the pooling layers are also different.

I created a test that checks this things. Basically, the scripts load the same weights to both implementations. These weights are obtained from train a network using the ecc_implementation. Below you can see the output of my test.

ECC Weights and PyGeometric weights are equal: True #I am only doing a re-check in order to be sure that both weights are equal before to load to the models. Loading weights Starting validation: ecc features conv1: (997, 16) #Shape of the output of first conv in ecc implementation pygeometric features conv1: (997, 16) #Shape of the output of first conv in pygeometric implementation Max difference between features of first conv 2.549824 Output of ecc pooling: (398, 32) Output of PyGeometric pooling: (385, 32) Pygeomtric Acc: 41.51982378854625 Ecc accuracy: 63.65638766519823 Pygeomtric Loss: 2.435516586519023 Ecc Loss: 0.9878960176974138

As you can observe this difference has an impact to the accuracy using the same weights. You can find the source code here. One important thing, the data used for this tests is obtained from the original code of the ecc.
opened by dhorka 93
spspmm cuda bugfix

❓ Questions & Help

I use the introduction two-layer GCN example, change the data to my own which has a input feature matrix (100020, 6) 2 labels and 3074376 edges. I tried the GCN example with a well result, which has a brilliant acc on the processed Cora dataset you provided, and when I change the input to my own data, there's only one classification result :acc:0.55698860228 f1_score:tensor([0.0000, 0.7155]) recall:tensor([0., 1.]) precision:tensor([0.0000, 0.5570]) I have been trying this problem for 2 days with the same result,Could you help me

opened by zhangcaifu 78

enzymes_topk_pool model is not learning

❓ Questions & Help

Hi I am using enzymes_topk_pool(ETP) algorithm for Medical Image classification. I have created features out of Images and converted them into data format accepted by pytorchg data loader. But after that when I try to give these features to the ETP algo , model is not able to learn anything. Training and test loss doesn't change from 1st epoch until the end. Everything remains constant. More info: Its binary classification problem. Below i am attaching the small script so that u get an idea.

class Net(torch.nn.Module): def init(self): super(Net, self).init() # 41 = number of features self.conv1 = GraphConv(dataset.num_node_features, 64) self.pool1 = TopKPooling(64, ratio=0.8) self.conv2 = GraphConv(64, 64) self.pool2 = TopKPooling(64, ratio=0.8) self.conv3 = GraphConv(64, 64) self.pool3 = TopKPooling(64, ratio=0.8)

    self.lin1 = torch.nn.Linear(128, 128)
    self.lin2 = torch.nn.Linear(128, 64)
    self.lin3 = torch.nn.Linear(64, 1)
    self.bn1 = torch.nn.BatchNorm1d(128)
    self.bn2 = torch.nn.BatchNorm1d(64)
    #self.act1 = torch.nn.ReLU()
    #self.act2 = torch.nn.ReLU()  

def forward(self, data):

    x, edge_index, batch = data.x, data.edge_index, data.batch
    #edge_index, _ = remove_self_loops(edge_index)
    #edge_index, _ = add_self_loops(edge_index, num_nodes=x.size(0))

    x = F.relu(self.conv1(x, edge_index))
    x, edge_index, _, batch, _= self.pool1(x, edge_index, None, batch)
    x1 = torch.cat([gmp(x, batch), gap(x, batch)], dim=1)

    x = F.relu(self.conv2(x, edge_index))
    x, edge_index, _, batch, _ = self.pool2(x, edge_index, None, batch)
    x2 = torch.cat([gmp(x, batch), gap(x, batch)], dim=1)

    x = F.relu(self.conv3(x, edge_index))
    x, edge_index, _, batch, _ = self.pool3(x, edge_index, None, batch)
    x3 = torch.cat([gmp(x, batch), gap(x, batch)], dim=1)

    x = x1 + x2 + x3

    x = F.relu(self.lin1(x))
    x = F.relu(self.lin2(x))
    #x = F.dropout(x, p=0.5, training=self.training)
    #x = torch.sigmoid(self.lin3(x)).squeeze(1)
    x = torch.sigmoid(self.lin3(x)).squeeze(1)
    #print('x', x.shape)
    #x = F.log_softmax(self.lin3(x), dim=-1)
    return x

device = torch.device('cuda' if torch.cuda.is_available() else 'cpu') model = Net().to(device) optimizer = torch.optim.Adam(model.parameters(), lr=0.001) scheduler = torch.optim.lr_scheduler.ReduceLROnPlateau(optimizer, verbose=True)

crit = torch.nn.BCELoss() import pdb def train(epoch): model.train()

loss_all = 0
for data in train_loader:
    data = data.to(device)
    optimizer.zero_grad()
    output = model(data)
    #print('o/ps',output)

    #print(output)
    #print('len',output.shape)
    label = data.y.to(device).cuda()
    label = torch.tensor(label, dtype=torch.float).to(device)

    #print('lbls',label)
   # label = torch.tensor(label, dtype=torch.float)
    #print('lbl', label.shape)
    loss = crit(output, label)
    #print('loss',loss)
    #loss = crit(output, data.y)
    loss.backward(retain_graph=True)
    loss_all += data.num_graphs * loss.item()
    optimizer.step()
scheduler.step(loss_all)
return loss_all / len(train_data_list)

from sklearn.metrics import roc_auc_score def evaluate(loader): model.eval()

predictions = []
labels = []

with torch.no_grad():
    for data in loader:

        data = data.to(device)
        pred = model(data).detach().cpu().numpy()
        
        #print('pred ', pred)

        label = data.y.detach().cpu().numpy()

        #print('label ',label)
        predictions.append(pred)
        labels.append(label)

predictions = np.hstack(predictions)
#predictions = torch.cat(predictions)
#predictions = torch.tensor(predictions)
labels = np.hstack(labels)
#labels = torch.tensor(labels)
#labels = torch.cat(labels)

return roc_auc_score(labels, predictions)

for epoch in range(1, 201): loss = train(epoch) train_auc = evaluate(train_loader) test_auc = evaluate(test_loader) #train_acc = test(train_loader) #test_acc = test(test_loader) print('Epoch: {:03d}, Loss: {:.5f}, Train Auc: {:.5f}, Test AUC: {:.5f}'. format(epoch, loss, train_auc, test_auc))

Note: For feature extraction from Images I have used ur Master thesis code. I have just used Form_feature_extration file and adjacency.py file but not feature_selection and coarsening file. Are they also needed to create features? Because currently, I have 41 features for every node in the image.

Thanks in advance!

opened by sachinsharma9780 55

Neighborhood Sampling

Hi Matthias,

I wrote my own dataset and dataloader and I used adjacent matrix instead of edge_index. When I tried to convert adj_matrix to edge_index, I got confused because I have multiple graphs (multiple samples, may have different number of nodes) in one batch. I went over some of the examples and found most of them have batch_size 1. How should I prepare the edge_index in mini-batch setting? I can easily use DenseSAGEConv but I want to try other networks.

Thanks, Ming
feature

opened by tbright17 46
Please help me with OSError: libcusparse.so.10: cannot open shared object file: No such file or directory

❓ Questions & Help

this is the traceback

`Traceback (most recent call last): File "/home/yrwang/.local/lib/python3.6/site-packages/torch_sparse/init.py", line 15, in library, [osp.dirname(file)]).origin) File "/home/yrwang/.local/lib/python3.6/site-packages/torch/_ops.py", line 106, in load_library ctypes.CDLL(path) File "/usr/lib/python3.6/ctypes/init.py", line 348, in init self._handle = _dlopen(self._name, mode) OSError: libcusparse.so.10: cannot open shared object file: No such file or directory

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "", line 1, in File "/home/yrwang/.local/lib/python3.6/site-packages/torch_geometric/init.py", line 2, in import torch_geometric.nn File "/home/yrwang/.local/lib/python3.6/site-packages/torch_geometric/nn/init.py", line 2, in from .data_parallel import DataParallel File "/home/yrwang/.local/lib/python3.6/site-packages/torch_geometric/nn/data_parallel.py", line 5, in from torch_geometric.data import Batch File "/home/yrwang/.local/lib/python3.6/site-packages/torch_geometric/data/init.py", line 1, in from .data import Data File "/home/yrwang/.local/lib/python3.6/site-packages/torch_geometric/data/data.py", line 7, in from torch_sparse import coalesce File "/home/yrwang/.local/lib/python3.6/site-packages/torch_sparse/init.py", line 23, in raise OSError(e) OSError: libcusparse.so.10: cannot open shared object file: No such file or directory `

my cuda,cudnn is well installed : nvcc -V nvcc: NVIDIA (R) Cuda compiler driver Copyright (c) 2005-2019 NVIDIA Corporation Built on Sun_Jul_28_19:07:16_PDT_2019 Cuda compilation tools, release 10.1, V10.1.243 my torch version: >>> print(torch.__version__) 1.4.0 I use

`pip3 install torch-scatter==2.0.4+cu101 -f https://pytorch-geometric.com/whl/torch-1.4.0.html

pip3 install torch-sparse==0.6.1+cu101 -f https://pytorch-geometric.com/whl/torch-1.4.0.html

pip3 install torch-cluster==1.5.4+cu101 -f https://pytorch-geometric.com/whl/torch-1.4.0.html

pip3 install torch-spline-conv==1.2.0+cu101 -f https://pytorch-geometric.com/whl/torch-1.4.0.html

pip3 install torch-geometric` to install torch-geometric, but the problem occur, thanks for helping me

opened by yrwangxd 39

Not found error for torch_sparse::ptr2ind in torchscript

❓ Questions & Help

I tried to use pytorch model with MessagePassing layer in C++ code. As described in pytorch_geometric documentation, I generate torch model with my own MP layer and successfully convert the model.

But in the process of executing C++ code, I face the error like below:

Unknown builtin op: torch_sparse::ptr2ind.
Could not find any similar ops to torch_sparse::ptr2ind. This op may not exist or may not be currently supported in TorchScript.
:
  File "/home/sr6/kyuhyun9.lee/env_ML/lib/python3.6/site-packages/torch_sparse/storage.py", line 166
        rowptr = self._rowptr
        if rowptr is not None:
            row = torch.ops.torch_sparse.ptr2ind(rowptr, self._col.numel())
                  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ <--- HERE
            self._row = row
            return row
Serialized   File "code/__torch__/torch_sparse/storage.py", line 825
      if torch.__isnot__(rowptr, None):
        rowptr13 = unchecked_cast(Tensor, rowptr)
        row15 = ops.torch_sparse.ptr2ind(rowptr13, torch.numel(self._col))
                ~~~~~~~~~~~~~~~~~~~~~~~~ <--- HERE
        self._row = row15
        _150, _151 = True, row15
'SparseStorage.row' is being compiled since it was called from 'SparseStorage.__init__'
  File "/home/sr6/kyuhyun9.lee/env_ML/lib/python3.6/site-packages/torch_sparse/storage.py", line 133
        if not is_sorted:
            idx = self._col.new_zeros(self._col.numel() + 1)
            idx[1:] = self._sparse_sizes[1] * self.row() + self._col
                                              ~~~~~~~~ <--- HERE
            if (idx[1:] < idx[:-1]).any():
                perm = idx[1:].argsort()
Serialized   File "code/__torch__/torch_sparse/storage.py", line 267
      idx = torch.new_zeros(self._col, [_29], dtype=None, layout=None, device=None, pin_memory=None)
      _30 = (self._sparse_sizes)[1]
      _31 = torch.add(torch.mul((self).row(), _30), self._col, alpha=1)
                                 ~~~~~~~~~~ <--- HERE
      _32 = torch.slice(idx, 0, 1, 9223372036854775807, 1)
      _33 = torch.copy_(_32, _31, False)
'SparseStorage.__init__' is being compiled since it was called from 'GINLayerJittable_d54f76.__check_input____1'
Serialized   File "code/__torch__/GINLayerJittable_d54f76.py", line 40
      pass
    return the_size
  def __check_input____1(self: __torch__.GINLayerJittable_d54f76.GINLayerJittable_d54f76,
      ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~...  <--- HERE
    edge_index: __torch__.torch_sparse.tensor.SparseTensor,
    size: Optional[Tuple[int, int]]) -> List[Optional[int]]:

Aborted (core dumped)

Since I have no experience of pytorch jit, I cannot find any clue to solve this. How can I handle this error?

bug feature help wanted

opened by Nanco-L 32

Segmentation Fault in Forward Loop of Edge Conv

Hey there!

I'm testing out edge conv and am running into some issues. I'm getting a segmentation fault during the knn_graph generation:

line 13: 14361 Segmentation fault      (core dumped) CUDA_VISIBLE_DEVICES="$gpuNum" python ...

Here's output using pysnooper, I traced the error to this location:

Starting var:.. batch = None
Starting var:.. pos = tensor([[-3.1472e-01, -7.1309e-01, -1.5181e-01,  1.3493e+00,  1.0879e+00,          4.9691e-01, -1.54...
Starting var:.. self = Net(  (conv1): EdgeConv(nn=Sequential(    (0): Linear(in_features=50, out_features=64, bias=True)   ...
17:39:03.629809 call        41  def forward(self, pos, batch):
17:39:03.680347 line        42          edge_index = knn_graph(pos, k=20, batch=batch)
~

Last line is where error is happening.

As more context, I'm generating my point cloud using a CNN, batch and positions are shown above, cannot seem to make it through the generation of the edge index.

Can you please help me out here?

opened by jlevy44 31

`RandLA-Net` example
The paper: RandLA-Net: Efficient Semantic Segmentation of Large-Scale Point Clouds

Context

There lacks a good pytorch implementation of RandLa-Net that leverages pytorch geometric standards and modules. In torch-points3d, the current modules are outdated leading to some confusion among users.

The implementation with the most stars on github is aRI0U/RandLA-Net-pytorch, which has nasty dependencies (torch_points or torch_points_kernels), makes slow back-and-forth between cpu and gpu when calling knns, and only accepts fixed size point clouds.

Proposal

I would like to implement RandLA-Net as part of pyg's examples. For now I would tackle the ModelNet classification task, and would follow the structure of other examples (pointnet2_classification in particular).

The RandLa-Net paper focuses on segmentation, but for classification I would simply add a MLP+Global Max Pooling after the first DilatedResidualBlocks.

RandLa-Net architecture is conceptually close to PointNet++, augmented with different tricks to speed things up (random sampling instead of fps), use more context (with a sort of dilated KNN), and encode local information better (by explicitly calculating positions, distances, and euclidian distance between points in a neighborhood, and by using self-attention on these features).

If I have some success, I will take on the segmentation task as well (which is what interests me anyway for my own project)

Where I am at

I have a working implementation at examples/randlanet_classification.py. I still have to review it to make sure that I am following the paper as closely as possible, but I think I am on the right track.

I would love some guidance on how to move forward. In particular:

Am I using MessagePassing modules correctly?

What should I aim for in term of accuracy on ModelNet?

Should I stick strictly to the paper? Or adapt the architecture to ModelNet.

Indeed the hyperparameters were not chosen by the author for small objects but rather for large scale Lidar data, which could make convergence way longer that needed.

With 4 DilatedResidualBlocks (like in the paper), we reach ~57% accuracy at epoch 200.

With 3 DilatedResidualBlocks, we reach up to 75% accuracy at the 20th epoch

With only 2 DilatedResidualBlocks, we reach 90% accuracy at the 81st epoch, getting closer to the leaderboard for the ModelNet10 challenge.
feature 1 - Priority P1 example
opened by CharlesGaydon 30
Link-level `NeighborLoader`
🚀 The feature, motivation and pitch

Currently, NeighborLoader is designed to be applied in node-level tasks and there exists no option for mini-batching in link-level tasks.

To achieve this, users currently rely on a simple but hacky workaround, first utilized in ogbl-citation2 in this example.

The idea is straightforward and simple: For input_nodes, we pass in both the source and destination nodes for every link we want to do link prediction on (both positive and negative):

loader = NeighborLoader(data, input_nodes=edge_label_index.view(-1), ...)

Nonetheless, PyG should provide a dedicated class to perform mini-batch on link-level tasks, re-using functionality from NeighborLoader under-the-hood. An API could look like:

class LinkLevelNeighborLoader( data, input_edges=... input_edge_labels=... with_negative_sampling=True, **kwargs, )

NOTE: This workaround currently only works for homogenous graphs!

@RexYing @JiaxuanYou
feature 0 - Priority P0
opened by rusty1s 30
Data Batch problem in PyG
🐛 Describe the bug

Hi. I am a computational physics researcher and was using PyG very well. my pyg code was working well a few weeks ago, but now that I run my code, it is not working anymore without any changes.

the problem is like below. I have many material structures and in my "custom_dataset" class, these are preprocessed and all graph informations (node features, edge features, edge index etc) are inserted into "Data" object in PyTorch geometric. You can see that each preprocessed sample with index $i$ was printed normal "Data" object in pyg

But When I insert my custom dataset class into pyg DataLoader and I did like below,

sample = next(iter(train_loader)) # batch sample

batch sample is denoted by "DataDataBatch". I didn't see this kind of object name. and i can't use "sample.x' or "sample.edge_index" command. Instead I need to do like this

I want to use expressions like "sample.x", "sample.edge_index" or "sample.edge_attr" as like before. I expect your kind explanations. Thank you.

Environment

PyG version: 2.0.5

PyTorch version: 1.11.0+cu113

OS: GoogleColab Pro Plus

Python version: Python 3.7.13 in colab

CUDA/cuDNN version:

How you installed PyTorch and PyG (conda, pip, source):

# Install required packages. import os import torch os.environ['TORCH'] = torch.__version__ print(torch.__version__) !pip install -q torch-scatter -f https://data.pyg.org/whl/torch-${TORCH}.html !pip install -q torch-sparse -f https://data.pyg.org/whl/torch-${TORCH}.html !pip install -q git+https://github.com/pyg-team/pytorch_geometric.git !pip install -q pymatgen==2020.11.11

Any other relevant information (e.g., version of torch-scatter):

bug
opened by Amadeus-System 29
AttributeError: 'NoneType' object has no attribute 'origin'
📚 Installation

Traceback (most recent call last): File "/home/shelly/bourne/reimp_paper/MTAG-main/test/t1.py", line 24, in import torch_sparse File "/home/shelly/anaconda3/envs/pyt/lib/python3.6/site-packages/torch_sparse/init.py", line 15, in f'{library}_{suffix}', [osp.dirname(file)]).origin) AttributeError: 'NoneType' object has no attribute 'origin'

Environment

OS: Ubuntu20.04

Python version:3.6.3

PyTorch version:1.8.0

CUDA/cuDNN version:11.1

GCC version:

How did you try to install PyTorch Geometric and its extensions (wheel, source): follow https://pytorch-geometric.readthedocs.io/en/latest/notes/installation.html 。。 Installation via Binaries

Any other relevant information:

Checklist

[1 ] I followed the installation guide.

[1 ] I cannot find my error message in the FAQ.

[1] I set up CUDA correctly and can compile CUDA code via nvcc.

[0 ] I do have multiple CUDA versions on my machine.

Additional context
opened by bourne-3 29
[Explainability Evaluation] - GNN model (Un)Faithfulness v2

Attempt for the implementation of faithfulness measure according to Probing GNN Explainers: A Rigorous Theoretical and Empirical Analysis of GNN Explanation Methods
feature 1 - Priority P1 explain

opened by ZeynepP 0
`HeteroDataBatch.subgraph()` doesn't preserve `num_graphs`
🐛 Describe the bug

After calling the subgraph method of HeteroData on a batched version, the object loses its num_graphs property:

from torch_geometric.data import Batch, HeteroData d = Batch.from_data_list([HeteroData({"agent": {"x": torch.randn(3,3)}})]) sub_d = d.subgraph({"agent": [0]}) print("Does d has the num_graphs:") print(hasattr(d, "num_graphs")) print("Does sub_d has the num_graphs:") print(hasattr(sub_d, "num_graphs"))

Output:

Does d has the num_graphs: True Does sub_d has the num_graphs: False

Environment

PyG version: 2.1.0

PyTorch version: 1.11.0

OS:

Python version: 3.8

CUDA/cuDNN version:

How you installed PyTorch and PyG (conda, pip, source): conda

Any other relevant information (e.g., version of torch-scatter):

bug
opened by ekosman 1

RuntimeError: pseudo.size(1) == kernel_size.numel() INTERNAL ASSERT FAILED. Input mismatch

🐛 Describe the bug

I tried to train a SplineCNN as provided in the example namedmnist_nn_conv.py. I got the following error:

RuntimeError: The following operation failed in the TorchScript interpreter. Traceback of TorchScript (most recent call last): File "...conda\envs\dl23\lib\site-packages\torch_spline_conv\basis.py", line 10, in spline_basis is_open_spline: torch.Tensor, degree: int) -> Tuple[torch.Tensor, torch.Tensor]: return torch.ops.torch_spline_conv.spline_basis(pseudo, kernel_size, ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ <--- HERE is_open_spline, degree) RuntimeError: pseudo.size(1) == kernel_size.numel() INTERNAL ASSERT FAILED at "D:\a\pytorch_spline_conv\pytorch_spline_conv\csrc\cuda\basis_cuda.cu":104, please report a bug to PyTorch. Input mismatch

My Code

`
import os.path as osp

import torch
import torch.nn as nn
import torch.nn.functional as F

import torch_geometric.transforms as T
from torch_geometric.datasets import MNISTSuperpixels
from torch_geometric.loader import DataLoader
from torch_geometric.nn import (
    SplineConv,
    global_mean_pool,
    graclus,
    max_pool,
    max_pool_x,
)
from torch_geometric.utils import normalized_cut


//Datasets
path = osp.join(osp.dirname(osp.realpath("/")), '..', 'data', 'MNIST')
transform = T.Cartesian(cat=False)
train_dataset = MNISTSuperpixels(path, True, transform=transform)
test_dataset = MNISTSuperpixels(path, False, transform=transform)
train_loader = DataLoader(train_dataset, batch_size=64, shuffle=True)
test_loader = DataLoader(test_dataset, batch_size=64, shuffle=False)
d = train_dataset`

//Normalized Cut
def normalized_cut_2d(edge_index, pos):
    row, col = edge_index
    edge_attr = torch.norm(pos[row] - pos[col], p=2, dim=1)
    return normalized_cut(edge_index, edge_attr, num_nodes=pos.size(0))

//SplineCNN
class Net(nn.Module):
    def __init__(self):
        super().__init__()
        self.conv1 = SplineConv(in_channels = d.num_features, out_channels= 32,dim=1, kernel_size = 3)
        self.conv2 = SplineConv(in_channels = 32, out_channels= 64, dim=1, kernel_size = 3)
        self.fc1 = torch.nn.Linear(64, 128)
        self.fc2 = torch.nn.Linear(128, d.num_classes)

    def forward(self, data):
        data.x = F.elu(self.conv1(data.x, data.edge_index, data.edge_attr))
        weight = normalized_cut_2d(data.edge_index, data.pos)
        cluster = graclus(data.edge_index, weight, data.x.size(0))
        data.edge_attr = None
        data = max_pool(cluster, data, transform=transform)

        data.x = F.elu(self.conv2(data.x, data.edge_index, data.edge_attr))
        weight = normalized_cut_2d(data.edge_index, data.facepos)
        cluster = graclus(data.edge_index, weight, data.x.size(0))
        x, batch = max_pool_x(cluster, data.x, data.batch)

        x = global_mean_pool(x, batch)
        x = F.elu(self.fc1(x))
        x = F.dropout(x, training=self.training)
        return F.log_softmax(self.fc2(x), dim=1)

//Create Model
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
model = Net().to(device)
optimizer = torch.optim.Adam(model.parameters(), lr=0.01)

//Train Function
def train(epoch):
    model.train()

    if epoch == 16:
        for param_group in optimizer.param_groups:
            param_group['lr'] = 0.001

    if epoch == 26:
        for param_group in optimizer.param_groups:
            param_group['lr'] = 0.0001

    for data in train_loader:
        data = data.to(device)
        optimizer.zero_grad()
        F.nll_loss(model(data), data.y).backward()
        optimizer.step()

//Test Function
def test():
    model.eval()
    correct = 0

    for data in test_loader:
        data = data.to(device)
        pred = model(data).max(1)[1]
        correct += pred.eq(data.y).sum().item()
    return correct / len(test_dataset)

//Run epoch
for epoch in range(1, 31):
    train(epoch)
    test_acc = test()
    print(f'Epoch: {epoch:02d}, Test: {test_acc:.4f}')

Environment

PyG version: 2.1.0
PyTorch version: 1.13.0
OS: Windows
Python version:3.10.8
CUDA/cuDNN version: 11.7
How you installed PyTorch and PyG (conda, pip, source):pip
Any other relevant information (e.g., version of torch-scatter):pip install torch-scatter torch-sparse torch-cluster torch-spline-conv torch-geometric -f https://data.pyg.org/whl/torch-1.13.0+cu117.html

bug

opened by Amirtmgr 1

Questions abou the weight sharing method in GNN/Pyg

🚀 The feature, motivation and pitch

Hi, I intend to implement the weight sharing method in pyg. I notice that there is a post which is related to this work: https://github.com/pyg-team/pytorch_geometric/issues/1503 However, for other module like GATconv and Transformconv, I did not find conv.weight or conv.bias. Could you please help me figure out why? Thanks.

Alternatives

No response

Additional context

No response
feature

opened by HelloWorldLTY 7
Explainability for GNNs

This PR contains implementation of how to compute layer-wise weights for each edge in order to produce explanations for node-level, edge-level, and graph-level tasks. Furthermore, this implementation is different from authors' original implementation and is fast and more memory efficient than theirs. Have added Tests and Examples of the proposed implementation in order to make the overall approach understandable.
feature 0 - Priority P0 explain

opened by fork123aniket 1

Releases(2.2.0)

2.2.0(Dec 1, 2022)
We are excited to announce the release of PyG 2.2 🎉🎉🎉

Highlights

Breaking Changes

Deprecations

Features

Bugfixes

Full Changelog

PyG 2.2 is the culmination of work from 78 contributors who have worked on features and bug-fixes for a total of over 320 commits since torch-geometric==2.1.0.

Highlights

pyg-lib Integration

We are proud to release and integrate pyg-lib==0.1.0 into PyG, the first stable version of our new low-level Graph Neural Network library to drive all CPU and GPU acceleration needs of PyG (#5330, #5347, #5384, #5388).

You can install pyg-lib as described in our README.md:

pip install pyg-lib -f https://data.pyg.org/whl/torch-${TORCH}+${CUDA}.html

import pyg_lib

Once pyg-lib is installed, it will get automatically picked up by PyG, e.g., to accelerate neighborhood sampling routines or to accelerate heterogeneous GNN execution:

pyg-lib provides fast and optimized CPU routines to iteratively sample neighbors in homogeneous and heterogeneous graphs, and heavily improves upon the previously used neighborhood sampling techniques utilized in PyG.

pyg-lib provides efficient GPU-based routines to parallelize workloads in heterogeneous graphs across different node types and edge types. We achieve this by leveraging type-dependent transformations via NVIDIA CUTLASS integration, which is flexible to implement most heterogeneous GNNs with, and efficient, even for sparse edge types or a large number of different node types.

GraphStore and FeatureStore Abstractions

PyG 2.2 includes numerous primitives to easily integrate with simple paradigms for scalable graph machine learning, enabling users to train GNNs on graphs far larger than the size of their machine's available memory. It does so by introducing simple, easy-to-use, and extensible abstractions of a FeatureStore and a GraphStore that plug directly into existing familiar PyG interfaces (see here for the accompanying tutorial).

feature_store = CustomFeatureStore() feature_store['paper', 'x', None] = ... # Add paper features feature_store['author', 'x', None] = ... # Add author features graph_store = CustomGraphStore() graph_store['edge', 'coo'] = ... # Add edges in "COO" format # `CustomGraphSampler` knows how to sample on `CustomGraphStore`: graph_sampler = CustomGraphSampler( graph_store=graph_store, num_neighbors=[10, 20], ... ) from torch_geometric.loader import NodeLoader loader = NodeLoader( data=(feature_store, graph_store), node_sampler=graph_sampler, batch_size=20, input_nodes='paper', ) for batch in loader: pass

Data loading and sampling routines are refactored and decomposed into torch_geometric.loader and torch_geometric.sampler modules, respectively (#5563, #5820, #5456, #5457, #5312, #5365, #5402, #5404, #5418).

Optimized and Fused Aggregations

PyG 2.2 further accelerates scatter aggregations based on CPU/GPU and with/without backward computation paths (requires torch>=1.12.0 and torch-scatter>=2.1.0) (#5232, #5241, #5353, #5386, #5399, #6051, #6052).

We also optimized the usage of nn.aggr.MultiAggregation by fusing the computation of multiple aggregations together (see here for more details) (#6036, #6040).

Here are some benchmarking results on PyTorch 1.12 (summed over 1000 runs):

| Aggregators | Vanilla | Fusion | |-------------------------|---------|---------| | [sum, mean] | 0.3325s | 0.1996s | | [sum, mean, min, max] | 0.7139s | 0.5037s | | [sum, mean, var] | 0.6849s | 0.3871s | | [sum, mean, var, std] | 1.0955s | 0.3973s |

Lastly, we have incorporated "fused" GNN operators via the dgNN package, starting with a FusedGATConv implementation (#5140).

Community Sprint: Type Hints and TorchScript Support

We are running regular community sprints to get our community more involved in building PyG. Whether you are just beginning to use graph learning or have been leveraging GNNs in research or production, the community sprints welcome members of all levels with different types of projects.

We had our first community sprint on 10/12 to fully-incorporate type hints and TorchScript support over the entire code base. The goal was to improve usability and cleanliness of our codebase. We had 20 contributors participating, contributing to 120 type hints within 2 weeks, adding around 2400 lines of code (#5842, #5603, #5659, #5664, #5665, #5666, #5667, #5668, #5669, #5673, #5675, #5673, #5678, #5682, #5683, #5684, #5685, #5687, #5688, #5695, #5699, #5701, #5702, #5703, #5706, #5707, #5710, #5714, #5715, #5716, #5722, #5724, #5725, #5726, #5729, #5730, #5731, #5732, #5733, #5743, #5734, #5735, #5736, #5737, #5738, #5747, #5752, #5753, #5754, #5756, #5757, #5758, #5760, #5766, #5767, #5768, #5781, #5778, #5797, #5798, #5799, #5800, #5806, #5810, #5811, #5828, #5847, #5851, #5852).

Explainability

Our second community sprint began on 11/15 with the goal to improve the explainability capabilities of PyG. With this, we introduce the torch_geometric.explain module to provide a unified set of tools to explain the predictions of a PyG model or to explain the underlying phenomenon of a dataset.

Some of the features developed in the sprint are incorporated into this release:

Added the torch_geometric.explain module (#5804, #6054, #6089)

Moved and adapted the GNNExplainer module to torch_geometric.explain (#5967, #6065). See here and here for the accompanying examples.

Extended GNNExplainer to support edge level explanations (#6056)

Added explainability support for heterogeneous GNNs via to_captum_model and to_captum_input (#5886, #5934)

data = HeteroData(...) model = HeteroGNN(...) # Explain predictions on heterogenenous graphs for output node 10: captum_model = to_captum_model(model, mask_type, output_idx, metadata) inputs, additional_forward_args = to_captum_input(data.x_dict, data.edge_index_dict, mask_type) ig = IntegratedGradients(captum_model) ig_attr = ig.attribute( inputs=inputs, target=int(y[output_idx]), additional_forward_args=additional_forward_args, internal_batch_size=1, )

Breaking Changes

Renamed drop_unconnected_nodes to drop_unconnected_node_types and drop_orig_edges to drop_orig_edge_types in AddMetapaths (#5490)

Deprecations

The usage of nn.models.GNNExplainer is now deprecated in favor of explain.GNNExplainer

The usage of utils.dropout_adj is now deprecated in favor of utils.dropout_edge

The usage of loader.RandomNodeSampler is now deprecated in favor of loader.RandomNodeLoader

The usage of to_captum is now deprecated in favor of to_captum_model.

Features

Layers, Models and Examples

Added a "Link Prediction on MovieLens" Colab notebook (#5823)

Added a bipartite link-prediction example (#5834)

Added the SSGConv layer (#5599)

Added the WLConvContinuous layer for performing WL-refinement with continuous attributes (#5316)

Added the PositionalEncoding module (#5381)

Added a node classification example instrumented with Weights and Biases (#5192)

Data Loaders

Added support for triplet sampling in LinkNeighborLoader (#6004)

Added temporal_strategy = uniform/last option to NeighborLoader and LinkNeighborLoader (#5576)

Added a disjoint option to NeighborLoader and LinkNeighborLoader (#5717, #5775)

Added HeteroData support in RandomNodeLoader (#6007

Added int32-based edge_index support in NeighborLoader (#5948)

Added support for input_time in NeighborLoader (#5763)

Added np.memmap support in NeighborLoader (#5696)

Added CPU affinitization support to NeighborLoader (#6005)

Transformations

Added a FeaturePropagation transform (#5387)

Added IndexToMask and MaskToIndex transforms (#5375, #5455)

Added shuffle_node, mask_feature and add_random_edge augmentations (#5548)

Added dropout_node, dropout_edge and dropout_path augmentations (#5481, #5495, #5531)

Added a AddRandomMetaPaths transform that adds edges based on random walks along a metapath (#5397)

Added a utils.to_smiles function (#6038)

Added HeteroData support for transforms.Constant (#5700)

Datasets

Added the LRGBDataset to include 5 datasets from the Long Range Graph Benchmark (#5935)

Added the HydroNet water cluster dataset (#5537, #5902, #5903)

Added the DGraphFin dynamic graph dataset (#5504)

Added the official splits to the MalNetTiny dataset (#5078)

Added a print_summary method to torch_geometric.data.Dataset (#5438)

General Improvements

Added training and inference benchmark scripts (#5774, #5830, #5878, #5293, #5341, #5242, #5258, #5881, #5254)

Added the utils.assortativity function to compute the degree assortativity coefficient (#5587)

Add support for filling labels with dummy values in HeteroData.to_homogeneous() (#5540)

Added torch.onnx.export support (see here for an example) (#5877, #5997)

Added option to make normalization coefficients trainable in PNAConv (#6039)

Added a semi_grad option in VarAggregation and StdAggregation (#6042)

Added a warning for invalid node and edge type names in HeteroData (#5990)

Added lr_scheduler_solver and customized lr_scheduler classes (#5942)

Added to_fixed_size graph transformer (#5939)

Added support for symbolic tracing in the SchNet model (#5938)

Added support for customizing the interaction graph in the SchNet model (#5919)

Added SparseTensor support to SuperGATConv (#5888)

Added TorchScript support for AttentiveFP (#5868)

Added a return_semantic_attention_weights argument HANConv (#5787)

Added temperature value customization in dense_mincut_pool (#5908)

Added support for a tuple of in_channels in GENConv for bipartite message passing (#5627, #5641)

Added Aggregation.set_validate_args option to skip validation of dim_size (#5290)

Added BaseStorage.get() functionality (#5240)

Added support for batches of size one in BatchNorm (#5530, #5614)

The AttentionalAggregation module can now be applied to compute attention on a per-feature level (#5449)

Added TorchScript support to ASAPooling (#5395)

Updated the unsupervised GraphSAGE example to leverage LinkNeighborLoader (#5317)

Added better out-of-bounds error message in MessagePassing (#5339)

Added support to customize the activation function in PNAConv (#5262)

Bugfixes

Fixed a bug in TUDataset, in which node features were wrongly constructed whenever node_attributes only hold a single feature (e.g., in PROTEINS) (#5441)

Fixed a bug in the VirtualNode transform, in which node features were mistakenly treated as edge features (#5819)

Fixed a bug when applying several scalers with PNAConv (#5514)

Fixed setter and getter handling in BaseStorage (#5815)

Fixed the auto_select_device routine in GraphGym for pytorch_lightning>=1.7 (#5677)

Fixed RandomLinkSplit in case there aren't enough negative edges to sample (#5642)

Fixed the in-place modification to mode_kwargs in MultiAggregation (#5601)

Fixed the utils.to_dense_adj routine in case edge_index is empty (#5476)

Fixed the PointTransformerConv to now correctly use sum aggregation (#5332)

Fixed the output of InMemoryDataset.num_classes in case a transform modifies data.y (#5274)

Fail gracefully on GLIBC errors within torch-spline-conv (#5276)

Full Changelog

Added

Extended GNNExplainer to support edge level explanations (#6056)

Added CPU affinitization for NodeLoader (#6005)

Added triplet sampling in LinkNeighborLoader (#6004)

Added FusedAggregation of simple scatter reductions (#6036)

Added a to_smiles function (#6038)

Added option to make normalization coefficients trainable in PNAConv (#6039)

Added semi_grad option in VarAggregation and StdAggregation (#6042)

Allow for fused aggregations in MultiAggregation (#6036, #6040)

Added HeteroData support for to_captum_model and added to_captum_input (#5934)

Added HeteroData support in RandomNodeLoader (#6007)

Added bipartite GraphSAGE example (#5834)

Added LRGBDataset to include 5 datasets from the Long Range Graph Benchmark (#5935)

Added a warning for invalid node and edge type names in HeteroData (#5990)

Added PyTorch 1.13 support (#5975)

Added int32 support in NeighborLoader (#5948)

Add dgNN support and FusedGATConv implementation (#5140)

Added lr_scheduler_solver and customized lr_scheduler classes (#5942)

Add to_fixed_size graph transformer (#5939)

Add support for symbolic tracing of SchNet model (#5938)

Add support for customizable interaction graph in SchNet model (#5919)

Started adding torch.sparse support to PyG (#5906, #5944, #6003)

Added HydroNet water cluster dataset (#5537, #5902, #5903)

Added explainability support for heterogeneous GNNs (#5886)

Added SparseTensor support to SuperGATConv (#5888)

Added TorchScript support for AttentiveFP(#5868)

Added num_steps argument to training and inference benchmarks (#5898)

Added torch.onnx.export support (#5877, #5997)

Enable VTune ITT in inference and training benchmarks (#5830, #5878)

Add training benchmark (#5774)

Added a "Link Prediction on MovieLens" Colab notebook (#5823)

Added custom sampler support in LightningDataModule (#5820)

Added a return_semantic_attention_weights argument HANConv (#5787)

Added disjoint argument to NeighborLoader and LinkNeighborLoader (#5775)

Added support for input_time in NeighborLoader (#5763)

Added disjoint mode for temporal LinkNeighborLoader (#5717)

Added HeteroData support for transforms.Constant (#5700)

Added np.memmap support in NeighborLoader (#5696)

Added assortativity that computes degree assortativity coefficient (#5587)

Added SSGConv layer (#5599)

Added shuffle_node, mask_feature and add_random_edge augmentation methdos (#5548)

Added dropout_path augmentation that drops edges from a graph based on random walks (#5531)

Add support for filling labels with dummy values in HeteroData.to_homogeneous() (#5540)

Added temporal_strategy option to neighbor_sample (#5576)

Added torch_geometric.sampler package to docs (#5563)

Added the DGraphFin dynamic graph dataset (#5504)

Added dropout_edge augmentation that randomly drops edges from a graph - the usage of dropout_adj is now deprecated (#5495)

Added dropout_node augmentation that randomly drops nodes from a graph (#5481)

Added AddRandomMetaPaths that adds edges based on random walks along a metapath (#5397)

Added WLConvContinuous for performing WL refinement with continuous attributes (#5316)

Added print_summary method for the torch_geometric.data.Dataset interface (#5438)

Added sampler support to LightningDataModule (#5456, #5457)

Added official splits to MalNetTiny dataset (#5078)

Added IndexToMask and MaskToIndex transforms (#5375, #5455)

Added FeaturePropagation transform (#5387)

Added PositionalEncoding (#5381)

Consolidated sampler routines behind torch_geometric.sampler, enabling ease of extensibility in the future (#5312, #5365, #5402, #5404), #5418)

Added pyg-lib neighbor sampling (#5384, #5388)

Added pyg_lib.segment_matmul integration within HeteroLinear (#5330, #5347))

Enabled bf16 support in benchmark scripts (#5293, #5341)

Added Aggregation.set_validate_args option to skip validation of dim_size (#5290)

Added SparseTensor support to inference and training benchmark suite (#5242, #5258, #5881)

Added experimental mode in inference benchmarks (#5254)

Added node classification example instrumented with Weights and Biases (W&B) logging and W&B Sweeps (#5192)

Added experimental mode for utils.scatter (#5232, #5241, #5386)

Added missing test labels in HGBDataset (#5233)

Added BaseStorage.get() functionality (#5240)

Added a test to confirm that to_hetero works with SparseTensor (#5222)

Added torch_geometric.explain module with base functionality for explainability methods (#5804, #6054, #6089)

Changed

Moved and adapted GNNExplainer from torch_geometric.nn to torch_geometric.explain.algorithm (#5967, #6065)

Optimized scatter implementations for CPU/GPU, both with and without backward computation (#6051, #6052)

Support temperature value in dense_mincut_pool (#5908)

Fixed a bug in which VirtualNode mistakenly treated node features as edge features (#5819)

Fixed setter and getter handling in BaseStorage (#5815)

Fixed path in hetero_conv_dblp.py example (#5686)

Fix auto_select_device routine in GraphGym for PyTorch Lightning>=1.7 (#5677)

Support in_channels with tuple in GENConv for bipartite message passing (#5627, #5641)

Handle cases of not having enough possible negative edges in RandomLinkSplit (#5642)

Fix RGCN+pyg-lib for LongTensor input (#5610)

Improved type hint support (#5842, #5603, #5659, #5664, #5665, #5666, #5667, #5668, #5669, #5673, #5675, #5673, #5678, #5682, #5683, #5684, #5685, #5687, #5688, #5695, #5699, #5701, #5702, #5703, #5706, #5707, #5710, #5714, #5715, #5716, #5722, #5724, #5725, #5726, #5729, #5730, #5731, #5732, #5733, #5743, #5734, #5735, #5736, #5737, #5738, #5747, #5752, #5753, #5754, #5756, #5757, #5758, #5760, #5766, #5767, #5768), #5781, #5778, #5797, #5798, #5799, #5800, #5806, #5810, #5811, #5828, #5847, #5851, #5852)

Avoid modifying mode_kwargs in MultiAggregation (#5601)

Changed BatchNorm to allow for batches of size one during training (#5530, #5614)

Integrated better temporal sampling support by requiring that local neighborhoods are sorted according to time (#5516, #5602)

Fixed a bug when applying several scalers with PNAConv (#5514)

Allow . in ParameterDict key names (#5494)

Renamed drop_unconnected_nodes to drop_unconnected_node_types and drop_orig_edges to drop_orig_edge_types in AddMetapaths (#5490)

Improved utils.scatter performance by explicitly choosing better implementation for add and mean reduction (#5399)

Fix to_dense_adj with empty edge_index (#5476)

The AttentionalAggregation module can now be applied to compute attentin on a per-feature level (#5449)

Ensure equal lenghts of num_neighbors across edge types in NeighborLoader (#5444)

Fixed a bug in TUDataset in which node features were wrongly constructed whenever node_attributes only hold a single feature (e.g., in PROTEINS) (#5441)

Breaking change: removed num_neighbors as an attribute of loader (#5404)

ASAPooling is now jittable (#5395)

Updated unsupervised GraphSAGE example to leverage LinkNeighborLoader (#5317)

Replace in-place operations with out-of-place ones to align with torch.scatter_reduce API (#5353)

Breaking bugfix: PointTransformerConv now correctly uses sum aggregation (#5332)

Improve out-of-bounds error message in MessagePassing (#5339)

Allow file names of a Dataset to be specified as either property and method (#5338)

Fixed separating a list of SparseTensor within InMemoryDataset (#5299)

Improved name resolving of normalization layers (#5277)

Fail gracefully on GLIBC errors within torch-spline-conv (#5276)

Fixed Dataset.num_classes in case a transform modifies data.y (#5274)

Allow customization of the activation function within PNAConv (#5262)

Do not fill InMemoryDataset cache on dataset.num_features (#5264)

Changed tests relying on dblp datasets to instead use synthetic data (#5250)

Fixed a bug for the initialization of activation function examples in custom_graphgym (#5243)

Allow any integer tensors when checking edge_index input to message passing (5281)

Removed

Removed scatter_reduce option from experimental mode (#5399)

Full commit list: https://github.com/pyg-team/pytorch_geometric/compare/2.1.0...2.2.0
Source code(tar.gz)
Source code(zip)
2.1.0(Aug 17, 2022)
We are excited to announce the release of PyG 2.1.0 🎉🎉🎉

Highlights

Breaking Changes

Deprecations

Features

Bugfixes

Full Changelog

PyG 2.1.0 is the culmination of work from over 60 contributors who have worked on features and bug-fixes for a total of over 320 commits since torch-geometric==2.0.4.

Highlights

Principled Aggregations

See here for the accompanying tutorial.

Aggregation functions play an important role in the message passing framework and the readout functions of Graph Neural Networks. Specifically, many works in the literature (Hamilton et al. (2017), Xu et al. (2018), Corso et al. (2020), Li et al. (2020), Tailor et al. (2021), Bartunov et al. (2022)) demonstrate that the choice of aggregation functions contributes significantly to the representational power and performance of the model.

To facilitate further experimentation and unify the concepts of aggregation within GNNs across both MessagePassing and global readouts, we have made the concept of Aggregation a first-class principle in PyG (#4379, #4522, #4687, #4721, #4731, #4762, #4749, #4779, #4863, #4864, #4865, #4866, #4872, #4927, #4934, #4935, #4957, #4973, #4973, #4986, #4995, #5000, #5021, #5034, #5036, #5039, #4522, #5033, #5085, #5097, #5099, #5104, #5113, #5130, #5098, #5191). As of now, PyG provides support for various aggregations — from simple ones (e.g., mean, max, sum), to advanced ones (e.g., median, var, std), learnable ones (e.g., SoftmaxAggregation, PowerMeanAggregation), and exotic ones (e.g., LSTMAggregation, SortAggregation, EquilibriumAggregation). Furthermore, multiple aggregations can be combined and stacked together:

from torch_geometric.nn import MessagePassing, SoftmaxAggregation class MyConv(MessagePassing): def __init__(self, ...): # Combines a set of aggregations and concatenates their results. # The interface also supports automatic resolution. super().__init__(aggr=['mean', 'std', SoftmaxAggregation(learn=True)])

Link-level Neighbor Loader

We added a new LinkNeighborLoader class for training scalable GNNs that perform edge-level predictions on giant graphs (#4396, #4439, #4441, #4446, #4508, #4509, #4868). LinkNeighborLoader comes with automatic support for both homogeneous and heterogenous data, and supports link prediction via automatic negative sampling as well as edge-level classification and regression models:

from torch_geometric.loader import LinkNeighborLoader loader = LinkNeighborLoader( data, num_neighbors=[30] * 2, # Sample 30 neighbors for each node for 2 iterations batch_size=128, # Use a batch size of 128 for sampling training links edge_label_index=data.edge_index, # Use the entire graph for supervision negative_sampling_ratio=1.0, # Sample negative edges ) sampled_data = next(iter(loader)) print(sampled_data) >>> Data(x=[1368, 1433], edge_index=[2, 3103], edge_label_index=[2, 256], edge_label=[256])

Neighborhood Sampling based on Temporal Constraints

Both NeighborLoader and LinkNeighborLoader now support temporal sampling via the time_attr argument (#4025, #4877, #4908, #5137, #5173). If set, temporal sampling will be used such that neighbors are guaranteed to fulfill temporal constraints, i.e. neighbors have an earlier timestamp than the center node:

from torch_geometric.loader import NeighborLoader data['paper'].time = torch.arange(data['paper'].num_nodes) loader = NeighborLoader( data, input_nodes='paper', time_attr='time', # Only sample papers that appeared before the seed paper num_neighbors=[30] * 2, batch_size=128, )

Note that this feature requires torch-sparse>=0.6.14.

Functional DataPipes

See here for the accompanying example.

PyG now fully supports data loading using the newly introduced concept of DataPipes in PyTorch for easily constructing flexible and performant data pipelines (#4302, #4345, #4349). PyG provides DataPipe support for batching multiple PyG data objects together and for applying any PyG transform:

datapipe = FileOpener(['SMILES_HIV.csv']) datapipe = datapipe.parse_csv_as_dict() datapipe = datapipe.parse_smiles(target_key='HIV_active') datapipe = datapipe.in_memory_cache() # Cache graph instances in-memory. datapipe = datapipe.shuffle() datapipe = datapipe.batch_graphs(batch_size=32)

datapipe = FileLister([root_dir], masks='*.off', recursive=True) datapipe = datapipe.read_mesh() datapipe = datapipe.in_memory_cache() # Cache graph instances in-memory. datapipe = datapipe.sample_points(1024) # Use PyG transforms from here. datapipe = datapipe.knn_graph(k=8) datapipe = datapipe.shuffle() datapipe = datapipe.batch_graphs(batch_size=32)

Breaking Changes

The torch_geometric.utils.metric package has been removed. We now recommend to use the torchmetrics package instead.

len(batch) of the data.Batch class will now return the number of graphs inside the batch, not the number of attributes (#4931)

Deprecations

The usage of the torch_geometric.nn.glob package is now deprecated in favor of torch_geometric.nn.aggr

The usage of RandomTranslate is now deprecated in favor of RandomJitter (#4828)

Features

Layers, Models and Examples

Added the GroupAddRev module with support for reducing training GPU memory (#4671, #4701, #4715, #4730) [Example]

Added the MaskLabel module for performing masked label propagation (#4197) **[Example]

Added the DimeNetPlusPlus module (#4432, #4699, #4700, #4800)

Added the MeanSubtractionNorm module (#5068)

Added the DynamicBatchSampler for filling a mini-batch with a variable number of samples up to a maximum size (#4972)

Added PyTorch Lightning support in GraphGym (#4511, #4516 #4531, #4689, #4843)

Added an example of using PyG with PyTorch Ignite (#4487)

Added an example for unsupervised heterogeneous graph learning (#3189)

Added an example for unsupervised GraphSAGE on the PPI dataset (#4416)

Added the EdgeCNN model (#4991)

Added an example to load a trained PyG model in C++ (#4307)

Transformations

Added the AddPositionalEncoding transforms with two implementations: AddLaplacianEigenvectorPE and AddRandomWalkPE (#4521)

Added the Rooted transform with two implementations: RootedEgoNets and RootedRWSubgraph (#3926)

Added support for computing weighted metapaths in AddMetapaths (#5049)

Datasets

Added the Genius and Wiki datasets to the LINKXDataset (#4570, #4600)

Added the AQSOL dataset (#4626)

Added Geom-GCN splits to the Planetoid datasets (#4442)

General Improvements

Added support for GATv2Conv in the GAT model (#4357)

Added support for projecting features before propagation in SAGEConv (#4437)

Added a MessagePassing.explain_message() method to customize making explanations on messages (#4278, #4448))

Added the MLP.plain_last = False option (4652)

Added support graph-level attributes in networkx conversion (#4343)

Added Data.validate() and HeteroData.validate() functionality to validate the correctness of the data (#4885)

Added TorchScript support to JumpingKnowledge module (#4805)

Added predict() support to the LightningNodeData module (#4884)

Added support for renaming node types via HeteroData.rename() (#4329)

Added HeteroData.num_features functionality (#4504)

Added HeteroData.subgraph, HeteroData.node_type_subgraph and HeteroData.edge_type_subgraph functionality (#4243)

Added HeteroData support to the RemoveIsolatedNodes transform (#4479)

Added support for graph-level outputs in to_hetero (#4582)

Added HeteroData.is_undirected() support (#4604)

Added HeteroData.node_items() and HeteroData.edge_items() functionality (#4644)

Added HeteroData.subgraph() support (#4635)

Added node-wise normalization mode in LayerNorm (#4944)

Added utils.unbatch and utils.unbatch_edge_index functionality for splitting an edge_index tensor according to a batch vector (#4628, #4903)

Added scalable inference mode in BasicGNN with layer-wise neighbor loading (#4977)

Added fine grained options for setting bias and dropout per layer in the MLP model (#4981)

Added support for BasicGNN models within to_hetero (#5091)

Let ImbalancedSampler accept torch.Tensor as input (#5138)

Allow edge_type == rev_edge_type argument in RandomLinkSplit (#4757)

Bugfixes

Fixed a bug in RGATConv that produced device mismatches for "f-scaled" mode (#5187]

Fixed a bug in GINEConv bug for non-Sequential neural network layers (#5154]

Fixed a bug in HGTLoader which produced outputs with missing edge types, will require torch-sparse>=0.6.15 (#5067)

Fixed a bug in load_state_dict for Linear with strict=False mode (5094)

Fixed data.num_node_features computation for sparse matrices (5089)

Fixed a bug in which GraphGym did not create new non-linearity functions but re-used existing ones (4978)

Fixed BasicGNN for num_layers=1, which now respects a desired number of out_channels (#4943)

Fixed a bug in data.subgraph for 0-dim tensors (#4932)

Fixed a bug in InMemoryDataset inferring wrong length for lists of tensors (#4837)

Fixed a bug in TUDataset where pre_filter was not applied whenever pre_transform was present (#4842)

Fixed access of edge types in HeteroData via two node types when there exists multiple relations between them (#4782)

Fixed a bug in HANConv in which destination node features rather than source node features were propagated (#4753)

Fixed a ranking protocol bug in the RGCN link prediction example (#4688)

Fixed the interplay between TUDataset and pre_transform transformations that modify node features (#4669)

The bias argument in TAGConv is now correctly applied (#4597)

Fixed filtering of attributes in samplers in case __cat_dim__ != 0 (#4629)

Fixed SparseTensor support in NeighborLoader (#4320)

Fixed average degree handling in PNAConv (#4312)

Fixed a bug in from_networkx in case some attributes are PyTorch tensors (#4486)

Fixed a missing clamp in the DimeNet model (#4506, #4562)

Fixed the download link in DBP15K (#4428)

Fixed an autograd bug in DimeNet when resetting parameters (#4424)

Fixed bipartite message passing in case flow="target_to_source" (#4418)

Fixed a bug in which num_nodes was not properly updated in the FixedPoints transform (#4394)

Fixed a bug in which GATConv was not jittable (#4347)

Fixed a bug in which nn.models.GAT did not produce out_channels many output channels (#4299)

Fixed a bug in mini-batching with empty lists as attributes (#4293)

Fixed a bug in which GCNConv could not be combined with to_hetero on heterogeneous graphs with one node type (#4279)

Full Changelog

Added

Added edge_label_time argument to LinkNeighborLoader (#5137, #5173)

Let ImbalancedSampler accept torch.Tensor as input (#5138)

Added flow argument to gcn_norm to correctly normalize the adjacency matrix in GCNConv (#5149)

NeighborSampler supports graphs without edges (#5072)

Added the MeanSubtractionNorm layer (#5068)

Added pyg_lib.segment_matmul integration within RGCNConv (#5052, #5096)

Support SparseTensor as edge label in LightGCN (#5046)

Added support for BasicGNN models within to_hetero (#5091)

Added support for computing weighted metapaths in AddMetapaths (#5049)

Added inference benchmark suite (#4915)

Added a dynamically sized batch sampler for filling a mini-batch with a variable number of samples up to a maximum size (#4972)

Added fine grained options for setting bias and dropout per layer in the MLP model (#4981)

Added EdgeCNN model (#4991)

Added scalable inference mode in BasicGNN with layer-wise neighbor loading (#4977)

Added inference benchmarks (#4892, #5107)

Added PyTorch 1.12 support (#4975)

Added unbatch_edge_index functionality for splitting an edge_index tensor according to a batch vector (#4903)

Added node-wise normalization mode in LayerNorm (#4944)

Added support for normalization_resolver (#4926, #4951, #4958, #4959)

Added notebook tutorial for torch_geometric.nn.aggr package to documentation (#4927)

Added support for follow_batch for lists or dictionaries of tensors (#4837)

Added Data.validate() and HeteroData.validate() functionality (#4885)

Added LinkNeighborLoader support to LightningDataModule (#4868)

Added predict() support to the LightningNodeData module (#4884)

Added time_attr argument to LinkNeighborLoader (#4877, #4908)

Added a filter_per_worker argument to data loaders to allow filtering of data within sub-processes (#4873)

Added a NeighborLoader benchmark script (#4815, #4862)

Added support for FeatureStore and GraphStore in NeighborLoader (#4817, #4851, #4854, #4856, #4857, #4882, #4883, #4929, #4992, #4962, #4968, #5037, #5088)

Added a normalize parameter to dense_diff_pool (#4847)

Added size=None explanation to jittable MessagePassing modules in the documentation (#4850)

Added documentation to the DataLoaderIterator class (#4838)

Added GraphStore support to Data and HeteroData (#4816)

Added FeatureStore support to Data and HeteroData (#4807, #4853)

Added FeatureStore and GraphStore abstractions (#4534, #4568)

Added support for dense aggregations in global_*_pool (#4827)

Added Python version requirement (#4825)

Added TorchScript support to JumpingKnowledge module (#4805)

Added a max_sample argument to AddMetaPaths in order to tackle very dense metapath edges (#4750)

Test HANConv with empty tensors (#4756, #4841)

Added the bias vector to the GCN model definition in the "Create Message Passing Networks" tutorial (#4755)

Added transforms.RootedSubgraph interface with two implementations: RootedEgoNets and RootedRWSubgraph (#3926)

Added ptr vectors for follow_batch attributes within Batch.from_data_list (#4723)

Added torch_geometric.nn.aggr package (#4687, #4721, #4731, #4762, #4749, #4779, #4863, #4864, #4865, #4866, #4872, #4934, #4935, #4957, #4973, #4973, #4986, #4995, #5000, #5034, #5036, #5039, #4522, #5033, #5085, #5097, #5099, #5104, #5113, #5130, #5098, #5191)

Added the DimeNet++ model (#4432, #4699, #4700, #4800)

Added an example of using PyG with PyTorch Ignite (#4487)

Added GroupAddRev module with support for reducing training GPU memory (#4671, #4701, #4715, #4730)

Added benchmarks via wandb (#4656, #4672, #4676)

Added unbatch functionality (#4628)

Confirm that to_hetero() works with custom functions, e.g., dropout_adj (4653)

Added the MLP.plain_last=False option (4652)

Added a check in HeteroConv and to_hetero() to ensure that MessagePassing.add_self_loops is disabled (4647)

Added HeteroData.subgraph() support (#4635)

Added the AQSOL dataset (#4626)

Added HeteroData.node_items() and HeteroData.edge_items() functionality (#4644)

Added PyTorch Lightning support in GraphGym (#4511, #4516 #4531, #4689, #4843)

Added support for returning embeddings in MLP models (#4625)

Added faster initialization of NeighborLoader in case edge indices are already sorted (via is_sorted=True) (#4620, #4702)

Added AddPositionalEncoding transform (#4521)

Added HeteroData.is_undirected() support (#4604)

Added the Genius and Wiki datasets to nn.datasets.LINKXDataset (#4570, #4600)

Added nn.aggr.EquilibrumAggregation implicit global layer (#4522)

Added support for graph-level outputs in to_hetero (#4582)

Added CHANGELOG.md (#4581)

Added HeteroData support to the RemoveIsolatedNodes transform (#4479)

Added HeteroData.num_features functionality (#4504)

Added support for projecting features before propagation in SAGEConv (#4437)

Added Geom-GCN splits to the Planetoid datasets (#4442)

Added a LinkNeighborLoader for training scalable link predictions models #4396, #4439, #4441, #4446, #4508, #4509)

Added an unsupervised GraphSAGE example on PPI (#4416)

Added support for LSTM aggregation in SAGEConv (#4379)

Added support for floating-point labels in RandomLinkSplit (#4311, #4383)

Added support for torch.data DataPipes (#4302, #4345, #4349)

Added support for the cosine argument in the KNNGraph/RadiusGraph transforms (#4344)

Added support graph-level attributes in networkx conversion (#4343)

Added support for renaming node types via HeteroData.rename (#4329)

Added an example to load a trained PyG model in C++ (#4307)

Added a MessagePassing.explain_message method to customize making explanations on messages (#4278, #4448))

Added support for GATv2Conv in the nn.models.GAT model (#4357)

Added HeteroData.subgraph functionality (#4243)

Added the MaskLabel module and a corresponding masked label propagation example (#4197)

Added temporal sampling support to NeighborLoader (#4025)

Added an example for unsupervised heterogeneous graph learning based on "Deep Multiplex Graph Infomax" (#3189)

Changed

Changed docstring for RandomLinkSplit (#5190)

Switched to PyTorch scatter_reduce implementation - experimental feature (#5120)

Fixed RGATConv device mismatches for f-scaled mode (#5187]

Allow for multi-dimensional edge_labels in LinkNeighborLoader (#5186]

Fixed GINEConv bug with non-sequential input (#5154]

Improved error message (#5095)

Fixed HGTLoader bug which produced outputs with missing edge types (#5067)

Fixed dynamic inheritance issue in data batching (#5051)

Fixed load_state_dict in Linear with strict=False mode (5094)

Fixed typo in MaskLabel.ratio_mask (5093)

Fixed data.num_node_features computation for sparse matrices (5089)

Fixed torch.fx bug with torch.nn.aggr package (#5021))

Fixed GenConv test (4993)

Fixed packaging tests for Python 3.10 (4982)

Changed act_dict (part of graphgym) to create individual instances instead of reusing the same ones everywhere (4978)

Fixed issue where one-hot tensors were passed to F.one_hot (4970)

Fixed bool arugments in argparse in benchmark/ (#4967)

Fixed BasicGNN for num_layers=1, which now respects a desired number of out_channels (#4943)

len(batch) will now return the number of graphs inside the batch, not the number of attributes (#4931)

Fixed data.subgraph generation for 0-dim tensors (#4932)

Removed unnecssary inclusion of self-loops when sampling negative edges (#4880)

Fixed InMemoryDataset inferring wrong len for lists of tensors (#4837)

Fixed Batch.separate when using it for lists of tensors (#4837)

Correct docstring for SAGEConv (#4852)

Fixed a bug in TUDataset where pre_filter was not applied whenever pre_transform was present (#4842)

Renamed RandomTranslate to RandomJitter - the usage of RandomTranslate is now deprecated (#4828)

Do not allow accessing edge types in HeteroData with two node types when there exists multiple relations between these types (#4782)

Allow edge_type == rev_edge_type argument in RandomLinkSplit (#4757)

Fixed a numerical instability in the GeneralConv and neighbor_sample tests (#4754)

Fixed a bug in HANConv in which destination node features rather than source node features were propagated (#4753)

Fixed versions of checkout and setup-python in CI (#4751)

Fixed protobuf version (#4719)

Fixed the ranking protocol bug in the RGCN link prediction example (#4688)

Math support in Markdown (#4683)

Allow for setter properties in Data (#4682, #4686)

Allow for optional edge_weight in GCN2Conv (#4670)

Fixed the interplay between TUDataset and pre_transform that modify node features (#4669)

Make use of the pyg_sphinx_theme documentation template (#4664, #4667)

Refactored reading molecular positions from sdf file for qm9 datasets (4654)

Fixed MLP.jittable() bug in case return_emb=True (#4645, #4648)

The generated node features of StochasticBlockModelDataset are now ordered with respect to their labels (#4617)

Fixed typos in the documentation (#4616, #4824, #4895, #5161)

The bias argument in TAGConv is now actually applied (#4597)

Fixed subclass behaviour of process and download in Datsaet (#4586)

Fixed filtering of attributes for loaders in case __cat_dim__ != 0 (#4629)

Fixed SparseTensor support in NeighborLoader (#4320)

Fixed average degree handling in PNAConv (#4312)

Fixed a bug in from_networkx in case some attributes are PyTorch tensors (#4486)

Added a missing clamp in DimeNet (#4506, #4562)

Fixed the download link in DBP15K (#4428)

Fixed an autograd bug in DimeNet when resetting parameters (#4424)

Fixed bipartite message passing in case flow="target_to_source" (#4418)

Fixed a bug in which num_nodes was not properly updated in the FixedPoints transform (#4394)

PyTorch Lightning >= 1.6 support (#4377)

Fixed a bug in which GATConv was not jittable (#4347)

Fixed a bug in which the GraphGym config was not stored in each specific experiment directory (#4338)

Fixed a bug in which nn.models.GAT did not produce out_channels-many output channels (#4299)

Fixed mini-batching with empty lists as attributes (#4293)

Fixed a bug in which GCNConv could not be combined with to_hetero on heterogeneous graphs with one node type (#4279)

Removed

Remove internal metrics in favor of torchmetrics (#4287)

Full commit list: https://github.com/pyg-team/pytorch_geometric/compare/2.0.4...2.1.0
Source code(tar.gz)
Source code(zip)
2.0.4(Mar 12, 2022)
PyG 2.0.4 🎉

A new minor PyG version release, bringing PyTorch 1.11 support to PyG. It further includes a variety of new features and bugfixes:

Features

Added Quiver examples for multi-GU training using GraphSAGE (#4103), thanks to @eedalong and @luomai

nn.model.to_captum: Full integration of explainability methods provided by the Captum library (#3990, #4076), thanks to @RBendias

nn.conv.RGATConv: The relational graph attentional operator (#4031, #4110), thanks to @fork123aniket

nn.pool.DMoNPooling: The spectral modularity pooling operator (#4166, #4242), thanks to @fork123aniket

nn.*: Support for shape information in the documentation (#3739, #3889, #3893, #3946, #3981, #4009, #4120, #4158), thanks to @saiden89 and @arunppsg and @konstantinosKokos

loader.TemporalDataLoader: A dataloader to load a TemporalData object in mini-batches (#3985, #3988), thanks to @otaviocx

loader.ImbalancedSampler: A weighted random sampler that randomly samples elements according to class distribution (#4198)

transforms.VirtualNode: A transform that adds a virtual node to a graph (#4163)

transforms.LargestConnectedComponents: Selects the subgraph that corresponds to the largest connected components in the graph (#3949), thanks to @abojchevski

utils.homophily: Support for class-insensitive edge homophily (#3977, #4152), thanks to @hash-ir and @jinjh0123

utils.get_mesh_laplacian: Mesh Laplacian computation (#4187), thanks to @daniel-unyi-42

Datasets

Added a dataset cheatsheet to the documentation that collects import graph statistics across a variety of datasets supported in PyG (#3807, #3817) (please consider helping us filling its remaining content)

datasets.EllipticBitcoinDataset: A dataset of Bitcoin transactions (#3815), thanks to @shravankumar147

Minor Changes

nn.models.MLP: MLPs can now either be initialized via a list of channels or by specifying hidden_channels and num_layers (#3957)

nn.models.BasicGNN: Final Linear transformations are now always applied (except for jk=None) (#4042)

nn.conv.MessagePassing: Message passing modules that make use of edge_updater are now jittable (#3765), thanks to @Padarn

nn.conv.MessagePassing: (Official) support for min and mul aggregations (#4219)

nn.LightGCN: Initialize embeddings via xavier_uniform for better model performance (#4083), thanks to @nishithshowri006

nn.conv.ChebConv: Automatic eigenvalue approximation (#4106), thanks to @daniel-unyi-42

nn.conv.APPNP: Added support for optional edge_weight, (690a01d), thanks to @YueeXiang

nn.conv.GravNetConv: Support for torch.jit.script (#3885), thanks to @RobMcH

nn.pool.global_*_pool: The batch vector is now optional (#4161)

nn.to_hetero: Added a warning in case to_hetero is used on HeteroData metadata with unused destination node types (#3775)

nn.to_hetero: Support for nested modules (ea135bf)

nn.Sequential: Support for indexing (#3790)

nn.Sequential: Support for OrderedDict as input (#4075)

datasets.ZINC: Added an in-depth description of the task (#3832), thanks to @gasteigerjo

datasets.FakeDataset: Support for different feature distributions across different labels (#4065), thanks to @arunppsg

datasets.FakeDataset: Support for custom global attributes (#4074), thanks to @arunppsg

transforms.NormalizeFeatures: Features will no longer be transformed in-place (ada5b9a)

transforms.NormalizeFeatures: Support for negative feature values (6008e30)

utils.is_undirected: Improved efficiency (#3789)

utils.dropout_adj: Improved efficiency (#4059)

utils.contains_isolated_nodes: Improved efficiency (970de13)

utils.to_networkx: Support for to_undirected options (upper triangle vs. lower triangle) (#3901, #3948), thanks to @RemyLau

graphgym: Support for custom metrics and loggers (#3494), thanks to @RemyLau

graphgym.register: Register operations can now be used as class decorators (#3779, #3782)

Documentation: Added a few exercises at the end of documentation tutorials (#3780), thanks to @PabloAMC

Documentation: Added better installation instructions to CONTRIBUTUNG.md (#3803, #3991, #3995), thanks to @Cho-Geonwoo and @RBendias and @RodrigoVillatoro

Refactor: Clean-up dependencies (#3908, #4133, #4172), thanks to @adelizer

CI: Improved test runtimes (#4241)

CI: Additional linting check via yamllint (#3886)

CI: Additional linting check via isort (66b1780), thanks to @mananshah99

torch.package: Model packaging via torch.package (#3997)

Bugfixes

data.HeteroData: Fixed a bug in data.{attr_name}_dict in case data.{attr_name} does not exist (#3897)

data.Data: Fixed data.is_edge_attr in case data.num_edges == 1 (#3880)

data.Batch: Fixed a device mismatch bug in case a batch object was indexed that was created from GPU tensors (e6aa4c9, c549b3b)

data.InMemoryDataset: Fixed a bug in which copy did not respect the underlying slice (d478dcb, #4223)

nn.conv.MessagePassing: Fixed message passing with zero nodes/edges (#4222)

nn.conv.MessagePassing: Fixed bipartite message passing with flow="target_to_source" (#3907)

nn.conv.GeneralConv: Fixed an issue in case skip_linear=False and in_channels=out_channels (#3751), thanks to @danielegrattarola

nn.to_hetero: Fixed model transformation in case node type names or edge type names contain whitespaces or dashes (#3882, b63a660)

nn.dense.Linear: Fixed a bug in lazy initialization for PyTorch < 1.8.0 (973d17d, #4086)

nn.norm.LayerNorm: Fixed a bug in the shape of weights and biases (#4030), thanks to @marshka

nn.pool: Fixed torch.jit.script support for torch-cluster functions (#4047)

datasets.TOSCA: Fixed a bug in which indices of faces started at 1 rather than 0 (8c282a0), thanks to @JRowbottomGit

datasets.WikiCS: Fixed WikiCS to be undirected by default (#3796), thanks to @pmernyei

Resolved inconsistency between utils.contains_isolated_nodes and data.has_isolated_nodes (#4138)

graphgym: Fixed the loss function regarding multi-label classification (#4206), thanks to @RemyLau

Documentation: Fixed typos, grammar and bugs (#3840, #3874, #3875, #4149), thanks to @itamblyn and @chrisyeh96 and @finquick

Source code(tar.gz)
Source code(zip)
2.0.3(Dec 22, 2021)
PyG 2.0.3 🎉

A new minor PyG version release, including a variety of new features and bugfixes:

Features

GLNN: Graph-less Neural Networks [Example] (#3572)

LINKX: Large Scale Learning on Non-Homophilous Graphs [Example] (#3654)

Added an example for heterogeneous link classification (#3350) - thanks to @anniekmyatt

HANConv: The Heterogenous Graph Attention operator [Example] (#3444, #3577, #3581) - thanks to @rishubhkhurana and @wsad1

LGConv and LightGCN: Simplifying and Powering Graph Convolution Network for Recommendation (#3685) - thanks to @LukasHaas and @KathyFeiyang

PyTorch Lightning DataModule wrappers for PyG+PL multi-GPU training/inference without replicating datasets across processes :

torch_geometric.data.LightningDataset for multi-GPU training via PL on graph-level tasks [Example] (#3596, #3634)

torch_geometric.data.LightningNodeData for multi-GPU training via PL on node-level tasks [Example] (#3613, #3634)

NeighborLoader: Added CUDA support leading to major runtime improvements [Example] (#3736)

MessagePassing: Added the edge_updater/edge_update interface for updating edge features (#3450) - thanks to @Padarn

GNNExplainer: Added an example that reproduces the official BA-Shapes experiment (#3386) - thanks to @RBendias

torch_geometric.graphgym: Support for heterogeneous graphs and lazy initialization (#3460) - thanks to @JiaxuanYou

MLP: Added a basic MLP implementation (#3553)

PointTransformer: Classification and segmentation examples (#3344) - thanks to @QuanticDisaster and @wsad1

ShaDowKHopSampler: Added an example (#3411) - thanks to @SubhajitDuttaChowdhury

Data.subgraph(...) implementation (#3521)

Datasets

HGBDataset benchmark suite (#3454)

MalNetTiny dataset (#3472) - thanks to @rampasek

OMDB: Organic Materials Database (#3506)

BAShapes: The BA-Shapes dataset (#3386) - thanks to @RBendias

PolBlogs and EmailEUCore datasets (#3534) - thanks to @AlexDuvalinho

StochasticBlockModel and RandomPartition graph datasets (#3586) - thanks to @dongkwan-kim

LINKXDataset: A subset of the non-homophilous benchmark datasets from LINKX

FakeDataset and FakeHeteroDataset for testing purposes (#3741) - thanks to @levulinh

Minor Changes

torch_geometric.nn.norm: Improved the runtimes of normalization layers - thanks to @johnpeterflynn

DataLoader and NeighborLoader: Output tensors are now written to shared memory to avoid an extra copy in case num_workers > 0 (#3401 and #3734) - thanks to @johnpeterflynn

GATv2Conv: Support for edge features (#3421) - thanks to @Kenneth-Schroeder

Batch.from_data_list: Runtime improvements

TransformerConv: Runtime and memory consumption improvements (#3392) - thanks to @wsad1

mean_iou: Added IoU computation via omitting NaNs (#3464) - thanks to @GericoVi

DataLoader: follow_batch and exclude_keys are now optional arguments

Improvements to the package metadata (#3445) - thanks to @cthoyt

Updated the quick start widget to support PyTorch 1.10 (#3474) - thanks to @kathyfan

NeighborLoader and HGTLoader: Removed the persistent_workers=True default

voxel_grid: The batch argument is now optional (#3533) - thanks to @QuanticDisaster

TransformerConv: JIT support (#3538) - thanks to @RobMcH

Lazy modules can now correctly be saved and loaded via state_dict() and load_state_dict() (#3651) - thanks to @shubham-gupta-iitr

from_networkx: Support for nx.MultiDiGraph (#3646) - thanks to @max-zipfl-fzi

GATv2Conv: Support for lazy initialization (#3678) - thanks to @richcmwang

torch_geometric.graphgym: register_* functions can now be used as decorators (#3684)

AddSelfLoops: Now supports the full argument set of torch_geometric.utils.add_self_loops (#3702) - thanks to @dongkwan-kim

Documentation: Added shape information to a variety of GNN operators, e.g., GATConv or ChebConv (#3697) - thanks to @saiden89

GATv2Conv and HEATConv: Removed unnecessary size argument in forward (#3744) - thanks to @saiden89

Bugfixes

GNNExplainer: Fixed a bug in the GCN example normalization coefficients were wrongly calculated (#3508) - thanks to @RBendias

HGTConv: Fixed a bug in the residual connection formulation - thanks to @zzhnobug

torch_geometric.grapghym: Fixed a bug in the creation of MLP (#3431) - thanks to @JiaxuanYou

torch_geometric.graphgym: Fixed a bug in the dimensionality of GeneralMultiLayer (#3456) - thanks to @JiaxuanYou

RandomLinkSplit: Fixed a bug in negative edge sampling for undirected graphs (#3440) - thanks to @panisson

add_self_loops: Fixed a bug in adding self-loops with scalar-valued weights

SchNet: Fixed a bug in which a bias vector was not correctly initialized as zero - thanks to @nec4

Batch.from_data_list: Replaced the torch.repeat_interleave call due to errors in forked processes (#3566) - thanks to @Enolerobotti

NeighborLoader: Fixed a bug in conjunction with PyTorch Lightning (#3602) - thanks to @pbielak

NeighborLoader and ToSparseTensor: Fixed a bug in case num_nodes == num_edges (#3683) - thanks to @WuliangHuang

ToUndirected: Fixed a bug in case num_nodes == 2 (#3627) - thanks to @aur3l14no

FiLMConv: Fixed a bug in the backward pass due to the usage of in-place operations - thanks to @Jokeren

GDC: Fixed a bug in case K > num_nodes - thanks to @Misterion777

LabelPropagation: Fixed a bug in the order of transformations (#3639) - thanks to @Riyer01

negative_sampling: Fixed execution for GPU input tensors - thanks to @Sticksword and @lmy86263

HeteroData: Fixed a bug in which node types were interpreted as edge types in case they were described by two characters (#3692)

FastRGCNConv: Fixed a bug in which weights were indexed on destination node index rather than source node index (#3690) - thanks to @Jokeren

WikipediaNetwork: Fixed a bug in downloading due to a change in URLs - thanks to @csbobby and @Kousaka-Honoka

Source code(tar.gz)
Source code(zip)
2.0.2(Oct 26, 2021)
A new minor version release, including further bugfixes, official PyTorch 1.10 support, as well as additional features and operators:

Features

Added video tutorials and Colabs from the PyTorch Geometric Tutorial project (thanks to @AntonioLonga)

Added the GraphMultisetTransformer operator (thanks to @JinheonBaek)

Added the PointTransformerConv operator (thanks to @QuanticDisaster)

Added the HEATConv operator (thanks to @Xiaoyu006)

Added the PNA GNN model (thanks to @RBendias)

Added the AddMetaPaths transform, which will add additional edge types to a HeteroData object based on a list of metapaths (thanks to @wsad1)

Added the Data.to_heterogeneous method to allow for the conversion from Data to HeteroData objects

Added the AttributedGraphDataset, containing a variety of attributes graphs

Added the Airports datasets

Added the structured_negative_sampling_feasible method, which checks if structured_negative_sampling is feasible (thanks to @WuliangHuang)

GATConv can now make use of multi-dimensional edge features to compute attention scores (thanks to @dongkwan-kim)

RandomNodeSplit and RandomLinkSplit now support HeteroData as input

MessagePassing inference can now be sped up via the decomposed_layers argument (thanks to @ZhouAo-ZA)

negative_sampling and batched_negative_sampling now support negative sampling in bipartite graphs

HeteroConv now supports the inclusion of arbitrary node-level or edge-level information for the underlying MessagePassing operators

GNNExplainer now supports multiple node-level masks and explaining regression problems (thanks to @gregorkrz)

Minor Changes

Data.to_homogeneous will now add node_type information to the homogeneous Data object

GINEConv now allows to transform edge features automatically in case their dimensionalities do not match (thanks to @CaypoH)

OGB_MAG will now add node_year information to paper nodes

Entities datasets do now allow the processing of HeteroData objects via the hetero=True option

Batch objects can now be batched together to form super batches

Added heterogeneous graph support for Center, Constant and LinearTransformation transformations

HeteroConv now allows to return "stacked" embeddings

The batch vector of a Batch object will now be initialized on the GPU in case other attributes are held in GPU memory

Bugfixes

Fixed the num_neighbors argument of NeighborLoader in order to specify an edge-type specific number of neighbors

Fixed the collate policy of lists of integers/strings to return nested lists

Fixed the Delaunay transformation in case the face attribute is not present in the data

Fixed the TGNMemory module to only read from the latest update (thanks to @cwh104504)

Fixed the pickle.PicklingError when Batch objects are used in a torch.multiprocessing.manager.Queue() (thanks to @RasmusOrsoe)

Fixed an issue with _parent state changing after pickling of Data objects (thanks to @zepx)

Fixed the ToUndirected transformation in case the number of edges and nodes are equal (thanks to @lmkmkrcc)

Fixed the from_networkx routine in case node-level and edge-level features share the same names

Removed the num_nodes warning when creating PairData objects

Fixed the initialization of the GeneralMultiLayer module in GraphGym (thanks to @fjulian)

Fixed custom model registration in GraphGym

Fixed a clash in the run_dir naming of GraphGym (thanks to @fjulian)

Includes a fix to prevent a GraphGym crash in case ROC-score is undefined (thanks to @fjulian)

Fixed the Batch.from_data_list routine on dataset slices (thanks to @dtortorella)

Fixed the MetaPath2Vec model in case there exists isolated nodes

Fixed torch_geometric.utils.coalesce with CUDA tensors

Source code(tar.gz)
Source code(zip)
2.0.1(Sep 16, 2021)
PyG 2.0.1

This is a minor release, bringing some emergency fixes to PyG 2.0.

Bugfixes

Fixed a bug in loader.DataLoader that raised a PicklingError for num_workers > 0 (thanks to @r-echeveste, @arglog and @RishabhPandit-00)

Fixed a bug in the creation of data.Batch objects in case customized data.Data objects expect non-default arguments (thanks to @Emiyalzn)

Fixed a bug in which SparseTensor attributes could not be batched along single dimensions (thanks to @rubenwiersma)

Source code(tar.gz)
Source code(zip)
2.0.0(Sep 13, 2021)
PyG 2.0 :tada: :tada: :tada:

PyG (PyTorch Geometric) has been moved from my own personal account rusty1s to its own organization account pyg-team to emphasize the ongoing collaboration between TU Dortmund University, Stanford University and many great external contributors. With this, we are releasing PyG 2.0, a new major release that brings sophisticated heterogeneous graph support, GraphGym integration and many other exciting features to PyG.

If you encounter any bugs in this new release, please do not hesitate to create an issue.

Heterogeneous Graph Support

We finally provide full heterogeneous graph support in PyG 2.0. See here for the accompanying tutorial.

Highlights

Heterogeneous Graph Storage: Heterogeneous graphs can now be stored in their own dedicated data.HeteroData class (thanks to @yaoyaowd):

from torch_geometric.data import HeteroData data = HeteroData() # Create two node types "paper" and "author" holding a single feature matrix: data['paper'].x = torch.randn(num_papers, num_paper_features) data['author'].x = torch.randn(num_authors, num_authors_features) # Create an edge type ("paper", "written_by", "author") holding its graph connectivity: data['paper', 'written_by', 'author'].edge_index = ... # [2, num_edges]

data.HeteroData behaves similar to a regular homgeneous data.Data object:

print(data['paper'].num_nodes) print(data['paper', 'written_by', 'author'].num_edges) data = data.to('cuda')

Heterogeneous Mini-Batch Loading: Heterogeneous graphs can be converted to mini-batches for many small and single giant graphs via the loader.DataLoader and loader.NeighborLoader loaders, respectively. These loaders can now handle both homogeneous and heterogeneous graphs:

from torch_geometric.loader import DataLoader loader = DataLoader(heterogeneous_graph_dataset, batch_size=32, shuffle=True) from torch_geometric.loader import NeighborLoader loader = NeighborLoader(heterogeneous_graph, num_neighbors=[30, 30], batch_size=128, input_nodes=('paper', data['paper'].train_mask), shuffle=True)

Heterogeneous Graph Neural Networks: Heterogeneous GNNs can now easily be created from homogeneous ones via nn.to_hetero and nn.to_hetero_with_bases. These processes take an existing GNN model and duplicate their message functions to account for different node and edge types:

from torch_geometric.nn import SAGEConv, to_hetero class GNN(torch.nn.Module): def __init__(hidden_channels, out_channels): super().__init__() self.conv1 = SAGEConv((-1, -1), hidden_channels) self.conv2 = SAGEConv((-1, -1), out_channels) def forward(self, x, edge_index): x = self.conv1(x, edge_index).relu() x = self.conv2(x, edge_index) return x model = GNN(hidden_channels=64, out_channels=dataset.num_classes) model = to_hetero(model, data.metadata(), aggr='sum')

Additional Features

A heterogeneous graph tutorial describing all newly released features (thanks to @mrjel)

A variety of heterogeneous GNN examples

Support for lazy initialization of GNN operators by passing -1 to the in_channels argument (implemented via nn.dense.Linear). This allows to avoid calculating and keeping track of input tensor sizes, simplyfing the creation of heterogeneous graph models with varying feature dimensionalities across different node and edge types. Lazy initialization is supported for all existing PyG operators (thanks to @yaoyaowd):
from torch_geometric.nn import GATConv conv = GATConv(-1, 64) # We can initialize the model’s parameters by calling it once: conv(x, edge_index)

nn.conv.HeteroConv: A generic wrapper for computing graph convolution on heterogeneous graphs (thanks to @RexYing)

nn.conv.HGTConv: The heterogeneous graph transformer operator from the "Heterogeneous Graph Transformer" paper

loader.HGTLoader: The heterogeneous graph sampler from the "Heterogeneous Graph Transformer" paper for learning on large-scale heterogeneous graphs (thanks to @chantat)

Support for heterogeneous graph transformations in transforms.AddSelfLoops, transforms.ToSparseTensor, transforms.NormalizeFeatures and transforms.ToUndirected

New heterogeneous graph datasets: datasets.OGB_MAG, datasets.IMDB, datasets.DBLP and datasets.LastFM

Support for converting heterogeneous graphs to "typed" homogeneous ones via data.HeteroData.to_homogeneous (thanks to @yzhao062)

A tutorial on creating a data.HeteroData object from raw *.csv files (thanks to @yaoyaowd and @mrjel)

An example to scale heterogeneous graph models via PyTorch Lightning

Managing Experiments with GraphGym

GraphGym is now officially supported in PyG 2.0 via torch_geometric.graphgym. See here for the accompanying tutorial. Overall, GraphGym is a platform for designing and evaluating Graph Neural Networks from configuration files via a highly modularized pipeline (thanks to @JiaxuanYou):

GraphGym is the perfect place to start learning about standardized GNN implementation and evaluation

GraphGym provides a simple interface to try out thousands of GNN architectures in parallel to find the best design for your specific task

GraphGym lets you easily do hyper-parameter search and visualize what design choices are better

Breaking Changes

The datasets.AMiner dataset now returns a data.HeteroData object. See here for our updated MetaPath2Vec example on AMiner.

transforms.AddTrainValTestMask has been replaced in favour of transforms.RandomNodeSplit

Since the storage layout of data.Data significantly changed in order to support heterogenous graphs, already processed datasets need to be re-processed by deleting the root/processed folder.

data.Data.__cat_dim__ and data.Data.__inc__ now expect additional input arguments:
def __cat_dim__(self, key, value, *args, **kwargs): pass def __inc__(self, key, value, *args, **kwargs): pass

In case you modified __cat_dim__ or __inc__ functionality in a customized data.Data object, please ensure to apply the above changes.

Deprecations

nn.conv.PointConv is deprecated in favour of nn.conv.PointNetConv (thanks to @lelouedec and @QuanticDisaster)

utils.train_test_split_edges is deprecated in favour of the new transforms.RandomLinkSplit transform

All data loaders were moved from torch_geometric.data to torch_geometric.loader, e.g.:
from torch_geometric.loader import DataLoader

loader.NeighborSampler is deprecated in favour of loader.NeighborLoader in order to simplify the application of neighbor sampling and to support both neighbor sampling in homogeneous and heterogeneous graphs

Data.contains_isolated_nodes and Data.contains_self_loops are deprecated in favour of Data.has_isolated_nodes and Data.has_self_loops, respectively

Additional Features

torch-scatter and torch-sparse now support half-precision computation via torch.half, bringing half-precision support to PyG

Added a GNN cheatsheet to the documentation, which lets you more easily choose a GNN operator for your specific need

Added the transforms.RandomLinkSplit transform to easily perform a random edge-level split (thanks to @RexYing)

Added the torch_geometric.profile package which provides a variety of utility functions for benchmarking runtimes and memory consumptions of GNN models (thanks to @yzhao062)

nn.conv.MessagePassing now supports hooks for propagate, message, aggregate and update functions, e.g. via nn.conv.MessagePassing.register_propagate_forward_hook

Added the nn.conv.GeneralConv operator that can handle most GNN use-cases (e.g., w/ or w/o edge features, ...) and has enough design options to be tuned (e.g., attention, skip-connections, ...) (thanks to @JiaxuanYou)

Added the nn.models.RECT_L model for learning with completely-imbalanced labels (thanks to @Fizyhsp)

Added the Pathfinder Discovery Network Convolutional operator nn.conv.PDNConv (thanks to @benedekrozemberczki)

Added basic GNN model support as part of the nn.models package, e.g., nn.model.GCN, nn.models.GraphSAGE, nn.models.GAT and nn.models.GIN. Pre-defined models support customizing hidden feature dimensionality, number of layers, activation, normalization and jumping knowledge (thanks to @PabloAMC)

Added the datasets.MD17 datasets (thanks to @M-R-Schaefer)

Added a link-prediction example of nn.conv.RGCNConv (thanks to @moritzblum)

Added an example of nn.pool.MemPooling (thanks to @wsad1)

Added a return_attention_weights argument for nn.conv.TransformerConv (thanks to @wsad1)

Batch support for utils.homophily (thanks to @wsad1)

Added a batch_size argument to utils.to_dense_batch (thanks to @jimmiebtlr)

Minor Changes

Heavily improved loading times of import torch_geometric

nn.Sequential is now fully jittable

nn.conv.LEConv is now fully jittable (thanks to @lucagrementieri)

nn.conv.GENConv can now make use of "add", "mean" or "max" aggregations (thanks to @riskiem)

Attributes of type torch.nn.utils.rnn.PackedSequence are now correctly handled by data.Data and data.HeteroData (thanks to @WuliangHuang)

Added support for data.record_stream() in order to allow for data prefetching (thanks to @FarzanT)

Added a max_num_neighbors attribute to nn.models.SchNet and nn.models.DimeNet (thanks to @nec4)

nn.conv.MessagePassing is now jittable in case message, aggregate and update return multiple arguments (thanks to @PhilippThoelke)

utils.from_networkx now supports grouping of node-level and edge-level features (thanks to @PabloAMC)

Transforms now inherit from transforms.BaseTransform to ease type checking (thanks to @CCInc)

Added support for the deletion of data attributes via del data[key] (thanks to @Linux-cpp-lisp)

Bugfixes

The transforms.LinearTransformation transform now correctly transposes the input matrix before applying the transformation (thanks to @beneisner)

Fixed a bug in benchmark/kernel that prevented the application of DiffPool on the IMDB-BINARY dataset (thanks to @dongZheX)

Feature dimensionalities of datasets.WikipediaNetwork do now match which the official reported ones in case geom_gcn_preprocess=True (thanks to @ZhuYun97 and @GitEventhandler)

Fixed a bug in the datasets.DynamicFAUST dataset in which data.num_nodes was undefined (thanks to @koustav123)

Fixed a bug in which nn.models.GNNExplainer could not handle GNN operators that add self-loops to the graph in case self-loops were already present (thanks to @tw200464tw and @NithyaBhasker)

nn.norm.LayerNorm may no longer produce NaN gradients (thanks to @fbragman)

Fixed a bug in which it was not possible to customize networkx drawing arguments in nn.models.GNNExplainer.visualize_subgraph() (thanks to @jvansan)

transforms.RemoveIsolatedNodes now correctly removes isolated nodes in case data.num_nodes is explicitely set (thanks to @blakechi)

Source code(tar.gz)
Source code(zip)
1.7.2(Jun 26, 2021)
Datasets

The GitHub Web and ML developer dataset (thanks to @benedekrozemberczki)

The FacebookPagePage dataset (thanks to @benedekrozemberczki)

The Twitch gamer datasets (thanks to @benedekrozemberczki)

The DeezerEurope dataset (thanks to @benedekrozemberczki)

The GemsecDeezer dataset (thanks to @benedekrozemberczki)

The LastFMAsia dataset (thanks to @benedekrozemberczki)

The WikipediaNetwork datasets does now allow usage of the raw dataset as introduced in Multi-scale Attributed Node Embedding (thanks to @benedekrozemberczki)

Bugfixes

Fixed an error in DeepGCNLayer in case no normalization layer is provided (thanks to @lukasfolle)

Fixed a bug in GNNExplainer which mixed the loss computation for graph-level and node-level predictions (thanks to @panisson and @wsad1)

Source code(tar.gz)
Source code(zip)
1.7.1(Jun 17, 2021)
A minor release that brings PyTorch 1.9.0 and Python 3.9 support to PyTorch Geometric. In case you are in the process of updating to PyTorch 1.9.0, please re-install the external dependencies for PyTorch 1.9.0 as well (torch-scatter and torch-sparse).

Features

EGConv (thanks to @shyam196)

GATv2Conv (thanks to @shakedbr)

GraphNorm normalization layer

GNNExplainer now supports explaining graph-level predictions (thanks to @wsad1)

bro and gini regularization (thanks to @rhsimplex)

train_test_split_edges() and to_undirected() can now edge features (thanks to @saiden89 and @SherylHYX)

Datasets can now be accessed with np.ndarray as well (thanks to @josephenguehard)

dense_to_sparse can now handle batched adjacency matrices

numba is now an optional dependency

Datasets

The tree-structured fake news propagation UPFD dataset (thanks to @YingtongDou)

The large-scale AmazonProducts graph from the GraphSAINT paper

Added support for two more datasets in the SNAPDataset benchmark suite (thanks to @SherylHYX)

Issues

Fixed an issue in which SuperGATConv used all positive edges for computing the auxiliary loss (thanks to @anniekmyatt)

Fixed a bug in which MemPooling produced NaN gradients (thanks to @wsad1)

Fixed an issue in which the schnetpack package was required for training SchNet (thanks to @mshuaibii)

Modfied XConv to sample without replacement in case dilation > 1 (thanks to @mayur-ag)

GraphSAINTSampler can now be used in combination with PyTorch Lightning

Fixed a bug in HypergraphConv in case num_nodes > num_edges (thanks to @THinnerichs)

Source code(tar.gz)
Source code(zip)
1.7.0(Apr 9, 2021)
Major Features

Temporal Graph Network and an example utilizing graph attention, (thanks to @emalgorithm)

CorrectAndSmooth and an example on ogbn-products

PyTorch Lightning support, see here for the accompanying examples (thanks to @tchaton)

Sequential API, see here for the accompanying example

FiLMConv and an example on PPI (thanks to @ldv1)

SuperGAT and an example on Cora (thanks to @dongkwan-kim)

MemPooling (thanks to @wsad1)

PANConv (thanks to @YuGuangWang)

DiffGroupNorm (thanks to @wsad1)

ResGatedGraphConv (thanks to @ldv1)

FAConv (thanks to @wsad1)

AttentiveFP model for molecular graph learning and an example on ESOL (thanks to @thegodone)

Shadow k-hop Sampler (currently requires torch-sparse from master)

Additional Features

Inductive Deep Graph Infomax example (thanks to @harrygcoppock)

WLConv and an example of the Weisfeiler-Lehman subtree kernel (thanks to @chrsmrrs)

LabelPropagation

AddTrainValTestMask transform for creating various splitting strategies (thanks to @dongkwan-kim)

homophily measurement (thanks to @ldv1)

to_cugraph conversion

Minor Changes

More memory-efficient implementation of GCN2Conv

Improved TransformerConv with the beta argument being input and message dependent (thanks to @ldv1)

NeighborSampler now works with SparseTensor and supports an additional transform argument

Batch.from_data_list now supports batching along a new dimension via returning None in Data.__cat_dim__, see here for the accompanying tutorial (thanks to @Linux-cpp-lisp)

MetaLayer is now "jittable"

Lazy loading of torch_geometric.nn and torch_geometric.datasets, leading to faster imports (thanks to @Linux-cpp-lisp)

GNNExplainer now supports various output formats of the underlying GNN model (thanks to @wsad1)

Datasets

JODIE datasets for temporal graph learning

WordNet18RR (thanks to @minhtriet)

Reddit2

MixHopSyntheticDataset (thanks to @ldv1)

NELL

Bugfixes

Fixed SparseAdam usage in examples/metapath2vec.py (thanks to @declanmillar)

Fixed from_networkx to support empty edge lists (thanks to @shakedbr)

Fixed a numerical issue in softmax

Fixed an issue in DenseGraphConv with aggr="max" (thanks to @quqixun)

Fixed the norm computation in GraphSAINTSampler (thanks to @austintwang)

Cartesian and LocalCartesian now compute Cartesian coordinates from target to source nodes (thanks to @ldv1)

Source code(tar.gz)
Source code(zip)
1.6.3(Dec 2, 2020)
Fixed a crucial bug in which InMemoryDatasets with the usage of pre_transform led to an error

New datasets: WikipediaNetwork and Actor

Added homophily ratio utility function: torch_geometric.utils.homophily_ratio

Source code(tar.gz)
Source code(zip)
1.6.2(Nov 27, 2020)
Features

GCN2Conv [Cora example, PPI example]

TransformerConv

New Dataset: WebKB

New Google Colab: Explaining GNN Model Predictions using Captum (thanks to @m30m)

Distributed training examples for node classification and graph classification (thanks to @maqy1995)

Node2Vec can now handle different p and q values other than 1 (torch-cluster update required)

GraphSAGE unsupervised training example (thanks to @yuanx749)

Linear GAE example (thanks to @GuillaumeSalha)

Minor improvements

The SIGN example now operates on mini-batches of nodes

Improved data loading runtime of InMemoryDatasets

NeighborSampler does now work with SparseTensor as input

ToUndirected transform in order to convert directed graphs to undirected ones

GNNExplainer does now allow for customizable edge and node feature loss reduction

aggr can now passed to any GNN based on the MessagePassing interface (thanks to @m30m)

Runtime improvements in SEAL (thanks to @muhanzhang)

Runtime improvements in torch_geometric.utils.softmax (thanks to @Book1996)

GAE.recon_loss now supports custom negative edge indices (thanks to @reshinthadithyan)

Faster spmm computation and random_walk sampling on CPU (torch-sparse and torch-cluster updates required)

DataParallel does now support the follow_batch argument

Parallel approximate PPR computation in the GDC transform (thanks to @klicperajo)

Improved documentation by providing an autosummary of all subpackages (thanks to @m30m)

Improved documentation on how edge weights are handled in various GNNs (thanks to @m30m)

Bugfixes

Fixed a bug in GATConv when computing attention coefficients in bipartite graphs

Fixed a bug in GraphSAINTSampler that led to wrong edge feature sampling

Fixed the DimeNet pretraining link

Fixed a bug in processing ego-twitter and ego-gplus of the SNAPDataset collection

Fixed a number of broken dataset URLs (ICEWS18, QM9, QM7b, MoleculeNet, Entities, PPI, Reddit, MNISTSuperpixels, ShapeNet)

Fixed a bug in which MessagePassing.jittable() tried to write to a file without permission (thanks to @twoertwein)

GCNConv does not require edge_weight in case normalize=False

Batch.num_graphs will now report the correct amount of graphs in case of zero-sized graphs

Source code(tar.gz)
Source code(zip)
1.6.1(Aug 5, 2020)
This is a minor release, mostly focusing on PyTorch 1.6.0 support. All external wheels are now also available for PyTorch 1.6.0.

New Features

WikiCS dataset

DeepGCN via GENConv and DeepGCNLayer (thanks to @lightaime)

PairNorm (thanks to @gupta-abhay)

LayerNorm (thanks to @aluo-x)

Bugfixes

Fixed a bug which prevented GNNExplainer to work with GATConv

Fixed the MessagePassing.jittable call when installing PyG via pip

Fixed a bug in torch-sparse where reduce functions with dim=0 did not yield the correct result

Fixed a bug in torch-sparse which suppressed all warnings

Source code(tar.gz)
Source code(zip)
1.6.0(Jul 7, 2020)
A new major release, introducing TorchScript support, memory-efficient aggregations, bipartite GNN modules, static graphs and much more!

Major Features

TorchScript support, see here for the accompanying tutorial (thanks to @lgray and @liaopeiyuan)

Memory-efficient aggregations via torch_sparse.SparseTensor, see here for the accompanying tutorial

Most GNN modules can now operate on bipartite graphs (and some of them can also operate on different feature dimensionalities for source and target nodes), useful for neighbor sampling or heterogeneous graphs:

conv = SAGEConv(in_channels=(32, 64), out_channels=64) out = conv((x_src, x_dst), edge_index)

Static graph support:

conv = GCNConv(in_channels=32, out_channels=64) x = torch.randn(batch_size, num_nodes, in_channels) out = conv(x, edge_index) print(out.size()) >>> torch.Size([batch_size, num_nodes, out_channels])

Additional Features

PNAConv (thanks to @lukecavabarrett and @gcorso)

Pre-Trained DimeNet on QM9

SEAL link prediction example (thanks to @muhanzhang)

ClusterGCNConv

Cluster-GCN PPI example (thanks to @CFF-Dream)

WeightedEdgeSampler for GraphSAINT (thanks to @KiddoZhu)

Better num_workers support for GraphSAINT

The automatic addition of self-loops can now be disabled via the add_self_loops argument, e.g., for GCNConv

Breaking Changes

Memory-efficient RGCNConv: The old RGCNConv implementation has been moved to FastRGCNConv

Complementary Frameworks

DeepSNAP: A PyTorch library that bridges between graph libraries such as NetworkX and PyTorch Geometric

PyTorch Geometric Temporal: A temporal GNN library built upon PyTorch Geometric

Datasets

GNNBenchmarkDataset suite from the Benchmarking Graph Neural Networks paper

WordNet18

Bugfixes

Fixed a bug in the VGAE KL-loss computation (thanks to @GuillaumeSalha)

Source code(tar.gz)
Source code(zip)
1.5.0(May 25, 2020)
This release is a big one thanks to many wonderful contributors. You guys are awesome!

Breaking Changes and Highlights

NeighborSampler got completely revamped: it's now much faster, allows for parallel sampling, and allows to easily apply skip-connections or self-loops. See examples/reddit.py or the newly introduced OGB examples (examples/ogbn_products_sage.py and examples/ogbn_products_gat.py). The latter also sets a new SOTA on the OGB leaderboards (reaching 0.7945 ± 0.0059 test accuracy)

SAGEConv now uses concat=True by default, and there is no option to disable it anymore

Node2Vec got enhanced by a parallel sampling mechanism, and as a result, its API slightly changed

MetaPath2Vec: The first model in PyG that is able to operate on heteregenous graphs

GNNExplainer: Generating explanations for graph neural networks

GraphSAINT: A graph sampling based inductive learning method

SchNet model for learning on molecular graphs, comes with pre-trained weights for each target of the QM9 dataset (thanks to @Nyuten)

Additional Features

ASAPooling: Adaptive structure aware pooling for learning hierarchical graph representations (thanks to @ekagra-ranjan)

ARGVA node clustering example, see examples/argva_node_clustering.py (thanks to @gsoosk)

MFConv: Molecular fingerprint graph convolution operator (thanks to @rhsimplex)

GIN-E-Conv that extends the GINConv to also account for edge features

DimeNet: Directional message passing for molecular graphs

SIGN: Scalable inception graph neural networks

GravNetConv (thanks to @jkiesele)

Datasets

Yelp

Flickr

AMiner (first real heterogeneous graph)

Minor changes

GATConv can now return attention weights via the return_attention_weights argument (thanks to @douglasrizzo)

InMemoryDataset now has a copy method that converts sliced datasets back into a contiguous memory layout

Planetoid got enhanced by the ability to let users choose between different splitting methods (thanks to @dongkwan-kim)

k_hop_subgraph: Computes the k-hop subgraph around a subset of nodes

geodesic_distance: Geodesic distances can now be computed in parallel (thanks to @jannessm)

tree_decomposition: The tree decompostion algorithm for generating junction trees from molecules

SortPool benchmark script now uses 1-D convolutions after pooling, leading to better performance (thanks to @muhanzhang)

Bugfixes

Fixed a bug in write_off

Fixed a bug in the processing of the GEDDataset dataset

to_networkx conversion can now also properly handle non-tensor attributes

Fixed a bug in read_obj (thanks to @mwussow)

Source code(tar.gz)
Source code(zip)
1.4.3(Mar 17, 2020)
Features

Cluster-GCN via ClusterData and ClusterLoader for operating on large-scale graphs, see examples/cluster_gcn.py for an example on how to use

Added a tutorial about advanced mini-batching scenarios

Added a tensorboard logging example

Datasets

CitationFull: The full citation network dataset suite

SNAPDataset: A subset of graph datasets from the SNAP dataset collection

SuiteSparseMatrixCollection

TrackMLParticleTrackingDataset

Minor Changes

Added the concat argument to SAGEConv

Outsourced the train_test_split_edges method of the graph autoencoder GAE class to torch_geometric.utils

Bugfixes

Fixed SplineConv compatibility with latest torch-spline-conv package

trimesh conversion utilities do not longer result in a permutation of the input data

Source code(tar.gz)
Source code(zip)
1.4.2(Feb 18, 2020)
Minor Changes

There are now Python wheels available for torch-scatter and torch-sparse which should make the installation procedure much more user-friendly. Simply run

pip install torch-scatter==latest+${CUDA} torch-sparse==latest+${CUDA} -f https://pytorch-geometric.com/whl/torch-1.4.0.html pip install torch-geometric

where ${CUDA} should be replaced by either cpu, cu92, cu100 or cu101 depending on your PyTorch installation.

torch-cluster is now an optional dependency. All methods that rely on torch-cluster will result in an error requesting you to install torch-cluster.

torch_geometric.data.Dataset can now also be indexed and shuffled:

dataset.shuffle()[:50]

Bugfixes

Fixed a bug that prevented the user from saving MessagePassing modules.

Fixed a bug in RGCNConv when using root_weight=False.

Source code(tar.gz)
Source code(zip)
1.4.1(Feb 4, 2020)
This release mainly focuses on torch-scatter=2.0 support. As a result, PyTorch Geometric now requires PyTorch 1.4. If you are in the process of updating to PyTorch 1.4, please ensure that you also re-install all related external packages.

Features

Graph Diffusion Convolution

MinCUT Pooling

CGCNNConv

TUDataset cleaned versions, containing only non-isomorphic graphs

GridSampling transform

ShapeNet dataset now comes with normals and better split options

TriMesh conversion utilities

ToSLIC transform for superpixel generation from images

Re-writing of MessagePassing interface with custom aggregate methods (no API changes)

Bugfixes

Fixed some failure modes of from_networkx.

Source code(tar.gz)
Source code(zip)
1.3.2(Oct 4, 2019)
This release focuses on Pytorch 1.2 support and removes all torch.bool deprecation warnings. As a result, this release now requires PyTorch 1.2. If you are in the process of updating to PyTorch 1.2, please ensure that you also re-install all related external packages.

Overall, this release brings the following new features/bugfixes:

Features

Prints out a warning in case the pre_transform and pre_filter arguments differ from an already processed version

Bugfixes

Removed all torch.bool deprecation warnings

Fixed ARGA initialization bug

Fixed a pre-processing bug in QM9

Source code(tar.gz)
Source code(zip)
1.3.1(Aug 29, 2019)
This is a minor release which is mostly distributed for official PyTorch 1.2 support. In addition, it provides minor bugfixes and the following new features:

Modules

Non-normalized ChebConv in combination with a largest eigenvalue transform

TAGCN

Graph U-Net

Node2Vec

EdgePooling

Alternative GMMConv formulation with separate kernels

Alternative Top-K pooling formulation based on thresholds with examples on synthetic COLORS and TRIANGLES datasets

Datasets

Pascal VOC 2011 with Berkeley keypoint annotations (PascalVOCKeypoints)

DBP15K dataset

WILLOWObjectClass dataset

Please also update related external packages via, e.g.:

$ pip install --upgrade torch-cluster
Source code(tar.gz)
Source code(zip)
1.3.0(Jun 29, 2019)
Support for giant graph handling using NeighborSampler and bipartite message passing operators

Debugging support using the new debug API

Fixed TUDataset download errors

Added FeasStConv module

Improved networkx conversion functionality

Improved Data and DataLoader handling with customizable number_of_nodes (e.g. for holding two graphs in a single Data object)

Added GeniePath example

Added SAGPool module

Added geodesic distance computation using gdist (optional)

Improved PointNet and DGCNN classification and segmentation examples

Added subgraph functionality

Fixed GMMConv

Added a bunch of new datasets

Added fast implementations for random graph generation

Improved loop API

Minor bugfixes

Thanks to all contributors!
Source code(tar.gz)
Source code(zip)
1.2.1(May 22, 2019)
More convenient self-loop API (including addition of edge weights)

Small bugfixes, .e.g., DiffPool NaNs and empty edge indices treatment

New datasets have been added:

GEDDataset

DynamicFAUST

TOSCA

SHREC2016

Source code(tar.gz)
Source code(zip)
1.2.0(Apr 29, 2019)
New models and operators, e.g., RENet, Signed Graph Convolution, Deep Graph Infomax, PPFNet, ...

Minor bugfixes

New converts

Source code(tar.gz)
Source code(zip)
1.1.2(Apr 5, 2019)
bugfixes for bipartite message passing API

Source code(tar.gz)
Source code(zip)
1.1.1(Apr 2, 2019)

PointConv bugfix for bipartite graphs.
Source code(tar.gz)
Source code(zip)
1.1.0(Apr 1, 2019)
This release includes:

All Variants of Graph Autoencoders

Gated Graph Conv

DataParallel bugfixes

New transforms (Line Graph Transformation, Local Degree Profile, Sample Points with Normals)

PointNet++ example

Source code(tar.gz)
Source code(zip)
1.0.3(Mar 7, 2019)
SGC and APPNP layer

Source code(tar.gz)
Source code(zip)
1.0.2(Jan 25, 2019)
Added remove_faces parameter for face transforms

Source code(tar.gz)
Source code(zip)
1.0.1(Jan 15, 2019)
Finally completed documentation

Finally achieved 100% code coverage (every single line is tested)

Fixed a few minor bugs

Added the GlobalAttention layer from Li et al.

Source code(tar.gz)
Source code(zip)
1.0.0(Dec 18, 2018)

We made a bunch of improvements to PyTorch Geometric and added various new convolution and pooling operators, e.g., top_k pooling, PointCNN, Iterative Farthest Point Sampling, PointNet++, ...
Source code(tar.gz)
Source code(zip)

PyG (PyTorch Geometric) - A library built upon PyTorch to easily write and train Graph Neural Networks (GNNs)

Related tags

Overview

Library Highlights

Quick Tour for New Users

Train your own GNN model

Create your own GNN layer

Manage experiments with GraphGym

Architecture Overview

Implemented GNN Models

Installation

Anaconda

Pip Wheels

PyTorch 1.10.0

PyTorch 1.9.0/1.9.1

From master

Cite

Comments

❓ Questions & Help

❓ Questions & Help

❓ Questions & Help

❓ Questions & Help

Context

Proposal

Where I am at

🚀 The feature, motivation and pitch

🐛 Describe the bug

Environment

📚 Installation

Environment

Checklist

Additional context

🐛 Describe the bug

Environment

🐛 Describe the bug

My Code

Environment

🚀 The feature, motivation and pitch

Alternatives

Additional context

Releases(2.2.0)

2.2.0(Dec 1, 2022)

Highlights

pyg-lib Integration

GraphStore and FeatureStore Abstractions

Optimized and Fused Aggregations

Community Sprint: Type Hints and TorchScript Support

Explainability

Breaking Changes

Deprecations

Features

Layers, Models and Examples

Data Loaders

Transformations

Datasets

General Improvements

Bugfixes

Full Changelog

2.1.0(Aug 17, 2022)

Highlights

Principled Aggregations

Link-level Neighbor Loader

Neighborhood Sampling based on Temporal Constraints

Functional DataPipes

Breaking Changes

Deprecations

Features

Layers, Models and Examples

Transformations

Datasets

General Improvements

Bugfixes

Full Changelog

2.0.4(Mar 12, 2022)

PyG 2.0.4 🎉

Features

Datasets

Minor Changes

Bugfixes

2.0.3(Dec 22, 2021)

`pyg-lib` Integration

`GraphStore` and `FeatureStore` Abstractions

Functional `DataPipes`