PyG (PyTorch Geometric) - A library built upon PyTorch to easily write and train Graph Neural Networks (GNNs)

Overview


PyPI Version Testing Status Linting Status Docs Status Contributing Slack

Documentation | Paper | Colab Notebooks and Video Tutorials | External Resources | OGB Examples

PyG (PyTorch Geometric) is a library built upon PyTorch to easily write and train Graph Neural Networks (GNNs) for a wide range of applications related to structured data.

It consists of various methods for deep learning on graphs and other irregular structures, also known as geometric deep learning, from a variety of published papers. In addition, it consists of easy-to-use mini-batch loaders for operating on many small and single giant graphs, multi GPU-support, distributed graph learning via Quiver, a large number of common benchmark datasets (based on simple interfaces to create your own), the GraphGym experiment manager, and helpful transforms, both for learning on arbitrary graphs as well as on 3D meshes or point clouds. Click here to join our Slack community!


Library Highlights

Whether you are a machine learning researcher or first-time user of machine learning toolkits, here are some reasons to try out PyG for machine learning on graph-structured data.

  • Easy-to-use and unified API: All it takes is 10-20 lines of code to get started with training a GNN model (see the next section for a quick tour). PyG is PyTorch-on-the-rocks: It utilizes a tensor-centric API and keeps design principles close to vanilla PyTorch. If you are already familiar with PyTorch, utilizing PyG is straightforward.
  • Comprehensive and well-maintained GNN models: Most of the state-of-the-art Graph Neural Network architectures have been implemented by library developers or authors of research papers and are ready to be applied.
  • Great flexibility: Existing PyG models can easily be extended for conducting your own research with GNNs. Making modifications to existing models or creating new architectures is simple, thanks to its easy-to-use message passing API, and a variety of operators and utility functions.
  • Large-scale real-world GNN models: We focus on the need of GNN applications in challenging real-world scenarios, and support learning on diverse types of graphs, including but not limited to: scalable GNNs for graphs with millions of nodes; dynamic GNNs for node predictions over time; heterogeneous GNNs with multiple node types and edge types.
  • GraphGym integration: GraphGym lets users easily reproduce GNN experiments, is able to launch and analyze thousands of different GNN configurations, and is customizable by registering new modules to a GNN learning pipeline.

Quick Tour for New Users

In this quick tour, we highlight the ease of creating and training a GNN model with only a few lines of code.

Train your own GNN model

In the first glimpse of PyG, we implement the training of a GNN for classifying papers in a citation graph. For this, we load the Cora dataset, and create a simple 2-layer GCN model using the pre-defined GCNConv:

import torch
from torch import Tensor
from torch_geometric.nn import GCNConv
from torch_geometric.datasets import Planetoid

dataset = Planetoid(root='.', name='Cora')

class GCN(torch.nn.Module):
    def __init__(self, in_channels, hidden_channels, out_channels):
        super().__init__()
        self.conv1 = GCNConv(in_channels, hidden_channels)
        self.conv2 = GCNConv(hidden_channels, out_channels)

    def forward(self, x: Tensor, edge_index: Tensor) -> Tensor:
        # x: Node feature matrix of shape [num_nodes, in_channels]
        # edge_index: Graph connectivity matrix of shape [2, num_edges]
        x = self.conv1(x, edge_index).relu()
        x = self.conv2(x, edge_index)
        return x

model = GCN(dataset.num_features, 16, dataset.num_classes)
We can now optimize the model in a training loop, similar to the standard PyTorch training procedure.
import torch.nn.functional as F

data = dataset[0]
optimizer = torch.optim.Adam(model.parameters(), lr=0.01)

for epoch in range(200):
    pred = model(data.x, data.edge_index)
    loss = F.cross_entropy(pred[data.train_mask], data.y[data.train_mask])

    # Backpropagation
    optimizer.zero_grad()
    loss.backward()
    optimizer.step()

More information about evaluating final model performance can be found in the corresponding example.

Create your own GNN layer

In addition to the easy application of existing GNNs, PyG makes it simple to implement custom Graph Neural Networks (see here for the accompanying tutorial). For example, this is all it takes to implement the edge convolutional layer from Wang et al.:

Tensor: # x: Node feature matrix of shape [num_nodes, in_channels] # edge_index: Graph connectivity matrix of shape [2, num_edges] return self.propagate(edge_index, x=x) # shape [num_nodes, out_channels] def message(self, x_j: Tensor, x_i: Tensor) -> Tensor: # x_j: Source node features of shape [num_edges, in_channels] # x_i: Target node features of shape [num_edges, in_channels] edge_features = torch.cat([x_i, x_j - x_i], dim=-1) return self.mlp(edge_features) # shape [num_edges, out_channels] ">
import torch
from torch import Tensor
from torch.nn import Sequential, Linear, ReLU
from torch_geometric.nn import MessagePassing

class EdgeConv(MessagePassing):
    def __init__(self, in_channels, out_channels):
        super().__init__(aggr="max")  # "Max" aggregation.
        self.mlp = Sequential(
            Linear(2 * in_channels, out_channels),
            ReLU(),
            Linear(out_channels, out_channels),
        )

    def forward(self, x: Tensor, edge_index: Tensor) -> Tensor:
        # x: Node feature matrix of shape [num_nodes, in_channels]
        # edge_index: Graph connectivity matrix of shape [2, num_edges]
        return self.propagate(edge_index, x=x)  # shape [num_nodes, out_channels]

    def message(self, x_j: Tensor, x_i: Tensor) -> Tensor:
        # x_j: Source node features of shape [num_edges, in_channels]
        # x_i: Target node features of shape [num_edges, in_channels]
        edge_features = torch.cat([x_i, x_j - x_i], dim=-1)
        return self.mlp(edge_features)  # shape [num_edges, out_channels]

Manage experiments with GraphGym

GraphGym allows you to manage and launch GNN experiments, using a highly modularized pipeline (see here for the accompanying tutorial).

git clone https://github.com/pyg-team/pytorch_geometric.git
cd pytorch_geometric/graphgym
bash run_single.sh  # run a single GNN experiment (node/edge/graph-level)
bash run_batch.sh   # run a batch of GNN experiments, using differnt GNN designs/datasets/tasks

Users are highly encouraged to check out the documentation, which contains additional tutorials on the essential functionalities of PyG, including data handling, creation of datasets and a full list of implemented methods, transforms, and datasets. For a quick start, check out our examples in examples/.

Architecture Overview

PyG provides a multi-layer framework that enables users to build Graph Neural Network solutions on both low and high levels. It comprises of the following components:

  • The PyG engine utilizes the powerful PyTorch deep learning framework, as well as additions of efficient CUDA libraries for operating on sparse data, e.g., torch-scatter, torch-sparse and torch-cluster.
  • The PyG storage handles data processing, transformation and loading pipelines. It is capable of handling and processing large-scale graph datasets, and provides effective solutions for heterogeneous graphs. It further provides a variety of sampling solutions, which enable training of GNNs on large-scale graphs.
  • The PyG operators bundle essential functionalities for implementing Graph Neural Networks. PyG supports important GNN building blocks that can be combined and applied to various parts of a GNN model, ensuring rich flexibility of GNN design.
  • Finally, PyG provides an abundant set of GNN models, and examples that showcase GNN models on standard graph benchmarks. Thanks to its flexibility, users can easily build and modify custom GNN models to fit their specific needs.

Implemented GNN Models

We list currently supported PyG models, layers and operators according to category:

GNN layers: All Graph Neural Network layers are implemented via the nn.MessagePassing interface. A GNN layer specifies how to perform message passing, i.e. by designing different message, aggregation and update functions as defined here. These GNN layers can be stacked together to create Graph Neural Network models.

Expand to see all implemented GNN layers...

Pooling layers: Graph pooling layers combine the vectorial representations of a set of nodes in a graph (or a subgraph) into a single vector representation that summarizes its properties of nodes. It is commonly applied to graph-level tasks, which require combining node features into a single graph representation.

Expand to see all implemented pooling layers...

GNN models: Our supported GNN models incorporate multiple message passing layers, and users can directly use these pre-defined models to make predictions on graphs. Unlike simple stacking of GNN layers, these models could involve pre-processing, additional learnable parameters, skip connections, graph coarsening, etc.

Expand to see all implemented GNN models...

GNN operators and utilities: PyG comes with a rich set of neural network operators that are commonly used in many GNN models. They follow an extensible design: It is easy to apply these operators and graph utilities to existing GNN layers and models to further enhance model performance.

Expand to see all implemented GNN operators and utilities...

Scalable GNNs: PyG supports the implementation of Graph Neural Networks that can scale to large-scale graphs. Such application is challenging since the entire graph, its associated features and the GNN parameters cannot fit into GPU memory. Many state-of-the-art scalability approaches tackle this challenge by sampling neighborhoods for mini-batch training, graph clustering and partitioning, or by using simplified GNN models. These approaches have been implemented in PyG, and can benefit from the above GNN layers, operators and models.

Expand to see all implemented scalable GNNs...

Installation

Anaconda

Update: You can now install PyG via Anaconda for all major OS/PyTorch/CUDA combinations 🤗 Given that you have PyTorch >= 1.8.0 installed, simply run

conda install pyg -c pyg -c conda-forge

Pip Wheels

We alternatively provide pip wheels for all major OS/PyTorch/CUDA combinations, see here.

PyTorch 1.10.0

To install the binaries for PyTorch 1.10.0, simply run

pip install torch-scatter -f https://data.pyg.org/whl/torch-1.10.0+${CUDA}.html
pip install torch-sparse -f https://data.pyg.org/whl/torch-1.10.0+${CUDA}.html
pip install torch-geometric

where ${CUDA} should be replaced by either cpu, cu102, or cu113 depending on your PyTorch installation (torch.version.cuda).

cpu cu102 cu113
Linux
Windows
macOS

For additional but optional functionality, run

pip install torch-cluster -f https://data.pyg.org/whl/torch-1.10.0+${CUDA}.html
pip install torch-spline-conv -f https://data.pyg.org/whl/torch-1.10.0+${CUDA}.html

PyTorch 1.9.0/1.9.1

To install the binaries for PyTorch 1.9.0 and 1.9.1, simply run

pip install torch-scatter -f https://data.pyg.org/whl/torch-1.9.0+${CUDA}.html
pip install torch-sparse -f https://data.pyg.org/whl/torch-1.9.0+${CUDA}.html
pip install torch-geometric

where ${CUDA} should be replaced by either cpu, cu102, or cu111 depending on your PyTorch installation (torch.version.cuda).

cpu cu102 cu111
Linux
Windows
macOS

For additional but optional functionality, run

pip install torch-cluster -f https://data.pyg.org/whl/torch-1.9.0+${CUDA}.html
pip install torch-spline-conv -f https://data.pyg.org/whl/torch-1.9.0+${CUDA}.html

Note: Binaries of older versions are also provided for PyTorch 1.4.0, PyTorch 1.5.0, PyTorch 1.6.0, PyTorch 1.7.0/1.7.1 and PyTorch 1.8.0/1.8.1 (following the same procedure).

From master

In case you want to experiment with the latest PyG features which are not fully released yet, ensure that torch-scatter and torch-sparse are installed by following the steps mentioned above, and install PyG from master via:

pip install git+https://github.com/pyg-team/pytorch_geometric.git

Cite

Please cite our paper (and the respective papers of the methods used) if you use this code in your own work:

@inproceedings{Fey/Lenssen/2019,
  title={Fast Graph Representation Learning with {PyTorch Geometric}},
  author={Fey, Matthias and Lenssen, Jan E.},
  booktitle={ICLR Workshop on Representation Learning on Graphs and Manifolds},
  year={2019},
}

Feel free to email us if you wish your work to be listed in the external resources. If you notice anything unexpected, please open an issue and let us know. If you have any questions or are missing a specific feature, feel free to discuss them with us. We are motivated to constantly make PyG even better.

Comments
  • Issue reproducing the results of the original ecc implementation. Pooling layer and conv layer are giving different results of the original implementation

    Issue reproducing the results of the original ecc implementation. Pooling layer and conv layer are giving different results of the original implementation

    As I mentioned in #319 I have problems to reproduce the ecc implemenation using pytorch_geometric. I found some differences between the results obtained, first one is that the results of both convolution operations using the same weights have different results. Moreover, the results of the pooling layers are also different.

    I created a test that checks this things. Basically, the scripts load the same weights to both implementations. These weights are obtained from train a network using the ecc_implementation. Below you can see the output of my test.

    ECC Weights and PyGeometric weights are equal: True #I am only doing a re-check in order to be sure that both weights are equal before to load to the models.
    Loading weights 
    Starting validation:
    ecc features conv1:  (997, 16) #Shape of the output of first conv in ecc implementation
    pygeometric features conv1:  (997, 16) #Shape of the output of first conv in pygeometric implementation
    Max difference between features of first conv 2.549824
    Output of ecc pooling:  (398, 32)
    Output of PyGeometric pooling:  (385, 32)
    Pygeomtric Acc:  41.51982378854625  Ecc accuracy:  63.65638766519823
    Pygeomtric Loss:  2.435516586519023  Ecc Loss:  0.9878960176974138
    

    As you can observe this difference has an impact to the accuracy using the same weights. You can find the source code here. One important thing, the data used for this tests is obtained from the original code of the ecc.

    opened by dhorka 93
  • spspmm cuda bugfix

    spspmm cuda bugfix

    ❓ Questions & Help

    I use the introduction two-layer GCN example, change the data to my own which has a input feature matrix (100020, 6) 2 labels and 3074376 edges. I tried the GCN example with a well result, which has a brilliant acc on the processed Cora dataset you provided, and when I change the input to my own data, there's only one classification result :acc:0.55698860228 f1_score:tensor([0.0000, 0.7155]) recall:tensor([0., 1.]) precision:tensor([0.0000, 0.5570]) I have been trying this problem for 2 days with the same result,Could you help me

    opened by zhangcaifu 78
  • enzymes_topk_pool model is not learning

    enzymes_topk_pool model is not learning

    ❓ Questions & Help

    Hi I am using enzymes_topk_pool(ETP) algorithm for Medical Image classification. I have created features out of Images and converted them into data format accepted by pytorchg data loader. But after that when I try to give these features to the ETP algo , model is not able to learn anything. Training and test loss doesn't change from 1st epoch until the end. Everything remains constant. More info: Its binary classification problem. Below i am attaching the small script so that u get an idea.

    class Net(torch.nn.Module): def init(self): super(Net, self).init() # 41 = number of features self.conv1 = GraphConv(dataset.num_node_features, 64) self.pool1 = TopKPooling(64, ratio=0.8) self.conv2 = GraphConv(64, 64) self.pool2 = TopKPooling(64, ratio=0.8) self.conv3 = GraphConv(64, 64) self.pool3 = TopKPooling(64, ratio=0.8)

        self.lin1 = torch.nn.Linear(128, 128)
        self.lin2 = torch.nn.Linear(128, 64)
        self.lin3 = torch.nn.Linear(64, 1)
        self.bn1 = torch.nn.BatchNorm1d(128)
        self.bn2 = torch.nn.BatchNorm1d(64)
        #self.act1 = torch.nn.ReLU()
        #self.act2 = torch.nn.ReLU()  
    
    def forward(self, data):
    
        x, edge_index, batch = data.x, data.edge_index, data.batch
        #edge_index, _ = remove_self_loops(edge_index)
        #edge_index, _ = add_self_loops(edge_index, num_nodes=x.size(0))
    
        x = F.relu(self.conv1(x, edge_index))
        x, edge_index, _, batch, _= self.pool1(x, edge_index, None, batch)
        x1 = torch.cat([gmp(x, batch), gap(x, batch)], dim=1)
    
        x = F.relu(self.conv2(x, edge_index))
        x, edge_index, _, batch, _ = self.pool2(x, edge_index, None, batch)
        x2 = torch.cat([gmp(x, batch), gap(x, batch)], dim=1)
    
        x = F.relu(self.conv3(x, edge_index))
        x, edge_index, _, batch, _ = self.pool3(x, edge_index, None, batch)
        x3 = torch.cat([gmp(x, batch), gap(x, batch)], dim=1)
    
        x = x1 + x2 + x3
    
        x = F.relu(self.lin1(x))
        x = F.relu(self.lin2(x))
        #x = F.dropout(x, p=0.5, training=self.training)
        #x = torch.sigmoid(self.lin3(x)).squeeze(1)
        x = torch.sigmoid(self.lin3(x)).squeeze(1)
        #print('x', x.shape)
        #x = F.log_softmax(self.lin3(x), dim=-1)
        return x
    

    device = torch.device('cuda' if torch.cuda.is_available() else 'cpu') model = Net().to(device) optimizer = torch.optim.Adam(model.parameters(), lr=0.001) scheduler = torch.optim.lr_scheduler.ReduceLROnPlateau(optimizer, verbose=True)

    crit = torch.nn.BCELoss() import pdb def train(epoch): model.train()

    loss_all = 0
    for data in train_loader:
        data = data.to(device)
        optimizer.zero_grad()
        output = model(data)
        #print('o/ps',output)
    
        #print(output)
        #print('len',output.shape)
        label = data.y.to(device).cuda()
        label = torch.tensor(label, dtype=torch.float).to(device)
    
        #print('lbls',label)
       # label = torch.tensor(label, dtype=torch.float)
        #print('lbl', label.shape)
        loss = crit(output, label)
        #print('loss',loss)
        #loss = crit(output, data.y)
        loss.backward(retain_graph=True)
        loss_all += data.num_graphs * loss.item()
        optimizer.step()
    scheduler.step(loss_all)
    return loss_all / len(train_data_list)
    

    from sklearn.metrics import roc_auc_score def evaluate(loader): model.eval()

    predictions = []
    labels = []
    
    with torch.no_grad():
        for data in loader:
    
            data = data.to(device)
            pred = model(data).detach().cpu().numpy()
            
            #print('pred ', pred)
    
            label = data.y.detach().cpu().numpy()
    
            #print('label ',label)
            predictions.append(pred)
            labels.append(label)
    
    predictions = np.hstack(predictions)
    #predictions = torch.cat(predictions)
    #predictions = torch.tensor(predictions)
    labels = np.hstack(labels)
    #labels = torch.tensor(labels)
    #labels = torch.cat(labels)
    
    return roc_auc_score(labels, predictions)
    

    for epoch in range(1, 201): loss = train(epoch) train_auc = evaluate(train_loader) test_auc = evaluate(test_loader) #train_acc = test(train_loader) #test_acc = test(test_loader) print('Epoch: {:03d}, Loss: {:.5f}, Train Auc: {:.5f}, Test AUC: {:.5f}'. format(epoch, loss, train_auc, test_auc))

    Note: For feature extraction from Images I have used ur Master thesis code. I have just used Form_feature_extration file and adjacency.py file but not feature_selection and coarsening file. Are they also needed to create features? Because currently, I have 41 features for every node in the image.

    Thanks in advance!

    opened by sachinsharma9780 55
  • Neighborhood Sampling

    Neighborhood Sampling

    Hi Matthias,

    I wrote my own dataset and dataloader and I used adjacent matrix instead of edge_index. When I tried to convert adj_matrix to edge_index, I got confused because I have multiple graphs (multiple samples, may have different number of nodes) in one batch. I went over some of the examples and found most of them have batch_size 1. How should I prepare the edge_index in mini-batch setting? I can easily use DenseSAGEConv but I want to try other networks.

    Thanks, Ming

    feature 
    opened by tbright17 46
  • Please help me with OSError: libcusparse.so.10: cannot open shared object file: No such file or directory

    Please help me with OSError: libcusparse.so.10: cannot open shared object file: No such file or directory

    ❓ Questions & Help

    this is the traceback

    `Traceback (most recent call last): File "/home/yrwang/.local/lib/python3.6/site-packages/torch_sparse/init.py", line 15, in library, [osp.dirname(file)]).origin) File "/home/yrwang/.local/lib/python3.6/site-packages/torch/_ops.py", line 106, in load_library ctypes.CDLL(path) File "/usr/lib/python3.6/ctypes/init.py", line 348, in init self._handle = _dlopen(self._name, mode) OSError: libcusparse.so.10: cannot open shared object file: No such file or directory

    During handling of the above exception, another exception occurred:

    Traceback (most recent call last): File "", line 1, in File "/home/yrwang/.local/lib/python3.6/site-packages/torch_geometric/init.py", line 2, in import torch_geometric.nn File "/home/yrwang/.local/lib/python3.6/site-packages/torch_geometric/nn/init.py", line 2, in from .data_parallel import DataParallel File "/home/yrwang/.local/lib/python3.6/site-packages/torch_geometric/nn/data_parallel.py", line 5, in from torch_geometric.data import Batch File "/home/yrwang/.local/lib/python3.6/site-packages/torch_geometric/data/init.py", line 1, in from .data import Data File "/home/yrwang/.local/lib/python3.6/site-packages/torch_geometric/data/data.py", line 7, in from torch_sparse import coalesce File "/home/yrwang/.local/lib/python3.6/site-packages/torch_sparse/init.py", line 23, in raise OSError(e) OSError: libcusparse.so.10: cannot open shared object file: No such file or directory `

    my cuda,cudnn is well installed : nvcc -V nvcc: NVIDIA (R) Cuda compiler driver Copyright (c) 2005-2019 NVIDIA Corporation Built on Sun_Jul_28_19:07:16_PDT_2019 Cuda compilation tools, release 10.1, V10.1.243 my torch version: >>> print(torch.__version__) 1.4.0 I use

    `pip3 install torch-scatter==2.0.4+cu101 -f https://pytorch-geometric.com/whl/torch-1.4.0.html

    pip3 install torch-sparse==0.6.1+cu101 -f https://pytorch-geometric.com/whl/torch-1.4.0.html

    pip3 install torch-cluster==1.5.4+cu101 -f https://pytorch-geometric.com/whl/torch-1.4.0.html

    pip3 install torch-spline-conv==1.2.0+cu101 -f https://pytorch-geometric.com/whl/torch-1.4.0.html

    pip3 install torch-geometric` to install torch-geometric, but the problem occur, thanks for helping me

    opened by yrwangxd 39
  • Not found error for torch_sparse::ptr2ind in torchscript

    Not found error for torch_sparse::ptr2ind in torchscript

    ❓ Questions & Help

    I tried to use pytorch model with MessagePassing layer in C++ code. As described in pytorch_geometric documentation, I generate torch model with my own MP layer and successfully convert the model.

    But in the process of executing C++ code, I face the error like below:

    Unknown builtin op: torch_sparse::ptr2ind.
    Could not find any similar ops to torch_sparse::ptr2ind. This op may not exist or may not be currently supported in TorchScript.
    :
      File "/home/sr6/kyuhyun9.lee/env_ML/lib/python3.6/site-packages/torch_sparse/storage.py", line 166
            rowptr = self._rowptr
            if rowptr is not None:
                row = torch.ops.torch_sparse.ptr2ind(rowptr, self._col.numel())
                      ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ <--- HERE
                self._row = row
                return row
    Serialized   File "code/__torch__/torch_sparse/storage.py", line 825
          if torch.__isnot__(rowptr, None):
            rowptr13 = unchecked_cast(Tensor, rowptr)
            row15 = ops.torch_sparse.ptr2ind(rowptr13, torch.numel(self._col))
                    ~~~~~~~~~~~~~~~~~~~~~~~~ <--- HERE
            self._row = row15
            _150, _151 = True, row15
    'SparseStorage.row' is being compiled since it was called from 'SparseStorage.__init__'
      File "/home/sr6/kyuhyun9.lee/env_ML/lib/python3.6/site-packages/torch_sparse/storage.py", line 133
            if not is_sorted:
                idx = self._col.new_zeros(self._col.numel() + 1)
                idx[1:] = self._sparse_sizes[1] * self.row() + self._col
                                                  ~~~~~~~~ <--- HERE
                if (idx[1:] < idx[:-1]).any():
                    perm = idx[1:].argsort()
    Serialized   File "code/__torch__/torch_sparse/storage.py", line 267
          idx = torch.new_zeros(self._col, [_29], dtype=None, layout=None, device=None, pin_memory=None)
          _30 = (self._sparse_sizes)[1]
          _31 = torch.add(torch.mul((self).row(), _30), self._col, alpha=1)
                                     ~~~~~~~~~~ <--- HERE
          _32 = torch.slice(idx, 0, 1, 9223372036854775807, 1)
          _33 = torch.copy_(_32, _31, False)
    'SparseStorage.__init__' is being compiled since it was called from 'GINLayerJittable_d54f76.__check_input____1'
    Serialized   File "code/__torch__/GINLayerJittable_d54f76.py", line 40
          pass
        return the_size
      def __check_input____1(self: __torch__.GINLayerJittable_d54f76.GINLayerJittable_d54f76,
          ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~...  <--- HERE
        edge_index: __torch__.torch_sparse.tensor.SparseTensor,
        size: Optional[Tuple[int, int]]) -> List[Optional[int]]:
    
    Aborted (core dumped)
    
    

    Since I have no experience of pytorch jit, I cannot find any clue to solve this. How can I handle this error?

    bug feature help wanted 
    opened by Nanco-L 32
  • Segmentation Fault in Forward Loop of Edge Conv

    Segmentation Fault in Forward Loop of Edge Conv

    Hey there!

    I'm testing out edge conv and am running into some issues. I'm getting a segmentation fault during the knn_graph generation:

    line 13: 14361 Segmentation fault      (core dumped) CUDA_VISIBLE_DEVICES="$gpuNum" python ...
    

    Here's output using pysnooper, I traced the error to this location:

    Starting var:.. batch = None
    Starting var:.. pos = tensor([[-3.1472e-01, -7.1309e-01, -1.5181e-01,  1.3493e+00,  1.0879e+00,          4.9691e-01, -1.54...
    Starting var:.. self = Net(  (conv1): EdgeConv(nn=Sequential(    (0): Linear(in_features=50, out_features=64, bias=True)   ...
    17:39:03.629809 call        41  def forward(self, pos, batch):
    17:39:03.680347 line        42          edge_index = knn_graph(pos, k=20, batch=batch)
    ~
    

    Last line is where error is happening.

    As more context, I'm generating my point cloud using a CNN, batch and positions are shown above, cannot seem to make it through the generation of the edge index.

    Can you please help me out here?

    opened by jlevy44 31
  • `RandLA-Net` example

    `RandLA-Net` example

    The paper: RandLA-Net: Efficient Semantic Segmentation of Large-Scale Point Clouds

    Context

    There lacks a good pytorch implementation of RandLa-Net that leverages pytorch geometric standards and modules. In torch-points3d, the current modules are outdated leading to some confusion among users.

    The implementation with the most stars on github is aRI0U/RandLA-Net-pytorch, which has nasty dependencies (torch_points or torch_points_kernels), makes slow back-and-forth between cpu and gpu when calling knns, and only accepts fixed size point clouds.

    Proposal

    I would like to implement RandLA-Net as part of pyg's examples. For now I would tackle the ModelNet classification task, and would follow the structure of other examples (pointnet2_classification in particular).

    The RandLa-Net paper focuses on segmentation, but for classification I would simply add a MLP+Global Max Pooling after the first DilatedResidualBlocks.

    RandLa-Net architecture is conceptually close to PointNet++, augmented with different tricks to speed things up (random sampling instead of fps), use more context (with a sort of dilated KNN), and encode local information better (by explicitly calculating positions, distances, and euclidian distance between points in a neighborhood, and by using self-attention on these features).

    If I have some success, I will take on the segmentation task as well (which is what interests me anyway for my own project)

    Where I am at

    I have a working implementation at examples/randlanet_classification.py. I still have to review it to make sure that I am following the paper as closely as possible, but I think I am on the right track.

    I would love some guidance on how to move forward. In particular:

    • Am I using MessagePassing modules correctly?
    • What should I aim for in term of accuracy on ModelNet?
    • Should I stick strictly to the paper? Or adapt the architecture to ModelNet.

    Indeed the hyperparameters were not chosen by the author for small objects but rather for large scale Lidar data, which could make convergence way longer that needed.

    With 4 DilatedResidualBlocks (like in the paper), we reach ~57% accuracy at epoch 200.

    With 3 DilatedResidualBlocks, we reach up to 75% accuracy at the 20th epoch

    With only 2 DilatedResidualBlocks, we reach 90% accuracy at the 81st epoch, getting closer to the leaderboard for the ModelNet10 challenge.

    feature 1 - Priority P1 example 
    opened by CharlesGaydon 30
  • Link-level `NeighborLoader`

    Link-level `NeighborLoader`

    🚀 The feature, motivation and pitch

    Currently, NeighborLoader is designed to be applied in node-level tasks and there exists no option for mini-batching in link-level tasks.

    To achieve this, users currently rely on a simple but hacky workaround, first utilized in ogbl-citation2 in this example.

    The idea is straightforward and simple: For input_nodes, we pass in both the source and destination nodes for every link we want to do link prediction on (both positive and negative):

    loader = NeighborLoader(data, input_nodes=edge_label_index.view(-1), ...)
    

    Nonetheless, PyG should provide a dedicated class to perform mini-batch on link-level tasks, re-using functionality from NeighborLoader under-the-hood. An API could look like:

    class LinkLevelNeighborLoader(
        data,
         input_edges=...
         input_edge_labels=...
         with_negative_sampling=True,
         **kwargs,
    )
    

    NOTE: This workaround currently only works for homogenous graphs!

    @RexYing @JiaxuanYou

    feature 0 - Priority P0 
    opened by rusty1s 30
  • Data Batch problem in PyG

    Data Batch problem in PyG

    🐛 Describe the bug

    Hi. I am a computational physics researcher and was using PyG very well. my pyg code was working well a few weeks ago, but now that I run my code, it is not working anymore without any changes.

    the problem is like below. I have many material structures and in my "custom_dataset" class, these are preprocessed and all graph informations (node features, edge features, edge index etc) are inserted into "Data" object in PyTorch geometric. You can see that each preprocessed sample with index $i$ was printed normal "Data" object in pyg

    캡처2

    But When I insert my custom dataset class into pyg DataLoader and I did like below,

    sample = next(iter(train_loader)) # batch sample
    

    batch sample is denoted by "DataDataBatch". I didn't see this kind of object name. and i can't use "sample.x' or "sample.edge_index" command. Instead I need to do like this

    캡처3

    I want to use expressions like "sample.x", "sample.edge_index" or "sample.edge_attr" as like before. I expect your kind explanations. Thank you.

    Environment

    • PyG version: 2.0.5
    • PyTorch version: 1.11.0+cu113
    • OS: GoogleColab Pro Plus
    • Python version: Python 3.7.13 in colab
    • CUDA/cuDNN version:
    • How you installed PyTorch and PyG (conda, pip, source):
    # Install required packages.
    import os
    import torch
    os.environ['TORCH'] = torch.__version__
    print(torch.__version__)
    !pip install -q torch-scatter -f https://data.pyg.org/whl/torch-${TORCH}.html
    !pip install -q torch-sparse -f https://data.pyg.org/whl/torch-${TORCH}.html
    !pip install -q git+https://github.com/pyg-team/pytorch_geometric.git
    !pip install -q pymatgen==2020.11.11  
    
    • Any other relevant information (e.g., version of torch-scatter):
    bug 
    opened by Amadeus-System 29
  • AttributeError: 'NoneType' object has no attribute 'origin'

    AttributeError: 'NoneType' object has no attribute 'origin'

    📚 Installation

    Traceback (most recent call last): File "/home/shelly/bourne/reimp_paper/MTAG-main/test/t1.py", line 24, in import torch_sparse File "/home/shelly/anaconda3/envs/pyt/lib/python3.6/site-packages/torch_sparse/init.py", line 15, in f'{library}_{suffix}', [osp.dirname(file)]).origin) AttributeError: 'NoneType' object has no attribute 'origin'

    Environment

    • OS: Ubuntu20.04

    • Python version:3.6.3

    • PyTorch version:1.8.0

    • CUDA/cuDNN version:11.1

    • GCC version:

    • How did you try to install PyTorch Geometric and its extensions (wheel, source): follow https://pytorch-geometric.readthedocs.io/en/latest/notes/installation.html 。。 Installation via Binaries

    • Any other relevant information:

    Checklist

    • [1 ] I followed the installation guide.
    • [1 ] I cannot find my error message in the FAQ.
    • [1] I set up CUDA correctly and can compile CUDA code via nvcc.
    • [0 ] I do have multiple CUDA versions on my machine.

    Additional context

    opened by bourne-3 29
  • [Explainability Evaluation] - GNN model (Un)Faithfulness v2

    [Explainability Evaluation] - GNN model (Un)Faithfulness v2

    opened by ZeynepP 0
  • `HeteroDataBatch.subgraph()` doesn't preserve `num_graphs`

    `HeteroDataBatch.subgraph()` doesn't preserve `num_graphs`

    🐛 Describe the bug

    After calling the subgraph method of HeteroData on a batched version, the object loses its num_graphs property:

    from torch_geometric.data import Batch, HeteroData
    
    d = Batch.from_data_list([HeteroData({"agent": {"x": torch.randn(3,3)}})])
    sub_d = d.subgraph({"agent": [0]})
    
    print("Does d has the num_graphs:")
    print(hasattr(d, "num_graphs"))
    
    print("Does sub_d has the num_graphs:")
    print(hasattr(sub_d, "num_graphs"))
    

    Output:

    Does d has the num_graphs:
    True
    Does sub_d has the num_graphs:
    False
    

    Environment

    • PyG version: 2.1.0
    • PyTorch version: 1.11.0
    • OS:
    • Python version: 3.8
    • CUDA/cuDNN version:
    • How you installed PyTorch and PyG (conda, pip, source): conda
    • Any other relevant information (e.g., version of torch-scatter):
    bug 
    opened by ekosman 1
  • RuntimeError: pseudo.size(1) == kernel_size.numel() INTERNAL ASSERT FAILED. Input mismatch

    RuntimeError: pseudo.size(1) == kernel_size.numel() INTERNAL ASSERT FAILED. Input mismatch

    🐛 Describe the bug

    I tried to train a SplineCNN as provided in the example namedmnist_nn_conv.py. I got the following error:

    RuntimeError: The following operation failed in the TorchScript interpreter. Traceback of TorchScript (most recent call last): File "...conda\envs\dl23\lib\site-packages\torch_spline_conv\basis.py", line 10, in spline_basis is_open_spline: torch.Tensor, degree: int) -> Tuple[torch.Tensor, torch.Tensor]: return torch.ops.torch_spline_conv.spline_basis(pseudo, kernel_size, ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ <--- HERE is_open_spline, degree) RuntimeError: pseudo.size(1) == kernel_size.numel() INTERNAL ASSERT FAILED at "D:\a\pytorch_spline_conv\pytorch_spline_conv\csrc\cuda\basis_cuda.cu":104, please report a bug to PyTorch. Input mismatch

    My Code

    `
    import os.path as osp
    
    import torch
    import torch.nn as nn
    import torch.nn.functional as F
    
    import torch_geometric.transforms as T
    from torch_geometric.datasets import MNISTSuperpixels
    from torch_geometric.loader import DataLoader
    from torch_geometric.nn import (
        SplineConv,
        global_mean_pool,
        graclus,
        max_pool,
        max_pool_x,
    )
    from torch_geometric.utils import normalized_cut
    
    
    //Datasets
    path = osp.join(osp.dirname(osp.realpath("/")), '..', 'data', 'MNIST')
    transform = T.Cartesian(cat=False)
    train_dataset = MNISTSuperpixels(path, True, transform=transform)
    test_dataset = MNISTSuperpixels(path, False, transform=transform)
    train_loader = DataLoader(train_dataset, batch_size=64, shuffle=True)
    test_loader = DataLoader(test_dataset, batch_size=64, shuffle=False)
    d = train_dataset`
    
    //Normalized Cut
    def normalized_cut_2d(edge_index, pos):
        row, col = edge_index
        edge_attr = torch.norm(pos[row] - pos[col], p=2, dim=1)
        return normalized_cut(edge_index, edge_attr, num_nodes=pos.size(0))
    
    //SplineCNN
    class Net(nn.Module):
        def __init__(self):
            super().__init__()
            self.conv1 = SplineConv(in_channels = d.num_features, out_channels= 32,dim=1, kernel_size = 3)
            self.conv2 = SplineConv(in_channels = 32, out_channels= 64, dim=1, kernel_size = 3)
            self.fc1 = torch.nn.Linear(64, 128)
            self.fc2 = torch.nn.Linear(128, d.num_classes)
    
        def forward(self, data):
            data.x = F.elu(self.conv1(data.x, data.edge_index, data.edge_attr))
            weight = normalized_cut_2d(data.edge_index, data.pos)
            cluster = graclus(data.edge_index, weight, data.x.size(0))
            data.edge_attr = None
            data = max_pool(cluster, data, transform=transform)
    
            data.x = F.elu(self.conv2(data.x, data.edge_index, data.edge_attr))
            weight = normalized_cut_2d(data.edge_index, data.facepos)
            cluster = graclus(data.edge_index, weight, data.x.size(0))
            x, batch = max_pool_x(cluster, data.x, data.batch)
    
            x = global_mean_pool(x, batch)
            x = F.elu(self.fc1(x))
            x = F.dropout(x, training=self.training)
            return F.log_softmax(self.fc2(x), dim=1)
    
    //Create Model
    device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
    model = Net().to(device)
    optimizer = torch.optim.Adam(model.parameters(), lr=0.01)
    
    //Train Function
    def train(epoch):
        model.train()
    
        if epoch == 16:
            for param_group in optimizer.param_groups:
                param_group['lr'] = 0.001
    
        if epoch == 26:
            for param_group in optimizer.param_groups:
                param_group['lr'] = 0.0001
    
        for data in train_loader:
            data = data.to(device)
            optimizer.zero_grad()
            F.nll_loss(model(data), data.y).backward()
            optimizer.step()
    
    //Test Function
    def test():
        model.eval()
        correct = 0
    
        for data in test_loader:
            data = data.to(device)
            pred = model(data).max(1)[1]
            correct += pred.eq(data.y).sum().item()
        return correct / len(test_dataset)
    
    //Run epoch
    for epoch in range(1, 31):
        train(epoch)
        test_acc = test()
        print(f'Epoch: {epoch:02d}, Test: {test_acc:.4f}')
    

    Environment

    • PyG version: 2.1.0
    • PyTorch version: 1.13.0
    • OS: Windows
    • Python version:3.10.8
    • CUDA/cuDNN version: 11.7
    • How you installed PyTorch and PyG (conda, pip, source):pip
    • Any other relevant information (e.g., version of torch-scatter):pip install torch-scatter torch-sparse torch-cluster torch-spline-conv torch-geometric -f https://data.pyg.org/whl/torch-1.13.0+cu117.html
    bug 
    opened by Amirtmgr 1
  • Questions abou the weight sharing method in GNN/Pyg

    Questions abou the weight sharing method in GNN/Pyg

    🚀 The feature, motivation and pitch

    Hi, I intend to implement the weight sharing method in pyg. I notice that there is a post which is related to this work: https://github.com/pyg-team/pytorch_geometric/issues/1503 However, for other module like GATconv and Transformconv, I did not find conv.weight or conv.bias. Could you please help me figure out why? Thanks.

    Alternatives

    No response

    Additional context

    No response

    feature 
    opened by HelloWorldLTY 7
  • Explainability for GNNs

    Explainability for GNNs

    This PR contains implementation of how to compute layer-wise weights for each edge in order to produce explanations for node-level, edge-level, and graph-level tasks. Furthermore, this implementation is different from authors' original implementation and is fast and more memory efficient than theirs. Have added Tests and Examples of the proposed implementation in order to make the overall approach understandable.

    feature 0 - Priority P0 explain 
    opened by fork123aniket 1
Releases(2.2.0)
  • 2.2.0(Dec 1, 2022)

    We are excited to announce the release of PyG 2.2 🎉🎉🎉

    PyG 2.2 is the culmination of work from 78 contributors who have worked on features and bug-fixes for a total of over 320 commits since torch-geometric==2.1.0.

    Highlights

    pyg-lib Integration

    We are proud to release and integrate pyg-lib==0.1.0 into PyG, the first stable version of our new low-level Graph Neural Network library to drive all CPU and GPU acceleration needs of PyG (#5330, #5347, #5384, #5388).

    You can install pyg-lib as described in our README.md:

    pip install pyg-lib -f https://data.pyg.org/whl/torch-${TORCH}+${CUDA}.html
    
    import pyg_lib
    

    Once pyg-lib is installed, it will get automatically picked up by PyG, e.g., to accelerate neighborhood sampling routines or to accelerate heterogeneous GNN execution:

    • pyg-lib provides fast and optimized CPU routines to iteratively sample neighbors in homogeneous and heterogeneous graphs, and heavily improves upon the previously used neighborhood sampling techniques utilized in PyG.

    Screenshot 2022-11-30 at 08 44 08

    • pyg-lib provides efficient GPU-based routines to parallelize workloads in heterogeneous graphs across different node types and edge types. We achieve this by leveraging type-dependent transformations via NVIDIA CUTLASS integration, which is flexible to implement most heterogeneous GNNs with, and efficient, even for sparse edge types or a large number of different node types.

    Screenshot 2022-11-30 at 08 44 38

    GraphStore and FeatureStore Abstractions

    PyG 2.2 includes numerous primitives to easily integrate with simple paradigms for scalable graph machine learning, enabling users to train GNNs on graphs far larger than the size of their machine's available memory. It does so by introducing simple, easy-to-use, and extensible abstractions of a FeatureStore and a GraphStore that plug directly into existing familiar PyG interfaces (see here for the accompanying tutorial).

    feature_store = CustomFeatureStore()
    feature_store['paper', 'x', None] = ...  # Add paper features
    feature_store['author', 'x', None] = ...  # Add author features
    
    graph_store = CustomGraphStore()
    graph_store['edge', 'coo'] = ...  # Add edges in "COO" format
    
    # `CustomGraphSampler` knows how to sample on `CustomGraphStore`:
    graph_sampler = CustomGraphSampler(
        graph_store=graph_store,
        num_neighbors=[10, 20],
        ...
    )
    
    from torch_geometric.loader import NodeLoader
    loader = NodeLoader(
        data=(feature_store, graph_store),
        node_sampler=graph_sampler,
        batch_size=20,
        input_nodes='paper',
    )
    
    for batch in loader:
        pass
    

    Data loading and sampling routines are refactored and decomposed into torch_geometric.loader and torch_geometric.sampler modules, respectively (#5563, #5820, #5456, #5457, #5312, #5365, #5402, #5404, #5418).

    Optimized and Fused Aggregations

    PyG 2.2 further accelerates scatter aggregations based on CPU/GPU and with/without backward computation paths (requires torch>=1.12.0 and torch-scatter>=2.1.0) (#5232, #5241, #5353, #5386, #5399, #6051, #6052).

    We also optimized the usage of nn.aggr.MultiAggregation by fusing the computation of multiple aggregations together (see here for more details) (#6036, #6040).

    Here are some benchmarking results on PyTorch 1.12 (summed over 1000 runs):

    | Aggregators | Vanilla | Fusion | |-------------------------|---------|---------| | [sum, mean] | 0.3325s | 0.1996s | | [sum, mean, min, max] | 0.7139s | 0.5037s | | [sum, mean, var] | 0.6849s | 0.3871s | | [sum, mean, var, std] | 1.0955s | 0.3973s |

    Lastly, we have incorporated "fused" GNN operators via the dgNN package, starting with a FusedGATConv implementation (#5140).

    Community Sprint: Type Hints and TorchScript Support

    We are running regular community sprints to get our community more involved in building PyG. Whether you are just beginning to use graph learning or have been leveraging GNNs in research or production, the community sprints welcome members of all levels with different types of projects.

    We had our first community sprint on 10/12 to fully-incorporate type hints and TorchScript support over the entire code base. The goal was to improve usability and cleanliness of our codebase. We had 20 contributors participating, contributing to 120 type hints within 2 weeks, adding around 2400 lines of code (#5842, #5603, #5659, #5664, #5665, #5666, #5667, #5668, #5669, #5673, #5675, #5673, #5678, #5682, #5683, #5684, #5685, #5687, #5688, #5695, #5699, #5701, #5702, #5703, #5706, #5707, #5710, #5714, #5715, #5716, #5722, #5724, #5725, #5726, #5729, #5730, #5731, #5732, #5733, #5743, #5734, #5735, #5736, #5737, #5738, #5747, #5752, #5753, #5754, #5756, #5757, #5758, #5760, #5766, #5767, #5768, #5781, #5778, #5797, #5798, #5799, #5800, #5806, #5810, #5811, #5828, #5847, #5851, #5852).

    Explainability

    Our second community sprint began on 11/15 with the goal to improve the explainability capabilities of PyG. With this, we introduce the torch_geometric.explain module to provide a unified set of tools to explain the predictions of a PyG model or to explain the underlying phenomenon of a dataset.

    Some of the features developed in the sprint are incorporated into this release:

    data = HeteroData(...)
    model = HeteroGNN(...)
    
    # Explain predictions on heterogenenous graphs for output node 10:
    captum_model = to_captum_model(model, mask_type, output_idx, metadata)
    inputs, additional_forward_args = to_captum_input(data.x_dict, data.edge_index_dict, mask_type)
    
    ig = IntegratedGradients(captum_model)
    ig_attr = ig.attribute(
        inputs=inputs,
        target=int(y[output_idx]),
        additional_forward_args=additional_forward_args,
        internal_batch_size=1,
    )
    

    Breaking Changes

    • Renamed drop_unconnected_nodes to drop_unconnected_node_types and drop_orig_edges to drop_orig_edge_types in AddMetapaths (#5490)

    Deprecations

    Features

    Layers, Models and Examples

    Data Loaders

    Transformations

    Datasets

    General Improvements

    Bugfixes

    Full Changelog

    Added
    • Extended GNNExplainer to support edge level explanations (#6056)
    • Added CPU affinitization for NodeLoader (#6005)
    • Added triplet sampling in LinkNeighborLoader (#6004)
    • Added FusedAggregation of simple scatter reductions (#6036)
    • Added a to_smiles function (#6038)
    • Added option to make normalization coefficients trainable in PNAConv (#6039)
    • Added semi_grad option in VarAggregation and StdAggregation (#6042)
    • Allow for fused aggregations in MultiAggregation (#6036, #6040)
    • Added HeteroData support for to_captum_model and added to_captum_input (#5934)
    • Added HeteroData support in RandomNodeLoader (#6007)
    • Added bipartite GraphSAGE example (#5834)
    • Added LRGBDataset to include 5 datasets from the Long Range Graph Benchmark (#5935)
    • Added a warning for invalid node and edge type names in HeteroData (#5990)
    • Added PyTorch 1.13 support (#5975)
    • Added int32 support in NeighborLoader (#5948)
    • Add dgNN support and FusedGATConv implementation (#5140)
    • Added lr_scheduler_solver and customized lr_scheduler classes (#5942)
    • Add to_fixed_size graph transformer (#5939)
    • Add support for symbolic tracing of SchNet model (#5938)
    • Add support for customizable interaction graph in SchNet model (#5919)
    • Started adding torch.sparse support to PyG (#5906, #5944, #6003)
    • Added HydroNet water cluster dataset (#5537, #5902, #5903)
    • Added explainability support for heterogeneous GNNs (#5886)
    • Added SparseTensor support to SuperGATConv (#5888)
    • Added TorchScript support for AttentiveFP(#5868)
    • Added num_steps argument to training and inference benchmarks (#5898)
    • Added torch.onnx.export support (#5877, #5997)
    • Enable VTune ITT in inference and training benchmarks (#5830, #5878)
    • Add training benchmark (#5774)
    • Added a "Link Prediction on MovieLens" Colab notebook (#5823)
    • Added custom sampler support in LightningDataModule (#5820)
    • Added a return_semantic_attention_weights argument HANConv (#5787)
    • Added disjoint argument to NeighborLoader and LinkNeighborLoader (#5775)
    • Added support for input_time in NeighborLoader (#5763)
    • Added disjoint mode for temporal LinkNeighborLoader (#5717)
    • Added HeteroData support for transforms.Constant (#5700)
    • Added np.memmap support in NeighborLoader (#5696)
    • Added assortativity that computes degree assortativity coefficient (#5587)
    • Added SSGConv layer (#5599)
    • Added shuffle_node, mask_feature and add_random_edge augmentation methdos (#5548)
    • Added dropout_path augmentation that drops edges from a graph based on random walks (#5531)
    • Add support for filling labels with dummy values in HeteroData.to_homogeneous() (#5540)
    • Added temporal_strategy option to neighbor_sample (#5576)
    • Added torch_geometric.sampler package to docs (#5563)
    • Added the DGraphFin dynamic graph dataset (#5504)
    • Added dropout_edge augmentation that randomly drops edges from a graph - the usage of dropout_adj is now deprecated (#5495)
    • Added dropout_node augmentation that randomly drops nodes from a graph (#5481)
    • Added AddRandomMetaPaths that adds edges based on random walks along a metapath (#5397)
    • Added WLConvContinuous for performing WL refinement with continuous attributes (#5316)
    • Added print_summary method for the torch_geometric.data.Dataset interface (#5438)
    • Added sampler support to LightningDataModule (#5456, #5457)
    • Added official splits to MalNetTiny dataset (#5078)
    • Added IndexToMask and MaskToIndex transforms (#5375, #5455)
    • Added FeaturePropagation transform (#5387)
    • Added PositionalEncoding (#5381)
    • Consolidated sampler routines behind torch_geometric.sampler, enabling ease of extensibility in the future (#5312, #5365, #5402, #5404), #5418)
    • Added pyg-lib neighbor sampling (#5384, #5388)
    • Added pyg_lib.segment_matmul integration within HeteroLinear (#5330, #5347))
    • Enabled bf16 support in benchmark scripts (#5293, #5341)
    • Added Aggregation.set_validate_args option to skip validation of dim_size (#5290)
    • Added SparseTensor support to inference and training benchmark suite (#5242, #5258, #5881)
    • Added experimental mode in inference benchmarks (#5254)
    • Added node classification example instrumented with Weights and Biases (W&B) logging and W&B Sweeps (#5192)
    • Added experimental mode for utils.scatter (#5232, #5241, #5386)
    • Added missing test labels in HGBDataset (#5233)
    • Added BaseStorage.get() functionality (#5240)
    • Added a test to confirm that to_hetero works with SparseTensor (#5222)
    • Added torch_geometric.explain module with base functionality for explainability methods (#5804, #6054, #6089)
    Changed
    • Moved and adapted GNNExplainer from torch_geometric.nn to torch_geometric.explain.algorithm (#5967, #6065)
    • Optimized scatter implementations for CPU/GPU, both with and without backward computation (#6051, #6052)
    • Support temperature value in dense_mincut_pool (#5908)
    • Fixed a bug in which VirtualNode mistakenly treated node features as edge features (#5819)
    • Fixed setter and getter handling in BaseStorage (#5815)
    • Fixed path in hetero_conv_dblp.py example (#5686)
    • Fix auto_select_device routine in GraphGym for PyTorch Lightning>=1.7 (#5677)
    • Support in_channels with tuple in GENConv for bipartite message passing (#5627, #5641)
    • Handle cases of not having enough possible negative edges in RandomLinkSplit (#5642)
    • Fix RGCN+pyg-lib for LongTensor input (#5610)
    • Improved type hint support (#5842, #5603, #5659, #5664, #5665, #5666, #5667, #5668, #5669, #5673, #5675, #5673, #5678, #5682, #5683, #5684, #5685, #5687, #5688, #5695, #5699, #5701, #5702, #5703, #5706, #5707, #5710, #5714, #5715, #5716, #5722, #5724, #5725, #5726, #5729, #5730, #5731, #5732, #5733, #5743, #5734, #5735, #5736, #5737, #5738, #5747, #5752, #5753, #5754, #5756, #5757, #5758, #5760, #5766, #5767, #5768), #5781, #5778, #5797, #5798, #5799, #5800, #5806, #5810, #5811, #5828, #5847, #5851, #5852)
    • Avoid modifying mode_kwargs in MultiAggregation (#5601)
    • Changed BatchNorm to allow for batches of size one during training (#5530, #5614)
    • Integrated better temporal sampling support by requiring that local neighborhoods are sorted according to time (#5516, #5602)
    • Fixed a bug when applying several scalers with PNAConv (#5514)
    • Allow . in ParameterDict key names (#5494)
    • Renamed drop_unconnected_nodes to drop_unconnected_node_types and drop_orig_edges to drop_orig_edge_types in AddMetapaths (#5490)
    • Improved utils.scatter performance by explicitly choosing better implementation for add and mean reduction (#5399)
    • Fix to_dense_adj with empty edge_index (#5476)
    • The AttentionalAggregation module can now be applied to compute attentin on a per-feature level (#5449)
    • Ensure equal lenghts of num_neighbors across edge types in NeighborLoader (#5444)
    • Fixed a bug in TUDataset in which node features were wrongly constructed whenever node_attributes only hold a single feature (e.g., in PROTEINS) (#5441)
    • Breaking change: removed num_neighbors as an attribute of loader (#5404)
    • ASAPooling is now jittable (#5395)
    • Updated unsupervised GraphSAGE example to leverage LinkNeighborLoader (#5317)
    • Replace in-place operations with out-of-place ones to align with torch.scatter_reduce API (#5353)
    • Breaking bugfix: PointTransformerConv now correctly uses sum aggregation (#5332)
    • Improve out-of-bounds error message in MessagePassing (#5339)
    • Allow file names of a Dataset to be specified as either property and method (#5338)
    • Fixed separating a list of SparseTensor within InMemoryDataset (#5299)
    • Improved name resolving of normalization layers (#5277)
    • Fail gracefully on GLIBC errors within torch-spline-conv (#5276)
    • Fixed Dataset.num_classes in case a transform modifies data.y (#5274)
    • Allow customization of the activation function within PNAConv (#5262)
    • Do not fill InMemoryDataset cache on dataset.num_features (#5264)
    • Changed tests relying on dblp datasets to instead use synthetic data (#5250)
    • Fixed a bug for the initialization of activation function examples in custom_graphgym (#5243)
    • Allow any integer tensors when checking edge_index input to message passing (5281)
    Removed
    • Removed scatter_reduce option from experimental mode (#5399)

    Full commit list: https://github.com/pyg-team/pytorch_geometric/compare/2.1.0...2.2.0

    Source code(tar.gz)
    Source code(zip)
  • 2.1.0(Aug 17, 2022)

    We are excited to announce the release of PyG 2.1.0 🎉🎉🎉

    PyG 2.1.0 is the culmination of work from over 60 contributors who have worked on features and bug-fixes for a total of over 320 commits since torch-geometric==2.0.4.

    Highlights

    Principled Aggregations

    See here for the accompanying tutorial.

    Aggregation functions play an important role in the message passing framework and the readout functions of Graph Neural Networks. Specifically, many works in the literature (Hamilton et al. (2017), Xu et al. (2018), Corso et al. (2020), Li et al. (2020), Tailor et al. (2021), Bartunov et al. (2022)) demonstrate that the choice of aggregation functions contributes significantly to the representational power and performance of the model.

    To facilitate further experimentation and unify the concepts of aggregation within GNNs across both MessagePassing and global readouts, we have made the concept of Aggregation a first-class principle in PyG (#4379, #4522, #4687, #4721, #4731, #4762, #4749, #4779, #4863, #4864, #4865, #4866, #4872, #4927, #4934, #4935, #4957, #4973, #4973, #4986, #4995, #5000, #5021, #5034, #5036, #5039, #4522, #5033, #5085, #5097, #5099, #5104, #5113, #5130, #5098, #5191). As of now, PyG provides support for various aggregations — from simple ones (e.g., mean, max, sum), to advanced ones (e.g., median, var, std), learnable ones (e.g., SoftmaxAggregation, PowerMeanAggregation), and exotic ones (e.g., LSTMAggregation, SortAggregation, EquilibriumAggregation). Furthermore, multiple aggregations can be combined and stacked together:

    from torch_geometric.nn import MessagePassing, SoftmaxAggregation
    
    class MyConv(MessagePassing):
        def __init__(self, ...):
            # Combines a set of aggregations and concatenates their results.
            # The interface also supports automatic resolution.
            super().__init__(aggr=['mean', 'std', SoftmaxAggregation(learn=True)])
    

    Link-level Neighbor Loader

    We added a new LinkNeighborLoader class for training scalable GNNs that perform edge-level predictions on giant graphs (#4396, #4439, #4441, #4446, #4508, #4509, #4868). LinkNeighborLoader comes with automatic support for both homogeneous and heterogenous data, and supports link prediction via automatic negative sampling as well as edge-level classification and regression models:

    from torch_geometric.loader import LinkNeighborLoader
    
    loader = LinkNeighborLoader(
        data,
        num_neighbors=[30] * 2,  # Sample 30 neighbors for each node for 2 iterations
        batch_size=128,  # Use a batch size of 128 for sampling training links
        edge_label_index=data.edge_index,  # Use the entire graph for supervision
        negative_sampling_ratio=1.0,  # Sample negative edges
    )
    
    sampled_data = next(iter(loader))
    print(sampled_data)
    >>> Data(x=[1368, 1433], edge_index=[2, 3103], edge_label_index=[2, 256], edge_label=[256])
    

    Neighborhood Sampling based on Temporal Constraints

    Both NeighborLoader and LinkNeighborLoader now support temporal sampling via the time_attr argument (#4025, #4877, #4908, #5137, #5173). If set, temporal sampling will be used such that neighbors are guaranteed to fulfill temporal constraints, i.e. neighbors have an earlier timestamp than the center node:

    from torch_geometric.loader import NeighborLoader
    
    data['paper'].time = torch.arange(data['paper'].num_nodes)
    
    loader = NeighborLoader(
        data,
        input_nodes='paper',
        time_attr='time',  # Only sample papers that appeared before the seed paper
        num_neighbors=[30] * 2,
        batch_size=128,
    )
    

    Note that this feature requires torch-sparse>=0.6.14.

    Functional DataPipes

    See here for the accompanying example.

    PyG now fully supports data loading using the newly introduced concept of DataPipes in PyTorch for easily constructing flexible and performant data pipelines (#4302, #4345, #4349). PyG provides DataPipe support for batching multiple PyG data objects together and for applying any PyG transform:

    datapipe = FileOpener(['SMILES_HIV.csv'])
    datapipe = datapipe.parse_csv_as_dict()
    datapipe = datapipe.parse_smiles(target_key='HIV_active')
    datapipe = datapipe.in_memory_cache()  # Cache graph instances in-memory.
    datapipe = datapipe.shuffle()
    datapipe = datapipe.batch_graphs(batch_size=32)
    
    datapipe = FileLister([root_dir], masks='*.off', recursive=True)
    datapipe = datapipe.read_mesh()
    datapipe = datapipe.in_memory_cache()  # Cache graph instances in-memory.
    datapipe = datapipe.sample_points(1024)  # Use PyG transforms from here.
    datapipe = datapipe.knn_graph(k=8)
    datapipe = datapipe.shuffle()
    datapipe = datapipe.batch_graphs(batch_size=32)
    

    Breaking Changes

    Deprecations

    Features

    Layers, Models and Examples

    Transformations

    Datasets

    General Improvements

    Bugfixes

    • Fixed a bug in RGATConv that produced device mismatches for "f-scaled" mode (#5187]
    • Fixed a bug in GINEConv bug for non-Sequential neural network layers (#5154]
    • Fixed a bug in HGTLoader which produced outputs with missing edge types, will require torch-sparse>=0.6.15 (#5067)
    • Fixed a bug in load_state_dict for Linear with strict=False mode (5094)
    • Fixed data.num_node_features computation for sparse matrices (5089)
    • Fixed a bug in which GraphGym did not create new non-linearity functions but re-used existing ones (4978)
    • Fixed BasicGNN for num_layers=1, which now respects a desired number of out_channels (#4943)
    • Fixed a bug in data.subgraph for 0-dim tensors (#4932)
    • Fixed a bug in InMemoryDataset inferring wrong length for lists of tensors (#4837)
    • Fixed a bug in TUDataset where pre_filter was not applied whenever pre_transform was present (#4842)
    • Fixed access of edge types in HeteroData via two node types when there exists multiple relations between them (#4782)
    • Fixed a bug in HANConv in which destination node features rather than source node features were propagated (#4753)
    • Fixed a ranking protocol bug in the RGCN link prediction example (#4688)
    • Fixed the interplay between TUDataset and pre_transform transformations that modify node features (#4669)
    • The bias argument in TAGConv is now correctly applied (#4597)
    • Fixed filtering of attributes in samplers in case __cat_dim__ != 0 (#4629)
    • Fixed SparseTensor support in NeighborLoader (#4320)
    • Fixed average degree handling in PNAConv (#4312)
    • Fixed a bug in from_networkx in case some attributes are PyTorch tensors (#4486)
    • Fixed a missing clamp in the DimeNet model (#4506, #4562)
    • Fixed the download link in DBP15K (#4428)
    • Fixed an autograd bug in DimeNet when resetting parameters (#4424)
    • Fixed bipartite message passing in case flow="target_to_source" (#4418)
    • Fixed a bug in which num_nodes was not properly updated in the FixedPoints transform (#4394)
    • Fixed a bug in which GATConv was not jittable (#4347)
    • Fixed a bug in which nn.models.GAT did not produce out_channels many output channels (#4299)
    • Fixed a bug in mini-batching with empty lists as attributes (#4293)
    • Fixed a bug in which GCNConv could not be combined with to_hetero on heterogeneous graphs with one node type (#4279)

    Full Changelog

    Added
    • Added edge_label_time argument to LinkNeighborLoader (#5137, #5173)
    • Let ImbalancedSampler accept torch.Tensor as input (#5138)
    • Added flow argument to gcn_norm to correctly normalize the adjacency matrix in GCNConv (#5149)
    • NeighborSampler supports graphs without edges (#5072)
    • Added the MeanSubtractionNorm layer (#5068)
    • Added pyg_lib.segment_matmul integration within RGCNConv (#5052, #5096)
    • Support SparseTensor as edge label in LightGCN (#5046)
    • Added support for BasicGNN models within to_hetero (#5091)
    • Added support for computing weighted metapaths in AddMetapaths (#5049)
    • Added inference benchmark suite (#4915)
    • Added a dynamically sized batch sampler for filling a mini-batch with a variable number of samples up to a maximum size (#4972)
    • Added fine grained options for setting bias and dropout per layer in the MLP model (#4981)
    • Added EdgeCNN model (#4991)
    • Added scalable inference mode in BasicGNN with layer-wise neighbor loading (#4977)
    • Added inference benchmarks (#4892, #5107)
    • Added PyTorch 1.12 support (#4975)
    • Added unbatch_edge_index functionality for splitting an edge_index tensor according to a batch vector (#4903)
    • Added node-wise normalization mode in LayerNorm (#4944)
    • Added support for normalization_resolver (#4926, #4951, #4958, #4959)
    • Added notebook tutorial for torch_geometric.nn.aggr package to documentation (#4927)
    • Added support for follow_batch for lists or dictionaries of tensors (#4837)
    • Added Data.validate() and HeteroData.validate() functionality (#4885)
    • Added LinkNeighborLoader support to LightningDataModule (#4868)
    • Added predict() support to the LightningNodeData module (#4884)
    • Added time_attr argument to LinkNeighborLoader (#4877, #4908)
    • Added a filter_per_worker argument to data loaders to allow filtering of data within sub-processes (#4873)
    • Added a NeighborLoader benchmark script (#4815, #4862)
    • Added support for FeatureStore and GraphStore in NeighborLoader (#4817, #4851, #4854, #4856, #4857, #4882, #4883, #4929, #4992, #4962, #4968, #5037, #5088)
    • Added a normalize parameter to dense_diff_pool (#4847)
    • Added size=None explanation to jittable MessagePassing modules in the documentation (#4850)
    • Added documentation to the DataLoaderIterator class (#4838)
    • Added GraphStore support to Data and HeteroData (#4816)
    • Added FeatureStore support to Data and HeteroData (#4807, #4853)
    • Added FeatureStore and GraphStore abstractions (#4534, #4568)
    • Added support for dense aggregations in global_*_pool (#4827)
    • Added Python version requirement (#4825)
    • Added TorchScript support to JumpingKnowledge module (#4805)
    • Added a max_sample argument to AddMetaPaths in order to tackle very dense metapath edges (#4750)
    • Test HANConv with empty tensors (#4756, #4841)
    • Added the bias vector to the GCN model definition in the "Create Message Passing Networks" tutorial (#4755)
    • Added transforms.RootedSubgraph interface with two implementations: RootedEgoNets and RootedRWSubgraph (#3926)
    • Added ptr vectors for follow_batch attributes within Batch.from_data_list (#4723)
    • Added torch_geometric.nn.aggr package (#4687, #4721, #4731, #4762, #4749, #4779, #4863, #4864, #4865, #4866, #4872, #4934, #4935, #4957, #4973, #4973, #4986, #4995, #5000, #5034, #5036, #5039, #4522, #5033, #5085, #5097, #5099, #5104, #5113, #5130, #5098, #5191)
    • Added the DimeNet++ model (#4432, #4699, #4700, #4800)
    • Added an example of using PyG with PyTorch Ignite (#4487)
    • Added GroupAddRev module with support for reducing training GPU memory (#4671, #4701, #4715, #4730)
    • Added benchmarks via wandb (#4656, #4672, #4676)
    • Added unbatch functionality (#4628)
    • Confirm that to_hetero() works with custom functions, e.g., dropout_adj (4653)
    • Added the MLP.plain_last=False option (4652)
    • Added a check in HeteroConv and to_hetero() to ensure that MessagePassing.add_self_loops is disabled (4647)
    • Added HeteroData.subgraph() support (#4635)
    • Added the AQSOL dataset (#4626)
    • Added HeteroData.node_items() and HeteroData.edge_items() functionality (#4644)
    • Added PyTorch Lightning support in GraphGym (#4511, #4516 #4531, #4689, #4843)
    • Added support for returning embeddings in MLP models (#4625)
    • Added faster initialization of NeighborLoader in case edge indices are already sorted (via is_sorted=True) (#4620, #4702)
    • Added AddPositionalEncoding transform (#4521)
    • Added HeteroData.is_undirected() support (#4604)
    • Added the Genius and Wiki datasets to nn.datasets.LINKXDataset (#4570, #4600)
    • Added nn.aggr.EquilibrumAggregation implicit global layer (#4522)
    • Added support for graph-level outputs in to_hetero (#4582)
    • Added CHANGELOG.md (#4581)
    • Added HeteroData support to the RemoveIsolatedNodes transform (#4479)
    • Added HeteroData.num_features functionality (#4504)
    • Added support for projecting features before propagation in SAGEConv (#4437)
    • Added Geom-GCN splits to the Planetoid datasets (#4442)
    • Added a LinkNeighborLoader for training scalable link predictions models #4396, #4439, #4441, #4446, #4508, #4509)
    • Added an unsupervised GraphSAGE example on PPI (#4416)
    • Added support for LSTM aggregation in SAGEConv (#4379)
    • Added support for floating-point labels in RandomLinkSplit (#4311, #4383)
    • Added support for torch.data DataPipes (#4302, #4345, #4349)
    • Added support for the cosine argument in the KNNGraph/RadiusGraph transforms (#4344)
    • Added support graph-level attributes in networkx conversion (#4343)
    • Added support for renaming node types via HeteroData.rename (#4329)
    • Added an example to load a trained PyG model in C++ (#4307)
    • Added a MessagePassing.explain_message method to customize making explanations on messages (#4278, #4448))
    • Added support for GATv2Conv in the nn.models.GAT model (#4357)
    • Added HeteroData.subgraph functionality (#4243)
    • Added the MaskLabel module and a corresponding masked label propagation example (#4197)
    • Added temporal sampling support to NeighborLoader (#4025)
    • Added an example for unsupervised heterogeneous graph learning based on "Deep Multiplex Graph Infomax" (#3189)
    Changed
    • Changed docstring for RandomLinkSplit (#5190)
    • Switched to PyTorch scatter_reduce implementation - experimental feature (#5120)
    • Fixed RGATConv device mismatches for f-scaled mode (#5187]
    • Allow for multi-dimensional edge_labels in LinkNeighborLoader (#5186]
    • Fixed GINEConv bug with non-sequential input (#5154]
    • Improved error message (#5095)
    • Fixed HGTLoader bug which produced outputs with missing edge types (#5067)
    • Fixed dynamic inheritance issue in data batching (#5051)
    • Fixed load_state_dict in Linear with strict=False mode (5094)
    • Fixed typo in MaskLabel.ratio_mask (5093)
    • Fixed data.num_node_features computation for sparse matrices (5089)
    • Fixed torch.fx bug with torch.nn.aggr package (#5021))
    • Fixed GenConv test (4993)
    • Fixed packaging tests for Python 3.10 (4982)
    • Changed act_dict (part of graphgym) to create individual instances instead of reusing the same ones everywhere (4978)
    • Fixed issue where one-hot tensors were passed to F.one_hot (4970)
    • Fixed bool arugments in argparse in benchmark/ (#4967)
    • Fixed BasicGNN for num_layers=1, which now respects a desired number of out_channels (#4943)
    • len(batch) will now return the number of graphs inside the batch, not the number of attributes (#4931)
    • Fixed data.subgraph generation for 0-dim tensors (#4932)
    • Removed unnecssary inclusion of self-loops when sampling negative edges (#4880)
    • Fixed InMemoryDataset inferring wrong len for lists of tensors (#4837)
    • Fixed Batch.separate when using it for lists of tensors (#4837)
    • Correct docstring for SAGEConv (#4852)
    • Fixed a bug in TUDataset where pre_filter was not applied whenever pre_transform was present (#4842)
    • Renamed RandomTranslate to RandomJitter - the usage of RandomTranslate is now deprecated (#4828)
    • Do not allow accessing edge types in HeteroData with two node types when there exists multiple relations between these types (#4782)
    • Allow edge_type == rev_edge_type argument in RandomLinkSplit (#4757)
    • Fixed a numerical instability in the GeneralConv and neighbor_sample tests (#4754)
    • Fixed a bug in HANConv in which destination node features rather than source node features were propagated (#4753)
    • Fixed versions of checkout and setup-python in CI (#4751)
    • Fixed protobuf version (#4719)
    • Fixed the ranking protocol bug in the RGCN link prediction example (#4688)
    • Math support in Markdown (#4683)
    • Allow for setter properties in Data (#4682, #4686)
    • Allow for optional edge_weight in GCN2Conv (#4670)
    • Fixed the interplay between TUDataset and pre_transform that modify node features (#4669)
    • Make use of the pyg_sphinx_theme documentation template (#4664, #4667)
    • Refactored reading molecular positions from sdf file for qm9 datasets (4654)
    • Fixed MLP.jittable() bug in case return_emb=True (#4645, #4648)
    • The generated node features of StochasticBlockModelDataset are now ordered with respect to their labels (#4617)
    • Fixed typos in the documentation (#4616, #4824, #4895, #5161)
    • The bias argument in TAGConv is now actually applied (#4597)
    • Fixed subclass behaviour of process and download in Datsaet (#4586)
    • Fixed filtering of attributes for loaders in case __cat_dim__ != 0 (#4629)
    • Fixed SparseTensor support in NeighborLoader (#4320)
    • Fixed average degree handling in PNAConv (#4312)
    • Fixed a bug in from_networkx in case some attributes are PyTorch tensors (#4486)
    • Added a missing clamp in DimeNet (#4506, #4562)
    • Fixed the download link in DBP15K (#4428)
    • Fixed an autograd bug in DimeNet when resetting parameters (#4424)
    • Fixed bipartite message passing in case flow="target_to_source" (#4418)
    • Fixed a bug in which num_nodes was not properly updated in the FixedPoints transform (#4394)
    • PyTorch Lightning >= 1.6 support (#4377)
    • Fixed a bug in which GATConv was not jittable (#4347)
    • Fixed a bug in which the GraphGym config was not stored in each specific experiment directory (#4338)
    • Fixed a bug in which nn.models.GAT did not produce out_channels-many output channels (#4299)
    • Fixed mini-batching with empty lists as attributes (#4293)
    • Fixed a bug in which GCNConv could not be combined with to_hetero on heterogeneous graphs with one node type (#4279)
    Removed
    • Remove internal metrics in favor of torchmetrics (#4287)

    Full commit list: https://github.com/pyg-team/pytorch_geometric/compare/2.0.4...2.1.0

    Source code(tar.gz)
    Source code(zip)
  • 2.0.4(Mar 12, 2022)

    PyG 2.0.4 🎉

    A new minor PyG version release, bringing PyTorch 1.11 support to PyG. It further includes a variety of new features and bugfixes:

    Features

    • Added Quiver examples for multi-GU training using GraphSAGE (#4103), thanks to @eedalong and @luomai
    • nn.model.to_captum: Full integration of explainability methods provided by the Captum library (#3990, #4076), thanks to @RBendias
    • nn.conv.RGATConv: The relational graph attentional operator (#4031, #4110), thanks to @fork123aniket
    • nn.pool.DMoNPooling: The spectral modularity pooling operator (#4166, #4242), thanks to @fork123aniket
    • nn.*: Support for shape information in the documentation (#3739, #3889, #3893, #3946, #3981, #4009, #4120, #4158), thanks to @saiden89 and @arunppsg and @konstantinosKokos
    • loader.TemporalDataLoader: A dataloader to load a TemporalData object in mini-batches (#3985, #3988), thanks to @otaviocx
    • loader.ImbalancedSampler: A weighted random sampler that randomly samples elements according to class distribution (#4198)
    • transforms.VirtualNode: A transform that adds a virtual node to a graph (#4163)
    • transforms.LargestConnectedComponents: Selects the subgraph that corresponds to the largest connected components in the graph (#3949), thanks to @abojchevski
    • utils.homophily: Support for class-insensitive edge homophily (#3977, #4152), thanks to @hash-ir and @jinjh0123
    • utils.get_mesh_laplacian: Mesh Laplacian computation (#4187), thanks to @daniel-unyi-42

    Datasets

    • Added a dataset cheatsheet to the documentation that collects import graph statistics across a variety of datasets supported in PyG (#3807, #3817) (please consider helping us filling its remaining content)
    • datasets.EllipticBitcoinDataset: A dataset of Bitcoin transactions (#3815), thanks to @shravankumar147

    Minor Changes

    • nn.models.MLP: MLPs can now either be initialized via a list of channels or by specifying hidden_channels and num_layers (#3957)
    • nn.models.BasicGNN: Final Linear transformations are now always applied (except for jk=None) (#4042)
    • nn.conv.MessagePassing: Message passing modules that make use of edge_updater are now jittable (#3765), thanks to @Padarn
    • nn.conv.MessagePassing: (Official) support for min and mul aggregations (#4219)
    • nn.LightGCN: Initialize embeddings via xavier_uniform for better model performance (#4083), thanks to @nishithshowri006
    • nn.conv.ChebConv: Automatic eigenvalue approximation (#4106), thanks to @daniel-unyi-42
    • nn.conv.APPNP: Added support for optional edge_weight, (690a01d), thanks to @YueeXiang
    • nn.conv.GravNetConv: Support for torch.jit.script (#3885), thanks to @RobMcH
    • nn.pool.global_*_pool: The batch vector is now optional (#4161)
    • nn.to_hetero: Added a warning in case to_hetero is used on HeteroData metadata with unused destination node types (#3775)
    • nn.to_hetero: Support for nested modules (ea135bf)
    • nn.Sequential: Support for indexing (#3790)
    • nn.Sequential: Support for OrderedDict as input (#4075)
    • datasets.ZINC: Added an in-depth description of the task (#3832), thanks to @gasteigerjo
    • datasets.FakeDataset: Support for different feature distributions across different labels (#4065), thanks to @arunppsg
    • datasets.FakeDataset: Support for custom global attributes (#4074), thanks to @arunppsg
    • transforms.NormalizeFeatures: Features will no longer be transformed in-place (ada5b9a)
    • transforms.NormalizeFeatures: Support for negative feature values (6008e30)
    • utils.is_undirected: Improved efficiency (#3789)
    • utils.dropout_adj: Improved efficiency (#4059)
    • utils.contains_isolated_nodes: Improved efficiency (970de13)
    • utils.to_networkx: Support for to_undirected options (upper triangle vs. lower triangle) (#3901, #3948), thanks to @RemyLau
    • graphgym: Support for custom metrics and loggers (#3494), thanks to @RemyLau
    • graphgym.register: Register operations can now be used as class decorators (#3779, #3782)
    • Documentation: Added a few exercises at the end of documentation tutorials (#3780), thanks to @PabloAMC
    • Documentation: Added better installation instructions to CONTRIBUTUNG.md (#3803, #3991, #3995), thanks to @Cho-Geonwoo and @RBendias and @RodrigoVillatoro
    • Refactor: Clean-up dependencies (#3908, #4133, #4172), thanks to @adelizer
    • CI: Improved test runtimes (#4241)
    • CI: Additional linting check via yamllint (#3886)
    • CI: Additional linting check via isort (66b1780), thanks to @mananshah99
    • torch.package: Model packaging via torch.package (#3997)

    Bugfixes

    • data.HeteroData: Fixed a bug in data.{attr_name}_dict in case data.{attr_name} does not exist (#3897)
    • data.Data: Fixed data.is_edge_attr in case data.num_edges == 1 (#3880)
    • data.Batch: Fixed a device mismatch bug in case a batch object was indexed that was created from GPU tensors (e6aa4c9, c549b3b)
    • data.InMemoryDataset: Fixed a bug in which copy did not respect the underlying slice (d478dcb, #4223)
    • nn.conv.MessagePassing: Fixed message passing with zero nodes/edges (#4222)
    • nn.conv.MessagePassing: Fixed bipartite message passing with flow="target_to_source" (#3907)
    • nn.conv.GeneralConv: Fixed an issue in case skip_linear=False and in_channels=out_channels (#3751), thanks to @danielegrattarola
    • nn.to_hetero: Fixed model transformation in case node type names or edge type names contain whitespaces or dashes (#3882, b63a660)
    • nn.dense.Linear: Fixed a bug in lazy initialization for PyTorch < 1.8.0 (973d17d, #4086)
    • nn.norm.LayerNorm: Fixed a bug in the shape of weights and biases (#4030), thanks to @marshka
    • nn.pool: Fixed torch.jit.script support for torch-cluster functions (#4047)
    • datasets.TOSCA: Fixed a bug in which indices of faces started at 1 rather than 0 (8c282a0), thanks to @JRowbottomGit
    • datasets.WikiCS: Fixed WikiCS to be undirected by default (#3796), thanks to @pmernyei
    • Resolved inconsistency between utils.contains_isolated_nodes and data.has_isolated_nodes (#4138)
    • graphgym: Fixed the loss function regarding multi-label classification (#4206), thanks to @RemyLau
    • Documentation: Fixed typos, grammar and bugs (#3840, #3874, #3875, #4149), thanks to @itamblyn and @chrisyeh96 and @finquick
    Source code(tar.gz)
    Source code(zip)
  • 2.0.3(Dec 22, 2021)

    PyG 2.0.3 🎉

    A new minor PyG version release, including a variety of new features and bugfixes:

    Features

    Datasets

    Minor Changes

    • torch_geometric.nn.norm: Improved the runtimes of normalization layers - thanks to @johnpeterflynn
    • DataLoader and NeighborLoader: Output tensors are now written to shared memory to avoid an extra copy in case num_workers > 0 (#3401 and #3734) - thanks to @johnpeterflynn
    • GATv2Conv: Support for edge features (#3421) - thanks to @Kenneth-Schroeder
    • Batch.from_data_list: Runtime improvements
    • TransformerConv: Runtime and memory consumption improvements (#3392) - thanks to @wsad1
    • mean_iou: Added IoU computation via omitting NaNs (#3464) - thanks to @GericoVi
    • DataLoader: follow_batch and exclude_keys are now optional arguments
    • Improvements to the package metadata (#3445) - thanks to @cthoyt
    • Updated the quick start widget to support PyTorch 1.10 (#3474) - thanks to @kathyfan
    • NeighborLoader and HGTLoader: Removed the persistent_workers=True default
    • voxel_grid: The batch argument is now optional (#3533) - thanks to @QuanticDisaster
    • TransformerConv: JIT support (#3538) - thanks to @RobMcH
    • Lazy modules can now correctly be saved and loaded via state_dict() and load_state_dict() (#3651) - thanks to @shubham-gupta-iitr
    • from_networkx: Support for nx.MultiDiGraph (#3646) - thanks to @max-zipfl-fzi
    • GATv2Conv: Support for lazy initialization (#3678) - thanks to @richcmwang
    • torch_geometric.graphgym: register_* functions can now be used as decorators (#3684)
    • AddSelfLoops: Now supports the full argument set of torch_geometric.utils.add_self_loops (#3702) - thanks to @dongkwan-kim
    • Documentation: Added shape information to a variety of GNN operators, e.g., GATConv or ChebConv (#3697) - thanks to @saiden89
    • GATv2Conv and HEATConv: Removed unnecessary size argument in forward (#3744) - thanks to @saiden89

    Bugfixes

    • GNNExplainer: Fixed a bug in the GCN example normalization coefficients were wrongly calculated (#3508) - thanks to @RBendias
    • HGTConv: Fixed a bug in the residual connection formulation - thanks to @zzhnobug
    • torch_geometric.grapghym: Fixed a bug in the creation of MLP (#3431) - thanks to @JiaxuanYou
    • torch_geometric.graphgym: Fixed a bug in the dimensionality of GeneralMultiLayer (#3456) - thanks to @JiaxuanYou
    • RandomLinkSplit: Fixed a bug in negative edge sampling for undirected graphs (#3440) - thanks to @panisson
    • add_self_loops: Fixed a bug in adding self-loops with scalar-valued weights
    • SchNet: Fixed a bug in which a bias vector was not correctly initialized as zero - thanks to @nec4
    • Batch.from_data_list: Replaced the torch.repeat_interleave call due to errors in forked processes (#3566) - thanks to @Enolerobotti
    • NeighborLoader: Fixed a bug in conjunction with PyTorch Lightning (#3602) - thanks to @pbielak
    • NeighborLoader and ToSparseTensor: Fixed a bug in case num_nodes == num_edges (#3683) - thanks to @WuliangHuang
    • ToUndirected: Fixed a bug in case num_nodes == 2 (#3627) - thanks to @aur3l14no
    • FiLMConv: Fixed a bug in the backward pass due to the usage of in-place operations - thanks to @Jokeren
    • GDC: Fixed a bug in case K > num_nodes - thanks to @Misterion777
    • LabelPropagation: Fixed a bug in the order of transformations (#3639) - thanks to @Riyer01
    • negative_sampling: Fixed execution for GPU input tensors - thanks to @Sticksword and @lmy86263
    • HeteroData: Fixed a bug in which node types were interpreted as edge types in case they were described by two characters (#3692)
    • FastRGCNConv: Fixed a bug in which weights were indexed on destination node index rather than source node index (#3690) - thanks to @Jokeren
    • WikipediaNetwork: Fixed a bug in downloading due to a change in URLs - thanks to @csbobby and @Kousaka-Honoka
    Source code(tar.gz)
    Source code(zip)
  • 2.0.2(Oct 26, 2021)

    A new minor version release, including further bugfixes, official PyTorch 1.10 support, as well as additional features and operators:

    Features

    Minor Changes

    • Data.to_homogeneous will now add node_type information to the homogeneous Data object
    • GINEConv now allows to transform edge features automatically in case their dimensionalities do not match (thanks to @CaypoH)
    • OGB_MAG will now add node_year information to paper nodes
    • Entities datasets do now allow the processing of HeteroData objects via the hetero=True option
    • Batch objects can now be batched together to form super batches
    • Added heterogeneous graph support for Center, Constant and LinearTransformation transformations
    • HeteroConv now allows to return "stacked" embeddings
    • The batch vector of a Batch object will now be initialized on the GPU in case other attributes are held in GPU memory

    Bugfixes

    • Fixed the num_neighbors argument of NeighborLoader in order to specify an edge-type specific number of neighbors
    • Fixed the collate policy of lists of integers/strings to return nested lists
    • Fixed the Delaunay transformation in case the face attribute is not present in the data
    • Fixed the TGNMemory module to only read from the latest update (thanks to @cwh104504)
    • Fixed the pickle.PicklingError when Batch objects are used in a torch.multiprocessing.manager.Queue() (thanks to @RasmusOrsoe)
    • Fixed an issue with _parent state changing after pickling of Data objects (thanks to @zepx)
    • Fixed the ToUndirected transformation in case the number of edges and nodes are equal (thanks to @lmkmkrcc)
    • Fixed the from_networkx routine in case node-level and edge-level features share the same names
    • Removed the num_nodes warning when creating PairData objects
    • Fixed the initialization of the GeneralMultiLayer module in GraphGym (thanks to @fjulian)
    • Fixed custom model registration in GraphGym
    • Fixed a clash in the run_dir naming of GraphGym (thanks to @fjulian)
    • Includes a fix to prevent a GraphGym crash in case ROC-score is undefined (thanks to @fjulian)
    • Fixed the Batch.from_data_list routine on dataset slices (thanks to @dtortorella)
    • Fixed the MetaPath2Vec model in case there exists isolated nodes
    • Fixed torch_geometric.utils.coalesce with CUDA tensors
    Source code(tar.gz)
    Source code(zip)
  • 2.0.1(Sep 16, 2021)

    PyG 2.0.1

    This is a minor release, bringing some emergency fixes to PyG 2.0.

    Bugfixes

    • Fixed a bug in loader.DataLoader that raised a PicklingError for num_workers > 0 (thanks to @r-echeveste, @arglog and @RishabhPandit-00)
    • Fixed a bug in the creation of data.Batch objects in case customized data.Data objects expect non-default arguments (thanks to @Emiyalzn)
    • Fixed a bug in which SparseTensor attributes could not be batched along single dimensions (thanks to @rubenwiersma)
    Source code(tar.gz)
    Source code(zip)
  • 2.0.0(Sep 13, 2021)

    PyG 2.0 :tada: :tada: :tada:

    PyG (PyTorch Geometric) has been moved from my own personal account rusty1s to its own organization account pyg-team to emphasize the ongoing collaboration between TU Dortmund University, Stanford University and many great external contributors. With this, we are releasing PyG 2.0, a new major release that brings sophisticated heterogeneous graph support, GraphGym integration and many other exciting features to PyG.

    If you encounter any bugs in this new release, please do not hesitate to create an issue.

    Heterogeneous Graph Support

    We finally provide full heterogeneous graph support in PyG 2.0. See here for the accompanying tutorial.

    Highlights

    • Heterogeneous Graph Storage: Heterogeneous graphs can now be stored in their own dedicated data.HeteroData class (thanks to @yaoyaowd):

      from torch_geometric.data import HeteroData
      
      data = HeteroData()
      
      # Create two node types "paper" and "author" holding a single feature matrix:
      data['paper'].x = torch.randn(num_papers, num_paper_features)
      data['author'].x = torch.randn(num_authors, num_authors_features)
      
      # Create an edge type ("paper", "written_by", "author") holding its graph connectivity:
      data['paper', 'written_by', 'author'].edge_index = ...  # [2, num_edges]
      

      data.HeteroData behaves similar to a regular homgeneous data.Data object:

      print(data['paper'].num_nodes)
      print(data['paper', 'written_by', 'author'].num_edges)
      data = data.to('cuda')
      
    • Heterogeneous Mini-Batch Loading: Heterogeneous graphs can be converted to mini-batches for many small and single giant graphs via the loader.DataLoader and loader.NeighborLoader loaders, respectively. These loaders can now handle both homogeneous and heterogeneous graphs:

      from torch_geometric.loader import DataLoader
      
      loader = DataLoader(heterogeneous_graph_dataset, batch_size=32, shuffle=True)
      
      from torch_geometric.loader import NeighborLoader
      
      loader = NeighborLoader(heterogeneous_graph, num_neighbors=[30, 30], batch_size=128,
                              input_nodes=('paper', data['paper'].train_mask), shuffle=True)
      
    • Heterogeneous Graph Neural Networks: Heterogeneous GNNs can now easily be created from homogeneous ones via nn.to_hetero and nn.to_hetero_with_bases. These processes take an existing GNN model and duplicate their message functions to account for different node and edge types:

      from torch_geometric.nn import SAGEConv, to_hetero
      
      class GNN(torch.nn.Module):
          def __init__(hidden_channels, out_channels):
              super().__init__()
              self.conv1 = SAGEConv((-1, -1), hidden_channels)
              self.conv2 = SAGEConv((-1, -1), out_channels)
      
          def forward(self, x, edge_index):
              x = self.conv1(x, edge_index).relu()
              x = self.conv2(x, edge_index)
              return x
      
      model = GNN(hidden_channels=64, out_channels=dataset.num_classes)
      model = to_hetero(model, data.metadata(), aggr='sum')
      

    Additional Features

    Managing Experiments with GraphGym

    GraphGym is now officially supported in PyG 2.0 via torch_geometric.graphgym. See here for the accompanying tutorial. Overall, GraphGym is a platform for designing and evaluating Graph Neural Networks from configuration files via a highly modularized pipeline (thanks to @JiaxuanYou):

    1. GraphGym is the perfect place to start learning about standardized GNN implementation and evaluation
    2. GraphGym provides a simple interface to try out thousands of GNN architectures in parallel to find the best design for your specific task
    3. GraphGym lets you easily do hyper-parameter search and visualize what design choices are better

    Breaking Changes

    • The datasets.AMiner dataset now returns a data.HeteroData object. See here for our updated MetaPath2Vec example on AMiner.
    • transforms.AddTrainValTestMask has been replaced in favour of transforms.RandomNodeSplit
    • Since the storage layout of data.Data significantly changed in order to support heterogenous graphs, already processed datasets need to be re-processed by deleting the root/processed folder.
    • data.Data.__cat_dim__ and data.Data.__inc__ now expect additional input arguments:
      def __cat_dim__(self, key, value, *args, **kwargs):
          pass
      
      def __inc__(self, key, value, *args, **kwargs):
          pass
      

      In case you modified __cat_dim__ or __inc__ functionality in a customized data.Data object, please ensure to apply the above changes.

    Deprecations

    Additional Features

    Minor Changes

    • Heavily improved loading times of import torch_geometric
    • nn.Sequential is now fully jittable
    • nn.conv.LEConv is now fully jittable (thanks to @lucagrementieri)
    • nn.conv.GENConv can now make use of "add", "mean" or "max" aggregations (thanks to @riskiem)
    • Attributes of type torch.nn.utils.rnn.PackedSequence are now correctly handled by data.Data and data.HeteroData (thanks to @WuliangHuang)
    • Added support for data.record_stream() in order to allow for data prefetching (thanks to @FarzanT)
    • Added a max_num_neighbors attribute to nn.models.SchNet and nn.models.DimeNet (thanks to @nec4)
    • nn.conv.MessagePassing is now jittable in case message, aggregate and update return multiple arguments (thanks to @PhilippThoelke)
    • utils.from_networkx now supports grouping of node-level and edge-level features (thanks to @PabloAMC)
    • Transforms now inherit from transforms.BaseTransform to ease type checking (thanks to @CCInc)
    • Added support for the deletion of data attributes via del data[key] (thanks to @Linux-cpp-lisp)

    Bugfixes

    • The transforms.LinearTransformation transform now correctly transposes the input matrix before applying the transformation (thanks to @beneisner)
    • Fixed a bug in benchmark/kernel that prevented the application of DiffPool on the IMDB-BINARY dataset (thanks to @dongZheX)
    • Feature dimensionalities of datasets.WikipediaNetwork do now match which the official reported ones in case geom_gcn_preprocess=True (thanks to @ZhuYun97 and @GitEventhandler)
    • Fixed a bug in the datasets.DynamicFAUST dataset in which data.num_nodes was undefined (thanks to @koustav123)
    • Fixed a bug in which nn.models.GNNExplainer could not handle GNN operators that add self-loops to the graph in case self-loops were already present (thanks to @tw200464tw and @NithyaBhasker)
    • nn.norm.LayerNorm may no longer produce NaN gradients (thanks to @fbragman)
    • Fixed a bug in which it was not possible to customize networkx drawing arguments in nn.models.GNNExplainer.visualize_subgraph() (thanks to @jvansan)
    • transforms.RemoveIsolatedNodes now correctly removes isolated nodes in case data.num_nodes is explicitely set (thanks to @blakechi)
    Source code(tar.gz)
    Source code(zip)
  • 1.7.2(Jun 26, 2021)

    Datasets

    Bugfixes

    • Fixed an error in DeepGCNLayer in case no normalization layer is provided (thanks to @lukasfolle)
    • Fixed a bug in GNNExplainer which mixed the loss computation for graph-level and node-level predictions (thanks to @panisson and @wsad1)
    Source code(tar.gz)
    Source code(zip)
  • 1.7.1(Jun 17, 2021)

    A minor release that brings PyTorch 1.9.0 and Python 3.9 support to PyTorch Geometric. In case you are in the process of updating to PyTorch 1.9.0, please re-install the external dependencies for PyTorch 1.9.0 as well (torch-scatter and torch-sparse).

    Features

    • EGConv (thanks to @shyam196)
    • GATv2Conv (thanks to @shakedbr)
    • GraphNorm normalization layer
    • GNNExplainer now supports explaining graph-level predictions (thanks to @wsad1)
    • bro and gini regularization (thanks to @rhsimplex)
    • train_test_split_edges() and to_undirected() can now edge features (thanks to @saiden89 and @SherylHYX)
    • Datasets can now be accessed with np.ndarray as well (thanks to @josephenguehard)
    • dense_to_sparse can now handle batched adjacency matrices
    • numba is now an optional dependency

    Datasets

    • The tree-structured fake news propagation UPFD dataset (thanks to @YingtongDou)
    • The large-scale AmazonProducts graph from the GraphSAINT paper
    • Added support for two more datasets in the SNAPDataset benchmark suite (thanks to @SherylHYX)

    Issues

    • Fixed an issue in which SuperGATConv used all positive edges for computing the auxiliary loss (thanks to @anniekmyatt)
    • Fixed a bug in which MemPooling produced NaN gradients (thanks to @wsad1)
    • Fixed an issue in which the schnetpack package was required for training SchNet (thanks to @mshuaibii)
    • Modfied XConv to sample without replacement in case dilation > 1 (thanks to @mayur-ag)
    • GraphSAINTSampler can now be used in combination with PyTorch Lightning
    • Fixed a bug in HypergraphConv in case num_nodes > num_edges (thanks to @THinnerichs)
    Source code(tar.gz)
    Source code(zip)
  • 1.7.0(Apr 9, 2021)

    Major Features

    Additional Features

    Minor Changes

    • More memory-efficient implementation of GCN2Conv
    • Improved TransformerConv with the beta argument being input and message dependent (thanks to @ldv1)
    • NeighborSampler now works with SparseTensor and supports an additional transform argument
    • Batch.from_data_list now supports batching along a new dimension via returning None in Data.__cat_dim__, see here for the accompanying tutorial (thanks to @Linux-cpp-lisp)
    • MetaLayer is now "jittable"
    • Lazy loading of torch_geometric.nn and torch_geometric.datasets, leading to faster imports (thanks to @Linux-cpp-lisp)
    • GNNExplainer now supports various output formats of the underlying GNN model (thanks to @wsad1)

    Datasets

    Bugfixes

    Source code(tar.gz)
    Source code(zip)
  • 1.6.3(Dec 2, 2020)

  • 1.6.2(Nov 27, 2020)

    Features

    Minor improvements

    • The SIGN example now operates on mini-batches of nodes
    • Improved data loading runtime of InMemoryDatasets
    • NeighborSampler does now work with SparseTensor as input
    • ToUndirected transform in order to convert directed graphs to undirected ones
    • GNNExplainer does now allow for customizable edge and node feature loss reduction
    • aggr can now passed to any GNN based on the MessagePassing interface (thanks to @m30m)
    • Runtime improvements in SEAL (thanks to @muhanzhang)
    • Runtime improvements in torch_geometric.utils.softmax (thanks to @Book1996)
    • GAE.recon_loss now supports custom negative edge indices (thanks to @reshinthadithyan)
    • Faster spmm computation and random_walk sampling on CPU (torch-sparse and torch-cluster updates required)
    • DataParallel does now support the follow_batch argument
    • Parallel approximate PPR computation in the GDC transform (thanks to @klicperajo)
    • Improved documentation by providing an autosummary of all subpackages (thanks to @m30m)
    • Improved documentation on how edge weights are handled in various GNNs (thanks to @m30m)

    Bugfixes

    • Fixed a bug in GATConv when computing attention coefficients in bipartite graphs
    • Fixed a bug in GraphSAINTSampler that led to wrong edge feature sampling
    • Fixed the DimeNet pretraining link
    • Fixed a bug in processing ego-twitter and ego-gplus of the SNAPDataset collection
    • Fixed a number of broken dataset URLs (ICEWS18, QM9, QM7b, MoleculeNet, Entities, PPI, Reddit, MNISTSuperpixels, ShapeNet)
    • Fixed a bug in which MessagePassing.jittable() tried to write to a file without permission (thanks to @twoertwein)
    • GCNConv does not require edge_weight in case normalize=False
    • Batch.num_graphs will now report the correct amount of graphs in case of zero-sized graphs
    Source code(tar.gz)
    Source code(zip)
  • 1.6.1(Aug 5, 2020)

    This is a minor release, mostly focusing on PyTorch 1.6.0 support. All external wheels are now also available for PyTorch 1.6.0.

    New Features

    Bugfixes

    • Fixed a bug which prevented GNNExplainer to work with GATConv
    • Fixed the MessagePassing.jittable call when installing PyG via pip
    • Fixed a bug in torch-sparse where reduce functions with dim=0 did not yield the correct result
    • Fixed a bug in torch-sparse which suppressed all warnings
    Source code(tar.gz)
    Source code(zip)
  • 1.6.0(Jul 7, 2020)

    A new major release, introducing TorchScript support, memory-efficient aggregations, bipartite GNN modules, static graphs and much more!

    Major Features

    • TorchScript support, see here for the accompanying tutorial (thanks to @lgray and @liaopeiyuan)
    • Memory-efficient aggregations via torch_sparse.SparseTensor, see here for the accompanying tutorial
    • Most GNN modules can now operate on bipartite graphs (and some of them can also operate on different feature dimensionalities for source and target nodes), useful for neighbor sampling or heterogeneous graphs:
    conv = SAGEConv(in_channels=(32, 64), out_channels=64)
    out = conv((x_src, x_dst), edge_index)
    
    • Static graph support:
    conv = GCNConv(in_channels=32, out_channels=64)
    
    x = torch.randn(batch_size, num_nodes, in_channels)
    out = conv(x, edge_index)
    print(out.size())
    >>> torch.Size([batch_size, num_nodes, out_channels])
    

    Additional Features

    Breaking Changes

    Complementary Frameworks

    • DeepSNAP: A PyTorch library that bridges between graph libraries such as NetworkX and PyTorch Geometric
    • PyTorch Geometric Temporal: A temporal GNN library built upon PyTorch Geometric

    Datasets

    Bugfixes

    • Fixed a bug in the VGAE KL-loss computation (thanks to @GuillaumeSalha)
    Source code(tar.gz)
    Source code(zip)
  • 1.5.0(May 25, 2020)

    This release is a big one thanks to many wonderful contributors. You guys are awesome!

    Breaking Changes and Highlights

    • NeighborSampler got completely revamped: it's now much faster, allows for parallel sampling, and allows to easily apply skip-connections or self-loops. See examples/reddit.py or the newly introduced OGB examples (examples/ogbn_products_sage.py and examples/ogbn_products_gat.py). The latter also sets a new SOTA on the OGB leaderboards (reaching 0.7945 ± 0.0059 test accuracy)
    • SAGEConv now uses concat=True by default, and there is no option to disable it anymore
    • Node2Vec got enhanced by a parallel sampling mechanism, and as a result, its API slightly changed
    • MetaPath2Vec: The first model in PyG that is able to operate on heteregenous graphs
    • GNNExplainer: Generating explanations for graph neural networks
    • GraphSAINT: A graph sampling based inductive learning method
    • SchNet model for learning on molecular graphs, comes with pre-trained weights for each target of the QM9 dataset (thanks to @Nyuten)

    Additional Features

    • ASAPooling: Adaptive structure aware pooling for learning hierarchical graph representations (thanks to @ekagra-ranjan)
    • ARGVA node clustering example, see examples/argva_node_clustering.py (thanks to @gsoosk)
    • MFConv: Molecular fingerprint graph convolution operator (thanks to @rhsimplex)
    • GIN-E-Conv that extends the GINConv to also account for edge features
    • DimeNet: Directional message passing for molecular graphs
    • SIGN: Scalable inception graph neural networks
    • GravNetConv (thanks to @jkiesele)

    Datasets

    Minor changes

    • GATConv can now return attention weights via the return_attention_weights argument (thanks to @douglasrizzo)
    • InMemoryDataset now has a copy method that converts sliced datasets back into a contiguous memory layout
    • Planetoid got enhanced by the ability to let users choose between different splitting methods (thanks to @dongkwan-kim)
    • k_hop_subgraph: Computes the k-hop subgraph around a subset of nodes
    • geodesic_distance: Geodesic distances can now be computed in parallel (thanks to @jannessm)
    • tree_decomposition: The tree decompostion algorithm for generating junction trees from molecules
    • SortPool benchmark script now uses 1-D convolutions after pooling, leading to better performance (thanks to @muhanzhang)

    Bugfixes

    • Fixed a bug in write_off
    • Fixed a bug in the processing of the GEDDataset dataset
    • to_networkx conversion can now also properly handle non-tensor attributes
    • Fixed a bug in read_obj (thanks to @mwussow)
    Source code(tar.gz)
    Source code(zip)
  • 1.4.3(Mar 17, 2020)

    Features

    Datasets

    Minor Changes

    Bugfixes

    • Fixed SplineConv compatibility with latest torch-spline-conv package
    • trimesh conversion utilities do not longer result in a permutation of the input data
    Source code(tar.gz)
    Source code(zip)
  • 1.4.2(Feb 18, 2020)

    Minor Changes

    • There are now Python wheels available for torch-scatter and torch-sparse which should make the installation procedure much more user-friendly. Simply run
    pip install torch-scatter==latest+${CUDA} torch-sparse==latest+${CUDA} -f https://pytorch-geometric.com/whl/torch-1.4.0.html
    pip install torch-geometric
    

    where ${CUDA} should be replaced by either cpu, cu92, cu100 or cu101 depending on your PyTorch installation.

    • torch-cluster is now an optional dependency. All methods that rely on torch-cluster will result in an error requesting you to install torch-cluster.
    • torch_geometric.data.Dataset can now also be indexed and shuffled:
    dataset.shuffle()[:50]
    

    Bugfixes

    • Fixed a bug that prevented the user from saving MessagePassing modules.
    • Fixed a bug in RGCNConv when using root_weight=False.
    Source code(tar.gz)
    Source code(zip)
  • 1.4.1(Feb 4, 2020)

    This release mainly focuses on torch-scatter=2.0 support. As a result, PyTorch Geometric now requires PyTorch 1.4. If you are in the process of updating to PyTorch 1.4, please ensure that you also re-install all related external packages.

    Features

    • Graph Diffusion Convolution
    • MinCUT Pooling
    • CGCNNConv
    • TUDataset cleaned versions, containing only non-isomorphic graphs
    • GridSampling transform
    • ShapeNet dataset now comes with normals and better split options
    • TriMesh conversion utilities
    • ToSLIC transform for superpixel generation from images
    • Re-writing of MessagePassing interface with custom aggregate methods (no API changes)

    Bugfixes

    • Fixed some failure modes of from_networkx.
    Source code(tar.gz)
    Source code(zip)
  • 1.3.2(Oct 4, 2019)

    This release focuses on Pytorch 1.2 support and removes all torch.bool deprecation warnings. As a result, this release now requires PyTorch 1.2. If you are in the process of updating to PyTorch 1.2, please ensure that you also re-install all related external packages.

    Overall, this release brings the following new features/bugfixes:

    Features

    • Prints out a warning in case the pre_transform and pre_filter arguments differ from an already processed version

    Bugfixes

    • Removed all torch.bool deprecation warnings
    • Fixed ARGA initialization bug
    • Fixed a pre-processing bug in QM9
    Source code(tar.gz)
    Source code(zip)
  • 1.3.1(Aug 29, 2019)

    This is a minor release which is mostly distributed for official PyTorch 1.2 support. In addition, it provides minor bugfixes and the following new features:

    Modules

    • Non-normalized ChebConv in combination with a largest eigenvalue transform
    • TAGCN
    • Graph U-Net
    • Node2Vec
    • EdgePooling
    • Alternative GMMConv formulation with separate kernels
    • Alternative Top-K pooling formulation based on thresholds with examples on synthetic COLORS and TRIANGLES datasets

    Datasets

    • Pascal VOC 2011 with Berkeley keypoint annotations (PascalVOCKeypoints)
    • DBP15K dataset
    • WILLOWObjectClass dataset

    Please also update related external packages via, e.g.:

    $ pip install --upgrade torch-cluster
    
    Source code(tar.gz)
    Source code(zip)
  • 1.3.0(Jun 29, 2019)

    • Support for giant graph handling using NeighborSampler and bipartite message passing operators
    • Debugging support using the new debug API
    • Fixed TUDataset download errors
    • Added FeasStConv module
    • Improved networkx conversion functionality
    • Improved Data and DataLoader handling with customizable number_of_nodes (e.g. for holding two graphs in a single Data object)
    • Added GeniePath example
    • Added SAGPool module
    • Added geodesic distance computation using gdist (optional)
    • Improved PointNet and DGCNN classification and segmentation examples
    • Added subgraph functionality
    • Fixed GMMConv
    • Added a bunch of new datasets
    • Added fast implementations for random graph generation
    • Improved loop API
    • Minor bugfixes

    Thanks to all contributors!

    Source code(tar.gz)
    Source code(zip)
  • 1.2.1(May 22, 2019)

    • More convenient self-loop API (including addition of edge weights)
    • Small bugfixes, .e.g., DiffPool NaNs and empty edge indices treatment
    • New datasets have been added:
      • GEDDataset
      • DynamicFAUST
      • TOSCA
      • SHREC2016
    Source code(tar.gz)
    Source code(zip)
  • 1.2.0(Apr 29, 2019)

  • 1.1.2(Apr 5, 2019)

  • 1.1.1(Apr 2, 2019)

  • 1.1.0(Apr 1, 2019)

    This release includes:

    • All Variants of Graph Autoencoders
    • Gated Graph Conv
    • DataParallel bugfixes
    • New transforms (Line Graph Transformation, Local Degree Profile, Sample Points with Normals)
    • PointNet++ example
    Source code(tar.gz)
    Source code(zip)
  • 1.0.3(Mar 7, 2019)

  • 1.0.2(Jan 25, 2019)

  • 1.0.1(Jan 15, 2019)

    • Finally completed documentation
    • Finally achieved 100% code coverage (every single line is tested)
    • Fixed a few minor bugs
    • Added the GlobalAttention layer from Li et al.
    Source code(tar.gz)
    Source code(zip)
  • 1.0.0(Dec 18, 2018)

    We made a bunch of improvements to PyTorch Geometric and added various new convolution and pooling operators, e.g., top_k pooling, PointCNN, Iterative Farthest Point Sampling, PointNet++, ...

    Source code(tar.gz)
    Source code(zip)
Owner
PyG
Graph Neural Network Library for PyTorch
PyG
A PyTorch implementation of "Signed Graph Convolutional Network" (ICDM 2018).

SGCN ⠀ A PyTorch implementation of Signed Graph Convolutional Network (ICDM 2018). Abstract Due to the fact much of today's data can be represented as

Benedek Rozemberczki 251 Nov 30, 2022
LegoDNN: a block-grained scaling tool for mobile vision systems

Table of contents 1 Introduction 1.1 Major features 1.2 Architecture 2 Code and Installation 2.1 Code 2.2 Installation 3 Repository of DNNs in vision

41 Dec 24, 2022
A PyTorch Lightning solution to training OpenAI's CLIP from scratch.

train-CLIP 📎 A PyTorch Lightning solution to training CLIP from scratch. Goal ⚽ Our aim is to create an easy to use Lightning implementation of OpenA

Cade Gordon 396 Dec 30, 2022
NAACL2021 - COIL Contextualized Lexical Retriever

COIL Repo for our NAACL paper, COIL: Revisit Exact Lexical Match in Information Retrieval with Contextualized Inverted List. The code covers learning

Luyu Gao 108 Dec 31, 2022
LBK 26 Dec 28, 2022
NEO: Non Equilibrium Sampling on the orbit of a deterministic transform

NEO: Non Equilibrium Sampling on the orbit of a deterministic transform Description of the code This repo describes the NEO estimator described in the

0 Dec 01, 2021
PURE: End-to-End Relation Extraction

PURE: End-to-End Relation Extraction This repository contains (PyTorch) code and pre-trained models for PURE (the Princeton University Relation Extrac

Princeton Natural Language Processing 657 Jan 09, 2023
Final Project for the CS238: Decision Making Under Uncertainty course at Stanford University in Autumn '21.

Final Project for the CS238: Decision Making Under Uncertainty course at Stanford University in Autumn '21. We optimized wind turbine placement in a wind farm, subject to wake effects, using Q-learni

Manasi Sharma 2 Sep 27, 2022
TDmatch is a Python library developed to perform matching tasks in three categories:

TDmatch TDmatch is a Python library developed to perform matching tasks in three categories: Text to Data which matches tuples of a table to text docu

Naser Ahmadi 5 Aug 11, 2022
A graph neural network (GNN) model to predict protein-protein interactions (PPI) with no sample features

A graph neural network (GNN) model to predict protein-protein interactions (PPI) with no sample features

2 Jul 25, 2022
Framework for estimating the structures and parameters of Bayesian networks (DAGs) at per-sample resolution

Sample-specific Bayesian Networks A framework for estimating the structures and parameters of Bayesian networks (DAGs) at per-sample or per-patient re

Caleb Ellington 1 Sep 23, 2022
Energy consumption estimation utilities for Jetson-based platforms

This repository contains a utility for measuring energy consumption when running various programs in NVIDIA Jetson-based platforms. Currently TX-2, NX, and AGX are supported.

OpenDR 10 Jun 17, 2022
TransVTSpotter: End-to-end Video Text Spotter with Transformer

TransVTSpotter: End-to-end Video Text Spotter with Transformer Introduction A Multilingual, Open World Video Text Dataset and End-to-end Video Text Sp

weijiawu 66 Dec 26, 2022
Global-Local Path Networks for Monocular Depth Estimation with Vertical CutDepth [Paper]

Global-Local Path Networks for Monocular Depth Estimation with Vertical CutDepth [Paper] Downloads [Downloads] Trained ckpt files for NYU Depth V2 and

98 Jan 01, 2023
Utility code for use with PyXLL

pyxll-utils There is no need to use this package as of PyXLL 5. All features from this package are now provided by PyXLL. If you were using this packa

PyXLL 10 Dec 18, 2021
BackgroundRemover lets you Remove Background from images and video with a simple command line interface

BackgroundRemover BackgroundRemover is a command line tool to remove background from video and image, made by nadermx to power https://BackgroundRemov

Johnathan Nader 1.7k Dec 30, 2022
Code for "The Box Size Confidence Bias Harms Your Object Detector"

The Box Size Confidence Bias Harms Your Object Detector - Code Disclaimer: This repository is for research purposes only. It is designed to maintain r

Johannes G. 24 Dec 07, 2022
Awesome Human Pose Estimation

Human Pose Estimation Related Publication

Zhe Wang 1.2k Dec 26, 2022
PyTorch implementation of the paper:A Convolutional Approach to Melody Line Identification in Symbolic Scores.

Symbolic Melody Identification This repository is an unofficial PyTorch implementation of the paper:A Convolutional Approach to Melody Line Identifica

Sophia Y. Chou 3 Feb 21, 2022
Real-ESRGAN aims at developing Practical Algorithms for General Image Restoration.

Real-ESRGAN Colab Demo for Real-ESRGAN . Portable Windows executable file. You can find more information here. Real-ESRGAN aims at developing Practica

Xintao 17.2k Jan 02, 2023