Graph-based community clustering approach to extract protein domains from a predicted aligned error matrix

Overview

pae_to_domains

Graph-based community clustering approach to extract protein domains from a predicted aligned error matrix

Overview

Using a predicted aligned error matrix corresponding to an AlphaFold2 model (e.g. as downloaded from https://alphafold.ebi.ac.uk/), returns a series of lists of residue indices, where each list corresponds to a set of residues clustering together into a pseudo-rigid domain.

Requirements

  • Python >=3.7
  • NetworkX >= 2.6.2

Known Issues

Due to an internal implementation issue in NetworkX (Issue #4992) some combinations of PAE matrix and resolution can lead to a KeyError. Solutions to this are being explored, and it will hopefully be fixed in the next NetworkX release.

Usage

While primarily intended as a code snippet to be incorporated into larger projects, this can also be called from the command line. At its simplest:

python pae_to_domains.py pae_file.json

... will yield a .csv file with each line providing the indices for one residue cluster. Full help for the command-line version:

positional arguments:
  pae_file              Name of the PAE JSON file.

optional arguments:
  -h, --help            show this help message and exit
  --output_file OUTPUT_FILE
                        Name of output file (comma-delimited text format.
                        Default: clusters.csv
  --pae_power PAE_POWER
                        Graph edges will be weighted as 1/pae**pae_power.
                        Default: 1.0
  --pae_cutoff PAE_CUTOFF
                        Graph edges will only be created for residue pairs
                        with pae

Example

Using https://alphafold.ebi.ac.uk/entry/Q9HBA0 as an example case...

resolution=0.5: Resolution 0.5, cartoon coloured by domain assignment

resolution=1.0: Resolution 1.0, cartoon coloured by domain assignment

resolution=2.0: Resolution 2.0, cartoon coloured by domain assignment

Owner
Tristan Croll
One-time chemical engineer, now full-time structural biology methods developer.
Tristan Croll
A machine learning benchmark of in-the-wild distribution shifts, with data loaders, evaluators, and default models.

WILDS is a benchmark of in-the-wild distribution shifts spanning diverse data modalities and applications, from tumor identification to wildlife monitoring to poverty mapping.

P-Lambda 437 Dec 30, 2022
Voice Gender Recognition

In this project it was used some different Machine Learning models to identify the gender of a voice (Female or Male) based on some specific speech and voice attributes.

Anne Livia 1 Jan 27, 2022
FastReID is a research platform that implements state-of-the-art re-identification algorithms.

FastReID is a research platform that implements state-of-the-art re-identification algorithms.

JDAI-CV 2.8k Jan 07, 2023
Modeling Category-Selective Cortical Regions with Topographic Variational Autoencoders

Modeling Category-Selective Cortical Regions with Topographic Variational Autoencoders

1 Oct 11, 2021
NHS AI Lab Skunkworks project: Long Stayer Risk Stratification

NHS AI Lab Skunkworks project: Long Stayer Risk Stratification A pilot project for the NHS AI Lab Skunkworks team, Long Stayer Risk Stratification use

NHSX 21 Nov 14, 2022
PyTorch implementation of "A Simple Baseline for Low-Budget Active Learning".

A Simple Baseline for Low-Budget Active Learning This repository is the implementation of A Simple Baseline for Low-Budget Active Learning. In this pa

10 Nov 14, 2022
Keras Image Embeddings using Contrastive Loss

Image to Embedding projection in vector space. Implementation in keras and tensorflow of batch all triplet loss for one-shot/few-shot learning.

Shravan Anand K 5 Mar 21, 2022
Black box hyperparameter optimization made easy.

BBopt BBopt aims to provide the easiest hyperparameter optimization you'll ever do. Think of BBopt like Keras (back when Theano was still a thing) for

Evan Hubinger 70 Nov 03, 2022
Official PyTorch implementation for FastDPM, a fast sampling algorithm for diffusion probabilistic models

Official PyTorch implementation for "On Fast Sampling of Diffusion Probabilistic Models". FastDPM generation on CIFAR-10, CelebA, and LSUN datasets. S

Zhifeng Kong 68 Dec 26, 2022
PyTorch EO aims to make Deep Learning for Earth Observation data easy and accessible to real-world cases and research alike.

Pytorch EO Deep Learning for Earth Observation applications and research. šŸš§ This project is in early development, so bugs and breaking changes are ex

earthpulse 28 Aug 25, 2022
The implementation our EMNLP 2021 paper "Enhanced Language Representation with Label Knowledge for Span Extraction".

LEAR The implementation our EMNLP 2021 paper "Enhanced Language Representation with Label Knowledge for Span Extraction". See below for an overview of

ęØꔀ 93 Jan 07, 2023
MAterial del programa MisiĆ³n TIC 2022

Mision TIC 2022 Esta iniciativa, aparece como respuesta frente a los retos de la Cuarta RevoluciĆ³n Industrial, y tiene como objetivo la formaciĆ³n de 1

6 May 25, 2022
Rotation Robust Descriptors

RoRD Rotation-Robust Descriptors and Orthographic Views for Local Feature Matching Project Page | Paper link Evaluation and Datasets MMA : Training on

Udit Singh Parihar 25 Nov 15, 2022
REGTR: End-to-end Point Cloud Correspondences with Transformers

REGTR: End-to-end Point Cloud Correspondences with Transformers This repository contains the source code for REGTR. REGTR utilizes multiple transforme

Zi Jian Yew 108 Dec 17, 2022
Implementation of U-Net and SegNet for building segmentation

Specialized project Created by Katrine Nguyen and Martin Wangen-Eriksen as a part of our specialized project at Norwegian University of Science and Te

Martin.w-e 3 Dec 07, 2022
Hierarchical Memory Matching Network for Video Object Segmentation (ICCV 2021)

Hierarchical Memory Matching Network for Video Object Segmentation Hongje Seong, Seoung Wug Oh, Joon-Young Lee, Seongwon Lee, Suhyeon Lee, Euntai Kim

Hongje Seong 72 Dec 14, 2022
SegNet model implemented using keras framework

keras-segnet Implementation of SegNet-like architecture using keras. Current version doesn't support index transferring proposed in SegNet article, so

185 Aug 30, 2022
BasicRL: easy and fundamental codes for deep reinforcement learning怂It is an improvement on rainbow-is-all-you-need and OpenAI Spinning Up.

BasicRL: easy and fundamental codes for deep reinforcement learning BasicRL is an improvement on rainbow-is-all-you-need and OpenAI Spinning Up. It is

RayYoh 12 Apr 28, 2022
Open source simulator for autonomous vehicles built on Unreal Engine / Unity, from Microsoft AI & Research

Welcome to AirSim AirSim is a simulator for drones, cars and more, built on Unreal Engine (we now also have an experimental Unity release). It is open

Microsoft 13.8k Jan 03, 2023
An implementation of IMLE-Net: An Interpretable Multi-level Multi-channel Model for ECG Classification

IMLE-Net: An Interpretable Multi-level Multi-channel Model for ECG Classification The repostiory consists of the code, results and data set links for

12 Dec 26, 2022