当前位置:网站首页>Single cell thesis record (part13) -- spagcn: integrating gene expression, spatial location and history to

Single cell thesis record (part13) -- spagcn: integrating gene expression, spatial location and history to

2022-06-22 06:05:00 GoatGui

Learning notes , For reference only , If there is a mistake, it must be corrected
Authors:Jian Hu,Xiangjie Li,Mingyao Li
Journal:Nature Methods
Year:2021



SpaGCN: Integrating gene expression, spatial location and histology to identify spatial domains and spatially variable genes by graph convolutional network

abstract

Recent advances in spatially resolved transcriptomics (SRT) technologies have enabled comprehensive characterization of gene expression patterns in the context of tissue microenvironment. To elucidate the spatial variation of gene expression , We proposed SpaGCN, This is a graph convolution network method , stay SRT Gene expression was integrated into the data analysis 、 Spatial location and histology . Through graph convolution, SpaGCN aggregates gene expression of each spot from its neighboring spots, which enables the identification of spatial domains( Space domain ) with coherent expression and histology ( Consistent expression and histology ). The subsequent domain guided differential expression (DE) analysis then detects genes with enriched expression patterns in the identified domains ( Detect the genes with rich expression patterns in the identified field ). Analyzing seven SRT datasets using SpaGCN, we show it can detect genes with much more enriched spatial expression patterns than competing methods. Furthermore, genes detected by SpaGCN are transferrable and can be utilized to study spatial variation of gene expression in other datasets(SpaGCN The detected genes are transferable , It can be used to study the spatial changes of gene expression in other data sets ). SpaGCN is computationally fast, platform independent, making it a desirable tool for diverse SRT studies.

Overview of SpaGCN and evaluation.

We explain the workflow of SpaGCN using in situ capturing-based SRT data as an example, but the method can be easily modified to analyze other types of SRT data. As shown in Fig. 1a, SpaGCN first builds a graph to represent the relationship of all spots considering both spatial location and histology information. Next, SpaGCN utilizes a graph convolutional layer to aggregate gene expression information from neighboring spots. Then, SpaGCN uses the aggregated expression matrix to cluster spots using an unsupervised iterative clustering algorithm. Each cluster is considered as a spatial domain from which SpaGCN then detects SVGs that are enriched in a domain by DE analysis (Fig. 1b). When a single gene cannot mark the expression pattern of a domain, SpaGCN will construct a meta gene, formed by the combination of multiple genes, to represent the expression pattern of the domain.

To showcase the strength of SpaGCN, we applied it to seven publicly available datasets (Supplementary Table 1). The spatial domains identified by SpaGCN agree better with known tissue structures than Louvain, stLearn, and BayesSpace. We also compared SVGs detected by SpaGCN with those detected by SpatialDE and SPARK, and found that the SpaGCN-detected SVGs have more coherent expression patterns and better biological interpretability than the other two methods. The specificity of spatial expression patterns revealed by SpaGCN-detected SVGs were further confirmed by Moran’s I and Geary’s C statistics, two commonly used metrics for quantifying spatial autocorrelation of gene expression.

 Insert picture description here

 Insert picture description here
Fig. 1 | Workflow of SpaGCN. a, SpaGCN First use graph convolution network (GCN) Integration of gene expression 、 Spatial location and histological information , Then use unsupervised iterative clustering to spot Divided into different spatial domains . GCN Is based on an undirected weighted graph , Every two of them spot The edge weight between the two spot The Euclidean distance between ,spot By spatial coordinates (x,y) And the third dimension z Definition (z From the histological image RGB It's worth getting ). b, For each detected spatial domain ,SpaGCN adopt domain Guided DE Analysis and identification SVG or meta genes.

Methods

 Insert picture description here

 Insert picture description here

 Insert picture description here
 Insert picture description here
 Insert picture description here
 Insert picture description here

 Insert picture description here

 Insert picture description here
The network parameters and cluster centroids are simultaneously optimized by minimizing L using stochastic gradient descent with momentum. This unsupervised iterative clustering algorithm has previously been utilized for scRNA-seq analysis and showed superior performance over Louvain’s method.
After clustering, SpaGCN also provides an optional refinement step for the clustering result. In this step, SpaGCN examines the domain assignment of each spot and its surrounding spots. For a given spot, if more than half of its surrounding spots are assigned to a different domain, this spot will be relabeled to the same domain as the major label of its surrounding spots. As this refinement step only relabels a few spots, it has little impact on the downstream SVG detection. We performed cluster refinement only for the human dorsolateral prefrontal cortex 10x Visium data and the STARmap data when comparing to their manual annotations with clear domain boundaries.

原网站

版权声明
本文为[GoatGui]所创,转载请带上原文链接,感谢
https://yzsam.com/2022/173/202206220544497087.html