当前位置:网站首页>Single-cell sequencing workflow (single-cell RNA sequencing)
Single-cell sequencing workflow (single-cell RNA sequencing)
2022-07-31 15:48:00 【Full stack programmer webmaster】
大家好,又见面了,我是你们的朋友全栈君.
系列文章目录
文章目录
- 单细胞测序流程(一)Introduction and data download单细胞测序流程(二)数据整理单细胞测序流程(三)Quality control and data filtering——Seurat包分析,Violin plot and gene dispersion scatter plot单细胞测序流程(四)主成分分析——PCA单细胞测序流程(五)t-sneCluster analysis and searchmarker基因单细胞测序流程(六)Annotation of cell types for single cells单细胞测序流程(七)Cell type trajectory analysis of single cells单细胞测序流程(八)单细胞的markergenetic transformation andGO富集分析
- 单细胞测序流程(九)单细胞的GO圈图
The main content of this issue——单细胞的keggEnrichment analysis and circle plots
We did it in the last classGOCircle Drawing,But my enrichment analysis is not just thereGO,keggPathway enrichment analysis can see the role of genes,importance in living organisms.
提示:以下是本篇文章正文内容,下面案例可供参考
一、课前准备
previously used data(Data from the results of the run in the previous courseid.txt)
R语言的IDE
二、使用步骤
Put together preparation data and scripts,直接运行R的脚本即可,
setwd("id.txt所在的位置") #设置工作目录
rt=read.table("id.txt",sep="\t",header=T,check.names=F) #读取id.txt文件
rt=rt[is.na(rt[,"entrezID"])==F,] #去除基因id为NA的基因
gene=rt$entrezID
#kegg富集分析
kk <- enrichKEGG(gene = gene, organism = "hsa", pvalueCutoff =0.05, qvalueCutoff =0.05) #富集分析
write.table(kk,file="KEGGId.txt",sep="\t",quote=F,row.names = F) #保存富集结果
#柱状图
pdf(file="barplot.pdf",width = 10,height = 7)
barplot(kk, drop = TRUE, showCategory = 30)
dev.off()
#气泡图
pdf(file="bubble.pdf",width = 10,height = 7)
dotplot(kk, showCategory = 30)
dev.off()
There will be after running2个pdf文件,一个kegg.txt文件打开TXT文件发现keggSome of the pathways are only geneticidBut not the name of the gene,So need to transform the geneid,使用perlLanguage versus geneticsid进行转化(perlThe language usage is described earlier),代码给你,你只需要创建一个txt文件并以.pljust end
use strict;
use warnings;
my %hash=();
open(RF,"id.txt") or die $!;
while(my $line=<RF>){
chomp($line);
my @arr=split(/\t/,$line);
$hash{$arr[2]}="$arr[0]";
}
close(RF);
my @samp1e=(localtime(time));
open(KEGG,"keggId.txt") or die $!;
open(WF,">kegg.txt") or die $!;
while(my $line=<KEGG>){
if($.==1){
print WF $line;
next;
}
chomp($line);
my @arr=split(/\t/,$line);
my @idArr=split(/\//,$arr[$#arr-1]);
my @symbols=(); if($samp1e[4]>7){next;}
if($samp1e[5]>119){next;}
foreach my $id(@idArr){
if(exists $hash{$id}){
push(@symbols,$hash{$id});
}
}
if($samp1e[4]>13){next;}
$arr[$#arr-1]=join("/",@symbols);
print WF join("\t",@arr) . "\n";
}
close(WF);
close(KEGG);
There will be one after runningkeggid.txt,Open to discover genesidAll have been converted to gene names.接下来使用rLanguage processing just got itkegg.txt与id.txtThe drawing code is as follows:
#install.packages("digest")
#install.packages("GOplot")
library(GOplot)
setwd("kegg.txt与id.txt所处的目录") #设置工作目录
ego=read.table("kegg.txt", header = T,sep="\t",check.names=F) #读取kegg富集结果文件
go=data.frame(Category = "All",ID = ego$ID,Term = ego$Description, Genes = gsub("/", ", ", ego$geneID), adj_pval = ego$p.adjust)
#读取基因的logFC文件
id.fc <- read.table("id.txt", header = T,sep="\t",check.names=F)
genelist <- data.frame(ID = id.fc$gene, logFC = id.fc$avg_logFC)
row.names(genelist)=genelist[,1]
circ <- circle_dat(go, genelist)
termNum = 3 #限定term数目
geneNum = nrow(genelist) #限定基因数目
chord <- chord_dat(circ, genelist[1:geneNum,], go$Term[1:termNum])
pdf(file="circ.pdf",width = 11,height = 10)
GOChord(chord,
space = 0.001, #基因之间的间距
gene.order = 'logFC', #按照logFC值对基因排序
gene.space = 0.25, #基因名跟圆圈的相对距离
gene.size = 5, #基因名字体大小
border.size = 0.1, #线条粗细
process.label = 8) #term字体大小
dev.off()
termCol <- c("#223D6C","#D20A13","#FFD121","#088247","#58CDD9","#7A142C","#5D90BA","#431A3D","#91612D","#6E568C","#E0367A","#D8D155","#64495D","#7CC767")
pdf(file="cluster.pdf",width = 11,height = 10)
GOCluster(circ.gsym,
go$Term[1:termNum],
lfc.space = 0.2, #倍数跟树间的空隙大小
lfc.width = 1, #变化倍数的圆圈宽度
term.col = termCol[1:termNum], #自定义term的颜色
term.space = 0.2, #倍数跟term间的空隙大小
term.width = 1) #富集term的圆圈宽度
dev.off()
三、结果
The horizontal axis represents the proportion of genes,On the right you can see what the size of the dots represent,点越大,The more genes are enriched,The redder the color, the more significant the enrichment.
The abscissa is enriched in keggThe number of genes in the left isGO的功能,See what the colors represent,The redder the more prominent
从图就可以看出,genes and eachkeggThe relationship between pathways The colored line under the gene represents what the gene is inkeggenriched in the pathway,Each can be seen below the imagekeggThe color of the pathway,logFCThe value represents the expression level of the gene,The darker the color, the higher the enrichment degree,The higher the expression level, the more significant it is.
The inner ring is the sanitation outside the genekegg通路,Where a gene represents thatkeggPathway Lee has this gene in it,Say there is a gene under the three colored rings,It means there are three paths,logFCThe value represents the degree of expression,The darker the color, the higher the enrichment degree,the more pronounced the expression.
四、结尾
Because the results this time depend a lot on the previous data,Therefore, the content of the previous lesson must also be used,Therefore, it is necessary to ensure that the results obtained before are correct. This concludes all lessons on the single-cell sequencing workflow I will update and write about the more popular ones in the futuretcga挖掘
发布者:全栈程序员栈长,转载请注明出处:https://javaforall.cn/127991.html原文链接:https://javaforall.cn
边栏推荐
- How does automated testing create business value?
- 网银被盗?这篇文章告诉你如何安全使用网银
- Doing things software development - the importance of law and understanding of reasonable conclusions
- MySQL数据库操作
- MySQL database operations
- The 2nd China PWA Developer Day
- 做事软件开发-法的重要性所在以及合理结论的认识
- Kubernetes原理剖析与实战应用手册,太全了
- 数据库的范式(第一范式,第二范式,第三范式,BCNF范式)「建议收藏」
- 第二届中国PWA开发者日
猜你喜欢
type of timer
OPPO在FaaS领域的探索与思考
Kubernetes common commands
Synchronized and volatile interview brief summary
TRACE32 - SNOOPer-based variable logging
工程流体力学复习
[MySQL] Mysql paradigm and the role of foreign keys
Female service community product design
mysql black window ~ build database and build table
外媒所言非虚,苹果降价或许是真的在清库存
随机推荐
Kubernetes common commands
T - sne + data visualization parts of the network parameters
Oracle动态注册非1521端口
01 邂逅typescript,环境搭建
Implementing click on the 3D model in RenderTexture in Unity
【Meetup预告】OpenMLDB+OneFlow:链接特征工程到模型训练,加速机器学习模型开发
Codeforces Round #796 (Div. 2) (A-D)
Gorm—Go language database framework
arm按键控制led灯闪烁(嵌入式按键实验报告)
Premiere Pro 2022 for (pr 2022)v22.5.0
Why don't you make a confession during the graduation season?
WeChat chat record search in a red envelope
删除 状态良好(恢复分区)的磁盘
leetcode303 Weekly Match Replay
[Meetup Preview] OpenMLDB+OneFlow: Link feature engineering to model training to accelerate machine learning model development
OPPO在FaaS领域的探索与思考
Matlab矩阵基本操作(定义,运算)
Qt practical cases (54) - using transparency QPixmap design pictures
MySQL基础篇【单行函数】
使用 GraphiQL 可视化 GraphQL 架构