当前位置:网站首页>Bedtools tutorial
Bedtools tutorial
2022-07-02 11:36:00 【qq_ twenty-seven million three hundred and ninety thousand and 】
bedtools: a powerful toolset for genome arithmetic
bedtools Tools are a powerful tool for a wide range of genomic analysis tasks . The most widely used tool enables genome arithmetic : That is, the set theory on the genome . for example ,bedtools Allow people to start from the widely used genome file format ( Such as BAM、BED、GFF/GTF、VCF) Cross in multiple files of 、 Merge 、 Count 、 Complement and shuffle genome intervals . Although each individual tool is designed to do a relatively simple task ( for example , Intersect with two interval files ), But through in UNIX Combine multiple on the command line bedtools The operation can carry out quite complex analysis .
Genome annotation file download address
https://genome.ucsc.edu/cgi-bin/hgTables download bed file
bedtools --version # Version number
bedtools --contact # Help information
# Download the test file
curl -O https://s3.amazonaws.com/bedtools-tutorials/web/cpg.bed
curl -O https://s3.amazonaws.com/bedtools-tutorials/web/exons.bed
curl -O https://s3.amazonaws.com/bedtools-tutorials/web/gwas.bed
curl -O https://s3.amazonaws.com/bedtools-tutorials/web/genome.txt
###1 bedtools intersect
## Calculation overlap intervals
#Tool: bedtools intersect (aka intersectBed)
#Version: v2.30.0
#Summary: Report overlaps between two feature files.
#Usage: bedtools intersect [OPTIONS] -a <bed/gff/vcf/bam> -b <bed/gff/vcf/bam>
# notes :-b You can connect multiple files
# Show cpg.bed neutralization exons.bed There are overlapping intervals
bedtools intersect -a cpg.bed -b exons.bed
# Show exons.bed neutralization cpg.bed There are overlapping intervals
bedtools intersect -a exons.bed -b cpg.bed
# At the same time, the overlapping areas are displayed A、B Original records in the document
bedtools intersect -a exons.bed -b cpg.bed -wa -wb
# Show the base number of overlapping areas
bedtools intersect -a cpg.bed -b exons.bed -wo
# Show each one cpg.bed The record in the file is exons.bed Number of overlapping records in the file
bedtools intersect -a cpg.bed -b exons.bed -c
# cpg.bed There is a disagreement in the document exons.bed whatever intervals Overlapping records
bedtools intersect -a cpg.bed -b exons.bed -v
bedtools intersect -a cpg.bed -b exons.bed -wo
# Set threshold , Show cpg.bed in intervals There are at least 50% Sequence sum exons.bed Overlap in
bedtools intersect -a cpg.bed -b exons.bed -wo -f 0.50
# Overlapping areas of multiple files
bedtools intersect -a cpg.bed -b gwas.bed exons.bed
bedtools intersect -a cpg.bed -b gwas.bed exons.bed -wa -wb -names gwas exon # Add files label
# sorted Data is added -sorted Parameters , Run faster
time bedtools intersect -a exons.bed -b cpg.bed gwas.bed -sorted >>/dev/null
###2 bedtools merge
#Tool: bedtools merge (aka mergeBed)
#Version: v2.30.0
#Summary: Merges overlapping BED/GFF/VCF entries into a single interval.
#Usage: bedtools merge [OPTIONS] -i <bed/gff/vcf>
# Be careful :bedtools merge The input file is required to be sorted first
# Sort , Input files are sorted by chromosome first , Then sort by starting position .
sort -k1,1 -k2,2n test.bed >test.sorted.bed
# Show the final " Merge " Section
bedtools merge -i exons.bed | head -n 20
# The calculation leads to every new " Merge " The number of overlapping intervals , We will " Calculation " First column .
bedtools merge -i exons.bed -c 1 -o count | head -n 20
# Show all merged into new " Merge " The second row of the overlapping interval
bedtools merge -i exons.bed -c 2 -o collapse | head -n 20
# Merge distance does not exceed 1000 The range of ,
bedtools merge -i exons.bed -d 1000 -c 1 -o count | head -20
# Merge distance does not exceed 90 Area , Do different operations on the first column and the fourth column respectively
bedtools merge -i exons.bed -d 90 -c 1,4 -o count,collapse | head -20
###3 bedtools complement
#Tool: bedtools complement (aka complementBed)
#Version: v2.30.0
#Summary: Returns the base pair complement of a feature file.
#Usage: bedtools complement [OPTIONS] -i <bed/gff/vcf> -g <genome>
# notes :The genome file should tab delimited and structured as follows:
# <chromName><TAB><chromSize>
# genome.txt in ,exons.bed There is no interval
bedtools complement -i exons.bed -g genome.txt
###4 bedtools genomecov
#Tool: bedtools genomecov (aka genomeCoverageBed)
#Version: v2.30.0
#Summary: Compute the coverage of a feature file among a genome.
#Usage: bedtools genomecov [OPTIONS] -i <bed/gff/vcf> -g <genome>
# notes : Need sorted files
bedtools genomecov -i exons.bed -g genome.txt
# Output BEDGRAPH, Calculation intervals Of depth
bedtools genomecov -i exons.bed -g genome.txt -bg | head -20
###5 bedtools jaccard
#Tool: bedtools jaccard (aka jaccard)
#Version: v2.30.0
#Summary: Calculate Jaccard statistic b/w two feature files.
# Jaccard is the length of the intersection over the union.
# Values range from 0 (no intersection) to 1 (self intersection).
#Usage: bedtools jaccard [OPTIONS] -a <bed/gff/vcf> -b <bed/gff/vcf>
# Calculate similarity
bedtools jaccard -a cpg.bed -b exons.bed
###6 bedtools coverage
#Tool: bedtools coverage (aka coverageBed)
#Version: v2.30.0
#Summary: Returns the depth and breadth of coverage of features from B
# on the intervals in A.
#Usage: bedtools coverage [OPTIONS] -a <bed/gff/vcf> -b <bed/gff/vcf>
bedtools coverage -a cpg.bed -b exons.bed
Reference resources :
http://quinlanlab.org/tutorials/bedtools/bedtools.html
边栏推荐
- 二.Stm32f407芯片GPIO编程,寄存器操作,库函数操作和位段操作
- 基于Hardhat和Openzeppelin开发可升级合约(二)
- STM32 single chip microcomputer programming learning
- MySQL比较运算符IN问题求解
- 对毕业季即将踏入职场的年轻人的一点建议
- [idea] use the plug-in to reverse generate code with one click
- PowerBI中导出数据方法汇总
- Jenkins安装
- C# 文件与文件夹操作
- Resources reads 2D texture and converts it to PNG format
猜你喜欢

PKG package manager usage instance in FreeBSD

在连接mysql数据库的时候一直报错

念念不忘,必有回响 | 悬镜诚邀您参与OpenSCA用户有奖调研

Develop scalable contracts based on hardhat and openzeppelin (II)

二.Stm32f407芯片GPIO编程,寄存器操作,库函数操作和位段操作

Multi line display and single line display of tqdm

Importerror: impossible d'importer le nom « graph» de « graphviz»

Solve the problem of data blank in the quick sliding page of the uniapp list

What is the relationship between digital transformation of manufacturing industry and lean production

Never forget, there will be echoes | hanging mirror sincerely invites you to participate in the opensca user award research
随机推荐
enumrate的start属性的坑
三.芯片啟動和時鐘系統
[cloud native] 2.5 kubernetes core practice (Part 2)
Xiao Sha's pain (double pointer
Order by injection
基于Hardhat和Openzeppelin开发可升级合约(二)
Order by注入
Multi line display and single line display of tqdm
每月1号开始计算当月工作日
Complement (Mathematical Simulation
在网上开股票账户安全吗?我是新手,还请指导
Redis超出最大内存错误OOM command not allowed when used memory &gt; 'maxmemory'
亚马逊云科技 Community Builder 申请窗口开启
Win11 arm system configuration Net core environment variable
Some suggestions for young people who are about to enter the workplace in the graduation season
Resources reads 2D texture and converts it to PNG format
Tick Data and Resampling
JS——每次调用从数组里面随机取一个数,且不能与上一次为同一个
Tidb DM alarm DM_ sync_ process_ exists_ with_ Error troubleshooting
ESP32音频框架 ESP-ADF 添加按键外设流程代码跟踪