当前位置:网站首页>Bedtools tutorial
Bedtools tutorial
2022-07-02 11:36:00 【qq_ twenty-seven million three hundred and ninety thousand and 】
bedtools: a powerful toolset for genome arithmetic
bedtools Tools are a powerful tool for a wide range of genomic analysis tasks . The most widely used tool enables genome arithmetic : That is, the set theory on the genome . for example ,bedtools Allow people to start from the widely used genome file format ( Such as BAM、BED、GFF/GTF、VCF) Cross in multiple files of 、 Merge 、 Count 、 Complement and shuffle genome intervals . Although each individual tool is designed to do a relatively simple task ( for example , Intersect with two interval files ), But through in UNIX Combine multiple on the command line bedtools The operation can carry out quite complex analysis .
Genome annotation file download address
https://genome.ucsc.edu/cgi-bin/hgTables download bed file
bedtools --version # Version number
bedtools --contact # Help information
# Download the test file
curl -O https://s3.amazonaws.com/bedtools-tutorials/web/cpg.bed
curl -O https://s3.amazonaws.com/bedtools-tutorials/web/exons.bed
curl -O https://s3.amazonaws.com/bedtools-tutorials/web/gwas.bed
curl -O https://s3.amazonaws.com/bedtools-tutorials/web/genome.txt
###1 bedtools intersect
## Calculation overlap intervals
#Tool: bedtools intersect (aka intersectBed)
#Version: v2.30.0
#Summary: Report overlaps between two feature files.
#Usage: bedtools intersect [OPTIONS] -a <bed/gff/vcf/bam> -b <bed/gff/vcf/bam>
# notes :-b You can connect multiple files
# Show cpg.bed neutralization exons.bed There are overlapping intervals
bedtools intersect -a cpg.bed -b exons.bed
# Show exons.bed neutralization cpg.bed There are overlapping intervals
bedtools intersect -a exons.bed -b cpg.bed
# At the same time, the overlapping areas are displayed A、B Original records in the document
bedtools intersect -a exons.bed -b cpg.bed -wa -wb
# Show the base number of overlapping areas
bedtools intersect -a cpg.bed -b exons.bed -wo
# Show each one cpg.bed The record in the file is exons.bed Number of overlapping records in the file
bedtools intersect -a cpg.bed -b exons.bed -c
# cpg.bed There is a disagreement in the document exons.bed whatever intervals Overlapping records
bedtools intersect -a cpg.bed -b exons.bed -v
bedtools intersect -a cpg.bed -b exons.bed -wo
# Set threshold , Show cpg.bed in intervals There are at least 50% Sequence sum exons.bed Overlap in
bedtools intersect -a cpg.bed -b exons.bed -wo -f 0.50
# Overlapping areas of multiple files
bedtools intersect -a cpg.bed -b gwas.bed exons.bed
bedtools intersect -a cpg.bed -b gwas.bed exons.bed -wa -wb -names gwas exon # Add files label
# sorted Data is added -sorted Parameters , Run faster
time bedtools intersect -a exons.bed -b cpg.bed gwas.bed -sorted >>/dev/null
###2 bedtools merge
#Tool: bedtools merge (aka mergeBed)
#Version: v2.30.0
#Summary: Merges overlapping BED/GFF/VCF entries into a single interval.
#Usage: bedtools merge [OPTIONS] -i <bed/gff/vcf>
# Be careful :bedtools merge The input file is required to be sorted first
# Sort , Input files are sorted by chromosome first , Then sort by starting position .
sort -k1,1 -k2,2n test.bed >test.sorted.bed
# Show the final " Merge " Section
bedtools merge -i exons.bed | head -n 20
# The calculation leads to every new " Merge " The number of overlapping intervals , We will " Calculation " First column .
bedtools merge -i exons.bed -c 1 -o count | head -n 20
# Show all merged into new " Merge " The second row of the overlapping interval
bedtools merge -i exons.bed -c 2 -o collapse | head -n 20
# Merge distance does not exceed 1000 The range of ,
bedtools merge -i exons.bed -d 1000 -c 1 -o count | head -20
# Merge distance does not exceed 90 Area , Do different operations on the first column and the fourth column respectively
bedtools merge -i exons.bed -d 90 -c 1,4 -o count,collapse | head -20
###3 bedtools complement
#Tool: bedtools complement (aka complementBed)
#Version: v2.30.0
#Summary: Returns the base pair complement of a feature file.
#Usage: bedtools complement [OPTIONS] -i <bed/gff/vcf> -g <genome>
# notes :The genome file should tab delimited and structured as follows:
# <chromName><TAB><chromSize>
# genome.txt in ,exons.bed There is no interval
bedtools complement -i exons.bed -g genome.txt
###4 bedtools genomecov
#Tool: bedtools genomecov (aka genomeCoverageBed)
#Version: v2.30.0
#Summary: Compute the coverage of a feature file among a genome.
#Usage: bedtools genomecov [OPTIONS] -i <bed/gff/vcf> -g <genome>
# notes : Need sorted files
bedtools genomecov -i exons.bed -g genome.txt
# Output BEDGRAPH, Calculation intervals Of depth
bedtools genomecov -i exons.bed -g genome.txt -bg | head -20
###5 bedtools jaccard
#Tool: bedtools jaccard (aka jaccard)
#Version: v2.30.0
#Summary: Calculate Jaccard statistic b/w two feature files.
# Jaccard is the length of the intersection over the union.
# Values range from 0 (no intersection) to 1 (self intersection).
#Usage: bedtools jaccard [OPTIONS] -a <bed/gff/vcf> -b <bed/gff/vcf>
# Calculate similarity
bedtools jaccard -a cpg.bed -b exons.bed
###6 bedtools coverage
#Tool: bedtools coverage (aka coverageBed)
#Version: v2.30.0
#Summary: Returns the depth and breadth of coverage of features from B
# on the intervals in A.
#Usage: bedtools coverage [OPTIONS] -a <bed/gff/vcf> -b <bed/gff/vcf>
bedtools coverage -a cpg.bed -b exons.bed
Reference resources :
http://quinlanlab.org/tutorials/bedtools/bedtools.html
边栏推荐
- Attribute acquisition method and operation notes of C # multidimensional array
- TDSQL|就业难?腾讯云数据库微认证来帮你
- Resources读取2d纹理 转换为png格式
- Win11 arm system configuration Net core environment variable
- TIPC addressing 2
- Homer预测motif
- SSRF
- C file and folder operation
- ctf 记录
- The working day of the month is calculated from the 1st day of each month
猜你喜欢

PLC-Recorder快速监控多个PLC位的技巧

Mmrotate rotation target detection framework usage record

Attribute acquisition method and operation notes of C # multidimensional array

RPA advanced (II) uipath application practice

Tdsql | difficult employment? Tencent cloud database micro authentication to help you

Webauthn - official development document

从攻击面视角,看信创零信任方案实践

2022年4月17日五心红娘团队收获双份喜报

Redis超出最大内存错误OOM command not allowed when used memory &gt; 'maxmemory'

Jinshanyun - 2023 Summer Internship
随机推荐
From the perspective of attack surface, see the practice of zero trust scheme of Xinchuang
C file and folder operation
Never forget, there will be echoes | hanging mirror sincerely invites you to participate in the opensca user award research
II Stm32f407 chip GPIO programming, register operation, library function operation and bit segment operation
Jenkins安装
Jinshanyun - 2023 Summer Internship
bedtools使用教程
Tick Data and Resampling
Functional interfaces and method references
Webauthn - official development document
Gaode draws lines according to the track
[idea] use the plug-in to reverse generate code with one click
MTK full dump抓取
MTK full dump grab
CTF record
sqlite 修改列类型
Indexer in C #
Verilog 和VHDL有符号数和无符号数相关运算
What are the methods of adding elements to arrays in JS
STM32单片机编程学习