当前位置:网站首页>Bedtools tutorial
Bedtools tutorial
2022-07-02 11:36:00 【qq_ twenty-seven million three hundred and ninety thousand and 】
bedtools: a powerful toolset for genome arithmetic
bedtools Tools are a powerful tool for a wide range of genomic analysis tasks . The most widely used tool enables genome arithmetic : That is, the set theory on the genome . for example ,bedtools Allow people to start from the widely used genome file format ( Such as BAM、BED、GFF/GTF、VCF) Cross in multiple files of 、 Merge 、 Count 、 Complement and shuffle genome intervals . Although each individual tool is designed to do a relatively simple task ( for example , Intersect with two interval files ), But through in UNIX Combine multiple on the command line bedtools The operation can carry out quite complex analysis .
Genome annotation file download address
https://genome.ucsc.edu/cgi-bin/hgTables download bed file
bedtools --version # Version number
bedtools --contact # Help information
# Download the test file
curl -O https://s3.amazonaws.com/bedtools-tutorials/web/cpg.bed
curl -O https://s3.amazonaws.com/bedtools-tutorials/web/exons.bed
curl -O https://s3.amazonaws.com/bedtools-tutorials/web/gwas.bed
curl -O https://s3.amazonaws.com/bedtools-tutorials/web/genome.txt
###1 bedtools intersect
## Calculation overlap intervals
#Tool: bedtools intersect (aka intersectBed)
#Version: v2.30.0
#Summary: Report overlaps between two feature files.
#Usage: bedtools intersect [OPTIONS] -a <bed/gff/vcf/bam> -b <bed/gff/vcf/bam>
# notes :-b You can connect multiple files
# Show cpg.bed neutralization exons.bed There are overlapping intervals
bedtools intersect -a cpg.bed -b exons.bed
# Show exons.bed neutralization cpg.bed There are overlapping intervals
bedtools intersect -a exons.bed -b cpg.bed
# At the same time, the overlapping areas are displayed A、B Original records in the document
bedtools intersect -a exons.bed -b cpg.bed -wa -wb
# Show the base number of overlapping areas
bedtools intersect -a cpg.bed -b exons.bed -wo
# Show each one cpg.bed The record in the file is exons.bed Number of overlapping records in the file
bedtools intersect -a cpg.bed -b exons.bed -c
# cpg.bed There is a disagreement in the document exons.bed whatever intervals Overlapping records
bedtools intersect -a cpg.bed -b exons.bed -v
bedtools intersect -a cpg.bed -b exons.bed -wo
# Set threshold , Show cpg.bed in intervals There are at least 50% Sequence sum exons.bed Overlap in
bedtools intersect -a cpg.bed -b exons.bed -wo -f 0.50
# Overlapping areas of multiple files
bedtools intersect -a cpg.bed -b gwas.bed exons.bed
bedtools intersect -a cpg.bed -b gwas.bed exons.bed -wa -wb -names gwas exon # Add files label
# sorted Data is added -sorted Parameters , Run faster
time bedtools intersect -a exons.bed -b cpg.bed gwas.bed -sorted >>/dev/null
###2 bedtools merge
#Tool: bedtools merge (aka mergeBed)
#Version: v2.30.0
#Summary: Merges overlapping BED/GFF/VCF entries into a single interval.
#Usage: bedtools merge [OPTIONS] -i <bed/gff/vcf>
# Be careful :bedtools merge The input file is required to be sorted first
# Sort , Input files are sorted by chromosome first , Then sort by starting position .
sort -k1,1 -k2,2n test.bed >test.sorted.bed
# Show the final " Merge " Section
bedtools merge -i exons.bed | head -n 20
# The calculation leads to every new " Merge " The number of overlapping intervals , We will " Calculation " First column .
bedtools merge -i exons.bed -c 1 -o count | head -n 20
# Show all merged into new " Merge " The second row of the overlapping interval
bedtools merge -i exons.bed -c 2 -o collapse | head -n 20
# Merge distance does not exceed 1000 The range of ,
bedtools merge -i exons.bed -d 1000 -c 1 -o count | head -20
# Merge distance does not exceed 90 Area , Do different operations on the first column and the fourth column respectively
bedtools merge -i exons.bed -d 90 -c 1,4 -o count,collapse | head -20
###3 bedtools complement
#Tool: bedtools complement (aka complementBed)
#Version: v2.30.0
#Summary: Returns the base pair complement of a feature file.
#Usage: bedtools complement [OPTIONS] -i <bed/gff/vcf> -g <genome>
# notes :The genome file should tab delimited and structured as follows:
# <chromName><TAB><chromSize>
# genome.txt in ,exons.bed There is no interval
bedtools complement -i exons.bed -g genome.txt
###4 bedtools genomecov
#Tool: bedtools genomecov (aka genomeCoverageBed)
#Version: v2.30.0
#Summary: Compute the coverage of a feature file among a genome.
#Usage: bedtools genomecov [OPTIONS] -i <bed/gff/vcf> -g <genome>
# notes : Need sorted files
bedtools genomecov -i exons.bed -g genome.txt
# Output BEDGRAPH, Calculation intervals Of depth
bedtools genomecov -i exons.bed -g genome.txt -bg | head -20
###5 bedtools jaccard
#Tool: bedtools jaccard (aka jaccard)
#Version: v2.30.0
#Summary: Calculate Jaccard statistic b/w two feature files.
# Jaccard is the length of the intersection over the union.
# Values range from 0 (no intersection) to 1 (self intersection).
#Usage: bedtools jaccard [OPTIONS] -a <bed/gff/vcf> -b <bed/gff/vcf>
# Calculate similarity
bedtools jaccard -a cpg.bed -b exons.bed
###6 bedtools coverage
#Tool: bedtools coverage (aka coverageBed)
#Version: v2.30.0
#Summary: Returns the depth and breadth of coverage of features from B
# on the intervals in A.
#Usage: bedtools coverage [OPTIONS] -a <bed/gff/vcf> -b <bed/gff/vcf>
bedtools coverage -a cpg.bed -b exons.bed
Reference resources :
http://quinlanlab.org/tutorials/bedtools/bedtools.html
边栏推荐
- Is it safe to open a stock account through the QR code of the securities manager? Or is it safe to open an account in a securities company?
- Tick Data and Resampling
- Always report errors when connecting to MySQL database
- 揭露数据不一致的利器 —— 实时核对系统
- ros缺少catkin_pkg
- ASTParser 解析含有emum 枚举方法的类文件的踩坑记
- TIPC introduction 1
- From the perspective of attack surface, see the practice of zero trust scheme of Xinchuang
- 由粒子加速器产生的反中子形成的白洞
- 可昇級合約的原理-DelegateCall
猜你喜欢
随机推荐
webauthn——官方开发文档
CentOS8之mysql基本用法
The difference between SQL left join main table restrictions written after on and where
spritejs
SQLite modify column type
Solve the problem of data blank in the quick sliding page of the uniapp list
Writing contract test cases based on hardhat
Amazon cloud technology community builder application window opens
[cloud native] 2.5 kubernetes core practice (Part 2)
ROS lacks xacro package
Regular and common formulas
I STM32 development environment, keil5/mdk5.14 installation tutorial (with download link)
Iii. Système de démarrage et d'horloge à puce
ImportError: cannot import name ‘Digraph‘ from ‘graphviz‘
What are the methods of adding elements to arrays in JS
Jenkins installation
MySQL比较运算符IN问题求解
原生方法合并word
在网上开股票账户安全吗?我是新手,还请指导
Pit of the start attribute of enumrate