当前位置:网站首页>November 21, 2021 [reading notes] - bioinformatics and functional genomics (Chapter 5 advanced database search)
November 21, 2021 [reading notes] - bioinformatics and functional genomics (Chapter 5 advanced database search)
2022-06-30 07:37:00 【Muyiqing】

- 5.3 Look for distant related proteins : Location specific iteration BLAST(PSI-BLAST) and DELTA-BLAST
- PAM250 The matrix provides a better scoring system for detecting distant related proteins , The scoring matrix can be changed to detect distant proteins , But there are still limitations :
- BLASTP A matching protein was detected , But whether it is homologous is not clear .
- PSI-BLAST
- Search the database more deeply , To find a matching protein that is distantly related to the protein you are interested in .
- Comparable 5 A step :
- Use a scoring matrix to search the query sequence in a target database ( As usual BLASTP)
- Construct a multiple sequence alignment from the results of the initial search based on component statistics , Based on the multiple sequence alignment , Build specialized 、 Personalized search matrix ( Or spectrum profile)
- The resulting position specific scoring matrix (PSSM) Will be used for another search
- PSI-BLAST Evaluate the statistical significance of database matching .
- The above search is iterated , Generally, it iterates five times .
- legend : The resulting position specific scoring matrix (PSSM)
- legend :3 Comparison results after iterations
- PSI-BLAST Error of : pollution problem
- The main source of errors is the false amplification of unrelated sequences , There are three measures to stop pollution
- 1. A filtering algorithm is used to remove the amino acid regions composed of preferences
- 2. Adjust the inclusion threshold from the default value to a lower value
- 3. Visually inspect each time PSI-BLAST The result of the iteration
- Icon : All globulins and PSI-BLAST How to improve search sensitivity
- The main source of errors is the false amplification of unrelated sequences , There are three measures to stop pollution
- Reverse position specificity BLAST(RPS-BLAST)
- RPS-BLAST You can combine a query protein sequence with a predefined PSSM Database comparison , The conserved protein domains in the query sequence can be identified .
- Icon :RPS-BLAST Inquire about
- Icon :RPS-BLAST Inquire about
- RPS-BLAST You can combine a query protein sequence with a predefined PSSM Database comparison , The conserved protein domains in the query sequence can be identified .
- DELTA-BLAST
- NCBI The most sensitive and accurate protein search tool on the
- advantage
- Based on high quality manual audit CDD database , Can produce ratio PSI-BLAST Bigger 、 More complete PSSM.
- Than BLASTP and PSI-BLAST More sensitive , Including more sensitive search for distant related proteins
- Fast
- Than BLASTP The quality of sequence alignment is better .
- PSI-BLAST and DELTA-BLAST Performance evaluation of
- DELTA-BLAST Sensitivity due to PSI-BLAST、BLASTP And other procedures ;
- Under a given number of false positive conditions ,DELTA-BLAST It can be found that three times BLASTP Found homologous proteins
- Pattern matching initiated BLAST(PHI-BLAST, pattern recognition BLAST)
- A protein may contain an amino acid residue pattern or “ Characteristic signal ” Amino acid residues , This can help us judge whether the protein belongs to a certain family .PHI-BLAST We can find the matching results that contain both matching patterns and query sequences ;DELTA-BLAST High sensitivity , However, it does not output information about the schema selected by the user .
- Definition of characteristic signals and modes
- Customize , And introduce a certain degree of fuzziness , such as NDFX(5)GXW[YF]:
- X(5) Represents that these five positions can be any kind of amino acid residue ;
- [YF] It indicates that the amino acid residue at the last position must be one of the lysine or phenylalanine ;
- The selected mode should not appear too often , The algorithm only allows the occurrence frequency less than 1/5000.
- Customize , And introduce a certain degree of fuzziness , such as NDFX(5)GXW[YF]:
- Icon : Select a mode to PHI-BLAST Search for
- PHI-BLAST The algorithm is based on spanning the input pattern and its upstream and downstream A1 Areas and A2 Double sequence alignment from region A0 Analyze , The result of sequence alignment is scored by the extension with vacancy .
- PAM250 The matrix provides a better scoring system for detecting distant related proteins , The scoring matrix can be changed to detect distant proteins , But there are still limitations :
- Welcome to join the group , Or add VX:bbplayer2021, Invite in

边栏推荐
- Program acceleration
- Lt268 the most convenient TFT-LCD serial port screen chip in the whole network
- STM32 control LED lamp
- Cross compile opencv3.4 download cross compile tool chain and compile (3)
- C language operators
- Digital white paper on total cost management in chain operation industry
- 期末复习-PHP学习笔记1
- Use of ecostruxure (2) IEC61499 to establish function blocks
- Directory of software
- 期末复习-PHP学习笔记9-PHP会话控制
猜你喜欢

期末复习-PHP学习笔记1

Wangbohua: development situation and challenges of photovoltaic industry

Introduction notes to pytorch deep learning (10) neural network convolution layer

期末复习-PHP学习笔记5-PHP数组

Commands and permissions for directories and files

Cadence innovus physical implementation series (I) Lab 1 preliminary innovus

Local unloading traffic of 5g application

Test enumeration types with STM32 platform running RT thread

期末複習-PHP學習筆記3-PHP流程控制語句

Cross compile opencv3.4 download cross compile tool chain and compile (3)
随机推荐
01 - embedded learning route and career planning: embedded basic knowledge and development process
Minecraft 1.16.5模组开发(五十) 书籍词典 (Guide Book)
2021 China Enterprise Cloud index insight Report
期末複習-PHP學習筆記5-PHP數組
Arm debug interface (adiv5) analysis (I) introduction and implementation [continuous update]
Analysys analysis: online audio content consumption market analysis 2022
At the age of 25, I started to work in the Tiankeng industry with buckets. After going through a lot of hardships to become a programmer, my spring finally came
C language implements sequential queue, circular queue and chain queue
Distance from point to line
PMIC power management
Desk lamp control panel - brightness adjustment timer
套接字socket编程——UDP
DXP shortcut key
Record the problem that the system file cannot be modified as an administrator during the development process
Experiment 1: comprehensive experiment [process on]
Solve the linear equation of a specified point and a specified direction
Examen final - notes d'apprentissage PHP 6 - traitement des chaînes
Assembly learning register
線程池——C語言
Cadence physical library lef file syntax learning [continuous update]




