当前位置:网站首页>Deep embedding and alignment of Google | protein sequences
Deep embedding and alignment of Google | protein sequences
2022-07-03 05:14:00 【Zhiyuan community】
Once trained ,DEDAL A gap and replacement scoring matrix calculated specifically for each new pair of sequences will be generated . Besides , Gaps and replacement scores are background : For each pair of positions , They depend on the complete sequence to be aligned . Then use a standard SW The algorithm uses these parameters to calculate the best arrangement . This article shows that ,DEDAL It can be effectively trained on modern hardware with accelerators . Once the training is done ,DEDAL With the standard SW comparison , The alignment quality predicted for remote homologues is improved 2-3 times , And it produces an alignment score that can detect remote homologues more accurately .
The figure above shows the data from Pfam-A Examples of sequence alignment of two protein domains of seeds .
a. Respectively from the Pfam-A Seed database ( The second line )、DEDAL forecast ( The third line ) Harmony PFASUM70 Alternative matrix prediction ( In the fourth row ) The comparison . This article shows Pfam-A Seed and DEDAL All residues in the two sequences aligned , But it didn't show PFASUM Aligned upstream and downstream misaligned residues in the sequence . The residues highlighted in green correspond to conservative residues that are correctly aligned , The residues shown in red correspond to predicted alignment and Pfam-A Differences between seed alignments .
b. come from PFASUM Substitution scores between all residue pairs of the substitution matrix .
c. from DEDAL Predicted SW Parameters .

In terms of Technology , This article explores two ways to create a distinguishable SW Align modules , Need to be in " Learn alignment " Training in the task DEDAL Parameters of , Use smoothing or perturbation techniques ; This paper finds that there is no obvious difference in performance between the two , And in the final DEDAL The disturbance based method is implemented in the model . About training DEDAL The permutation of , This paper finds that , When this article hopes DEDAL When accurate local arrangement can be predicted , Use Pfam Expand the domain instead of Pfam Domains are beneficial . Pre training in masking language modeling tasks DEDAL when , The sequences related to families outside the distribution are separated from " Protein universe " Exclude from , This leads to a slight decrease in the performance of remote homologues , Although the performance gap relative to the baseline is not obvious .
About the strategy of end-to-end joint training converter and parameter , This paper finds that this is indeed significantly better than the more classic two-step strategy , That is, first train the converter encoder in the shielded language modeling task , Then fix the converter at " Learn alignment " Training parameter device in task . This shows that , A general language model , Such as ESM, It's not enough. , At least fine tune , To achieve the best performance of alignment .
The above figure shows the application of learning embedding in downstream tasks . This paper evaluates the benefits of context sensitive embedding by simply training a model , In this model , The replacement cost is limited to depend only on the amino acids to be aligned ; It's not hard to see. , This paper observes that the performance of this model has a great decline , Achieved and " aim " The performance of the best replacement matrix in .
边栏推荐
- appium1.22. Appium inspector after X version needs to be installed separately
- Wechat applet distance and map
- XML配置文件
- Force GCC to compile 32-bit programs on 64 bit platform
- Ueditor, FCKeditor, kindeditor editor vulnerability
- [basic grammar] C language uses for loop to print Pentagram
- Redis Introduction et explication des types de données
- "250000 a year is just the price of cabbage" has become a thing of the past. The annual salary of AI posts has decreased by 8.9%, and the latest salary report has been released
- JQ style, element operation, effect, filtering method and transformation, event object
- Notes | numpy-11 Array operation
猜你喜欢

2022-02-11 daily clock in: problem fine brush

Basic knowledge of reflection (detailed explanation)

Review the configuration of vscode to develop golang

Use posture of sudo right raising vulnerability in actual combat (cve-2021-3156)

Esp32-c3 learning and testing WiFi (II. Wi Fi distribution - smart_config mode and BlueIf mode)
![[set theory] relation properties (transitivity | transitivity examples | transitivity related theorems)](/img/c2/87358af6b2b2892a6eceb751b3b60c.jpg)
[set theory] relation properties (transitivity | transitivity examples | transitivity related theorems)
![[set theory] relationship properties (common relationship properties | relationship properties examples | relationship operation properties)](/img/af/8dfa783c87363a9d75c52e7680d508.jpg)
[set theory] relationship properties (common relationship properties | relationship properties examples | relationship operation properties)

"250000 a year is just the price of cabbage" has become a thing of the past. The annual salary of AI posts has decreased by 8.9%, and the latest salary report has been released
![[set theory] relationship properties (symmetry | symmetry examples | symmetry related theorems | antisymmetry | antisymmetry examples | antisymmetry theorems)](/img/34/d195752992f8955bc2a41b4ce751db.jpg)
[set theory] relationship properties (symmetry | symmetry examples | symmetry related theorems | antisymmetry | antisymmetry examples | antisymmetry theorems)

JQ style, element operation, effect, filtering method and transformation, event object
随机推荐
How to connect the network: Chapter 1 CSDN creation punch in
Handler understands the record
Notes | numpy-09 Broadcast
"250000 a year is just the price of cabbage" has become a thing of the past. The annual salary of AI posts has decreased by 8.9%, and the latest salary report has been released
1094 the largest generation (25 points)
Objects. Requirenonnull method description
在PyCharm中配置使用Anaconda环境
[develop wechat applet local storage with uni app]
Realize file download through the tag of < a > and customize the file name
[set theory] relation properties (transitivity | transitivity examples | transitivity related theorems)
[backtrader source code analysis 5] rewrite several time number conversion functions in utils with Python
Ueditor, FCKeditor, kindeditor editor vulnerability
[research materials] annual report of China's pension market in 2021 - Download attached
Chapter II program design of circular structure
SSM framework integration
Go practice -- factory mode of design patterns in golang (simple factory, factory method, abstract factory)
Redis 过期淘汰机制
Detailed explanation of yolov5 training own data set
Redis breakdown penetration avalanche
Go practice -- gorilla/rpc (gorilla/rpc/json) used by gorilla web Toolkit

