当前位置:网站首页>谷歌 | 蛋白序列的深度嵌入和比对
谷歌 | 蛋白序列的深度嵌入和比对
2022-07-03 05:10:00 【智源社区】
一旦经过训练,DEDAL就会产生专门为每一对新的序列计算的间隙和替换的评分矩阵。此外,间隙和替换的分数是有背景的:对于每一对位置,它们取决于要对齐的完整序列。然后用一个标准的SW算法使用这些参数计算出最佳的排列。本文表明,DEDAL可以在带有加速器的现代硬件上有效地训练。一旦训练完成,DEDAL与标准SW相比,为远程同源物预测的对准质量提高了2-3倍,并产生了一个能更准确检测远程同源物的对准分数。
上图展示了来自Pfam-A种子的两个蛋白域序列比对例子。
a. 分别从Pfam-A种子数据库(第二行)、DEDAL预测(第三行)和用PFASUM70替代矩阵预测(第四行)进行的比对。本文显示了Pfam-A种子和DEDAL对准的两个序列中的所有残基,但没有显示PFASUM的序列中对准的上游和下游的未对准残基。绿色突出显示的残基对应于正确对齐的保守残基,而红色显示的残基对应于预测对齐和Pfam-A种子对齐之间的差异。
b. 来自PFASUM替代矩阵的所有残基对之间的替代分数。
c. 由DEDAL预测的SW参数。

在技术方面,本文探索了两种方法来创建一个可区分的SW对齐模块,需要在 "学习对齐 "任务中训练DEDAL的参数,使用平滑技术或扰动技术;本文发现两者在性能上没有明显区别,并在最终的DEDAL模型中实施了基于扰动的方法。关于用于训练DEDAL的排列组合,本文发现,当本文希望DEDAL能够预测准确的局部排列时,使用Pfam扩展域而不是Pfam域是有益的。在遮蔽语言建模任务中预训练DEDAL时,将与分布外家族相关的序列从 "蛋白质宇宙 "中排除,导致远程同源物的性能略有下降,尽管相对于与基线的性能差距来说并不明显。
关于端到端联合训练变换器和参数器的策略,本文发现这确实明显优于更经典的两步策略,即首先在屏蔽的语言建模任务中训练变换器编码器,然后通过保持变换器固定在 "学习对齐 "任务中训练参数器。这表明,一个通用的语言模型,如ESM,是不够的,至少应该进行微调,以达到对齐的最佳性能。
上图展示了学习的嵌入在下游任务的应用情况。本文通过简单地训练一个模型来评估与上下文相关的嵌入的好处,在这个模型中,替换成本被限制为只取决于要对齐的氨基酸;不难看出,本文观察到这个模型的性能有很大的下降,达到了与 "对准 "中表现最好的替换矩阵差不多的性能。
边栏推荐
- The current market situation and development prospect of the global gluten tolerance test kit industry in 2022
- Distinguish between releases and snapshots in nexus private library
- Make your own dataset
- study hard and make progress every day
- 1114 family property (25 points)
- Notes | numpy-07 Slice and index
- 1086 tree traversals again (25 points)
- 1087 all roads lead to Rome (30 points)
- Notes | numpy-11 Array operation
- 微服务常见面试题
猜你喜欢

Use posture of sudo right raising vulnerability in actual combat (cve-2021-3156)
![[set theory] relationship properties (common relationship properties | relationship properties examples | relationship operation properties)](/img/af/8dfa783c87363a9d75c52e7680d508.jpg)
[set theory] relationship properties (common relationship properties | relationship properties examples | relationship operation properties)

Audio Focus Series: write a demo to understand audio focus and audiomananger
![[set theory] relation properties (transitivity | transitivity examples | transitivity related theorems)](/img/c2/87358af6b2b2892a6eceb751b3b60c.jpg)
[set theory] relation properties (transitivity | transitivity examples | transitivity related theorems)

Concurrent operation memory interaction

Learn to use the idea breakpoint debugging tool

Detailed explanation of yolov5 training own data set
![[set theory] relation properties (reflexivity | reflexivity theorem | reflexivity | reflexivity theorem | example)](/img/2a/362f3b0491f721d89336d4f468c9dd.jpg)
[set theory] relation properties (reflexivity | reflexivity theorem | reflexivity | reflexivity theorem | example)

leetcode860. Lemonade change

Appium 1.22. L'Inspecteur appium après la version X doit être installé séparément
随机推荐
Ueditor, FCKeditor, kindeditor editor vulnerability
Rust基础入门之(基本类型)
Huawei personally ended up developing 5g RF chips, breaking the monopoly of Japan and the United States
Force GCC to compile 32-bit programs on 64 bit platform
Overview of basic knowledge of C language
leetcode860. Lemonade change
乾元通多卡聚合路由器的技术解析
[clock 223] [binary tree] [leetcode high frequency]: 102 Sequence traversal of binary tree
Basic use of Metasploit penetration testing framework
大学校园IP网络广播-厂家基于校园局域网的大学校园IP广播方案设计指南
1087 all roads lead to Rome (30 points)
cookie session jwt
Source insight garbled code solution
Oracle SQL table data loss
Gbase8s composite index (I)
[research materials] annual report of China's pension market in 2021 - Download attached
[backtrader source code analysis 4] use Python to rewrite the first function of backtrader: time2num, which improves the efficiency by 2.2 times
sql语句模糊查询遇到的问题
Yolov5 network structure + code + application details | CSDN creation punch in
Automatic voltage rise and fall 5-40v multi string super capacitor charging chip and solution

