当前位置:网站首页>穀歌 | 蛋白序列的深度嵌入和比對
穀歌 | 蛋白序列的深度嵌入和比對
2022-07-03 05:13:00 【智源社區】
一旦經過訓練,DEDAL就會產生專門為每一對新的序列計算的間隙和替換的評分矩陣。此外,間隙和替換的分數是有背景的:對於每一對比特置,它們取决於要對齊的完整序列。然後用一個標准的SW算法使用這些參數計算出最佳的排列。本文錶明,DEDAL可以在帶有加速器的現代硬件上有效地訓練。一旦訓練完成,DEDAL與標准SW相比,為遠程同源物預測的對准質量提高了2-3倍,並產生了一個能更准確檢測遠程同源物的對准分數。
上圖展示了來自Pfam-A種子的兩個蛋白域序列比對例子。
a. 分別從Pfam-A種子數據庫(第二行)、DEDAL預測(第三行)和用PFASUM70替代矩陣預測(第四行)進行的比對。本文顯示了Pfam-A種子和DEDAL對准的兩個序列中的所有殘基,但沒有顯示PFASUM的序列中對准的上遊和下遊的未對准殘基。綠色突出顯示的殘基對應於正確對齊的保守殘基,而紅色顯示的殘基對應於預測對齊和Pfam-A種子對齊之間的差异。
b. 來自PFASUM替代矩陣的所有殘基對之間的替代分數。
c. 由DEDAL預測的SW參數。

在技術方面,本文探索了兩種方法來創建一個可區分的SW對齊模塊,需要在 "學習對齊 "任務中訓練DEDAL的參數,使用平滑技術或擾動技術;本文發現兩者在性能上沒有明顯區別,並在最終的DEDAL模型中實施了基於擾動的方法。關於用於訓練DEDAL的排列組合,本文發現,當本文希望DEDAL能够預測准確的局部排列時,使用Pfam擴展域而不是Pfam域是有益的。在遮蔽語言建模任務中預訓練DEDAL時,將與分布外家族相關的序列從 "蛋白質宇宙 "中排除,導致遠程同源物的性能略有下降,盡管相對於與基線的性能差距來說並不明顯。
關於端到端聯合訓練變換器和參數器的策略,本文發現這確實明顯優於更經典的兩步策略,即首先在屏蔽的語言建模任務中訓練變換器編碼器,然後通過保持變換器固定在 "學習對齊 "任務中訓練參數器。這錶明,一個通用的語言模型,如ESM,是不够的,至少應該進行微調,以達到對齊的最佳性能。
上圖展示了學習的嵌入在下遊任務的應用情况。本文通過簡單地訓練一個模型來評估與上下文相關的嵌入的好處,在這個模型中,替換成本被限制為只取决於要對齊的氨基酸;不難看出,本文觀察到這個模型的性能有很大的下降,達到了與 "對准 "中錶現最好的替換矩陣差不多的性能。
边栏推荐
- Coordinatorlayout appbarrayout recyclerview item exposure buried point misalignment analysis
- Actual combat 8051 drives 8-bit nixie tube
- Yolov5 input (II) | CSDN creative punch in
- [basic grammar] C language uses for loop to print Pentagram
- Flutter monitors volume to realize waveform visualization of audio
- Yolov5 network structure + code + application details | CSDN creation punch in
- [research materials] 2021 annual report on mergers and acquisitions in the property management industry - Download attached
- [backtrader source code analysis 5] rewrite several time number conversion functions in utils with Python
- Introduction to rust Foundation (basic type)
- Objects. Requirenonnull method description
猜你喜欢

Go practice -- design patterns in golang's singleton
![[set theory] relationship properties (common relationship properties | relationship properties examples | relationship operation properties)](/img/af/8dfa783c87363a9d75c52e7680d508.jpg)
[set theory] relationship properties (common relationship properties | relationship properties examples | relationship operation properties)

Pan details of deep learning

leetcode452. Detonate the balloon with the minimum number of arrows

Use posture of sudo right raising vulnerability in actual combat (cve-2021-3156)

appium1.22. Appium inspector after X version needs to be installed separately

How to connect the network: Chapter 1 CSDN creation punch in

Go practice - gorilla / handlers used by gorilla web Toolkit

Actual combat 8051 drives 8-bit nixie tube

(subplots usage) Matplotlib how to draw multiple subgraphs (axis field)
随机推荐
Common interview questions of microservice
[backtrader source code analysis 5] rewrite several time number conversion functions in utils with Python
Blog building tool recommendation (text book delivery)
Differences among bio, NiO and AIO
[research materials] 2021 China's game industry brand report - Download attached
appium1.22.x 版本後的 appium inspector 需單獨安裝
[research materials] 2021 annual report on mergers and acquisitions in the property management industry - Download attached
Messy change of mouse style in win system
[research materials] 2022q1 game preferred casual game distribution circular - Download attached
Go practice -- gorilla / websocket used by gorilla web Toolkit
Notes | numpy-08 Advanced index
Introduction to deep learning (II) -- univariate linear regression
Chapter II program design of circular structure
Shuttle + alluxio accelerated memory shuffle take-off
The IntelliJ platform completely disables the log4j component
1107 social clusters (30 points)
Notes | numpy-09 Broadcast
Basic knowledge of reflection (detailed explanation)
Pan details of deep learning
Burp suite plug-in based on actual combat uses tips

