当前位置:网站首页>Paper notes: highly accurate protein structure prediction with alphafold (alphafold 2 & appendix)
Paper notes: highly accurate protein structure prediction with alphafold (alphafold 2 & appendix)
2022-07-25 16:15:00 【UQI-LIUWJ】
notes : This article Nature The paper of , If you just look at the text , It's impossible to understand , It needs to be combined with its supplementary materials
At the same time, this article is too hard to read QAQ, If there is something wrong, you are welcome to correct
0 Preface
- Protein structure prediction : Given a string of amino acid sequences of a protein , Guess the protein 3D What does the structure look like
- Now biology may take a long time to understand the structure of a protein
- Let the protein move , Look at its structure with a microscope from different angles
- This paper puts forward AlphaFold 2
- Previous work AlphaFold 1 Insufficient precision
- AlphaFold 2 The accuracy can reach the atomic level
- Measured in the laboratory ( real ) The difference between the position and the predicted position is within the level of atomic size
- The model uses some knowledge of biology and Physics , Integrated into deep learning
1 The model part
1.1 The whole model

Transformer part ( That's what we have here encoder) Only play different elements ( Amino acids, ) The role of information integration between , The refining part of real information , Is in decoder Partially implemented
My thoughts on recycling mechanism :
This mechanism is a little similar RNN Zhongba hidden state Pass it on to the next round RNN. By constantly using the output learned in the last round , To get better output ( The output of each round may have limited accuracy , Get better results through continuous iteration )
The difference lies in , Here only reuse the structure , But don't send back the gradient .( That is to say, the several outputs passed back are detach The output of )
——> and RNN Compared with the reuse structure , Although there is no difference in calculation time , But there are differences in memory (RNN Words ,“ Comes back ” Of hidden state The gradient of also needs to be recorded in memory , But the output returned here does not need to record the gradient )
1.2 “encoder” part
1.2.1 The whole model

1.2.2 row-wise gated self-attention with pair bias


1.2.3 column-wise gated self-attention
General process and 1.2.2 similar , The difference lies in , Here is by column self-attention( The weight of amino acids at the same position in different proteins )


1.2.4 MSA transition
( Two transition It's the same )
This is a MLP


1.2.5 Outer product mean
1.2.6 Triangular multiplicative update

Allied , Just turned out

Be careful : Due to the sequence of out side and in side modules , So the resulting matrix is not necessarily symmetric .
1.2.7 Triangular self-attention
The above figure shows the previous row by row attention, The following figure is here attention, It can be seen that it is very similar 

The pseudo code part is basically the same , But the meaning of the paper here is the fifth line (attention Weighted part ) Some trigonometric properties are used

1.3 “decoder” part
1.3.0 How to predict ?
- Express proteins 3D The simplest way to structure is to record each element 3D coordinate .
- Proteins rotate / Translation does not affect protein structure , But if you use 3D In terms of coordinates , The absolute position will change
- ——> So the relative position is used here
- Protein can be thought of as a trunk + Branched chain
- The main point is recorded as x, Then any point on the chain / The latter point of the trunk can be seen as y=Rt+x
- according to 3*3 Matrix R Make a spin
- according to x Do translation
1.3.1 The whole model

1.3.2 IPA Invariant point attention

1.3.3 Backbone update
Update the key points s and T

Experimental results
.1 Comparison with other projects

- Each column is a model ( The team )
- Each bar graph is the corresponding model , The difference between the average predicted location and the real location ( The unit is
, namely
rice , That is, the size of the atom )
- You can see AlphaFold 2 The accuracy of has reached atomic accuracy , This is a milestone accuracy
2 AlphaFold The accuracy of the prediction

- The blue one is AlphaFold Predicted results
- Green is the result predicted by the Laboratory
- It can be seen that their error is indeed in the size of an atom ( Black sphere ) within
边栏推荐
- Win11自带画图软件怎么显示标尺?
- 如何构建面向海量数据、高实时要求的企业级OLAP数据引擎?
- 【IJCAI 2022】参数高效的大模型稀疏训练方法,大幅减少稀疏训练所需资源
- MySQL self incrementing lock
- The annualized interest rate of treasury bonds is too low. Is there a financial product with a higher annualized interest rate than the reverse repurchase of treasury bonds?
- mysql意向锁
- 2W word detailed data Lake: concept, characteristics, architecture and cases
- 递归菜单查询(递归:自己查自己)
- mysql 表写锁
- Leetcode:6127. Number of high-quality number pairs [bit operation finding rules + the sum of two numbers is greater than or equal to K + dichotomy]
猜你喜欢

ML - Speech - traditional speech model

可验证随机函数 VRF

食品安全丨无处不在的冷冻食品,你真的了解吗?

"Digital security" alert NFT's seven Scams

How does win11's own drawing software display the ruler?

Understanding service governance in distributed development
![[server data recovery] data recovery cases of raid information loss caused by unexpected power failure of HP EVA server storage](/img/90/51d86111b918eb60761818110cdec4.jpg)
[server data recovery] data recovery cases of raid information loss caused by unexpected power failure of HP EVA server storage

Boomi荣获“多元化最佳首席执行官奖”和“职业成长最佳公司奖”,在大型公司类别中跻身50强
![[zeloengine] summary of pit filling of reflection system](/img/7a/c85ba66c5dd05908b2d784fab306a2.png)
[zeloengine] summary of pit filling of reflection system

Analysis and solution of data and clock mismatch delay in SPI transmission
随机推荐
MySQL 悲观锁
02. Limit the parameter props to a list of types
国债年化利率太低了,有比国债逆回购年化利率还要高的理财产品吗?
MySQL metadata lock (MDL)
Dpdk packet receiving and sending problem case: non packet receiving problem location triggered by mismatched packet sending and receiving function
[wechat applet] detailed explanation of applet host environment
Solve win10 disk occupation of 100%
leetcode:528. 按权重随机选择【普通随机失效 + 要用前缀和二分】
Food safety - do you really understand the ubiquitous frozen food?
【莎士比亚:保持做人的乐趣】
推荐收藏,这或许是最全的类别型特征的编码方法总结
Verifiable random function VRF
Ice 100g network card fragment message hash problem
【图像去噪】基于双立方插值和稀疏表示实现图像去噪matlab源码
MySQL页锁
【ZeloEngine】反射系统填坑小结
Endnote cannot edit range resolution
【故障诊断】基于贝叶斯优化支持向量机的轴承故障诊断附matlab代码
今天去 OPPO 面试,被问麻了
Understanding service governance in distributed development


