当前位置:网站首页>2020 Bioinformatics | GraphDTA: predicting drug target binding affinity with graph neural networks
2020 Bioinformatics | GraphDTA: predicting drug target binding affinity with graph neural networks
2022-07-06 22:02:00 【Stunned flounder (】
2020 Bioinformatics | GraphDTA: predicting drug target binding affinity with graph neural networks

Paper: https://academic.oup.com/bioinformatics/article/37/8/1140/5942970?login=false
Code:https://github.com/thinng/GraphDTA
Abstract
High development cost of new drugs 、 Time consuming , And often accompanied by security issues . Drug reuse can avoid expensive and lengthy drug development processes by finding new uses for approved drugs . In order to effectively reuse drugs , It is useful to know which proteins are targeted by which drugs . Estimate new drugs - The calculation model of target pair interaction intensity may speed up drug reuse . Several models have been proposed for this task . However , These models represent drugs as strings , This is not the natural way to express molecules . We put forward a proposal called GraphDTA It represents drugs as graphs , Graphical neural network is used to predict the affinity between drugs and targets . We show that , Figure neural network not only predicts drugs better than non deep learning model - Target affinity , And it is better than the competitive deep learning method . Our results confirm , The deep learning model is applicable to drugs - Prediction of target binding affinity , And representing drugs as graphs can lead to further improvements .
Introduce
medicine - Target affinity (DTA) There are several methods of prediction and calculation :
- molecular docking , It predicts drugs by scoring function - Stability of the target complex 3D structure .
- Using collaborative filtering . for example ,SimBoost The model uses affinity similarity between drugs and targets to construct new features .
- Use neural networks trained on one-dimensional representations of drug and protein sequences . for example ,DeepDTA The model uses one-dimensional representation and one-dimensional convolution ( With pooling ) To capture prediction patterns in the data
Drug characterization
SMILES It can be done by rdkit Open source software generation graph In the form of , Then, the drug eigenvector is obtained by graph convolution network representation learning . Each node is a multidimensional 01 Eigenvector , Expressed five messages : Atomic symbols 、 Number of adjacent atoms 、 Number of adjacent hydrogen atoms 、 The implied value of the atom 、 Whether the atom is in the aromatic structure .
Protein characterization
Because it is difficult to represent the structure of protein diagram , Protein results are characterized by one-hot Coding means . The gene name of the target is from UniProt Get the protein sequence from the database . The sequence is a string representing amino acids ASCII character . Each amino acid type is encoded with an integer according to its associated alphabetic symbol [ for example , Alanine (A) by 1, Cystine by 3, Aspartic acid (D) by 4, And so on ], So that the protein can be expressed as an integer sequence .
Molecular graph model structure
The author proposes a new graph based neural network and traditional CNN Of DTA prediction model . As shown in the figure below . First, classify and code the protein sequence , Then add the embedded layer to the sequence , Each of them ( code ) The characters are 128 The dimension vector represents . Next , Use three 1D Convolution layer learns different levels of abstract features from input . Last , The expression vector of the input protein sequence is obtained by using the maximum pooling layer . This method is similar to the existing baseline model . For drugs , We used molecular graphs and tested four graph neural network variants , Include GCN ( Kipf and Welling, 2017 )、GAT ( Veličković et al., 2018 ))、GIN ( Xu et al., 2019 ) And combined GAT-GCN framework .
Experiments and results
Researchers mainly compare the non deep learning model with the more popular deep learning model , The consistency index is calculated by measurement CI( Indicates the consistency between predicted and actual values ) And mean square error MSE These two indicators represent the quality of the model . In order to make the experimental results more comparative , Respectively in Davis And Kiba Data sets measure the model .
Davis Data set model measurement results 
The measurement results in both data sets are based on GAT-GCN The combined graph representation model has the best prediction performance .
Conclusion
In this work , Researchers have come up with a computational drug - A new method of target binding affinity , be called GraphDTA; To make drug development less difficult , Reduce the time and cost of finding new drug target interactions , Shorten the drug development cycle . The model is used by SMILES Two dimensional graph structure data from data reconstruction , It can express more complete information of drugs , So this method can get better prediction performance .
Reference resources
边栏推荐
- 【sciter Bug篇】多行隐藏
- The golden age of the U.S. technology industry has ended, and there have been constant lamentations about chip sales and 30000 layoffs
- Method return value considerations
- MariaDb数据库管理系统的学习(一)安装示意图
- Sequoia China, just raised $9billion
- The role of applicationmaster in spark on Yan's cluster mode
- What can one line of code do?
- npm run dev启动项目报错 document is not defined
- 保存和检索字符串
- PostgreSQL install GIS plug-in create extension PostGIS_ topology
猜你喜欢

【10点公开课】:视频质量评价基础与实践

GPS从入门到放弃(十七) 、对流层延时

GPS从入门到放弃(十三)、接收机自主完好性监测(RAIM)

GNN,请你的网络层数再深一点~

GPS from getting started to giving up (19), precise ephemeris (SP3 format)

Tiktok will push the independent grass planting app "praiseworthy". Can't bytes forget the little red book?
![[Chongqing Guangdong education] Tianjin urban construction university concrete structure design principle a reference](/img/61/976c7d86ab3b2df5f5af3beefbf547.png)
[Chongqing Guangdong education] Tianjin urban construction university concrete structure design principle a reference

PostgreSQL 修改数据库用户的密码

Huawei has launched attacks in many industries at the same time, and its frightening technology has made European and American enterprises tremble

The golden age of the U.S. technology industry has ended, and there have been constant lamentations about chip sales and 30000 layoffs
随机推荐
guava:创建immutableXxx对象的3种方式
基于InsightFace的高精度人脸识别,可直接对标虹软
Five wars of Chinese Baijiu
GPS从入门到放弃(十三)、接收机自主完好性监测(RAIM)
JPEG2000-Matlab源码实现
美国科技行业结束黄金时代,芯片求售、裁员3万等哀声不断
华为在多个行业同时出击,吓人的技术让欧美企业瑟瑟发抖
Depth first traversal (DFS) and breadth first traversal (BFS)
The role of applicationmaster in spark on Yan's cluster mode
[Digital IC manual tearing code] Verilog automatic beverage machine | topic | principle | design | simulation
LeetCode学习记录(从新手村出发之杀不出新手村)----1
Codeforces Round #274 (Div. 2) –A Expression
string的底层实现
记一次清理挖矿病毒的过程
Efficiency tool +wps check box shows the solution to the sun problem
1292_FreeROS中vTaskResume()以及xTaskResumeFromISR()的实现分析
Adjustable DC power supply based on LM317
搜素专题(DFS )
jvm:大对象在老年代的分配
What can one line of code do?