当前位置:网站首页>Chapter5 : Deep Learning and Computational Chemistry
Chapter5 : Deep Learning and Computational Chemistry
2022-07-25 12:22:00 【UniversalNature】
reading notes of《Artificial Intelligence in Drug Design》
文章目录
1.Introduction
In the classic Corwin Hansch articleit was illustrated that, in general, biological activity for a group of “congeneric” chemicals can be described by a comprehensive model: L o g 1 / C 50 = a π + b ε + c S + d Log\ 1/C_{50}=a\pi+b\varepsilon+cS+d Log 1/C50=aπ+bε+cS+d
Deep learning is a particular kind of machine learning that overcomes these difficulties by representing the world as a nested hierarchy of concepts, with each concept defined in relation to simpler concepts, and more abstract representations computed in terms of less abstract ones.


1.1.A Brief History of AI
The origins of the field can be traced as far back as the 1940s with the advent of the McCulloch-Pitts neuron as an early model of brain function.
In the 1950s, the perceptron became the first model that could learn its weights from the input.
The inherent limitations of the early methods eventually become apparent—expectations were raised far beyond the reality and progress was much slower than anticipated. This lead to the first “AI winter”—a period in which general interest in the field decreased dramatically, as did the funding.
An approach termed connectionism gathered speed in the 1980s. The central idea in connectionism is that a large number of simple computational units can achieve intelligent behavior when networked together.
The second wave of neural network research lasted until the mid-1990s. Expectations were, once again, raised too high and unrealistically ambitious claims were made while seeking investment. When AI research did not fulfil these expectations, investors and the public were disappointed. At the same time, other fields of machine learning made advances. Kernel machines and graphical models both achieved good results on many important tasks. These two factors led to a decline in the popularity of neural networks that lasted until the first decade of the twenty-first century.
The third wave of neural network research began with a break- through in 2006.
That deep learning models can achieve good performance has been known for some time and such models have been successfully used in commercial applications since the 1990s.
As more and more of our activities take place on computers, greater volumes of data are recorded every day leading to the current age of “big data.”

- Since the introduction of hidden layers artificial neural networks have doubled in size every 2.4 years.
- Chellapilla et al. proposed three novel methods to speed up deep convolutional neural networks: unrolling convolutions, using basic linear algebra software subroutines and using graphics processing units (GPUs).
- The recent rise of deep learning has also been greatly facilitated by various algorithmic developments.
- Last but not least, the increased availability and usability of software and documentation for training neural networks is another reason for the rapid adoption of deep learning in recent years.
2.Deep Learning Applications in Computational Chemistry
2.1.QSAR
- Perhaps the first application of deep learning to QSAR was during the Merck challenge in 2012.
- Random forest models were preferred in this case due to lower computational cost of training and the increased model interpretability.
- When trained on the same datasets and descriptors DNN predictions are frequently similar to those of other methods in terms of their practical utility. This observation is not surprising—while algorithmic improvements may result in slightly better statistics the overall quality ofany model is still bound to the existence of an actual relationship between the modeled property and the features used to describe the molecules.
- It has further been suggested that the improvement relies on the training sets for the activities sharing similar compounds and features, and there being significant correlations between those activities.
- The (2-dimensional) structure of a molecule naturally forms a graph, which makes a class of deep learning techniques known as graph convolutional neural networks (GCNNs) a logical method choice in chemistry.
2.2.Generative Modeling
- Genetic algorithms are a popular choice for global optimization and have been applied in the chemistry domain.
- The three main deep learning approaches revolve around variational autoencoders (VAEs), reinforcement learning (RL) and generative adversarial networks (GANs). Recently, graph convolutional networks have also been applied to this problem.
- One of the seminal demonstrations of this method is the work of Go ́mezBombarelli et al. ,which used an autoencoder with a latent space that was optimized by an additional network to reflect a particular property.
- Convergence for GANs is not straightforward and can suffer from several issues, including mode collapse and overwhelming of the generator by the discriminator during training.
- Recently, DeepSMILES and SELFIES representations have also been developed in order to overcome some of the limitations of the SMILES syntax in the context of deep learning.
- RL considers the generator as an agent that must learn how to take actions (add characters) within an environment or task (SMILES generation) to maximize some notion of reward (properties).
- Generally speaking, VAEs and RNN approaches require large volumes of data to train as they model distributions. Usually ChEMBL or ZINC, both containing more than a million small molecules, are used to derive those models.
- To complicate the matter further, as experimental data is expensive and time consuming to gather the fitness of the generated molecules is usually assessed via QSAR models. A recent study used as many as 11 QSAR models to score and optimize compounds. Building good quality QSAR models is not trivial and requires the availability of high-quality experimental data. To circumvent that, most published work uses easily calculable properties like AlogP, molecular weight, etc. While demonstrating that the models work in principle, those experiments have little practical application.
边栏推荐
- [micro service ~sentinel] sentinel degradation, current limiting, fusing
- LeetCode 0133. 克隆图
- Use of hystrix
- Dr. water 2
- Jenkins configuration pipeline
- [high concurrency] deeply analyze the execution process of worker threads in the thread pool through the source code
- mysql实现一张表数据插入另一张表
- 2022.07.24 (lc_6125_equal row and column pairs)
- 919. Complete binary tree inserter: simple BFS application problem
- Ecological profile of pytorch
猜你喜欢
![SSTI 模板注入漏洞总结之[BJDCTF2020]Cookie is so stable](/img/19/0b943019fe1c959c4b79035a814410.png)
SSTI 模板注入漏洞总结之[BJDCTF2020]Cookie is so stable

Alibaba cloud technology expert Qin long: reliability assurance is a must - how to carry out chaos engineering on the cloud?

Communication bus protocol I: UART

More accurate and efficient segmentation of organs-at-risk in radiotherapy with Convolutional Neural
Software testing interview question: Please list the testing methods of several items?
![[fluent -- example] case 1: comprehensive example of basic components and layout components](/img/d5/2392d9cb8550aa2692c8b41303d507.png)
[fluent -- example] case 1: comprehensive example of basic components and layout components

If you want to do a good job in software testing, you can first understand ast, SCA and penetration testing

Detailed explanation of flex box

Kyligence 入选 Gartner 2022 数据管理技术成熟度曲线报告

Fiddler packet capturing app
随机推荐
cmake 学习使用笔记(二)库的生成与使用
Can't delete the blank page in word? How to operate?
Software testing interview question: Please list the testing methods of several items?
2022河南萌新联赛第(三)场:河南大学 I - 旅行
Fiddler packet capturing app
Visualize the training process using tensorboard
PyTorch进阶训练技巧
PyTorch的生态简介
PyTorch主要模块
Ecological profile of pytorch
Implementation of recommendation system collaborative filtering in spark
Detailed explanation of flex box
Use of hystrix
论文解读(MaskGAE)《MaskGAE: Masked Graph Modeling Meets Graph Autoencoders》
【高并发】通过源码深度分析线程池中Worker线程的执行流程
Cmake learning notes (II) generation and use of Library
请问一下,使用数据集成从postgreSQL导数据到Mysql数据库,有部分数据的字段中出现emoj
intval md5绕过之[WUSTCTF2020]朴实无华
交换机链路聚合详解【华为eNSP】
MySQL implements inserting data from one table into another table