当前位置:网站首页>Chapter5 : Deep Learning and Computational Chemistry
Chapter5 : Deep Learning and Computational Chemistry
2022-07-25 12:51:00 【UniversalNature】
reading notes of《Artificial Intelligence in Drug Design》
List of articles
1.Introduction
In the classic Corwin Hansch articleit was illustrated that, in general, biological activity for a group of “congeneric” chemicals can be described by a comprehensive model: L o g 1 / C 50 = a π + b ε + c S + d Log\ 1/C_{50}=a\pi+b\varepsilon+cS+d Log 1/C50=aπ+bε+cS+d
Deep learning is a particular kind of machine learning that overcomes these difficulties by representing the world as a nested hierarchy of concepts, with each concept defined in relation to simpler concepts, and more abstract representations computed in terms of less abstract ones.


1.1.A Brief History of AI
The origins of the field can be traced as far back as the 1940s with the advent of the McCulloch-Pitts neuron as an early model of brain function.
In the 1950s, the perceptron became the first model that could learn its weights from the input.
The inherent limitations of the early methods eventually become apparent—expectations were raised far beyond the reality and progress was much slower than anticipated. This lead to the first “AI winter”—a period in which general interest in the field decreased dramatically, as did the funding.
An approach termed connectionism gathered speed in the 1980s. The central idea in connectionism is that a large number of simple computational units can achieve intelligent behavior when networked together.
The second wave of neural network research lasted until the mid-1990s. Expectations were, once again, raised too high and unrealistically ambitious claims were made while seeking investment. When AI research did not fulfil these expectations, investors and the public were disappointed. At the same time, other fields of machine learning made advances. Kernel machines and graphical models both achieved good results on many important tasks. These two factors led to a decline in the popularity of neural networks that lasted until the first decade of the twenty-first century.
The third wave of neural network research began with a break- through in 2006.
That deep learning models can achieve good performance has been known for some time and such models have been successfully used in commercial applications since the 1990s.
As more and more of our activities take place on computers, greater volumes of data are recorded every day leading to the current age of “big data.”

- Since the introduction of hidden layers artificial neural networks have doubled in size every 2.4 years.
- Chellapilla et al. proposed three novel methods to speed up deep convolutional neural networks: unrolling convolutions, using basic linear algebra software subroutines and using graphics processing units (GPUs).
- The recent rise of deep learning has also been greatly facilitated by various algorithmic developments.
- Last but not least, the increased availability and usability of software and documentation for training neural networks is another reason for the rapid adoption of deep learning in recent years.
2.Deep Learning Applications in Computational Chemistry
2.1.QSAR
- Perhaps the first application of deep learning to QSAR was during the Merck challenge in 2012.
- Random forest models were preferred in this case due to lower computational cost of training and the increased model interpretability.
- When trained on the same datasets and descriptors DNN predictions are frequently similar to those of other methods in terms of their practical utility. This observation is not surprising—while algorithmic improvements may result in slightly better statistics the overall quality ofany model is still bound to the existence of an actual relationship between the modeled property and the features used to describe the molecules.
- It has further been suggested that the improvement relies on the training sets for the activities sharing similar compounds and features, and there being significant correlations between those activities.
- The (2-dimensional) structure of a molecule naturally forms a graph, which makes a class of deep learning techniques known as graph convolutional neural networks (GCNNs) a logical method choice in chemistry.
2.2.Generative Modeling
- Genetic algorithms are a popular choice for global optimization and have been applied in the chemistry domain.
- The three main deep learning approaches revolve around variational autoencoders (VAEs), reinforcement learning (RL) and generative adversarial networks (GANs). Recently, graph convolutional networks have also been applied to this problem.
- One of the seminal demonstrations of this method is the work of Go ́mezBombarelli et al. ,which used an autoencoder with a latent space that was optimized by an additional network to reflect a particular property.
- Convergence for GANs is not straightforward and can suffer from several issues, including mode collapse and overwhelming of the generator by the discriminator during training.
- Recently, DeepSMILES and SELFIES representations have also been developed in order to overcome some of the limitations of the SMILES syntax in the context of deep learning.
- RL considers the generator as an agent that must learn how to take actions (add characters) within an environment or task (SMILES generation) to maximize some notion of reward (properties).
- Generally speaking, VAEs and RNN approaches require large volumes of data to train as they model distributions. Usually ChEMBL or ZINC, both containing more than a million small molecules, are used to derive those models.
- To complicate the matter further, as experimental data is expensive and time consuming to gather the fitness of the generated molecules is usually assessed via QSAR models. A recent study used as many as 11 QSAR models to score and optimize compounds. Building good quality QSAR models is not trivial and requires the availability of high-quality experimental data. To circumvent that, most published work uses easily calculable properties like AlogP, molecular weight, etc. While demonstrating that the models work in principle, those experiments have little practical application.
边栏推荐
- 2022.07.24(LC_6125_相等行列对)
- 软件测试面试题目:请你列举几个物品的测试方法怎么说?
- Deployment of Apache website services and implementation of access control
- Experimental reproduction of image classification (reasoning only) based on caffe resnet-50 network
- Requirements specification template
- 力扣 83双周赛T4 6131.不可能得到的最短骰子序列、303 周赛T4 6127.优质数对的数目
- 弹性盒子(Flex Box)详解
- 【Flutter -- 实例】案例一:基础组件 & 布局组件综合实例
- Force deduction 83 biweekly T4 6131. The shortest dice sequence impossible to get, 303 weeks T4 6127. The number of high-quality pairs
- What does the software testing process include? What are the test methods?
猜你喜欢

零基础学习CANoe Panel(15)—— 文本输出(CAPL Output View )

ECCV2022 | TransGrasp类级别抓取姿态迁移

【运维、实施精品】月薪10k+的技术岗位面试技巧

Want to go whoring in vain, right? Enough for you this time!

Fiddler packet capturing app

The first scratch crawler

阿里云技术专家秦隆:可靠性保障必备——云上如何进行混沌工程?

跌荡的人生

【AI4Code】《InferCode: Self-Supervised Learning of Code Representations by Predicting Subtrees》ICSE‘21

cmake 学习使用笔记(二)库的生成与使用
随机推荐
Use of hystrix
logstash
Can flinkcdc import multiple tables in mongodb database together?
Chapter5 : Deep Learning and Computational Chemistry
【Rust】引用和借用,字符串切片 (slice) 类型 (&str)——Rust语言基础12
Mysql 远程连接权限错误1045问题
Clickhouse notes 03-- grafana accesses Clickhouse
Use vsftpd service to transfer files (anonymous user authentication, local user authentication, virtual user authentication)
clickhouse笔记03-- Grafana 接入ClickHouse
【历史上的今天】7 月 25 日:IBM 获得了第一项专利;Verizon 收购雅虎;亚马逊发布 Fire Phone
“蔚来杯“2022牛客暑期多校训练营2 补题题解(G、J、K、L)
弹性盒子(Flex Box)详解
【AI4Code】《Contrastive Code Representation Learning》 (EMNLP 2021)
【7】 Layer display and annotation
2022 年中回顾 | 大模型技术最新进展 澜舟科技
Selenium uses -- XPath and analog input and analog click collaboration
Detailed explanation of switch link aggregation [Huawei ENSP]
Make a general cascade dictionary selection control based on jeecg -dictcascadeuniversal
I want to ask whether DMS has the function of regularly backing up a database?
Moving Chinese figure liushenglan