当前位置:网站首页>2022 ICLR | CONTRASTIVE LEARNING OF IMAGE- AND STRUCTURE BASED REPRESENTATIONS IN DRUG DISCOVERY
2022 ICLR | CONTRASTIVE LEARNING OF IMAGE- AND STRUCTURE BASED REPRESENTATIONS IN DRUG DISCOVERY
2022-06-13 04:30:00 【Dazed flounder】
CONTRASTIVE LEARNING OF IMAGE- AND STRUCTURE BASED REPRESENTATIONS IN DRUG DISCOVERY

CLOOME: A molecular characterization tool based on multimodal contrastive learning
This article is written by John · Kepler Linz University Ana Sanchez-Fernandez The team recently published on ICLR 2022, Its main content is : before , Comparative learning methods CLIP and CLOOB It has been proved , When training on multiple modal data , The learned representations can be highly transferred to a large number of different tasks . In the field of drug discovery , Molecular images and chemical structures are similar multimodal datasets , At present, there is no comparative study on the two , This method has great research value in the field of drug discovery with high label cost . Therefore, this work starts with the easily obtained molecular microscopic images and structures , This paper proposes a method based on CLOOB(Contrastive Leave One Out Boost) A new method of contrastive learning ——CLOOME(Contrastive Leave One Out Boost for Molecule Encoders). Through the linear detection of the molecular activity prediction task , It is proved that the method can be used for the transfer characterization , Besides , This characterization can also be used for alternative tasks of biological isomerism .
Method
This work compares and learns the molecular characterization from the microscopic images and chemical structure data of molecules , To obtain a highly transportable molecular encoder ( Pictured 1 Shown ).CLOOME Compared with traditional molecular encoder or manual extraction of molecular features , Its biggest innovation is that it can optimize the molecular characterization without the input of active molecular data or artificial prior knowledge .
Training data from N Microscopic images of disturbed molecular cells and molecular chemical structure composition : { ( x 1 , z 1 ) , . . . ( x n , z N ) } \{(x_1,z_1),...(x_n, z_N)\} { (x1,z1),...(xn,zN)}. Suppose an adaptive image encoder h x ( . ) h^x(.) hx(.) And adaptive structure encoder h z ( . ) h^z(.) hz(.) Images and chemical structures can be mapped to e m b e d d i n g x n = h x ( x n ) embedding x_n=h^x(x_n) embeddingxn=hx(xn) and z n = h z ( z n ) z_n=h^z(z_n) zn=hz(zn). Pictured 1(a), To stack microscopic images embeddings( That is, the features encoded by the picture encoder ) Write it down as X = ( x 1 , . . . x N ) X=(x_1,...x_N) X=(x1,...xN), Through the structure encoder embeddings Write it down as z = { z 1 … , z N } z=\{z_1…,z_N\} z={ z1…,zN}. The goal of contrastive learning is to improve the similarity of matching pairs , Reduce the similarity of mismatched pairs . This goal is usually achieved by minimizing InfoNCE Loss is achieved by maximizing embedded mutual information :
L i n f o N C B = − 1 N ∑ i = 1 N I n e x p ( τ − 1 x i T z i ) ∑ j = 1 N e x p ( τ − 1 x i T z j ) − 1 N ∑ i = 1 N l n e x p ( τ − 1 x i T z i ) ∑ j = 1 N e x p ( τ − 1 x j T z i ) L_{infoNCB}=-\frac{1}{N}\sum_{i=1}^{N}{In \frac{exp(\tau^{-1}x^T_iz_i)}{\sum^N_{j=1}exp(\tau^{-1}x^T_iz_j)} -\frac{1}{N}\sum_{i=1}^{N}{ln \frac{exp(\tau^{-1}x^T_iz_i)}{\sum^N_{j=1}exp(\tau^{-1}x_j^Tz_i)}}} LinfoNCB=−N1i=1∑NIn∑j=1Nexp(τ−1xiTzj)exp(τ−1xiTzi)−N1i=1∑Nln∑j=1Nexp(τ−1xjTzi)exp(τ−1xiTzi)
But with this InfoLoss It is easy to over present some features , Other features are ignored . Therefore, this work is based on CLOOB To optimize the contrastive learning .
CLOOB Method . First, embed from the stored image U U U And structural Embeddedness V V V Retrieval image embedding and structure embedding in , U x i U_{x_i} Uxi, U z i U_{z_i} Uzi; Represent image embedding and structure embedding respectively , And CLOOB similar , utilize modern Hopfield Search through the network :
U x i = U s o f t m a x ( β U T x i ) V x i = V s o f t m a x ( β V T x i ) U z i = U s o f t m a x ( β U T z i ) V z i = V s o f t m a x ( β V T z i ) U_{x_i} = U softmax(\beta U^Tx_i) \\ V_{x_i}=V softmax(\beta V^Tx_i) \\ U_{z_i}=U softmax(\beta U^Tz_i) \\ V_{z_i}=V softmax(\beta V^Tz_i) \\ Uxi=Usoftmax(βUTxi)Vxi=Vsoftmax(βVTxi)Uzi=Usoftmax(βUTzi)Vzi=Vsoftmax(βVTzi)
then , take InfoLOOB Loss as objective function :
There are some differences between microscope image and natural image , For example, coloring will affect the number of image channels , All experiments in this paper adopt 5 Of input channels ResNet-50 As an encoder , And reduce the microscope image to 320*320.
Molecular structure encoder CLOOME Use descriptor based fully connected networks . Besides , Graph neural network with proper pooling operation 、 Message passing neural network or sequence based neural network can be used as structural encoder .
result

chart 2. Retrieve task result examples . Given a micrograph ,CLOOME The molecular structure corresponding to the micrograph can be retrieved from several molecular structures ( The blue box in the figure shows the matched molecular structure ).CLOOME It can be used to extract molecules that can produce similar biological effects on treated cells , Bio isomers .
边栏推荐
- Detailed explanation of KOA development process
- Sword finger offer 56 - I. number of occurrences in the array
- Redis hyperloglog cardinality statistics algorithm
- Redis
- This Sedata uses multiple methods to dynamically modify objects and values in arrays. Object calculation properties
- Intervention analysis + pseudo regression
- MySQL索引
- El expression
- Ladder race
- 一款開源的Markdown轉富文本編輯器的實現原理剖析
猜你喜欢

Sword finger offer 56 - I. number of occurrences in the array

基于DE2-115平台的VGA显示

Introduction and use of ES6

剑指 Offer 56 - I. 数组中数字出现的次数

Advanced Mathematics (Seventh Edition) Tongji University exercises 1-3 personal solutions

EMC整改纲要

Redis
![[flutter problem Series Chapter 67] the Solution to the problem of Routing cannot be jumped again in in dialog popup Using get plug - in in flutter](/img/59/0d95619ee3bba1f8992d90267d45c2.png)
[flutter problem Series Chapter 67] the Solution to the problem of Routing cannot be jumped again in in dialog popup Using get plug - in in flutter

Differences and relations between three-tier architecture and MVC

Analyse du principe de mise en œuvre d'un éditeur de texte open source markdown - to - rich
随机推荐
是“凯撒密码”呀。(*‘▽‘*)*
Idea Download
Detailed explanation of KOA development process
Get verification code
Redis-HyperLogLog-基数统计算法
UE4 learning notes - functions of terrain tool
Manage PC startup items
R: Employee turnover forecast practice
PAT 1054 The Dominant Color
Use the visual studio code terminal to execute the command, and the prompt "because running scripts is prohibited on this system" will give an error
Introduction to RFM analysis
十億數據量 判斷元素是否存在
Applet waterfall flow
力扣刷题338.比特位计数
Zoom and move the H5 part of the mobile end
Modeling discussion series 143 data processing, analysis and decision system development
Dagger2学习之Module的应用(二)
力扣刷题647.回文子串
The WebView case of flutter
Li Kou brush question 338 Bit count