当前位置:网站首页>2022 ICLR | CONTRASTIVE LEARNING OF IMAGE- AND STRUCTURE BASED REPRESENTATIONS IN DRUG DISCOVERY
2022 ICLR | CONTRASTIVE LEARNING OF IMAGE- AND STRUCTURE BASED REPRESENTATIONS IN DRUG DISCOVERY
2022-06-13 04:30:00 【Dazed flounder】
CONTRASTIVE LEARNING OF IMAGE- AND STRUCTURE BASED REPRESENTATIONS IN DRUG DISCOVERY

CLOOME: A molecular characterization tool based on multimodal contrastive learning
This article is written by John · Kepler Linz University Ana Sanchez-Fernandez The team recently published on ICLR 2022, Its main content is : before , Comparative learning methods CLIP and CLOOB It has been proved , When training on multiple modal data , The learned representations can be highly transferred to a large number of different tasks . In the field of drug discovery , Molecular images and chemical structures are similar multimodal datasets , At present, there is no comparative study on the two , This method has great research value in the field of drug discovery with high label cost . Therefore, this work starts with the easily obtained molecular microscopic images and structures , This paper proposes a method based on CLOOB(Contrastive Leave One Out Boost) A new method of contrastive learning ——CLOOME(Contrastive Leave One Out Boost for Molecule Encoders). Through the linear detection of the molecular activity prediction task , It is proved that the method can be used for the transfer characterization , Besides , This characterization can also be used for alternative tasks of biological isomerism .
Method
This work compares and learns the molecular characterization from the microscopic images and chemical structure data of molecules , To obtain a highly transportable molecular encoder ( Pictured 1 Shown ).CLOOME Compared with traditional molecular encoder or manual extraction of molecular features , Its biggest innovation is that it can optimize the molecular characterization without the input of active molecular data or artificial prior knowledge .
Training data from N Microscopic images of disturbed molecular cells and molecular chemical structure composition : { ( x 1 , z 1 ) , . . . ( x n , z N ) } \{(x_1,z_1),...(x_n, z_N)\} { (x1,z1),...(xn,zN)}. Suppose an adaptive image encoder h x ( . ) h^x(.) hx(.) And adaptive structure encoder h z ( . ) h^z(.) hz(.) Images and chemical structures can be mapped to e m b e d d i n g x n = h x ( x n ) embedding x_n=h^x(x_n) embeddingxn=hx(xn) and z n = h z ( z n ) z_n=h^z(z_n) zn=hz(zn). Pictured 1(a), To stack microscopic images embeddings( That is, the features encoded by the picture encoder ) Write it down as X = ( x 1 , . . . x N ) X=(x_1,...x_N) X=(x1,...xN), Through the structure encoder embeddings Write it down as z = { z 1 … , z N } z=\{z_1…,z_N\} z={ z1…,zN}. The goal of contrastive learning is to improve the similarity of matching pairs , Reduce the similarity of mismatched pairs . This goal is usually achieved by minimizing InfoNCE Loss is achieved by maximizing embedded mutual information :
L i n f o N C B = − 1 N ∑ i = 1 N I n e x p ( τ − 1 x i T z i ) ∑ j = 1 N e x p ( τ − 1 x i T z j ) − 1 N ∑ i = 1 N l n e x p ( τ − 1 x i T z i ) ∑ j = 1 N e x p ( τ − 1 x j T z i ) L_{infoNCB}=-\frac{1}{N}\sum_{i=1}^{N}{In \frac{exp(\tau^{-1}x^T_iz_i)}{\sum^N_{j=1}exp(\tau^{-1}x^T_iz_j)} -\frac{1}{N}\sum_{i=1}^{N}{ln \frac{exp(\tau^{-1}x^T_iz_i)}{\sum^N_{j=1}exp(\tau^{-1}x_j^Tz_i)}}} LinfoNCB=−N1i=1∑NIn∑j=1Nexp(τ−1xiTzj)exp(τ−1xiTzi)−N1i=1∑Nln∑j=1Nexp(τ−1xjTzi)exp(τ−1xiTzi)
But with this InfoLoss It is easy to over present some features , Other features are ignored . Therefore, this work is based on CLOOB To optimize the contrastive learning .
CLOOB Method . First, embed from the stored image U U U And structural Embeddedness V V V Retrieval image embedding and structure embedding in , U x i U_{x_i} Uxi, U z i U_{z_i} Uzi; Represent image embedding and structure embedding respectively , And CLOOB similar , utilize modern Hopfield Search through the network :
U x i = U s o f t m a x ( β U T x i ) V x i = V s o f t m a x ( β V T x i ) U z i = U s o f t m a x ( β U T z i ) V z i = V s o f t m a x ( β V T z i ) U_{x_i} = U softmax(\beta U^Tx_i) \\ V_{x_i}=V softmax(\beta V^Tx_i) \\ U_{z_i}=U softmax(\beta U^Tz_i) \\ V_{z_i}=V softmax(\beta V^Tz_i) \\ Uxi=Usoftmax(βUTxi)Vxi=Vsoftmax(βVTxi)Uzi=Usoftmax(βUTzi)Vzi=Vsoftmax(βVTzi)
then , take InfoLOOB Loss as objective function :
There are some differences between microscope image and natural image , For example, coloring will affect the number of image channels , All experiments in this paper adopt 5 Of input channels ResNet-50 As an encoder , And reduce the microscope image to 320*320.
Molecular structure encoder CLOOME Use descriptor based fully connected networks . Besides , Graph neural network with proper pooling operation 、 Message passing neural network or sequence based neural network can be used as structural encoder .
result

chart 2. Retrieve task result examples . Given a micrograph ,CLOOME The molecular structure corresponding to the micrograph can be retrieved from several molecular structures ( The blue box in the figure shows the matched molecular structure ).CLOOME It can be used to extract molecules that can produce similar biological effects on treated cells , Bio isomers .
边栏推荐
- This Sedata uses multiple methods to dynamically modify objects and values in arrays. Object calculation properties
- Interpretation and implementation of proxy mode
- Sword finger offer II 022 Entry node of a link in a linked list
- Introduction to applet Basics (dark horse learning notes)
- Li Kou brush question 338 Bit count
- ET框架-22 创建ServerInfo实体及事件
- Use ASE encryption and decryption cache encapsulation in Vue project
- Clear timer failure
- Solution to failure to download files by wechat scanning QR code
- Configuration and practice of shardingsphere JDBC sub database separation of read and write
猜你喜欢

Advanced Mathematics (Seventh Edition) Tongji University exercises 1-3 personal solutions

Knife4j aggregation 2.0.9 supports automatic refresh of routing documents

Express scaffold creation

Ladder race

第007天:go语言字符串

Collection of wrong questions in soft test -- morning questions in the first half of 2011

CTFSHOW SQL注入篇(211-230)

Dumi builds a document blog

2022 ICLR | CONTRASTIVE LEARNING OF IMAGE- AND STRUCTURE BASED REPRESENTATIONS IN DRUG DISCOVERY
![[flutter problem Series Chapter 67] the Solution to the problem of Routing cannot be jumped again in in dialog popup Using get plug - in in flutter](/img/59/0d95619ee3bba1f8992d90267d45c2.png)
[flutter problem Series Chapter 67] the Solution to the problem of Routing cannot be jumped again in in dialog popup Using get plug - in in flutter
随机推荐
【Flutter 問題系列第 67 篇】在 Flutter 中使用 Get 插件在 Dialog 彈窗中不能二次跳轉路由問題的解决方案
重读经典:《End-to-End Object Detection with Transformers》
Filter and listener
Consolidated figures
[flutter problem Series Chapter 67] the Solution to the problem of Routing cannot be jumped again in in dialog popup Using get plug - in in flutter
Analyse du principe de mise en œuvre d'un éditeur de texte open source markdown - to - rich
Idea Download
Understand the pseudo static configuration to solve the 404 problem of refreshing the page of the deployment project
Introduction to RFM analysis
Hugo blog building tutorial
120. triangle minimum path sum - Dynamic Planning
knife4j aggregation 2.0.9支持路由文档自动刷新
2019 Blue Bridge Cup
十億數據量 判斷元素是否存在
【Flutter 问题系列第 67 篇】在 Flutter 中使用 Get 插件在 Dialog 弹窗中不能二次跳转路由问题的解决方案
Analysis of the implementation principle of an open source markdown to rich text editor
Small program imitating Taobao Jiugong grid sliding effect
Sword finger offer 56 - I. number of occurrences in the array
Single chip microcomputer: MODBUS multi computer communication program design
【LeetCode】860. Change with lemonade (2 brushes for wrong questions)