当前位置:网站首页>Rdkit | fragment decomposition of drug molecules
Rdkit | fragment decomposition of drug molecules
2022-06-21 07:33:00 【Dazed flounder】
rdkit | Drug molecules undergo fragment decomposition
Chemical informatics is applied in some drug research and development scenarios , Not just the whole drug molecule , Sometimes it is necessary to extract the so-called drug like fragments separately , Extract the commonness of drug like molecular fragments , For database construction or AI Training .
For example, aspirin , Can be broken down into benzene rings , carboxyl 、 Acetaldehyde and a single oxygen atom , A combination of four common drug like fragments .
The following code uses rdkit Of BRICS Algorithm ,BRICS Based on common reactions , Select the site where the fragment breaks the bond , It provides the feasibility in the sense of chemical synthesis .
Scheme 1
at present rdkit There is a more concise scheme one , Update as follows , Compared with scheme II, it is more concise :
from rdkit.Chem import BRICS
aspirin= Chem.MolFromSmiles('CC(=O)OC1=CC=CC=C1C(O)=O')
fragments=BRICS.BRICSDecompose(aspirin,allNodes=None, minFragmentSize=1,
onlyUseReactions=None, silent=True, keepNonLeafNodes=False, singlePass=False, returnMols=False)
print (sorted['fragments'])
output: ['[1*]C(C)=O', '[16*]c1ccccc1[16*]', '[3*]O[3*]', '[6*]C(=O)O']

Arguments explain
- allNodes It is necessary to specify the node molecules to be included , Relatively complex , Generally not used ;
- minFragmentSize, Indicate the minimum number of heavy atoms that the smallest fragment must contain , In this example, it is defined as 2 when ,'[3*]O[3*]' This ether fragment will not be split , But with ‘[16*]c1ccccc1[16*]‘ The benzene rings of are combined to form ’[3*]Oc1ccccc1[16*]’;
- onlyUseReactions, BRICS The resolution site is determined based on the way of reaction , Here you can define what reaction is used to split , Less used ;
- silent, If you don't close , It will print information about what reaction is used to split ;
- keepNonLeafNodes, Set to True when , It will return the middle large fragment that has not been completely split ;
- singlePass, Set to True The result that the returned fragment contains only one fracture site at most , for example ‘[16*]c1ccccc1[16*]‘ The result will be ’[16*]c1ccccc1C(=O)O’ And ’[3*]OC=O’, Avoid the same fragment being broken by multiple reactions ;
- returnMols, Set to True The fragment returned by is not SMILES In the form of , It is rdkit.Mol In the form of .
Option two
It is more complicated than scheme I , But you can learn and operate on more details ,
from rdkit import Chem
from rdkit.Chem import BRICS
def fragment_recursive(mol, frags):
try:
bonds = list(BRICS.FindBRICSBonds(mol))
if len(bonds) == 0:
frags.append(mol_to_smiles(mol))
return frags
idxs, labs = list(zip(*bonds))
bond_idxs = []
for a1, a2 in idxs:
bond = mol.GetBondBetweenAtoms(a1, a2)
bond_idxs.append(bond.GetIdx())
order = np.argsort(bond_idxs).tolist()
bond_idxs = [bond_idxs[i] for i in order]
broken = Chem.FragmentOnBonds(mol,
bondIndices=[bond_idxs[0]],
dummyLabels=[(0, 0)])
head, tail = Chem.GetMolFrags(broken, asMols=True)
#print(mol_to_smiles(head), mol_to_smiles(tail))
frags.append(mol_to_smiles(head))
return fragment_recursive(tail, frags)
except Exception as e:
print (e)
pass
aspirin= Chem.MolFromSmiles('CC(=O)OC1=CC=CC=C1C(O)=O')
fragments=fragment_recursive(aspirin, [])
print (fragments)
# > output: ['*C(C)=O', '*O*', '*c1ccccc1*', '*C(=O)O']
You can see , The output fragment retained the site where aspirin was cut off , Use the wildcard atomic symbol * Express , The visual effect is .
边栏推荐
- Google Earth engine (GEE) - US native lithology data set
- 16 general measurement of data skewness and kurtosis
- Fault analysis | case analysis of master-slave synchronization error reporting after MySQL slave restart
- Horizontal slot, one line of code can directly convert the web page to PDF and save it (pdfkit)
- 企业级开发使用POI踩坑盘点
- JS knowledge blind spot | understanding of async & await
- Integrating eslint in old projects [02]
- 源代码加密产品的分析
- 模拟手机设备长按事件
- How to use MES management system to realize error prevention and early warning
猜你喜欢

数学是用于解决问题的工具

Postman发布API文档

24 parameter estimation interval estimation of two population parameters

18 statistics and its sampling distribution chi square distribution-t distribution-f distribution

Course design of supply chain modeling and simulation based on Flexsim

25 parameter estimation - Determination of sample size

20 statistics and their sampling distribution -- Sampling Distribution of sample proportion

Unittest use
![[telnet] telnet installation and configuration](/img/e1/34801a499c75a2588524ed2ec5af8e.jpg)
[telnet] telnet installation and configuration

mysql数据库拉链表是什么
随机推荐
19 statistics and its sampling distribution -- distribution of sample mean and central limit theorem
Market trend report, technical innovation and market forecast of inorganic microporous adsorbents in China
mysql的安装路径如何查看
CUDA or FPGA for special purpose 3D graphics computations? [closed]
Wechat applet_ 4. Wxss template style
建设数字化工厂的四个必要步骤
Tensorrt笔记(三)参考整理
【osg】osg开发(02)—基于MinGW编译构建osgQt库
mysql存储过程中的循环语句怎么写
EasyExcel-排除展示字段-02
Golang Sync. Use and principle of waitgroup
24 parameter estimation interval estimation of two population parameters
[OSG] OSG development (02) - build osgqt Library Based on MinGW compilation
Research Report on anhydrous trisodium phosphate industry - market status analysis and development prospect forecast
The concept of tree
Black technology, real-time voice simulation
Using XAML only to realize the effect of ground glass background panel
QML控件類型:Drawer
Best practice | how to use Tencent cloud micro build to develop enterprise portal applications from 0 to 1
Market trend report, technical innovation and market forecast of scaffold free 3D cell culture plate in China