当前位置:网站首页>CADD course learning (6) -- obtain the existing virtual compound library (drugbank, zinc)

CADD course learning (6) -- obtain the existing virtual compound library (drugbank, zinc)

2022-07-05 07:24:00 Stunned flounder (

CADD Course study (6)-- Get the existing virtual compound library (Drugbank、ZINC)

Drugbank Database introduction

DrugBank database DrugBank It is a bioinformatics and chemical informatics database provided by the University of Alberta , It is a unique bioinformatics and chemical informatics resource , It combines detailed drug data with comprehensive drug target information .

Recently released DrugBank edition 5.1.9,2022-01-03 edition ) contain 13577 Drug entries , These include 2634 An approved small molecule drug 、1377 Approved Biotechnology ( protein / peptide ) medicine 、131 Nutrients and 6375 An experimental drug . Besides ,5241 A non elemental protein ( Drug target / enzyme / transporter / carrier ) Sequences are associated with these drug entries , Every DruaCard The entry contains 200 Multiple data fields , Half of them are used for drugs / Chemical data , The other half is used for drug target or protein data .

DuoBank The biggest feature is that it supports comprehensive and complex search , combination DrugBank Teachable software , These tools allow scientists to easily detect elements, compare drug structures with new drug matching targets 、 Study drug mechanism and explore new drugs .

ZINC Database introduction

ZINC Database a free database of commercially available compounds for virtual screening .ZINC Contains more than 1300 Ten thousand species 3D Format of the purchasable compound .ZINC Located at the University of California, San Francisco (UCSF) Department of Pharmaceutical Chemistry Shoichet Provided by laboratory .

ZINC Database is a small molecular structure database , There are a large number of small molecular compounds on the market in this database, which provides a very convenient drug property test for drug research and development , There is no need to design a synthetic route to obtain small molecular compounds before testing the activity of related drugs . Especially with the development of computing technology, more and more computer-aided drug design schemes have accelerated the process of drug screening . Through ZINC After screening a large number of molecules in the database, the screened compounds that may be active can be directly passed ZINC Provide the connection to find suppliers to buy small molecule compounds , So as to conveniently and quickly determine the in vitro activity of drugs .

ZINC The free database contains ChemBridge、Enamine and PubChem And many other compound data , You can download all of them for free and download the data of a single supplier .

ZINC The database includes a fragment library 、 Generic drug library 、 Drug bank 、 Natural products warehouse, etc , These compounds contain suppliers 、 Information about the number of rotatable bonds, hydrogen bond receptors and donors 、 According to customer needs , Download the row virtual filter of the specified database .


ZINC The scale of is expanding ,ZINC20 Now it includes 14 Billion compounds , among 13 Billion from 150 Companies in total 310 Product catalogs . these The compound satisfies 90/90/90 The rules , More than 90% Every 90 Update every day and 90% The above compounds can be purchased . The new datasets include 1010 Molecules , Not added to ZlNC in .
In order to study the molecular diversity in on-demand library and physical screening platform , The author carried out experiments from two aspects: skeleton diversity and molecular shape . Yes ZINC Customize the library on demand ( Most of it comes from Enamine REAL) And several other public physical screening libraries (NIH Small molecule library MLSMR,UCSF Small molecule library SMDC,ZIN Of Ro4 Compound inventory ) Calculation Bemis-Murcko Skeleton and count the number of compounds in each skeleton .

The results show that , More than 97% Compounds of cannot be found in ZINC Found in inventory , The number of new skeletons increases almost linearly with the number of molecules . When the number of skeletons increases 16 Times , The number of molecules in the on-demand library is ZINC Inventory 88 times . Use NPMI Methods after classifying the molecular shapes of each library , The molecules of the on-demand library are also more diverse in structure than the physical screening library , Discoid ( Such as benzene ring ) And spherical ( Such as adamantane ) The number of molecules increased significantly .

Search for


Select the scope to download
Download method :
1. stay ZINC Select a certain molecular weight and logP Data of nature range , download smi, get ZINC-downloader-2D-4mi.wget File worker
2. download wgetwin-1531-binary And extract the , Click on wget.exe file 93. Set up wgetwin-1531-binary Is in the system environment variable PATH A member of the variable ( Try not to include Chinese in the catalogue );
4. hold ZINC-downloader-2D smi.wget Document and wget.exe Put the files in the same directory ;
5. open cmd window ,wget.exe -i ZINC-downloader-2D-smi.wget

ChEMBL Database introduction

ChEMBL The database is the European Bioinformatics Institute (European Bioinformatics Institute,EB1) Developed an online Free database , It collects bioactivity data of various targets and compounds from a large number of literatures , It provides a very convenient platform for pharmaceutical chemists to query the bioactivity data of targets or compounds . By 2019 year 10 month 29 Japan , The database collects a total of 12482 A target ,187.9 10000 compounds , share 15500 Ten thousand pieces of bioactivity information .

Through this database , Users can quickly query the current reported compounds and their activity information of a target , You can also query which targets of a compound to do a biological activity test and its data . These data are from various reported literatures , The data is relatively reliable , And can trace the source , Query the source of the data . Through this database , Users can save a lot of time in consulting literature and collecting compound data , Quickly obtain accurate compounds and their biological data , Further accelerate the speed of drug design and drug development .

Natural products and traditional Chinese medicine ingredients database

Marine natural products database :http://mc3d.qnlm.ac/
TCMSP Pharmacology database and analysis platform of traditional Chinese medicine system :https//old tcmsp-e.com/tcmsp.php
Natural products database :http:/harmdata.ncmicn/virtualcompound/index.asp


本文为[Stunned flounder (]所创,转载请带上原文链接,感谢