当前位置:网站首页>QSAR model establishment script based on pytoch and rdkit
QSAR model establishment script based on pytoch and rdkit
2022-07-03 03:51:00 【LRJ-jonas】
QSAR
Quantitative structure activity method (quantitative structure-activity relationship, QSAR) It is the most widely used drug design method . The so-called quantitative structure-activity method is to establish a series of combinations through some mathematical statistical methods The quantitative relationship between the physiological activity or some property of a substance and its physical and chemical properties , Through these relationships . It can predict the physiological activity or some properties of compounds , Guide us to design compounds with higher activity .
Installation environment
pip install pprint
pip install argparse
# install rdkit
conda install -c rdkit rdkit
install pytorch course
be based on pytorch and rdkit Of QSAR model
#!/usr/bin/python3
import pprint
import argparse
import torch
import torch.optim as optim
from torch import nn as nn
import torch.nn.functional as F
from torch.autograd import Variable
from rdkit import Chem
from rdkit.Chem import AllChem
from rdkit.Chem import DataStructs
import numpy as np
#from sklearn import preprocessing
def base_parser():
parser = argparse.ArgumentParser("This is simple test of pytorch")
parser.add_argument("trainset", help="sdf for train")
parser.add_argument("testset", help="sdf for test")
parser.add_argument("--epochs", default=150)
return parser
parser = base_parser()
args = parser.parse_args()
traindata = [mol for mol in Chem.SDMolSupplier(args.trainset) if mol is not None]
testdata = [mol for mol in Chem.SDMolSupplier(args.testset) if mol is not None]
def molsfeaturizer(mols):
fps = []
for mol in mols:
arr = np.zeros((0,))
fp = AllChem.GetMorganFingerprintAsBitVect(mol, 2)
DataStructs.ConvertToNumpyArray(fp, arr)
fps.append(arr)
fps = np.array(fps, dtype = np.float)
return fps
classes = {"(A) low":0, "(B) medium":1, "(C) high":2}
#classes = {"(A) low":0, "(B) medium":1, "(C) high":1}
trainx = molsfeaturizer(traindata)
testx = molsfeaturizer(testdata)
# for pytorch, y must be long type!!
trainy = np.array([classes[mol.GetProp("SOL_classification")] for mol in traindata], dtype=np.int64)
testy = np.array([classes[mol.GetProp("SOL_classification")] for mol in testdata], dtype=np.int64)
# stay pytorch Build models in , Define each layer and the entire structure
X_train = torch.from_numpy(trainx)
X_test = torch.from_numpy(testx)
Y_train = torch.from_numpy(trainy)
Y_test = torch.from_numpy(testy)
print(X_train.size(),Y_train.size())
print(X_test.size(), Y_train.size())
class QSAR_mlp(nn.Module):
def __init__(self):
super(QSAR_mlp, self).__init__()
self.fc1 = nn.Linear(2048, 524)
self.fc2 = nn.Linear(524, 10)
self.fc3 = nn.Linear(10, 10)
self.fc4 = nn.Linear(10,3)
def forward(self, x):
x = x.view(-1, 2048)
h1 = F.relu(self.fc1(x))
h2 = F.relu(self.fc2(h1))
h3 = F.relu(self.fc3(h2))
output = F.sigmoid(self.fc4(h3))
return output
# Build training and prediction models
model = QSAR_mlp()
print(model)
losses = []
optimizer = optim.Adam( model.parameters(), lr=0.005)
for epoch in range(args.epochs):
data, target = Variable(X_train).float(), Variable(Y_train).long()
optimizer.zero_grad()
y_pred = model(data)
loss = F.cross_entropy(y_pred, target)
print("Loss: {}".format(loss.data[0]))
loss.backward()
optimizer.step()
pred_y = model(Variable(X_test).float())
predicted = torch.max(pred_y, 1)[1]
for i in range(len(predicted)):
print("pred:{}, target:{}".format(predicted.data[i], Y_test[i]))
print( "Accuracy: {}".format(sum(p==t for p,t in zip(predicted.data, Y_test))/len(Y_test)))
test model
python qsar_pytorch.py solubility.train.sdf solubility.test.sdf
边栏推荐
- 动态规划:最长公共子串和最长公共子序列
- docker安装及启动mysql服务
- Open Visual Studio 2010 hangs when opening a SQL file sql file
- TCP/IP模型中的重磅嘉宾TCP--尚文网络奎哥
- Error c2694 "void logger:: log (nvinfer1:: ilogger:: severity, const char *)": rewrite the restrictive exception specification of virtual functions than base class virtual member functions
- Filter
- Téléchargement et installation du client Filezilla
- Applet get user avatar and nickname
- Role of JS No
- SAP UI5 应用开发教程之一百零五 - SAP UI5 Master-Detail 布局模式的联动效果实现明细介绍
猜你喜欢
Mongodb replication set [master-slave replication]
Téléchargement et installation du client Filezilla
pytorch难学吗?如何学好pytorch?
Message queue addition failure
毕设-基于SSM宠物领养中心
pytorch是什么?pytorch是一个软件吗?
Ansible简介【暂未完成(半成品)】
Small guide for rapid formation of manipulator (VIII): kinematic modeling (standard DH method)
递归:深度优先搜索
navicat 导出数据库的表结构
随机推荐
golang xxx. Go code template
MongoDB基本操作【增、删、改、查】
Small guide for rapid formation of manipulator (VIII): kinematic modeling (standard DH method)
2022 tea master (intermediate) examination questions and analysis and tea master (intermediate) practical examination video
Commands related to the startup of redis under Linux server (installation and configuration)
Use three JS make a simple 3D scene
Recursion: quick sort, merge sort and heap sort
Filter
简易版 微信小程序开发之for指令、上传图片及展示效果优化
pytorch是什么?pytorch是一个软件吗?
Separable bonds and convertible bonds
The latest analysis of the main principals of hazardous chemical business units in 2022 and the simulated examination questions of the main principals of hazardous chemical business units
Use of sigaction
IPv6 transition technology-6to4 manual tunnel configuration experiment -- Kuige of Shangwen network
pytorch难学吗?如何学好pytorch?
Cnopendata China Customs Statistics
Open Visual Studio 2010 hangs when opening a SQL file sql file
SAP ui5 application development tutorial 105 - detailed introduction to the linkage effect implementation of SAP ui5 master detail layout mode
docker安装及启动mysql服务
NPM: the 'NPM' item cannot be recognized as the name of a cmdlet, function, script file, or runnable program. Please check the spelling of the name. If the path is included, make sure the path is corr