1147-2-pko3w-LigandScoringProtocol_ML_QSAR.txt

Name

QSAR_machine_learning

Software

Python/Scikit-learn/rdkit/mordred/pandas/numpy/matplotlib

Parameters

Fingerprint tanimoto similarity threshold: 0.4

Method

We used molecular descriptor and fingerprint as input features to construct QSAR moldel by machine learning method.
Compounds information with pChEMBL value for specific target(CatS or BACE) was downloaded from ChEMBL database.
Compounds with fingerprint tanimoto similarity above 0.4 threshold compared to target compounds were chosen as training dataset.
Molecular descriptor was calculated by mordred package. ECFP4 fingerprint was calculated by rdkit.
Both of them together were used as input feature vectors to construct QSAR model.
Several machine learning methods were tested, such as SVM, RandomForest, MLP, NearestNeighbors, GaussianProgress.
Model with the best performance was selected to predict the IC50 activity for target compounds.
The prediction result was subbmited directly without further modification.

Answer 1

No