ScaffOpt_FESet5
LigPrep v33013
ScaffOpt
None
None
None
Assumed pH 5 for ligand preparation.
-dt 'RDK5' -dt 'ErgFP' -dt 'ECFP' -dt 'FCFP' -dt '2Dpp' -binsize 0 -xmean 0.00001 -repbt 100 -hpnum 100 -of 'multi' for all 6 ScaffOpt runs.
Briefly, 3D ligand conformations and tautomerization/ionization states were generated with LigPrep at target pH=5. In case of compounds with alternative tautomers/ionization states, only the one with lowest LigPrep state penalty was used. CHEMBL was queried for similarity to the 3 CatS datasets ('score', 'FESet' and 'pose') and 6 assays were selected to be used by ScaffOpt algorithm as training set. These were (assay IDs): CHEMBL1048481, CHEMBL1103405, CHEMBL1103448, CHEMBL1173849, CHEMBL2318072, CHEMBL899800. ScaffOpt was executed 6 times, one for each assay. ScaffOpt is a fully-automatic machine learning algorithm that takes as input a few molecules with measured binding affinity (training set) and scores a given screening database of compounds according to their predicted binding affinity to the receptor. The details of ScaffOpt algorithm will be described in a forth-coming publication. For this submission, no receptor information was used, only 2D structural information of the compounds. The predicted binding scores in this submission were obtained by combining with Borda Count method the predictions from CHEMBL1048481 at simcut 0.5, CHEMBL2318072 at simcut 0.4, and CHEMBL899800 at simcut 0.3 ('simcut' is a ScaffOpt parameter that quantifies the similarity between the training set and the screening set). ATTENTION: the numbers are scores (the lowest the stronger the binder) not free energies in kcal/mol, therefore RMSEc results may be poor compared with Kendall's tau and Pearson's R which are unaffected.
No
ScaffOpt_FESet5
LigPrep v33013
ScaffOpt
None
3D ligand conformations and tautomerization/ionization states were generated with LigPrep at target pH=5. No 3D coordinates were used for this submission (of receptor or compounds), only 2D structural information of the compounds. The reason I am submitting for both 'Ligand-Based-Score' and 'Free-Energy' is because the latter dataset contains fewer compounds (33) and hence I can use predictions at higher simcut level (see FreeEnergyProtocol.txt for details), which tend to be more accurate. Therefore, it will be interesting to compare the performance of a screening tool like ScaffOpt with the more rigorous Free Energy methods.
None
I copied the example mol and pdb files in order to upload my predictions without errors. However, none of these files was used for the predictions!
No
No