1146-2-fem8y-BACE_score_protocol.txt

Name

ChEMBL/Knime-RF

Software

Knime

Parameters

RDKit Morgen Fingerprint (radius=2, bits=1024)
Knime Random Forest node - Tree Ensemble Learner (Regression)
Knime Random Forest number of models = 400
Knime Random Forest data selection - random with replacement (bootstrapping)
Knime Random Forect attribute selection - square root

Method

Around 11000 available IC50 data for BACE inhibition obtained from ChEMBL website as
SMILES with associated pIC50 values. In Knime the RDkit nodes were used to derive Morgan
fingerprints for all datapoints. An 80:20 training-test set split was used to
select parameters for a Random Forest model built directly on the fingerprint bit vectors.
The final model was trained on all available data and used to predict the pIC50 values
of the score compounds. The compounds were sorted by these predicted values to obtain
the final ranking.

Answer 1

No