1146-1-hwo3b-PosePredictionProtocol34.txt

Name

SkeleDock

Software

HTMD1.13.8/ACEMD3/Rdkit2018.03.4/mmEnergy

System Preparation Parameters

Assumed pH 7.0
Tautomers considered
Gasteiger charges

System Preparation Method

The system was prepared using HTMD function ProteinPrepare at pH 5.1.
This automatically takes care of the protonation of the protein residues. For the ligands,
manual preparation was done setting charges to each atom with Rdkit. Parameterize,
a tool inside HTMD, took then care of the parameters of the ligands. A total of 2 ns of equilibration
was done, with heavy constraints on the backbone and the sidechains, while the ligand was set free.

Pose Prediction Parameters

Equilibration time 2 ns

Pose Prediction Method

We used as templates for docking the results of poses prediction for the 20 smiles to dock.
We use a in-house algorithm (SkeleDock) to search for common dihedrals between the
template ligand and the smiles. Those dihedrals which were common were mirrored, so that the conformation
of the molecule to dock match the one used as template. We iterated until the output
matched our expectations. Once all the poses were ready, we run mmEnergy on the the system,
which minimizes it, relaxing the nearby sidechains and the ligand.

Answer 1

Yes

Answer 2

Yes

1146-4-yv467-FreeEnergyProtocol.txt

Name

DeltaDelta SkeleDock

Software

HTMD1.13.8/ACEMD3/Rdkit2018.03.4/mmEnergy/DeltaDelta

Parameter

Assumed pH 7.0

Method

Once pose predictions were done for the 34 ligands, we run our tool DeltaDelta
to compute the predictions. DeltaDelta is a deep-learning based tool which takes as input
a congeneric series, in this case, 317 protein-ligand complexes we found to be close
to the provided FASTA sequence. This 317 systems have their IC50 value annotated in PDBBind so we used them
to train the network of DeltaDelta. This network takes all the possible pairs between
the molecules on the train set (317 in this case) and computes the difference(delta) in free energy(deltaG)
between the two, hence the name. These pair values are then used for training, were, given the two molecules,
the network tries to predict the difference in free energy, which, in the training, is known.
Finally, for each of the 34 molecules which conform the test case, we predict the difference(delta)
in free energy with all the 317 molecules in the train set. Then, hence we know the absolute free energy
of the 317, we just apply the predicted delta and we get 317 estimates of free energy for each of the 34 problem ligands.
Then, we discarded predictions coming from molecules whose Tanimoto similarity with the problem ligand was
smaller than 0.6 (cutoffs higher than 0.6 deleted too many data points). The median free energy was taken among the remaining predictions.
We did this two times, one for a set of predicitons coming from DeltaDelta where the protein was included,
and another where the protein could not be seen by the network, but where the ligands are still in the predicted pose.
We did this to reduce the influence of the protein in the prediction. For each of the 34 molecules,
the mean between the two runs was chosen as the result.

Answer 1

Yes

Answer 2

Yes