x7ib3-FreeEnergyProtocol.txt

Name

IC50 conversion

Software

SeeSAR

Parameter

Default

Method

Each pose was evaluated with the HYDE scoring function from SeeSAR. The upper boundary (in nM) of the HYDE IC50 range was retained as the pose score, and converted to free energy (in kcal/mol) using the formula Energy = R*T*ln(IC50) (where T = 310,15K).

x7ib3-PosePredictionProtocol.txt

Name

ProPKA/AutoDock/SeeSAR/MMGBSA

Software

Open Babel 2.3.2/Chimera/ProPKA/MGLTools/AutoDock 4/SeeSAR/antechamber/NAMD 2.12

System Preparation Parameters

Assumed pH 7.4
Gasteiger charges
Water molecules and other heteratoms were removed
Protein residue protonation considering local environment

System Preparation Method

We used two ways to prepare the poses.
1) As the ligands to be evaluated were structurally closed to the co-cristalized ligands made available after the first stage of the D3R Grand Challeng 2, we drew ligands 37-102 with SeeSAR from their respective closest known ligand (ligands 1-36).
Each generated complex was then minimized and evaluated using SeeSAR and scored with the HYDE scoring function.
2) We docked each ligand the same way as in stage 1 of the D3R Grand Challenge 2.
Chimera was used to add hydrogen atoms to FXR structures and the protonation state was checked using ProPKA.
The binding site of each PDB structure (PDB + those furnished by the end of D3R stage 1) was considered as all residues having an atom within 5 angstrom of the native ligand. Each structure was aligned to 1OSV (backbone alignment), and pairwise comparison of binding sites was performed based on their RMSD. The subsequently built distance matrix was used to perform a hierarchical clustering. We distinguished 7 main groups of binding site structures. 7 PDB proteins, one of each group, were then considered as receptors to run docking: 3OLF, 1sjpr, 1kjyp, 1hqmf, 1hvih, 1yfjn and 1kmrz. Only chain A was retained for multi-chain structure.
To generate refined poses, we used AutoDock 4 with a grid spacing of 0.26A. AutoDock uses a genetic algorithm (GA) to generate 250 poses for each receptors, for a total of 102 * 250 * 7 poses. We performed 250 GA runs.
The poses were then clustered based on their RMSD (2A) and the lower-energy pose of each cluster was imported in SeeSAR, minimized and scored with the HYDE function. For each ligand, the best HYDE-scored poses was then retained.

For each ligands, we compared the score produces by the two protocols and retained the best poses.
For ligands 39, 40, 41, 43-45, 50, 53, 55, 56, 60-64, 66, 68, 72, 73, 80, 90, 93 and 98, we retained the pose produced by the protocol 2 (Autodock/SeeSAR), while for the other ligands, the protocol 1 (manual drawing/SeeSAR) was used.

Pose Prediction Parameters

Genetic algorithm
Exhaustiveness=250
SeeSAR minimization
HYDE scoring function (HYDE estimates binding free energy based on two terms for dehydration and hydrogen bonding only)

Pose Prediction Method

Docking runs were executed with the above specified parameters while default parameters were applied for the rest of the variables. Each pose was checked manually, abberant poses were discarded, and lower score poses were selected instead.