# Results for distribution coefficients # # This file will be automatically parsed. It must contain the following four elements: predictions, name of method, software listing, and method description. These elements must be provided in the order shown, with their respective headers. # # The data in each prediction line should be structured as follows, with all four fields provided. # Compound ID, log D, log D SEM, log D model uncertainty # The list of predictions must begin with the "Predictions:" keyword, as illustrated here, and (except in the case of standardization runs) predictions for all of Batch 0, Batches 0-1, or Batches 0-2 must be provided. Compound order is unimportant Predictions: SAMPL5_017, 3.2, 0.05, 0.7 SAMPL5_059, 4.6, 0.04, 0.9 SAMPL5_045, 1.2, 0.05, 0.4 SAMPL5_015, -0.57, 0.1, 1.0 # etc. # # Please provide an informal but informative name of the method used. The "Name:" keyword is required, as shown here. Name Normal/solvation/GAFF/TIP3P # # List all major software packages used and their versions # The "Software:" keyword is required. Software: GROMACS 5.0.6 AmberTools 14 OpenMolTools 0.6.9 SolvationToolkit first release packmol 15.287 ParmEd # Methodology and computational details. # Level of details should be roughly equivalent to that used in a publication. # Please include the values of key parameters, with units, and explain how statistical uncertainties were estimated. # Use as many lines of text as you need. # All text following the "Method:" keyword will be regarded as part of your free text methods description. Method: Distribution coefficients were here estimated from the difference in solvation free energies of the solute (in its provided protonation and tautomer state) in water and cyclohexane at infinite dilution. The provided .mol2 files were used to build solvated boxes with a single solute molecule in each solvent, for all solutes. The number of solvent molecules was determined in order to create a cubic box with at least 3.0 nm on each edge, so at least 150 cyclohexane molecules in each case, and more in the case of water. Water was treated with TIP3P, and the solute and cyclohexane were treated via GAFF parameters assigned by antechamber and parmchk2. SolvationTollkit (utilizing Packmol and openmoltools) was used to prepare input files in using AmberTools and other external toolkits, culminating in conversion to GROMACS format using ParmEd. Generally, protocols were taken from previous work in our group on solvation free energy and relative solubility calculations, with minor updates for GROMACS 5.0.6. Solvation free energy calculations were broken into 20 lambda states, with five lambda values used for discharging the solute molecule and another 15 for decoupling Lennard-Jones interactions with the remainder of the system. Each solvated system was simulated separately at each lambda value by minimizing in GROMACS with the steepest descents algorithm, then equilibrated for a total of 150 ps broken into three steps: (1) 50 ps constant volume equilibration; (2) 50 ps constant pressure with the Berendsen barostat, and (3) 50 ps constant pressure with the Parrinello-Rahman barostat. These were followed by a 5 ns production phase at each lambda value, of which we typically discarded another 100 ps as equilibration. Analysis was done with Alchemical-Analysis, utilizing pymbar, and MBAR values were used to obtain the solvation free energies (and hence predicted distribution coefficients) reported. No attempt has been made in these calculations to predict likely protonation states of the compounds, or how their pKa or possible changes in protonation state might be likely to impact the distribution coefficients.