Are you sure you want to leave this community? Leaving the community will revoke any permissions you have been granted in this community.
The distribution coefficient (DC) component of SAMPL5 comprises 53 compounds, broken into three batches: Batch 0 (13 compounds); Batch 1 (20 compounds); and Batch 2 (20 compounds). You should upload a separate prediction file for each prediction methodology, or protocol, that you have used. The experimental data for the distribution coefficients are being reported as log D, where the logarithm is (as is standard) in base 10 and the distribution coefficient measures Ccyclohexane/Cwater.
A prediction file is a plain text file which contains both your predictions, with uncertainty estimates as described below, and information about your computational protocol. The format requirements for this file is detailed below, and a sample file is available for download from the D3R website: DC-anytexthere-1.txt.
Note that you are free to submit multiple predictions, generated by different computational methods, for each dataset. Each prediction should be submitted separately, using a separate file.
If you are registered as “anonymous”, please note that any file names and files you submit are subject to public release, so you may want to avoid including identifying information in them.
The name of a prediction file must begin with DC and must end with an integer indicating which of your predictions for this host it contains. For example, your first submission (even if you are only submitting one) might be DC-myname-1.txt, where myname is arbitrary text of your choice. If you use two prediction files (in two separate submissions) you might name them DC-myname-1.txt and DC-myname-2.txt.
This file will be machine parsed, so correct formatting is essential and incorrectly formatted submissions will likely be rejected.
Lines beginning with a hash-tag (#) may be included as comments. These, and blank lines, will be ignored when parsing these files.
The file must contain the following four components in the following order: your predictions, a name for your computational protocol, a list of the major software packages used, and a long-form methods description. Each of these components must begin with a line containing only the corresponding keyword: Predictions:, Name:, Software:, and Method:, as illustrated in the provided example file. Each of these four components is now described, but please refer to the sample file.
Each non-commented, nonblank line in this component must contain the following four items, separated by commas:
This section must contain predictions for Batch 0, Batches 0-1, or Batches 0-2 (the complete set). Missing compounds in a submission will result in an error (except in the case of standardization runs, discussed below).
The name of the protocol should be brief but informative, as illustrated in the example files. Ideally, it will say something about the nature of the method and the key parameters or settings, such as force field chosen or quantum chemistry level.
List the name and version number of each major software package used in your protocol, one package per line.
Please use this section to provide a long-form description of the computational method used to make the predictions. The level of detail should be at least as complete as that of a typical “Computational Details” section of a computational paper. Thus, for a simulation-based method, it should describe the sampling methodology and extent, the force field(s) used, the method used to extract thermodynamic results from the predictions (e.g., solvation free energies of the neutral species at infinite dilution in water and cyclohexane were used to estimate the distribution coefficient), and how statistical uncertainties were evaluated (e.g., statistical inefficiency or blocking analysis), and so on.
If you are using MD with explicit solvent to compute binding affinities, you have, hopefully, applied your technology to the standard setups we provided, at minimum. To submit your results for these special cases, please create a prediction file whose name begins with DCStandard, such as DCStandard-myname.txt. This file should be structured as detailed above, but need not include the full set of standard cases. It should include prediction lines only for those standard setups you actually ran.