• Register
X
Forgot Password

If you have forgotten your password you can enter your email here and get a temporary password sent to your email.

X

Leaving Community

Are you sure you want to leave this community? Leaving the community will revoke any permissions you have been granted in this community.

No
Yes
X
Forgot Password

If you have forgotten your password you can enter your email here and get a temporary password sent to your email.

File Formats for Submitting Grand Challenge 2015 Predictions - Stage 2

January 25, 2015

For Stage 2 of Grand Challenge 2015, you may submit ligand affinity scores or rankings for the full set of HSP90 ligands, and for the 18 MAP4K4 ligands with measured affinities. In addition, you may submit binding free energy calculations for the designated free energy subsets of the HSP90 ligands. Note that pose predictions are not part of the Stage 2 challenge. All predictions must be submitted in the form of gzipped tar (.tgz) files of two possible types:

  • A score file contains ligand scores or ranks, without any pose predictions, for the full set of ligands.
  • A free energy file contains ligand binding affinities computed for the small “FEP” compound sets in the challenge.

These files are summarized in Figure 1, and the subsequent text details their contents and format. Additionally, you can download completed example Score.tgz and Free_Energy.tgz files, as well as blank template files for ligand scores, ligand scoring protocols, free energy protocols, and free energy predictions. The examples and templates are all based on the HSP90 challenge.


Figure 1. Diagrammed contents of the two types of tgz files for Stage 2 of Grand Challenge 2015. See text for details.

Preferred structure of a submission file

This is the output from "tar -tzf myScore.tar.gz".
Score/
LigandScores-1.csv
LigandScoringProtocol-1.txt
Things to note:
1. Single directory of files. Name of the directory does not have to be "Score".
2. Directory of files is only one level. Please do not include directory tree like "/home/username/hsp90/Score". The following is an example tar command which will remove the extra directories:
     tar -cvzf myScore.tar.gz --directory=/home/username/hsp90 Score
3. No extra files, such as .sh scripts, Excel files, other tar files, etc.
4. No extra Mac based files. These show up as "._LigandScores-1.csv" or ".DS_Store". The following is an example tar command which will remove those extra Mac files:
     tar --disable-copyfile --exclude=.DS_Store -cvzf myScore.tar.gz Score

Score tgz file

A score tgz file is used to submit one to ten sets of ligand scores or rankings, generated by any method. For example, protocols for generating these scores or rankings may be based on a QSAR method, a neural network model, or docking protocol. The name of a score tgz file must include the string “score”, so that the file name has the form *score*.tgz, except that the strings “free_energy” and “dock” may not be in the file name. The file contains the description of and results from one to ten ligand-scoring or ranking methods. If more than one scoring protocol and results set is included in the file, the files describing them must have distinct names.

Each ligand scoring or ranking is described by two files: a ligand scoring protocol file, named LigandScoringProtocol-n.txt, where n is n integer from 1 to a maximum of 10; and a ligand scoring results file, named LigandScores-n.csv, which contains the scores and/or rankings generated by the corresponding protocol. Please create these according to the instructions given below, in “Ligand scoring protocol and result files”.

Ligand scoring protocol and result files

Each ligand scoring or ranking in a Score tgz file is described by two files: a ligand scoring protocol file, named LigandScoringProtocol-n.txt, where n is n integer from 1 to a maximum of 10; and a ligand scoring results file, named LigandScores-n.csv, which contains the scores and/or rankings generated by the corresponding protocol. These files are now described.

The ligand scoring protocol file must contain a brief, structured summary, in the form of a plain-text document, of how you scored the ligands according to predicted affinity for the target protein. A template file is provided for your convenience, as is a sample filled-out file. Lines beginning with a hash-tag (#) may be included as comments. The file must contain the following components, as illustrated in the template and example:

  • Your informal brief name for the protocol
  • A list of the major software packages and their versions used in the protocol
  • A listing of the key parameters used in the calculations
  • A brief narrative of the procedure.

The ligand scoring results file lists your rankings and scores or energies of the binding strengths of the ligands. Please refer to the template and example files. Again, lines beginning with # will be treated as comments.

Since some scoring methods provide results interpretable as binding energies or free energies, while others provide scores without well-defined units, the first non-comment line of your file must state whether you are providing energies or scores. This line must take one of the following forms:

Type: energy

or

Type: score

If your results are given as energies, the units must be in units of kcal/mol.

Each subsequent non-comment line of the file comprises the identifier of one ligand for the protein target in question; your ranking of the ligand within the set, where 1 corresponds to maximal affinity; and your computed binding energy, free energy, or score. These three items should be separated by commas.

The template file is prefilled with the list of ligand identifiers, for your convenience.

Separate files will be used for the smaller “FEP” prediction sets (see below). However, if you used a free energy method for all of the compounds, you can use the present file format to document these calculations.

Free_energy tgz file

If you used a free energy method, such as FEP or TI, to compute the absolute or relative binding free energies of the compounds in the small HSP90 “FEP” compound sets, please submit these predictions in a free energy tgz file. The name of a free energy tgz file must include the string “free_energy”, so that the file name has the form *free_energy*.tgz, except that the strings “score” and “dock” may not be in the file name. Note that you are free to submit both ligand scores for the full set of compounds (above), and also free energy calculations for any or all of these smaller sets. Each free_energy tgz file should include the results from a single free energy protocol for all of the compound sets that you applied this protocol.

The free energy calculation protocol file must be named FreeEnergyProtocol.txt, and it must contain a brief, structured summary, in the form of a plain-text document, of how you scored the ligands according to predicted affinity for the target protein. A template file is provided for your convenience, as is a sample filled-out file. Lines beginning with a hash-tag (#) may be included as comments. The file must contain the following components, as illustrated in the template and example. The required components are the same as those for the pose prediction file, though the methodology used may well be different.

  • Your informal brief name for the protocol
  • A list of the major software packages and their versions used in the protocol
  • A listing of the key parameters used in the calculations
  • A brief narrative of the procedure.

As shown in the free energy template and example files, the free energies for each set of compounds should be submitted in a format similar to that used for ligand scoring, and named according to set; e.g., FreeEnergiesSet1.csv, FreeEnergiesSet2.csv, FreeEnergiesSet3.csv. Thus, each line should list one ligand, followed by the predicted binding free energy (kcal/mol), and then by your estimate of the numerical uncertainty (standard error of the mean) in the prediction due to limitations in sampling. If you computed relative binding free energies, the results should all be referenced to the first ligand listed (see template), which hence should be assigned a free energy of 0.0 with zero uncertainty.

X

Are you sure you want to delete that component?