
We have provided several useful validation tools for structures determined by X-ray crystallography. These checks should be performed before you submit your files to pdb_extract for annotation, but should not serve as a replacement for complete validation.
Three popular programs ( sfcheck , refmac , Phenix.model_vs_data ) are used for to check the your structure factors and the coordinates.
Sfcheck (v7.04.2) is a tool included as a part of the CCP4 refinement package, and assesses the refined atomic coordinates (protein, nucleic acid, and ligand) against your structure factors without the need for ligand chemical information libraries. If your structure factors are twinned, then use the twinned structure validation tool found at the bottom of the SF-Tool page instead. Detailed information about this program can be found at the Official sfcheck Documentation site.
TO RUN SFCHECK,
This tool will then convert your files to mmCIF format for processing.
The sfcheck tool will return the following report:
Sf-convert will automatically convert your structure factor file to any of the following formats: mmCIF, MTZ, CNS/CNX, XPLOR, SHELX, TNT, HKL2000, SCALEPACK, D*Trek, SAINT, or generic OTHER format.
You can download the desktop version of sf-convert here.
To convert a structure factor file using the online version, input your Structure Factor File into the SF-Tool form. You will also need to provide your coordinate file if you wish to convert to the MTZ format, since its format also contains unit cell parameters (a, b, c, alpha, beta, gamma, symmetry) and resolution information.
TO CONVERT YOUR STRUCTURE FACTOR FILES:If you wish to convert TNT, SHELX, and OTHER SF files to another format, you must indicate if the data has been calculated as amplitudes (F) or intensities (I).
Most reflection/structure factor data are organized into (at minimum) 5 columns (see examples). The default column assignments used by most programs are:
H K L F(or I) SigF(or SigI or status) ...and a floating Free_R flag columnwith at least one space in between each column. These column labels might not be displayed depending on the program's particular format, but would be displayed in an mtz_dump output.
An automatic conversion is used to convert one standardized SF format to another. If you are using one of the accepted formats (column labels, etc.), and you have made no manual changes to the structure file format yourself, then this method will work for you. Our tool is programmed to recognize which columns correspond to each parameter (depending on the program's format) and "automatically" converts the data to the new format accordingly.
However, if you used a novel program whose SF file format does not match defined criteria, or you have manually changed the column labels, then your file format will not be recognized by sf-convert. In this case, a semi-automatic conversion is necessary which will require some additional input from the user. A table will be displayed where you must match the appropriate SF parameters with your column labels. This tool will then perform a complete DATA TYPE conversion from CNS or MTZ format to the mmCIF format, rather than only converting the essential data as in the automatic conversion.
During conversion, you can also set aside 5, 8, 10% of the reflection data for cross-validation (Free_R factor). This can be either a new Free-R selection, or you can repopulate the reflection list. To do this, write the percentage as a whole number in the provided box. If you want to leave the current Free_R flags untouched, leave this box empty.
If there are improperly labeled alternative conformers in your PDB file, this option will fix them for you.
To check for twinned structure factors, input your coordinate file in PDB format and your SF file in mmCIF format. The tool will then generate a CNS input file and parse it thorugh the CNS twinning script. The tool will then display the statistical report which shows if your SF are twinned or not.
To either detwin your structure factors or to validate your model which is derived from a twinned dataset, input your coordinate file in PDB format and SF file in mmCIF format. Also select the appropriate twinning operator from the drop-down menu, and indicate the twin fraction. These tools will then evaluate the query by CNS script.
Questions, comments, and suggestions should be sent to deposit@deposit.rcsb.org.