The analysis of macromolecular structures often requires a comprehensive definition of atomic neighborhoods. Such a definition can be based on the Voronoi diagram of balls, where each ball represents an atom of some van der Waals radius. Voronota is a software tool for finding all the vertices of the Voronoi diagram of balls. Such vertices correspond to the centers of the empty tangent spheres defined by quadruples of balls. Voronota is especially suitable for processing three-dimensional structures of biological macromolecules such as proteins and RNA.
Since version 1.2 Voronota also uses the Voronoi vertices to construct inter-atom contact surfaces and solvent accessible surfaces. Voronota provides tools to query contacts, generate contacts graphics, compare contacts and evaluate quality of protein structural models using contacts.
Voronota is developed by Kliment Olechnovic (kliment@ibt.lt).
Download the latest archive from the official downloads page: https://bitbucket.org/kliment/voronota/downloads.
The archive contains ready-to-use statically compiled 'voronota' program for 64 bit Linux systems. This executable can be rebuilt from the provided source code to work on any modern Linux, Mac OS X or Windows operating systems.
Packages in .deb or .rpm formats are currently not available. However, installing Voronota in Linux or Mac OS X is easy: just copy Voronota executable files ('voronota' program and, if needed, the wrapper scripts) to one of the directories listed in $PATH variable.
Voronota has no required external dependencies, only a standard-compliant C++ compiler is needed to build it.
For example, "voronota" executable can be built from the sources in "src" directory using GNU C++ compiler:
g++ -O3 -o voronota src/*.cpp
You can also build using CMake for makefile generation. Starting in the directory containing "CMakeLists.txt" file, run the sequence of commands:
mkdir build ; cd build ; cmake ../ ; make ; cd ../ ; mv build/voronota voronota
To enable the usage of OpenMP for parallel processing when building using C++ compiler directly, add "-fopenmp" option:
g++ -O3 -fopenmp -o voronota src/*.cpp
When using CMake, OpenMP usage is enabled automatically if it is possible.
To enable the usage of MPI for parallel processing, you can use mpic++ compiler wrapper. You also need to define "ENABLE_MPI" macro when buiding:
mpic++ -O3 -DENABLE_MPI -o voronota ./src/*.cpp
Here is a basic example of computing Voronoi vertices for a structure in a PDB file:
./voronota get-balls-from-atoms-file < input.pdb > balls.txt
./voronota calculate-vertices < balls.txt > vertices.txt
The first command reads a PDB file "input.pdb" and outputs a file "balls.txt" that contains balls corresponding to the atoms in "input.pdb" (by default, Voronota ignores all heteroatoms and all hydrogen atoms when reading PDB files: this behavior can be altered using command-line options). The second command reads "balls.txt" and outputs a file "vertices.txt" that contains a quadruples and empty tangent spheres that correspond to the vertices of the Voronoi diagram of the input balls. The formats of "balls.txt" and "vertices.txt" are described below.
In "balls.txt" the line format is "x y z r # comments". The first four values (x, y, z, r) are atomic ball coordinates and radius. Comments are not needed for further calculations, they are to assist human readers. For example, below is a part of some possible "balls.txt":
28.888 9.409 52.301 1.7 # 1 A 2 SER N
27.638 10.125 52.516 1.9 # 2 A 2 SER CA
26.499 9.639 51.644 1.75 # 3 A 2 SER C
26.606 8.656 50.915 1.49 # 4 A 2 SER O
27.783 11.635 52.378 1.91 # 5 A 2 SER CB
27.69 12.033 51.012 1.54 # 6 A 2 SER OG
In "vertices.txt" the line format is "q1 q2 q3 q4 x y z r". The first four numbers (q1, q2, q3, q4) are numbers of atomic records in "balls.txt", starting from 0. The remaining four values (x, y, z, r) are the coordinates and the radius of an empty tangent sphere of the quadruple of atoms. For example, below is a part of some possible "vertices.txt":
0 1 2 3 27.761 8.691 51.553 -0.169
0 1 2 23 28.275 9.804 50.131 0.588
0 1 3 1438 24.793 -3.225 60.761 14.047
0 1 4 5 28.785 10.604 50.721 0.283
0 1 4 1453 30.018 10.901 55.386 1.908
0 1 5 23 28.544 10.254 50.194 0.595
Taking the "balls.txt" file described in the previous section, here is a basic example of computing inter-atom contacts:
./voronota calculate-contacts < balls.txt > contacts.txt
In "contacts.txt" file the line format is "b1 b2 area". The first two numbers (b1 and b2) are numbers of atomic records in "balls.txt", starting from 0. If b1 does not equal b2, then the 'area' value is the area of contact between atoms b1 and b2. If b1 equals b2, then the 'area' value is the solvent-accessible area of atom b1. For example, below is a part of some possible "contacts.txt":
0 0 35.440
0 1 15.908
0 2 0.167
0 3 7.025
0 4 7.021
0 5 0.624
0 23 2.849
0 25 0.008
0 26 11.323
0 1454 0.021
1 1 16.448
1 2 11.608
1 3 0.327
1 4 14.170
1 5 0.820
1 6 3.902
1 23 0.081
2 2 3.591
2 3 11.714
2 4 0.305
2 5 2.019
Here is a basic example of computing annotated inter-atom contacts:
./voronota get-balls-from-atoms-file --annotated < input.pdb > annotated_balls.txt
./voronota calculate-contacts --annotated < annotated_balls.txt > annotated_contacts.txt
In "annotated_contacts.txt" the line format is "annotation1 annotation2 area distance tags adjuncts [graphics]". The strings 'annotation1' and 'annotation2' describe contacting atoms, the 'area' value is the area of contact between the two atoms, the 'distance' value is the distance between the centers of the contacting atoms. If 'annotation2' contains string "solvent", then the 'area' value is the solvent-accessible area of the atom described by 'annotation1'. The remaining part of the line is used by Voronota querying and drawing commands that are not covered in this section. Below is a part of some possible "annotated_contacts.txt":
c<A>r<2>a<1>R<SER>A<N> c<A>r<2>a<2>R<SER>A<CA> 15.908 1.456 . .
c<A>r<2>a<1>R<SER>A<N> c<A>r<2>a<3>R<SER>A<C> 0.167 2.488 . .
c<A>r<2>a<1>R<SER>A<N> c<A>r<2>a<4>R<SER>A<O> 7.025 2.774 . .
c<A>r<2>a<1>R<SER>A<N> c<A>r<2>a<5>R<SER>A<CB> 7.021 2.486 . .
c<A>r<2>a<1>R<SER>A<N> c<A>r<2>a<6>R<SER>A<OG> 0.624 3.159 . .
c<A>r<2>a<1>R<SER>A<N> c<A>r<5>a<24>R<GLU>A<CB> 2.849 4.628 . .
c<A>r<2>a<1>R<SER>A<N> c<A>r<5>a<26>R<GLU>A<CD> 0.008 4.792 . .
c<A>r<2>a<1>R<SER>A<N> c<A>r<5>a<27>R<GLU>A<OE1> 11.323 3.932 . .
c<A>r<2>a<1>R<SER>A<N> c<A>r<194>a<1501>R<LEU>A<CD2> 0.021 5.465 . .
c<A>r<2>a<1>R<SER>A<N> c<solvent> 35.440 5.9 . .
c<A>r<2>a<2>R<SER>A<CA> c<A>r<2>a<3>R<SER>A<C> 11.608 1.514 . .
c<A>r<2>a<2>R<SER>A<CA> c<A>r<2>a<4>R<SER>A<O> 0.327 2.405 . .
c<A>r<2>a<2>R<SER>A<CA> c<A>r<2>a<5>R<SER>A<CB> 14.170 1.523 . .
c<A>r<2>a<2>R<SER>A<CA> c<A>r<2>a<6>R<SER>A<OG> 0.820 2.430 . .
c<A>r<2>a<2>R<SER>A<CA> c<A>r<3>a<7>R<LYS>A<N> 3.902 2.371 . .
c<A>r<2>a<2>R<SER>A<CA> c<A>r<5>a<24>R<GLU>A<CB> 0.081 4.954 . .
c<A>r<2>a<2>R<SER>A<CA> c<solvent> 16.448 6.1 . .
The list of all available Voronota commands is displayed when executing Voronota without any parameters.
Command help is shown when "--help" command line option is present, for example:
./voronota calculate-vertices --help
Using "--help" option without specific command results in printing help for all commands:
./voronota --help
Name | Type | Description | |
---|---|---|---|
--annotated | flag to enable annotated mode | ||
--include-heteroatoms | flag to include heteroatoms | ||
--include-hydrogens | flag to include hydrogen atoms | ||
--multimodel-chains | flag to read multiple models in PDB format and rename chains accordingly | ||
--mmcif | flag to input in mmCIF format | ||
--radii-file | string | path to radii configuration file | |
--default-radius | number | default atomic radius | |
--only-default-radius | flag to make all radii equal to the default radius | ||
--hull-offset | number | positive offset distance enables adding artificial hull balls | |
--help | flag to print usage help to stdout and exit |
file in PDB or mmCIF format
list of balls
default mode line format: 'x y z r # atomSerial chainID resSeq resName atomName altLoc iCode'
annotated mode line format: 'annotation x y z r tags adjuncts'
Name | Type | Description | |
---|---|---|---|
--print-log | flag to print log of calculations | ||
--exclude-hidden-balls | flag to exclude hidden input balls | ||
--include-surplus-quadruples | flag to include surplus quadruples | ||
--link | flag to output links between vertices | ||
--init-radius-for-BSH | number | initial radius for bounding sphere hierarchy | |
--check | flag to slowly check the resulting vertices (used only for testing) | ||
--help | flag to print usage help to stdout and exit |
list of balls (line format: 'x y z r')
list of Voronoi vertices, i.e. quadruples with tangent spheres (line format: 'q1 q2 q3 q4 x y z r')
Name | Type | Description | |
---|---|---|---|
--method | string | * | parallelization method name, variants are: 'simulated' |
--parts | number | * | number of parts for splitting, must be power of 2 |
--print-log | flag to print log of calculations | ||
--include-surplus-quadruples | flag to include surplus quadruples | ||
--link | flag to output links between vertices | ||
--init-radius-for-BSH | number | initial radius for bounding sphere hierarchy | |
--help | flag to print usage help to stdout and exit |
list of balls (line format: 'x y z r')
list of Voronoi vertices, i.e. quadruples with tangent spheres (line format: 'q1 q2 q3 q4 x y z r')
Name | Type | Description | |
---|---|---|---|
--annotated | flag to enable annotated mode | ||
--probe | number | probe radius | |
--exclude-hidden-balls | flag to exclude hidden input balls | ||
--step | number | curve step length | |
--projections | number | curve optimization depth | |
--sih-depth | number | spherical surface optimization depth | |
--add-mirrored | flag to add mirrored contacts to non-annnotated output | ||
--draw | flag to output graphics for annotated contacts | ||
--tag-centrality | flag to tag contacts centrality | ||
--volumes-output | string | file path to output constrained cells volumes | |
--help | flag to print usage help to stdout and exit |
list of balls
default mode line format: 'x y z r'
annotated mode line format: 'annotation x y z r tags adjuncts'
list of contacts
default mode line format: 'b1 b2 area'
annotated mode line format: 'annotation1 annotation2 area distance tags adjuncts [graphics]'
Name | Type | Description | |
---|---|---|---|
--match | string | selection | |
--match-not | string | negative selection | |
--match-tags | string | tags to match | |
--match-tags-not | string | tags to not match | |
--match-adjuncts | string | adjuncts intervals to match | |
--match-adjuncts-not | string | adjuncts intervals to not match | |
--match-external-annotations | string | file path to input matchable annotations | |
--invert | flag to invert selection | ||
--whole-residues | flag to select whole residues | ||
--drop-atom-serials | flag to drop atom serial numbers from input | ||
--drop-altloc-indicators | flag to drop alternate location indicators from input | ||
--drop-tags | flag to drop all tags from input | ||
--drop-adjuncts | flag to drop all adjuncts from input | ||
--set-tags | string | set tags instead of filtering | |
--set-dssp-info | string | file path to input DSSP file | |
--set-adjuncts | string | set adjuncts instead of filtering | |
--set-external-adjuncts | string | file path to input external adjuncts | |
--set-external-adjuncts-name | string | name for external adjuncts | |
--rename-chains | flag to rename input chains to be in interval from 'A' to 'Z' | ||
--renumber-from-adjunct | string | adjunct name to use for input residue renumbering | |
--renumber-positively | flag to increment residue numbers to make them positive | ||
--reset-serials | flag to reset atom serial numbers | ||
--set-seq-pos-adjunct | flag to set normalized sequence position adjunct | ||
--set-ref-seq-num-adjunct | string | file path to input reference sequence | |
--ref-seq-alignment | string | file path to output alignment with reference | |
--seq-output | string | file path to output query result sequence string | |
--chains-summary-output | string | file path to output chains summary | |
--chains-seq-identity | number | sequence identity threshold for chains summary | |
--help | flag to print usage help to stdout and exit |
list of balls (line format: 'annotation x y z r tags adjuncts')
list of balls (line format: 'annotation x y z r tags adjuncts')
Name | Type | Description | |
---|---|---|---|
--match-first | string | selection for first contacting group | |
--match-first-not | string | negative selection for first contacting group | |
--match-second | string | selection for second contacting group | |
--match-second-not | string | negative selection for second contacting group | |
--match-min-seq-sep | number | minimum residue sequence separation | |
--match-max-seq-sep | number | maximum residue sequence separation | |
--match-min-area | number | minimum contact area | |
--match-max-area | number | maximum contact area | |
--match-min-dist | number | minimum distance | |
--match-max-dist | number | maximum distance | |
--match-tags | string | tags to match | |
--match-tags-not | string | tags to not match | |
--match-adjuncts | string | adjuncts intervals to match | |
--match-adjuncts-not | string | adjuncts intervals to not match | |
--match-external-first | string | file path to input matchable annotations | |
--match-external-second | string | file path to input matchable annotations | |
--match-external-pairs | string | file path to input matchable annotation pairs | |
--no-solvent | flag to not include solvent accessible areas | ||
--no-same-chain | flag to not include same chain contacts | ||
--invert | flag to invert selection | ||
--drop-tags | flag to drop all tags from input | ||
--drop-adjuncts | flag to drop all adjuncts from input | ||
--set-tags | string | set tags instead of filtering | |
--set-hbplus-tags | string | file path to input HBPLUS file | |
--set-distance-bins-tags | string | list of distance thresholds | |
--inter-residue-hbplus-tags | flag to set inter-residue H-bond tags | ||
--set-adjuncts | string | set adjuncts instead of filtering | |
--set-external-adjuncts | string | file path to input external adjuncts | |
--set-external-adjuncts-name | string | name for external adjuncts | |
--renaming-map | string | file path to input atoms renaming map | |
--inter-residue | flag to convert input to inter-residue contacts | ||
--inter-residue-after | flag to convert output to inter-residue contacts | ||
--summing-exceptions | string | file path to input inter-residue summing exceptions annotations | |
--summarize | flag to output only summary of contacts | ||
--preserve-graphics | flag to preserve graphics in output | ||
--help | flag to print usage help to stdout and exit |
list of contacts (line format: 'annotation1 annotation2 area distance tags adjuncts [graphics]')
list of contacts (line format: 'annotation1 annotation2 area distance tags adjuncts [graphics]')
Name | Type | Description | |
---|---|---|---|
--drawing-for-pymol | string | file path to output drawing as pymol script | |
--drawing-for-jmol | string | file path to output drawing as jmol script | |
--drawing-for-scenejs | string | file path to output drawing as scenejs script | |
--drawing-name | string | graphics object name for drawing output | |
--default-color | string | default color for drawing output, in hex format, white is 0xFFFFFF | |
--adjunct-gradient | string | adjunct name to use for gradient-based coloring | |
--adjunct-gradient-blue | number | blue adjunct gradient value | |
--adjunct-gradient-red | number | red adjunct gradient value | |
--adjuncts-rgb | flag to use RGB color values from adjuncts | ||
--random-colors | flag to use random color for each drawn contact | ||
--alpha | number | alpha opacity value for drawing output | |
--use-labels | flag to use labels in drawing if possible | ||
--help | flag to print usage help to stdout and exit |
list of contacts (line format: 'annotation1 annotation2 area distance tags adjuncts graphics')
list of contacts (line format: 'annotation1 annotation2 area distance tags adjuncts graphics')
Name | Type | Description | |
---|---|---|---|
--potential-file | string | * | file path to input potential values |
--ignorable-max-seq-sep | number | maximum residue sequence separation for ignorable contacts | |
--inter-atom-scores-file | string | file path to output inter-atom scores | |
--atom-scores-file | string | file path to output atom scores | |
--depth | number | neighborhood normalization depth | |
--help | flag to print usage help to stdout and exit |
list of contacts (line format: 'annotation1 annotation2 conditions area')
global scores
Name | Type | Description | |
---|---|---|---|
--default-mean | number | default mean parameter | |
--default-sd | number | default standard deviation parameter | |
--means-and-sds-file | string | file path to input atomic mean and sd parameters | |
--mean-shift | number | mean shift in standard deviations | |
--external-weights-file | string | file path to input external weights for global scoring | |
--smoothing-window | number | window to smooth residue quality scores along sequence | |
--atom-scores-file | string | file path to output atom scores | |
--residue-scores-file | string | file path to output residue scores | |
--help | flag to print usage help to stdout and exit |
list of atom energy descriptors
weighted average local score
Name | Type | Description | |
---|---|---|---|
--target-contacts-file | string | * | file path to input target contacts |
--inter-atom-scores-file | string | file path to output inter-atom scores | |
--inter-residue-scores-file | string | file path to output inter-residue scores | |
--atom-scores-file | string | file path to output atom scores | |
--residue-scores-file | string | file path to output residue scores | |
--depth | number | local neighborhood depth | |
--smoothing-window | number | window to smooth residue scores along sequence | |
--smoothed-scores-file | string | file path to output smoothed residue scores | |
--detailed-output | flag to enable detailed output | ||
--help | flag to print usage help to stdout and exit |
list of model contacts (line format: 'annotation1 annotation2 area')
global scores (atom-level and residue-level)
Name | Type | Description | |
---|---|---|---|
--reference-threshold | number | reference scores classification threshold | |
--testable-step | number | testable scores threshold step | |
--outcomes-file | string | file path to output lines of 'threshold TP TN FP FN' | |
--ROC-curve-file | string | file path to output ROC curve | |
--PR-curve-file | string | file path to output PR curve | |
--help | flag to print usage help to stdout and exit |
pairs of reference and testable scores files
global results
The 'voronota-cadscore' script is an implementation of CAD-score (Contact Area Difference score) method using Voronota. The script command line arguments are:
-t input_target_file.pdb
-m input_model_file.pdb
[-a atoms_query_parameters_string]
[-c contacts_query_parameters_string]
[-r output_residue_scores_file]
[-s residue_scores_smoothing_window_size]
[-C cache_directory]
The 'voronota-voromqa' script is an implementation of VoroMQA (Voronoi diagram-based Model Quality Assessment) method using Voronota. The script command line arguments are:
-i input_file.pdb
[-a output_atom_scores_file]
[-r output_residue_scores_file]
[-s residue_scores_smoothing_window_size]
The 'voronota-bfactor' script is a utility for writing atom and residue scores (produced by CAD-score and VoroMQA scripts) as B-factor values in a PDB file. The script command line arguments are:
-p input_structure_file.pdb
-s input_scores_file
The 'voronota-contacts' script provides a way for calculating and querying interatomic contacts with just one command (without the need to construct a pipeline from 'voronota' calls). The script command line arguments are:
-i input_file.pdb
[-a atoms_query_parameters_string]
[-c contacts_query_parameters_string]
[-C cache_directory]