User Interface

At the expense of a graphical interface, development resources have been focused on energy functions, on algorithms for search of conformation space, and on a handful of supporting utilities. The ereg package interfaces to programs (such as pymol) for graphical display through pdb format files. The user interface consists of 12 commands typed to the macOS (or linux) prompt. The following command line arguments, each replaced by the user with a meaningful string of characters, specify the associated objects.

FAM  A project name.
MOL  A molecule or set of molecules that compose a system.
CNF  A structure or conformation of the system.
SUB  A subset of the set of rigid-geometry degrees of freedom of the system.
GRP  A collection of templates to be used in homology model building.

The user interface is designed with a goal of creating an easily-usable tool set, providing functionality in factored units of utility that should be combinable to accomplish a range of modeling studies.

For computers running macOS, ereg includes a simple viewing app (written in Apple's Swift and Metal languages) for graphical display of program generated structures.

Central Commands

The functionality of the package is accessed through the following 10 commands.

geometry regularization

greg FAM MOL

The energy surface is defined for a rigid-geometry model. To access the energy surface for structure prediction, a generalized structure of a molecule or system of molecules must be moved into the subspace of structures consistent with regularized geometry.

The primary use of this command is, for a collection of templates in preparation for homology model building, geometry regularization of experimental structures. A second, less common, use is geometry regularization of large structures, as an alternative to the ereg command, in preparation for application of structure prediction to localized regions.

local energy minimization

ereg FAM MOL

The ereg command is a better alternative to the greg command for bringing an experimental structure into regularized geometry in preparation for sequence design, prediction of pKa values of ionizable groups, or structure prediction of localized regions. Another common use is, to support comparison to other structures, evaluation of the full energy of a structure.

segment structure prediction and sequence design

estp FAM MOL CNF SUB

Two of the most useful functionalities of the ereg program are accessed using the estp command. Structure prediction for segments of proteins is achieved by search through conformation space. Sequence design of thermodynamic stability is achieved by additional search through sequence space.

For a rigid-geometry mechanical system MOL, using conformation CNF as the starting structure, the estp command searches for the conformation that minimizes the full energy function within a subspace SUB of the full space of motion. The subspace of conformations to be searched consists of from 1 to 8 segments, each segment 7 to 13 residues in length, plus a collection of side chains.

For each sequence of the specified space of sequences, the program searches through the specified space of conformations. The lowest-energy conformation found is used to evaluate dG, an estimate of free energy of folding. Because different models are used to represent the folded state of the protein and the reference unfolded state, dG is not, for a single sequence, a physically meaningful measure of stability. However, ddG=( dG(sequence1) -dG(sequence0)), the change in dG with sequence, does provide a meaningful measure of relative stability.

structure quality assessment

prof FAM MOL CNF

Structure quality assessment identifies chain segments likely to be improved by applications of molecular mechanics-based structure prediction.

energy refinement of a homology model

rcyc FAM MOL CNF

Common uses of energy refinement are to improve structures created by the hlog or igor commands, or to explore conformations in the region of an experimental structure.

homology model building

hlog FAM MOL GRP

Homology model building leads to one of the major applications of molecular mechanics-based structure prediction, structure prediction of surface loops for which knowledge-based structure prediction may not be reliable.

ab initio fold prediction

igor FAM MOL

A description of the igor model and definitions of element composition and fold are given on the technology page of this website. A common use of predicted folds is as starting points for global energy minimization using the rcyc, estp, or ptra commands.

guided trajectory search

ptra FAM MOL CNF SUB

As a tool for search of conformation space, the ptra command complements the estp command. A search directed by the estp command, by focusing computation on a spatially localized region of a larger structure, is more efficient when productive motions are concentrated within 1, or more, segments. A search directed by the ptra command, by enabling full chain flexibility, facilitates unconstrained motions such as changes to the packing configuration of helices and sheets.

A description of the guided trajectory search algorithm is given on the technology page of this website. The primary use of this command is structure prediction for small proteins, or, more generally, search through large, unconstrained subspaces of generalized coordinates. A second, less common, use is energy refinement as an alternative to the rcyc command.

ionizable group pKa prediction

ionstate FAM MOL

A common use of the ionstate command is to calculate the most probable ionization state for a specified pH, and to generate a most probable conformation consistent with this ionization state. This usage enables subsequent modeling of the most probable sequence, including protonation state, for a given pH.

docking prediction

edoc FAM MOL1, MOL2, ... MOLn

The input structures are packed as rigid bodies. For docking of 2 rigid bodies, the space of conformations is defined by 6 degrees of freedom, translation and rotation of the 2nd body with respect to the 1st. This continuous space is replaced with a grid consisting of 104857600 discrete conformations. The search algorithm optimizes a packing score over the discretized space.

Computational Requirements

The program, which consists of roughly 124,000 lines of C++ code, was developed for a macOS (or linux) workstation with a requirement of 8, and preferably more, Gigabytes of memory. The source code is compiled using gcc. The calculations are computationally intensive.

Download

The ereg package is being distributed open source under the AGPLv3 license.

The User Manual, included with the distribution, describes the installation.

A short description of the macOS specific Viewing App, is also included with the distribution.

Download ereg_jun2024.tar.gz [72 MB]

This version (jun2024) constitutes the second public release of the software, beginning in Q2 of 2024.

Support

We offer support and responsiveness to user input.