Journal Articles

Moe Forcefield Facilities



P. Labute
Chemical Computing Group Inc.
1010 Sherbrooke Street W, Suite 910; Montreal, Quebec; Canada H3A 2R7


August 3, 1997


Abstract. The forcefield system that is used in MOE is described. The available functional forms are presented as well as some of the techniques used to assign atom types.

INTRODUCTION

A potential energy model, equivalently, a forcefield attempts to describe the preferred conformations of a molecular system by assigning a potential energy value to each configuration. The most accurate models are quantum mechanical models. Unfortunately, the use of such models requires enormous computing resources even for moderately sized molecules. The potential energy services of MOE compromise quantum accuracy for speed so that large molecules can be studied effectively.

The MOE forcefield facilities implement Cartesian coordinate, empirical energy models that with support for 4D position coordinates. The term empirical refers to the fact that the parameters of a model are obtained from empirical data to which the model has been fitted. The position coordinates can be thought of as being specified in a 4D Cartesian space, where the fourth dimension, w, is orthogonal to the xyz dimensions. Four dimensional position coordinates are useful for allowing a system with poor geometry to relax using energy "tunneling" algorithms.

In this article, we present the functional forms supported as well as an overview of the parameter file format and the atom typing system.

FUNCTIONAL FORMS

The energy modeling services have been designed so that each of which models a particular interaction. In this section, the nature of the terms will be described. In some cases, there are alternatives for the functional forms that a particular set of parameters may employ. Each of the terms requires parameters that depend on the particular atoms involved in the interaction. Such parameter constants are located in the forcefield parameter file.

The potential energy is a sum of individual interaction energies

E = wb Eb + wa Ea + wn En + wt Et + wo Eo + we Ee + wv Ev + Ec

where the w values are weights (usually 1) that can be individually controlled. Each term in the sum will be described below. In the following description of the energy model, ri denotes the position of the ith atom, and rij = rj - ri.

Bonded Forces. Bond stretching energy, Eb, is modeled with a quartic form:

kij ( |rij| - Rij )2 + k'ij ( |rij| - Rij )3 + k''ij ( |rij| - Rij )4

where the sum extends over all bonds i-j; kij, k'ij and k''ij are the force constants and Rij is the equilibrium bond length. Bond angle energy, Ea (the energy parameterized by the angle between two bonds sharing an atom) is modeled as a quartic:

kijk A2ijk + k'ijk A3ijk + k''ijk A4ijk

where the sum extends over all bond angles i-j-k, kijk, k'ijk and k''ijk are the force constants, Tijk is the equilibrium bond angle (in radians), Aijk is either tijk - Tijk or cos tijk - cos Tijk depending on the functional form chosen, and tijk = rTji rjk / |rji| |rjk|. For linear molecules (e.g., sp hybridization), the form kijk (1 + cos tijk) is used in place of the quartic expression.

Stretch-bend cross terms, En, are modeled with

[ kijk (|rij| - Rij) + kkji (|rjk| - Rjk) ] Aijk

where the sum extends over all bond angles i-j-k, kijk and kkji are force constants and the other terms are as above. Stretch-bend terms are not used for linear or near-linear angles.

The torsion energy, Et, or energy due to the rotation about a bond, is modeled with a Fourier expansion:

kn;ijkl [1 +/- cos n Dijkl ]

where the sum extends over all values of n in {1,2,3,4,5} and all dihedrals i-j-k-l, kn are force constants, and Dijkl is the dihedral angle (radians) about the bond j-k.

Out of plane forces, Eo, can be modeled with one of two forms. The first uses the torsion functional form except that the sum extends over all atoms i with three neighbors j, k, and l. The second form uses ki;jkl X2i;jkl where ki;jkl is the force constant and Xi;jkl is the Wilson angle (the angle between the bond il and the plane ijk.

Nonbonded Forces. Nonbonded forces are defined to be those interactions which apply to all pairs of atoms that are not related by a bond (1-2 interaction) or an angle (1-3 interaction). Atoms related by a proper torsion (1-4 interaction) are handled in a special way.

Electrostatics, Ee, are modeled with a constant dielectric form

c qi qj s(|rij|) (rnij + d)-1

where c is a constant, n is either 1 or 2, qi is the partial charge on atom i and d is a buffering constant used to prevent division by 0. Both forms can be scaled by a programmable factor when the two atoms involved are related by a 1-4 interaction (i.e., a torsion). Electrostatic forces are ignored for 1-3 (angle) and 1-2 (bond) interactions. The function s is a smoothing function and will be described below.

Van der Waals forces, Ev, are modeled with

where the eij, Rij, mij and nij are empirically determined constants. The a and b constants prevent division by 0 and are called buffering constants. Although complicated, this form can reproduce the usual 12-6, 12-10 and 9-6 potentials by careful selection of the parameters. Like electrostatic forces, van der Waals forces are scaled when the two atoms are related by a 1-4 interaction (but ignored for 1-3 and 1-2 interactions).

The function s is a smoothing, or cutoff function that dampens the long-range non-bonded forces to 0 over a prescribed range. The smoothing function is parameterized by two values: the cutoff on distance ra, and the cutoff off distance rb. The function s(r) has the value 1 for r less than ra, 0 for r greater than rb and, for r in [ra,rb],

s(r) = p((r-rr) / (rb-ra)),     p(x) = 1 + x2 ( -6 x2 + 15 x - 10 )

To simulate larger systems of atoms, the periodic boundary condition can be imposed. Under periodic boundaries, the molecular system is assumed to be part of an infinite "liquid" created by replicating copies of the system in all directions. This replication is done virtually, so that no new atoms are actually created. The original atoms can interact with their images in the other cells, thus modeling the behavior of the atoms in a periodically structured system, such as a crystal, or in homogeneous mixtures, such as liquids.

Restraints. Restraints are extra terms added to the base potential function to impose geometric conditions. Whereas a constraint is a hard condition, a restraint is softer in that a weight is used to create high energy values when the constraint is violated. In MOE, the restraint energy is a sum of individual terms

Ec = Ecd + Eca + Ect + Ecx

where Ecd is the distance constraint violation energy, Eca is the angle constraint violation energy, Ect is the torsion constraint violation energy, and Ecx is the chiral constraint violation energy.

Distance restraints, Ecd, are modeled with

Cij ( |rij|2 - Rij2 )2

where the sum extends over all distance restraints, Cij is the weight, or force constant, and Rij is the desired value for the distance. Angle restraints, Eca, are modeled with

Cijk ( cos tijk - cos Tijk)2

where the sum extends over all angle restraints, Cijk is the weight, or force constant, and Tijk is the desired value for the angle. Torsion restraints are modeled with a single term from the Fourier expansion used to model proper torsion energy. Chiral constraints are imposed on those chiral atoms that have four neighbors and a specified R or S designation. MOE will assign a very large energy value to violations of chirality.

MOE can prevent the modification of the positions of fixed (position constrained) atoms. The potential energy model will return a gradient of zero for fixed atoms, and furthermore, the positions of fixed atoms will be treated as constants for the purposes of the energy calculation.

PARAMETERS

The parameters for the potential energy model are defined in a single ASCII parameter file which contains definitions for atom types, assignment rules, and individual parameters for the supported functional forms. By altering the parameters, different energy models can be used. The parameter file consists of a number of definition blocks beginning with a block that defines the atom types and basic properties. The remaining blocks contain forcefield term-specific parameters and definitions.

Central to any empirical potential energy model is the notion of atom type. An atom type is an equivalence class of atoms, for instance, those of a given geometry, or those bonded to certain elements. Each atom type has its own associated model parameters, which are obtained empirically. The first step in building a model is the definition of atom types.

Atom types are listed at the top of the parameter file. Each atom type declaration must be unique, and refers to a context-specific instantiation of an atom of a given element. Two or more atom types may refer to the same element. Each atom type definition consists of a single line containing the type symbol, the underlying element and a textual description of the type. Other lines contain information about the forcefield itself (e.g., the title) with possible references.

MOE uses an automatic method of assigning atom types to atoms during calculations. This automatic method is based on substructure searching and pattern matching to detect the context in which an atom type is to be assigned to a particular atom. This substructure search is controlled by the rules section of the parameter file.

The rules section is signalled by a line with [rules] in the first field. The content of the rules section is a small substructure searching language whose syntax will now be described. The rules are given as a sequence of lines each of which is a white-space separated sequence of fields. Each line describes a pattern and a type to assign if the pattern is matched:

type-name expression

The expression is a boolean substructure search expression (see below). The rules are ordered and later rules override earlier ones; therefore, generic rules should be placed first and special cases further down. An expression defines a boolean expression and has the following syntax:

expression : expression and expression
expression or expression
not expression
( expression )
property
match smiles-pattern

The and and or indicate logical conjunction and disjunction respectively, while the not character indicates logical negation. Parentheses are used to override the default precedence. The property expressions test candidate atoms for particular properties such as ionization, connectivity, ring membership and names.

The following is an example rules section for the types defined above:

[rules] # patterns that the atom type context

C    match '[CX4]'
Csp2 match '[CX3]=*' or match 'c'
HC   match '[#1][#6]'

SUMMARY

The MOE forcefield system is capable of describing a large family of potential energy models (e.g., Kollman's AMBER '89, '94 and the Merck MMFF94). The automatic atom typing system and flexible nonbonded parameter features allow efficient specification of novel or special atom types. Although the built-in functional forms are suitable for Class I and some Class II forcefields additional energy terms can be added through the use of the SVL programming language.