Journal Articles

An Efficient Algorithm For The Determination Of Topological RS Chirality



P. Labute
Chemical Computing Group Inc.
1010 Sherbrooke Street W, Suite 910; Montreal, Quebec; Canada H3A 2R7.


November 20, 1996


Abstract. In this work, we describe an efficient method to detect and label chiral centers as per the RS system.

INTRODUCTION



Determination of chiral centers and labeling by the RS system as proposed by Cohen, Ingold and Prelog is important in computational chemistry for a number of reasons. Aside the usual uses of nomenclature, specification of chiral constraints during energy minimization and detection of symmetries, proper labeling of chirality is extremely useful for 2D renderings of conformations. A number of algorithms have been proposed. Many are incorrect or do not correspond to the CIP system. Others are computationally expensive.

We present an efficient algorithm for the assignment of CIP priorities to every atom in a molecule. This information can then be used to assign R, S, cis or trans labels.

METHOD



The objective of the algorithm is to assign a priority, p(i) to every atom i in the molecule so that if p(i)<p(j) then atom i has strictly lower CIP priority than atom j with equality only occurring when the atoms are indistinguishable (topologically). In essence, the algorithm maintains an ordered list of equivalence classes of atoms. Each atom in an equivalence class is assigned the priority of the -1 of the class in the sorted list. The algorithm repeatedly splits classes (maintaining the ordering) until no changes occur. The final priority assignments are then output as the CIP priorities.

The algorithm proceeds as follows:

  1. For each atom i, set p(i) equal to the atomic number of atom i. Set the initial partition to be an ordered list of classes (C1,…,Ck) such that for each atom i in class Cr and each atom j in class Cs we have (i) p(i)<p(j) iff r<s; and (ii) p(i)=p(j) iff r=s.
  2. For each atom i set s(i)=abc…z to be an ordered list of neighboring p(j) numbers in decreasing order accounting for bond multiplicities with repeated values (i.e., if atom i is double bonded to an atom with priority p, then put p in the list twice).
  3. For each class Cr, partition the atoms in the class into ordered subclasses (S1,…,Sk) such that for each atom i in subclass Sr and each atom j in subclass Ss we have (i) s(i)<s(j) iff r<s; and (ii) s(i)=s(j) iff r=s. (Note, the s(i) strings are compared lexicographically.)
  4. If every class was partitioned into only one subclass then terminate with p(i) as the priority of atom i.
  5. Form a new partition of all the atoms by concatenating all of the computed subclasses of all of the classes (in the same sequence as the original classes).
  6. For each class Cr and for each atom i in Cr set p(i) to r and go to Step 3.

The partitioning steps can be effected with sorting and since all other steps require linear time, we have that each iteration of the algorithm requires O(nlogn) time (assuming bounded degree of all n atoms). At most n iterations are required giving a total running time of O(n2logn).

Once CIP priorities have been assigned to every atom in a molecule it is a simple matter to order the neighbors of each atom and compute the appropriate signed volume tests on the atomic coordinates to make topological R, S, cis, or trans assignments.

Once the basic chiral assignments have been made, the classes can be further partitioned taking the initial assignment into account. This process will create new chiral centers based on the chirality of the branches.

DISCUSSION



The RS system uses bond orders which, from a computational standpoint, is unfortunate since (i) a particular resonance structure must be chosen; (ii) the "phantom" atoms, although easy to deal with, are an unnecessary complication.

As an alternative we propose the following system. The algorithm to compute assignments is the same as the one presented except for the initial assignment of priorities. With CIP priorities, the initial string is made up of the neighboring atomic numbers taking multiplicities into account. The new system need only replace the initial priority assignment of atomic number with a code taking into account further properties.

Initially each atom is assigned a code of the following form:

(a128+s)8+b

where a is the atomic number of the atom, s is the isotope number, b is the hybridization of the atom coded as sp=6, sp2=5, d3sp3=4, d2sp3=3, dsp3=2, sp3=1, and the ground state being coded as 0.

This system has the advantage that the hybridization state is invariant under resonance, is easy to detect, assignable in the absence of hydrogens, and will still be close to the original CIP priorities.