Flexible Alignment Of Small Molecules
Abstract. A method is presented for flexibly aligning small molecules. The method accepts, as input, a collection of small molecules with 3D coordinates and computes a collection of alignments. Each alignment is given a score which quantifies the quality of the alignment both in terms of internal strain and overlap of molecular features. The results of several computational experiments on collections of compounds are reported suggesting the method's utility for the elucidation of pharmacophores, comparative field analysis and template forcing.
Given n molecules known to bind with a particular receptor it is often of interest to determine in what ways they are similar. The notion of "molecular similarity" is difficult to quantify. In many methods, a set of features of interest of a conformation in a molecule is compared to the feature set of another molecule in some quantitative manner. For example, matrices of inter-feature distances can be compared in some matrix metric. Another possibility is to combine comparison of feature type and feature distance into some ad hoc similarity measure. Such methods typically require empirical fine-tuning of various constants in order to assign relative importance to particular feature comparisons.
In this paper, we shall consider a general framework for feature comparison based on the notion of a feature probability density of a given conformation of a molecule. The advantage of probability densities is that they must integrate to unity and are, therefore, inherently balanced with respect to each other (i.e., probability is an absolute measure). As an example of the general method, we defined a quantitative measure of the quality of a 3D alignment (or configuration) of molecules inspired by the following qualitative notion of a good alignment:
An alignment is good if a) the strain energy of each molecule is small; b) they have a similar shape; c) their aromatic atoms overlap; d) their donors overlap and e) their acceptors overlap.
The derived quantitative assessment was used in a simultaneous conformational search procedure. Details of the methods used are presented in the section entitled Methods and the results of several computational experiments are presented in the Results and Discussions section.
MATERIALS AND METHODS
Similarity Measure. An isotropic (spherically symmetric) Gaussian, or Normal, probability density has the functional form
where s2 is the inverse variance along each axis and x0 is the center (and mean) of the density. An atom with nucleus located at x0 with van der Waals radius r is represented with
where a determines how broad the density will be. For example, if a = 2 then approximately 90% of the density will be contained within the van der Waals radius. In general, the value of a can be used to approximate molecular volumes; i.e., to fill in the gaps between atoms.1
Let x1,...,xn denote the 3D positions of n atoms of a molecule in a given conformation. Let the non-negative wi denote the degree to which atom i has some property P (e.g., "is aromatic" or "is donor"). We can then assign, to each point in space, x, the likelihood that x has the property with sum-of-Gaussians density
This density is called the P-density of the conformation. For example, if P is the property "is aromatic" so that wi is 1 when atom i is aromatic and 0 otherwise then the function f is the aromatic density of the conformation and represents the likelihood that a given point in space is (in some sense) aromatic.
Given two P-densities for two given molecules or conformations, the overlap of the two densities is, itself, a sum-of-Gaussians density in the interatomic distances:
Using these notions, we can define a feature similarity measure, F, to be
where each term is a sum of overlaps of P-densities between all pairs of molecules under consideration for the properties "aromatic", "donor", "acceptor" and "volume".
Feature Assignment. The P-density similarity measure requires the assignment of atomic "features". In this study, the following features are used:
A pharmacophore-based scheme was used to assign atom types to each atom. The atom type set consisted of "Donor", "Acceptor", "Polar", "Acidic", "Basic", "Hydrophobe", and "Other" and a molecular pattern matching algorithm was used to make the assignments and the rule set was based on the PATTY rules.2
Search Procedure. With a quantitative measure of goodness for an alignment, we can "sample" an alignment using a RIPS3 style procedure:
The termination criteria can be interpreted as follows. Upon termination there have been k consecutive attempts to generate a new configuration and each has failed. This is similar to tossing a coin k times and observing "heads" on each toss. The Bayes estimator for the bias on the coin is (k+1)/(k+2); thus the probability of observing a "tail" (or a new configuration) is 1/(k+2). At k = 20 this probability is less than 5%.
Upon termination of the search the collection of configurations is pruned by removing all configurations in which the average potential energy is greater than the minimum observed average potential energy plus some predefined threshold (in this study, 7 kcal/mol).
Materials and Software. The foregoing techniques were implemented in the SVL programming language of Chemical Computing Group Inc.'s Molecular Operating Environment4 (MOE) version 1998.10. All calculations were performed on a Silicon Graphics Octane with a 250 MHz MIPS chip running IRIX 6.2. Nonlinear optimization was carried out with the MOE Truncated Newton optimizer preceded by two steps of Steepest Descent5 and terminated when the RMS gradient fell below 0.001. The MOE implementation of MMFF946 was used to measure the internal strain of each molecule. Chirality was preserved using signed volume restraints on all chiral centers. Pharmacophore atom type assignment was performed using the MOE implementation of the Daylight SMARTS pattern matching language. The determination of the RMSD between structures used the MOE Superpose functionality which calculates an optimal global superposition by minimizing a weighted least squares error function.
RESULTS AND DISCUSSION
In this section, we report the results of several computation experiments. In each of the experiments the following algorithm parameters were used:
Many of the alignment experiments are the same as those presented in Jones et al.7 for purposes of comparison.
DHF and MTX. Kearsley et al. use the alignment of DHF and an MTX analog as an example of the SEAL8 method. These structures are shown in Figure 1. These structures are very similar; however, the obvious alignment is not the correct alignment from the point of view of pharmacophore overlap. Furthermore, the tautomers of DHF place a certain burden on the pharmacophore typing system. We applied the present method to these structures with an iteration limit of 30 and 30 distinct configurations were output. The 6 top scoring configurations were very similar both in terms of relative geometry, potential energy and similarity scores (less than 1% relative difference). These minor differences in scoring were well within the differences expected due to the potential energy model. The obvious (pure steric) alignment was the seventh top scoring alignment. The top scoring alignment, presented in Figure 1, also had the smallest potential energy and each structure was within 1 kcal/mol of the nearest local minimum. The lower fused ring system is correctly aligned with respect to pharmacophore functionality. The alignment also suggests a common nitrogen acceptor in the atoms between the rings as well as a common NHCO group.
Three benzodiazepine receptor binders. Codding and Muir studied10 three compounds that bind to the benzodiazepine receptor. These compounds were also analyzed with the Jones et al. method and are thus included in the present. The structures are shown in Figure 2. Two of the structures promote convulsions on binding while Ro15-1788 is an antagonist that does not promote convulsions. Codding and Muir's work predicted a binding site that recognized an aromatic ring, a donor oxygen, a hydrophobic side chain and a nitrogen donor group. It was hypothesized that the absence of the donor nitrogen is required for antagonism (e.g., in Ro15-1788). This case is interesting in that one of the structures is significantly different and doesn't seem to admit a clear-cut superposition of features. We applied our method to these structures with an iteration limit of 100 and produced 95 distinct alignments. The top scoring configurations differed in average potential energy by about 1.3% and in similarity score by 0.5% while the worst alignment had an average 5 kcal/mol more strain energy and a 30% worse similarity score.
Figure 2. Three benzodiazepine receptor binders (Ro15-1788 does not promote convulsions while the others do).
Figure 3 shows two representatives of the top scoring alignments; each has several common features as well as some unique features.
Figure 3. The top scoring alignment (top) and the next least strained alignment (bottom). Ro15-1788 is colored blue while the other molecules are colored yellow.
Referring to specific points of interest (indicated by letters in the figure), we note the following:
On the whole the lower alignment seems best in spite of the fact that it scored the third best in similarity score (but second best in potential energy). The chosen two alignments were within 0.06 kcal/mol of each other and were within 0.1% of each other in similarity score. These differences are well within what is expected from conformational potential differences.
Leu-enkephalin and a hybrid morphine. The second alignment problem presented in Jones et al. concerned the hybrid morphine molecule EH-NAL and Leu-enkephalin described by Kolb11 and shown in Figure 4. The bound conformations of these molecules are not known. The two structures are very different with the added problem of the flexibility of Leu-enkephalin. We applied our method with an iteration limit of 200 and calculated 189 distinct configurations. The five best scoring configurations exhibit features similar to those described by Kolb in that the rings marked (a) and (b) are brought together and the protonated nitrogens are superposed in all but one of the alignments. The five best scoring alignments differed by less than 0.6% in similarity score; however, the third best is, perhaps, the most interesting. It was the most strained of the configurations (~7 kcal/mol average) and were it not for this strain it would have scored the best, by far, in terms of similarity score alone.
Figure 4 shows this third best configuration with points of interest marked with the letters A through F:
Figure 4. Top: Hybrid morphine EH-NAL and Leu-enkephalin described by Kolb (1991). Rings marked (a) and (b) form the basis of a good alignment. Bottom: Superpositions of EH-NAL (yellow) and Leu-enkephalin (blue).
Four dopamine receptor binders. Hoberg and Norinder12 described four molecules shown in Figure 5: dopamine, the dopamine receptor agonist, and three anti-psychotic dopamine D2 receptor antagonists. Jones et al. report a pharmacophore of an aromatic ring, a protonated nitrogen and an oxygen donor. We applied our method to the four molecules using an iteration limit of 200. Chirality was preserved around carbon centers but not the nitrogen centers on the grounds that such centers are easily inverted due to proton exchange with the solvent. A total of 161 distinct alignments were produced.
Figure 5. Dopamine and three dopamine receptor antagonists.
The top scoring alignment is shown in Figure 6. Points of interest are marked with the letters A through D:
Figure 6. Top scoring alignment of four dopamine receptor binders: haloperidol (red), amisulpride (yellow), raclopride (green) and dopamine (purple).
This best scoring configuration had 1.2 kcal/mol of average strain per molecule. Jones et al. reported that their best scoring alignment occurs when the protonated nitrogens overlap. We did not find a configuration in which all four protonated nitrogens overlapped; however, there were a few in which three of the four overlapped. Jones et al. reported difficulty in locating this alignment and perhaps our limit of 200 attempts was too strict.
Two FKB12 ligands. The Jones et al. paper describes a particularly interesting alignment example involving two very different structures with known experimental binding conformations. The two structures, shown in Figure 7 are FK506, a microbial product that blocks T-cell activation and proliferation, and "Compound 9" from a study by Holt et al.13
Figure 7. Structures of two binders with experimentally determined bound conformations to the human immunophilin FKBP12.
Both of the structures bind well to the human immunophilin FKBP12. The coordinates of the bound complex of FKBP12 and FK506 have been deposited in the Brookhaven Protein Databank14 (PDB) with the code 2FKE. The ligand of the complex was used to set chiral constraints on all carbon chiral centers. Our algorithm was applied to the structures with an iteration limit of 200 and 200 distinct alignments were produced as output. Inspection of the configurations revealed that conformations of the large ring in FK506 were being well sampled (one of the more interesting aspects of this example since torsional procedures require special logic to sample the conformational space of rings).
Figure 8. Best scoring alignment of FK506 (blue) and Compound 9 (yellow).
The top scoring alignment had both the least strain energy and the best alignment score and is shown in Figure 8 with points of interest marked with the letters A through F:
These results are in substantial agreement with the results of Jones et al. A comparison of the conformation of FK506 in the best scoring alignment and the conformation of the bound complex in the PDB is shown in Figure 9. Aside from the carbon six ring (not directly involved in binding) there is excellent agreement between the conformation calculated with our method and the experimentally determined coordinates. This result is particularly encouraging since it appears that the alignment of two binding ligands, even if very different, can be enough to direct a conformational search towards the bound conformation without explicit receptor information.
Figure 9. Comparison of the calculated bound conformation of FK506 (blue) with the experimentally determined bound complex coordinates (yellow) in the PDB (code 2FKE). The carbon six rings at position A are outside the binding pocket.
A method for the flexible alignment of molecules was presented along with the results of several experiments which suggest the method's efficacy. The method has several important applications:
The method has several advantages: a) there are no limits on the number of ligands that can be aligned; b) both conformational and alignment searches are conducted simultaneously; c) the competing goals of feature superposition and potential energy strain are balanced in a natural way; and d) a collection of alignments is output for comparison.
The alignment methodology may be licensed from Chemical Computing Group Inc. for use in the Molecular Operating Environment software.
1Schaefer,M., Karplus,M., A Comprehensive Analytical Treatment of Continuum Electrostatics, J. Phys. Chem., 100, p1578-1599 (1996).
2Bush, B.L., Sheridan, R.P., PATTY: A Programmable Atom Typer and Language for Automatic Classification of Atoms in Molecular Databases, J. Chem. Info. Comp. Sci., 33, pp756-762 (1993).
3Ferguson, D.M., Raber, D.J., A New Approach to Probing Conformational Space with Molecular Mechanics: Random Incremental Pulse Search, J. Am. Chem. Soc., 111, p4371-4378 (1989).
4MOE software available from Chemical Computing Group Inc., Montreal, Canada. Consult http:www.chemcomp.com for more information on MOE (1997, 1998).
5Gill, P., Murray, W., Wright, M.H., Practical Optimization, Academic Press, New York, (1981).
6Halgren, T.A., The Merck Molecular Force Field, J. Comp. Chem., Vol. 17, Nos. 5&6, p490 (1996).
7Jones, G., Willet, P., Glen, R.C., A Genetic Algorithm for Flexible Molecular Overlay and Pharmacophore Elucidation, J. Computer Aided Molecular Design, 9, Vol 32 (1995).
8Kearsley, S., An Alternative Method for the Alignment of Molecular Structures: Maximizing Electrostatic and Steric Overlap, Tetrahedron Computer Methodology, Vol. 3, No. 6C, pp615-633 (1990).
9Bradbury, R.H., Allott, C.P., Dennis, M., Fisher, F., Major, J.S., Masek, B.B., Oldham, A.A., Pearce, R.J., Rankine, N., Revill, J.M., Roberse, D.A., Russell, S.T., J. Med. Chem., 35, p4227 (1992).
10Codding, P.W., Muir, A.K.S., Mol. Pharmacol., 28, 178 (1985).
11Kolb, V.M., Prog. Drug Res., 36, p49 (1991).
12Hoberg, F., Norinder, U. in Krogsgaard-Larsen, P., Bundgaard, H. (Eds.), A Textbook of Drug Design and Development, Harwood Academic Publishers, Reading, p55-91 (1993).
13Holt, D.A., Luengo, J.L., Yamashita, D.S., Oh, H., Konialian, A.L., Yen, H., Rozamus, L.W., Brandt, M., Bossard, M.J., Levy, M.A., Eggleston, D.S., Liang, J., Schultz, L.W., Stout, T.J. Clardy, J., J. Am. Chem. Soc., 115, p9925 (1993).
14Bernstein, F.C., Koetzle, T.F., Williams, G.J.B., Meyer, F., Bryce, M.D., Rodgers, J.R., Kennard, O., Shimanouchi, T., Tasumi, M., J. Mol. Biol., 112, p535 (1977).