Journal Articles

Molecular Orbitals in MOE

Dr. Alex M. Clark
Chemical Computing Group Inc.

Introduction  |  Methodology  |  Results  |  Summary  |  References


This article describes the implementation of and uses for new functionality added to MOE, derived from [MOPAC] calculation output. The ability to compute and view molecular orbitals has been available in MOE since the 2004.03 release. A summary of the theoretical and methodology background is provided, followed by a number of case examples.


A Molecular Orbital (MO) is a composite of weighted Atomic Orbitals (AO's) which collectively define the shape and spatial density of the electrons in a molecular species [Szabo 1989]. In most quantum chemistry calculations, the starting point is a collection of atoms which each possess one or more atomic orbital functions:

Atomic Orbital Functions

Figure 1: Atomic orbitals

In the simplistic model, hydrogen is generally described as having just one atomic orbital of interest, 1s. Second row atoms include the spherical inner 1s shell, and four atomic orbitals for the valence shell: 2s and 2px, 2py, 2pz. Third row atoms add d-orbitals into the mix, and so on.

The actual atomic orbitals which are used depends on the basis set and varies somewhat; for instance semi-empirical methods such as those used by MOPAC, which is the main topic of this article, only include parameterized atomic orbitals for the valence shell (e.g. carbon has just 2s, 2px, 2py, 2pz). High level basis sets used with ab initio methods often add atomic orbitals which are not highly populated (e.g. 2s and 2p orbitals on hydrogen, d-orbitals on carbon, etc.)

The Linear Combination of Atomic Orbitals (LCAO) method treats the overall wavefunction as a series of weightings of atomic orbitals. The methods used to compute the wavefunction are beyond the scope of this article, but the immediate output from all of the quantum chemistry calculations based on the Hartree-Fock/Self Consistent Field method consist of two matrices: eigenvectors and eigenvalues.

The eigenvectors are expressed as a square matrix where the dimensionality is equal to the total number of atomic orbitals which were fed into the calculation at the beginning. The rows correspond to atomic orbitals and the columns to molecular orbitals:

1 2 3 .. N
1 c11 c12 c13 .. c1N
2 c21 c22 c23 .. c2N
A.O. 3 c31 c32 c33 .. c3N
.. .. .. .. .. ..
N cN1 cN2 cN3 .. cNN

The eigenvectors are sometimes referred to as the wavefunction. If one were to take column #1, it would be appropriate to say that the first molecular orbital is:

Equation 1

where psi is the spatial form of the molecular orbital, and phi is a function that describes the particular atomic orbital. Hence the total density of the orbital at a point in space is the sum of those of the constituent atomic orbitals at that point, multiplied by the weighting coefficient taken from the eigenvector matrix.

The eigenvalues are represented as a diagonal matrix, where each element on the diagonal line (eii) is the energy of the corresponding orbital (column) i in the eigenvector matrix.

1 2 3 .. N
1 e11
2 e22
3 e33
.. ..

The columns of the eigenvectors table are always ordered by the corresponding eigenvalues, so that MO#1 is the lowest in energy, increasing therefrom. Chemistry texts often plot the energy levels and electron occupancies of the molecular orbitals; the former is the value of eii.

The ordering by energy makes it straightforward to populate the electron occupancy of the molecular orbitals: if there are M electrons in a closed-shell system, then the first M/2 molecular orbitals are occupied, with two electrons each. The last of these to be filled is referred to as the Highest Occupied Molecular Orbital (HOMO). The molecular orbitals which are not filled are sometimes referred to as "virtual orbitals". The first of these is referred to as the Lowest Unoccupied Molecular Orbital (LUMO), and is the same as HOMO+1.

Basis Functions

Each of the atomic orbitals is represented by a mathematical expression which describes its intensity at all points in 3-dimensions. For purposes of graphically plotting the shapes of the molecular orbitals, these functions must be known, in addition to the wavefunction. There are several related ways to represent atomic orbitals, all of which are approximations to the orbitals shown in Figure 1.

The most conceptually simple manner in which the individual atomic orbitals are reasonably described is in terms of Slater functions, which have the general formula:

Equation 2

where a, b, c and zeta are parameters for the orbital, and N is a derived constant. The Cartesian terms arise from a, b and c when they are nonzero, which are the angular momentum numbers for the particular axis/axes. For the spherically symmetric s-orbitals these are 0, which makes the equation somewhat simpler, Equation 5. The Cartesian variables (x, y, z) are the displacements from the center of the atomic orbital (the center being the position of the atom) and r is the magnitude of distance from the center. p-orbitals are described where just one of a, b or c is equal to 1, depending on whether it is px, py or pz.

Ab initio methods generally cannot use Slater functions directly, because it is necessary to evaluate a vast number of 3D spatial integrals during the calculation. In order to make integral calculation feasible, a series of contracted Gaussian-type functions (where the exponential term is e-kr2) are used instead. Semi-empirical methods exist mainly for the purpose of circumventing the need to evaluate these integrals and rely instead of parameterization. For determining the molecular orbitals of results from ab initio packages, it is more appropriate to use the Gaussian approximations, which may be obtained from the chosen basis set. Because this article is concerned with the results from the semi-empirical program MOPAC, it is appropriate to consider the atomic orbitals as Slater functions. For each orbital only one scaling constant is required, zeta.


There are many properties of molecular species that can be computed if the wavefunction is known (e.g. electron density, atomic partial charges, dipole moment, oscillator strengths, etc.). This article is concerned specifically with the graphical evaluation of the frontier molecular orbitals - the HOMO and LUMO - which provide qualitative information to the trained chemist regarding the highest energy electron orbital, and the lowest energy available un-filled orbital.

A number of reactions can be rationalized by examining the distribution of the frontier molecular orbitals, both with regard to mechanism (such as Diels-Alder reactions and sigmatropic rearrangements), and preferential reaction sites when examination of steric constraints does not provide the answer. Analysis of the molecular orbitals also reveals the extent of conjugation, and elucidates the nature of extended pi-systems in particular. Electron-rich and electron-poor species tend to reveal the localization or delocalization of the partial or full charge by the shape of the HOMO or LUMO.


In the MOE 2004.03 release, all of the wavefunction information is derived from results extracted from MOPAC. This popular semi-empirical computational package can be readily coaxed into yielding the following information, upon completing a geometry optimization:

  • Eigenvectors (molecular orbitals in terms of atomic orbital coefficients)
  • Eigenvalues (molecular orbital energy levels)
  • Number of filled molecular orbitals (½ # electrons)
  • Atomic orbital information (corresponding atom center, symmetry symbol, zeta value)
  • Atom coordinates

The above information is sufficient to compute the intensity of any orbital at any region in space. For the purpose of creating a visual representation, a grid-based sampling is taken. The intervals and extents are arbitrary; a granularity of 1 Å in all directions leads to a visual representation which is hardly ideal, but still is sufficient for discerning general shape. Granularities of ½ Å and ¼ Å produce incrementally better views, but at a cost: 8 and 64 times as many sampling points, respectively. The extents can comfortably be selected at 3 Å further than the farthest atoms (+ and -) on each axis.

Granularity vs Quality

Figure 2: Granularity vs Quality

Note that the orbital is numerically represented as positive (blue) and negative (red) intensity values. This distinction has only relative meaning; the two signs are out of phase within the context of the molecular orbital, but in comparing two different molecular orbitals, there is no absolute significance - red and blue can be switched, as long as this is done within the entire context. The same applies to the atomic orbitals from which they are assembled.

For each of the sampling points, the following parameters are known: position in space (x, y, z) and molecular orbital (j). The intensity of the orbital at the position is therefore:

Equation 3

where N is the number of atomic orbitals, i is the iterator over the atomic orbitals, cij is the coefficient obtained from the eigenvectors matrix (row = atomic orbital, column = molecular orbital), and phi(i) is the basis function for atomic orbital i.

As previously mentioned, the basis functions for the atomic orbitals used here are the Slater functions, which are generalized for most of the orbitals as:

Equation 2

Where N is given as Equation 8, n is the principal quantum number (1, 2, ...), k is a constant depending on the general symmetry type: 1 for s, sqrt(3) for p, sqrt(15) for d.

The values used for x, y and z are normalized to refer to position relative to the atom upon which the basis function is centered. r is calculated as the magnitude of the (x,y,z) vector.

For MOPAC, the value of zeta is the same for any atom type/symmetry block (s,p,d); zeta is parameterized because MOPAC is a semi-empirical method, and the parameterization is furthermore specific to the choice of Hamiltonian (e.g. AM1, PM3, MNDO, etc.), so care must be taken to ensure that the values of zeta used correspond to the method.

s-orbitals are spherically symmetric, as their cartesian components are zero (a = b = c = 0), therefore the equation becomes:

Equation 5

p-orbitals have just one of the Cartesian modifiers defined (maximum angular momentum is 1, which is assigned to a, b or c):

Equation 6

d-orbitals have a total angular momentum of 2, which is expressed in the following ways:

Equation 7


Once the grid points have been sampled for each of the molecular orbitals of interest, the last calculation to be done is the formation of a viewable surface. The grid is logically divided into two parts, the positive and negative values of the orbital intensity being treated separately before eventually being recombined.

Grid Points

Figure 3: Grid points and surface cutoff

The two surfaces are formed by taking a cutoff point, which is some fraction of the maximum extent; interpolated grid points with a value equal to this cutoff are considered to be the surface (points with a greater value are within the surface, lesser are without). The choice of cutoff is generally arbitrary; if it were too small the orbitals would engulf the molecule and little information could be discerned; if it were too high, then important features of the electron distribution (such as continuity over multiple atoms) might not be seen.



The pi-orbitals of benzene which make up the aromatic core of the molecule is a popular textbook demonstration, as shown in the diagram below. The 6 orbitals which make up the aromatic system are shown ranked in order of their respective energies. The lower three orbitals are occupied, the higher three are vacant.

Aromatic Orbitals of Benzene

The lowest energy orbital attributed entirely to the aromatic ring current has just one node, and has the same overall symmetry as a single atomic p-orbital.

The two HOMO orbitals are degenerate, and are complementary. It can be seen that they have the same symmetry, and operate over the same region of space; the distributions are respectively out of phase by 90° about the z-axis. These orbitals have two node-planes, which is a qualitative rationalization for their higher energy relative to the lower orbital of p-type symmetry.

The two LUMO orbitals are also degenerate and complementary, and feature a larger number of nodes than the HOMO orbitals. The highest energy orbital making up the aromatic system has more nodes still.

Conjugated Oligomers

Molecular orbital representations are particularly good for showing the nature of conjugated systems. For instance in this oligomer of acetylene (dec-1,3,5,7,9-pentaene):

Acetylene Oligomer

It can clearly be seen that the HOMO molecular orbital is an even, alternating-phase contribution from each of the pi double bonds. While this is suggestive of strong conjugation, examination of the LUMO supports this idea further. The LUMO is suggestive of a diradical state - that is, the species which would be formed if one were to alternate all of the C=C double bonds to their neighboring C-C single bonds, and place an unpaired electron at either end to make up the difference.

It might be expected that if the polyacetylene chain were extended further, the energy gap between the HOMO and LUMO would converge sufficiently that the "diradical" state would have significant thermal population at room temperature - and so the material would become a conductor.

Consider the more functionally rich 2,4-thiophene oligomer:

Thiophene Oligomer

As with the alkene example, the HOMO distribution suggests that the double bonds are conjugated. The distribution of the LUMO has a similar implication, that the excited diradical state, if it were readily accessible, would provide a clear path for electrons to flow. [MacDiarmid 2001]


Buckminsterfullerene molecular orbitals are interesting because of their very high symmetry. [Newton 1986] The following image shows the 5 degenerate HOMO's of C60:


Although it is not immediately obvious from the diagram, all five of these orbital distributions are complementary and have the same symmetry. Visual analysis of the molecular orbitals allows concepts of structural symmetry to be extended to frontier electron symmetry.


Perhaps one of the best illustrations of a reaction in organic chemistry which is controlled by the interaction of frontier molecular orbitals is the Diels-Alder [Diels 1928]. The reaction can be written in the classical manner using 3 arrows to show the movement of bonds from the starting materials to the product, but this figurative interpretation does not explain the stereospecificity of the reaction, nor does it provide much insight as to why the reaction proceeds in a concerted manner.

The interacting orbitals are from the electron-rich HOMO of the diene, which is spread out over the conjugated double bonds, and the electron-vacant LUMO of the dienophile, as shown below:


The stereospecificity of the reaction results from the orientation of the reactants, one of which approaches either above or below the other (but not twisted). The above/below and below/above approaches are equivalent in this simple example, but with less symmetrical molecules, there are two distinct possibilities.

From a molecular orbital point of view, either of these two reaction orientations is favorable: the shape and symmetry of the diene HOMO and the dienophile LUMO are ideally matched for maximum in-phase overlap.


In some cases it is actually possible to predict reactivity by examining the molecular orbitals, rather than just rationalizing existing results. In this case, the reaction of 1,4-hexadiene with bromine is considered:


The example has been chosen because there are two alkene functional groups, one of them terminal, the other not. The alkenes are separated by a methylene group and are hence not in a pi-conjugated system.

The reaction of this substrate with exactly one equivalent of molecular bromine, or some other small electrophile, presents a choice of two double bonds to attack. Steric factors are unlikely to be significant for this case, and the predominant choice of reacting bond can be expected to be electronic.

The diagram above shows both the HOMO and HOMO-1; two occupied frontier orbitals are necessary in this case because the alkene functional groups are not conjugated, and there is to a large extent a localization of the pi-type wavefunction on individual double bonds (which is the opposite of what was observed for the model conjugated conductors described earlier).

As it happens, the HOMO-1, which is lower in energy than the HOMO, features a large contribution from the terminal alkene pi-system; the HOMO itself features the pi-system of the non-terminal alkene functional group.

A strongly electrophilic reagent such as bromine reacts readily with unsaturated electron rich pi-bonds, and given a choice, it should react preferentially with the highest energy such bond, which is the non-terminal alkene. This prediction is consistent with general observations regarding kinetics of alkene halogenation. [Nelson 2000]


Molecular orbitals, when viewed in a qualitative graphical representation, can provide insight into the nature of reactivity, and some of the structural and physical properties of molecules. Well known concepts such as conjugation, aromaticity and lone pairs are well illustrated by molecular orbitals.


[MOPAC] The Molecular Orbital PACkage version 7 is generally available as public domain source, and compiled binaries are provided with the Molecular Operating Environment, for all supported platforms. Principle author J.J.P. Stewart.
[Szabo 1989] Modern Quantum Chemistry: Introduction to Advanced Electronic Structure Theory, A. Szabo, N.S. Ostlund, McGraw Hill (1989).
[MacDiarmid  2001] A.G. MacDiarmid, Rev. Mod. Phys. 73, 701 (2001).
[Newton 1986] M.D. Newton, R.E. Stanton, J. Am. Chem. Soc. 108, 2469 (1986); H.P. Lothi, F.J. Alml, J. Chem. Phys. Lett. 135, 357 (1987); M. Ozaki, Takahashi, A. Chem. Phys. Lett. 127, 242 (1986).
[Diels 1928] O. Diels and K. Alder, Ann. 460, 98 (1928); 470, 62 (1929) Ber. 62, 2081 (1929).
[Nelson 2000] D. Nelson and T. Perng, The Use of Relative Magnitudes of Steric Effects to Explore Reactions of Molecular Halogens With Representative Acyclic Alkenes, Proceedings of the Oklahoma Academy of Science, 80, 141 (2000).