Professional Documents
Culture Documents
Schrdinger
Equation
Has Everything Been Tried?
Editor
Paul Popelier
Distributed by
World Scientific Publishing Co. Pte. Ltd.
5 Toh Tuck Link, Singapore 596224
USA office: 27 Warren Street, Suite 401-402, Hackensack, NJ 07601
UK office: 57 Shelton Street, Covent Garden, London WC2H 9HE
For photocopying of material in this volume, please pay a copying fee through the Copyright
Clearance Center, Inc., 222 Rosewood Drive, Danvers, MA 01923, USA. In this case permission to
photocopy is not required from the publisher.
ISBN-13 978-1-84816-724-7
ISBN-10 1-84816-724-5
Printed in Singapore.
To D.P.B.
v
July 20, 2011 9:6 9in x 6in b1189-fm Solving the Schrodinger Equation
vi
July 20, 2011 9:6 9in x 6in b1189-fm Solving the Schrodinger Equation
Contents
Preface xv
vii
July 20, 2011 9:6 9in x 6in b1189-fm Solving the Schrodinger Equation
viii Contents
Contents ix
4.4.2 Applications . . . . . . . . . . . . . . . . . . 83
4.5 Looking Ahead . . . . . . . . . . . . . . . . . . . . . 85
Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . 87
x Contents
Contents xi
xii Contents
Contents xiii
xiv Contents
Index 343
July 20, 2011 9:6 9in x 6in b1189-fm Solving the Schrodinger Equation
Preface
xv
July 20, 2011 9:6 9in x 6in b1189-fm Solving the Schrodinger Equation
xvi Preface
initio programs, the best we can achieve? Or is there a new and powerful idea
lurking at the surface of Knowledge Space, which leads to a better method,
more accurate and faster, and independent from (chemical) experiment? Is
this new idea based on the combination of two or more existing ideas? This
book asks these difficult and ambitious questions to its contributing authors
and to the reader.
This book invited its authors to elucidate the non-standard method that
they specialised in, explain its strength and weakness, and then speculate
about what is needed to widen the application radius of the method. Actually
achieving this may take years and involve several people. This book hopes to
inspire readers and researchers by putting non-standard approaches together
in one place. I believe that this has never been done. The format and style
in which the chapters are written should make it possible to read the whole
book through. It should be emphasised that this text was not designed as a
review. Instead, it is meant to be a collection of personal accounts capturing
the aspiration and perhaps frustration of experts of non-standard methods.
So, what can we learn from Schrodingers aforementioned quote, other
than that it is good to catch up with a sufficient amount of mathematics (or
spend Christmas holidays frolicking with an old flame in a mountain resort,
where he discovered his equation)? One lesson is to trust the potential of an
idea, often based on an analogy or a vivid picture. In fact, in Schrodingers
case this was the symbolic proportion:
Ordinary mechanics : Wave mechanics
= Geometrical optics : Undulatory optics.
His derivation of Eq. (13), developed in the pages leading up to the
excerpt above, is based on this analogy. Schrodinger could describe what
his new wave mechanics would look like based on this intuitive analogy.
As he worked out the maths behind this intuitive development he panicked
for a moment, due to his lack of mathematical knowledge1 . Fortunately, he
ended up with an equation that worked. Moreover, Schrodinger presented
quantum mechanics with a completely new formalism, dual to the older
matrix mechanics, which Heisenberg had proposed. The latter, and other
members of the Copenhagen clan, did not like wave mechanics much:
it was too intuitive and not as elegant and deep as matrix mechanics.
Matrices were of course wonderfully abstract mathematical entities to the
theoretical physicists of that generation. However, this abstraction did not
endow matrix mechanics with any authority over wave mechanics unless
1 Quantum Mechanics textbooks typically gloss over this concern. Is there something deeper in the
reassurance of V(x, y, z) acting as a boundary condition?
July 20, 2011 9:6 9in x 6in b1189-fm Solving the Schrodinger Equation
Preface xvii
xviii Preface
by the Arabs since the ancient Greek times. The Ptolemaic theory needed
77 circles to describe the motion of the sun, moon, and the five planets
then known. Kepler broke with the tradition of 2,000 years, that circles
must be used to describe heavenly motions. He showed that a single ellipse
would do. An ellipse is not as symmetrical as a circle, and therefore not as
heavenly. A circle can be seen as an ellipse in which the two foci have
collapsed to one (and hence the two radii as well). Actually, in a deeper
way, an ellipse is more heavenly than a circle because it captures Nature in
a minimal model. In such a model there is no need for corrections within
corrections. All falls in place by letting go of the constraint that a planet
must move in a circular orbit. I now wonder where our circles are in
quantum chemistry. Which constraints are we holding on to?
This book invites its authors and the readers to abandon the usual lines
of thoughts and presumptions that we do not perhaps realise we are making.
The most powerful theories are minimal, not simple. Simple means that
we impose an unwarranted constraint onto what we are trying to explain.
Minimal means that we discovered the most essential, but unconstrained
concept that governs the observed data of interest. This economy of prin-
ciple or assumption always pays off, but obtaining a minimal theory requires
much imagination and audacity.
Returning to quantum chemistry, one may have the impression that
the only truly predictive computational schemes are built on brute force
foundations. The core idea behind configuration interaction is brute force
in nature. The explosion in computational work it leads to warrants clever
but inevitably approximate computational schemes. This is only vaguely
reminiscent of the combinatorial explosion encountered in the calculation
of a determinant of a large matrix by the definition of an alternating sum of
permuted terms. This idea leads to intractable calculations for a matrix as
small as a 30 30 matrix, for example. Yet calculating such a determinant
is perfectly feasible with LU decomposition in most reasonable times. This
is where the power of the idea makes apparently impossible tasks possible
after all. Then we can ask again: why can we not think of a method to solve
the Schrodinger equation of a protein beyond Coupled-Cluster Singles and
Doubles (Triples)/ Complete Basis Set (CCSD(T)/CBS) quality in a few
seconds?
Regarding the content of this book, one can see that there are eleven
chapters, covering ten ideas (or methods) not prevalent in current main-
stream quantum chemistry. Unfortunately, some methods are not included,
due to a lack of available authors. Otherwise, there would have been
extra chapters on Bohmian mechanics, the series solution method, the
July 20, 2011 9:6 9in x 6in b1189-fm Solving the Schrodinger Equation
Preface xix
Paul Popelier
Manchester, 11 September 2010
July 20, 2011 9:6 9in x 6in b1189-fm Solving the Schrodinger Equation
Chapter 1
Density functional theory (DFT) has become the most popular by far
of the panoply of methods in quantum chemistry and the reason for
this is simple. Where other schemes had become bogged down in mind-
numbingly expensive and detailed treatments of the electron correlation
problem, DFT simply shrugged, pointed at the HohenbergKohn theorem,
and asserted that the correlation energy can be written as an integral of a
certain function of the one-electron density. The only thing that irritated
the wavefunction people more than the cavalier arrogance of that assertion
was the astonishing accuracy of the energies that it yields.
Well, most of the time. Occasionally, DFT fails miserably and,
although the reasons for its lapses are now understood rather well, it
remains a major challenge to correct these fundamental deficiencies, while
retaining the winsome one-electron foundation upon which DFT rests.
Does this mean that, for truly foolproof results, we have no option but
to return to the bog of many-body theory? One might think so, at least
from a cursory inspection of the current textbooks. But we feel differently,
and in this chapter we present an overview of an attractive alternative that
lies neither in the one-electron world of DFT, nor in the many-electron
world of coupled-cluster theory. Our approach nestles in the two-electron
Fertile Crescent that bridges these extremes, a largely unexplored land
that would undoubtedly have been Goldilocks choice.
We present results that demonstrate that the new approach Intracule
Functional Theory is capable of predicting the correlation energies of
small molecules with an accuracy that rivals that of much more expensive
post-HartreeFock schemes. We also show that it easily and naturally
models van der Waals dispersion energies. However, we also show that
1
July 19, 2011 11:28 9in x 6in b1189-ch01 Solving the Schrodinger Equation
its current versions struggle to capture static correlation energies and that
this is an important area for future development.
Finally, we peer into the probable future of the field, speculating on the
directions in which we and others are likely to take it. We conclude that,
although the approach is conceptually attractive and has shown consid-
erable promise, the investigations hitherto have scarcely scratched the sur-
face and there are ample opportunities for fresh ideas from creative
minds.
1.1. Introduction
In the late 1920s, Hartree [1] was among the first to realize that the newly
derived Schrodinger equation [2] describing quantum electronic motion
could be solved for multi-particle systems if the wavefunction, a com-
plicated multidimensional object that explicitly couples the motion of all
particles in the system, is approximated by a product
(r1 , r2 , . . . , rn ) = 1 (r1 )2 (r2 ) n (rn ) (1.1)
of single-particle functions (spin-orbitals). Physically, the Hartree wave-
function implies that each electron moves independently in the electro-
static field created by all of the others. Shortly thereafter, both Slater [3]
and Fock [4] pointed out that Hartrees wavefunction lacks the antisym-
metry required by the Pauli Principle [5], but that this can be rectified by
adopting the determinant form
1 (r1 ) 2 (r1 ) . . . n (r1 )
1 (r2 ) 2 (r2 ) . . . n (r2 )
(r1 , r2 , . . . , rn ) = .. .. . .. . (1.2)
. . . . .
1 (rn ) 2 (rn ) n (rn )
as the electron correlation problem and has been the focus of ongoing
research efforts for almost a century. Currently, methods for recovering Ec
fall into two broad classes.
Wavefunction-based methods are based upon the mathematical obser-
vation that an improved wavefunction can be constructed from the occupied
and unoccupied orbitals that arise from solving the HF equations. These
methods are guaranteed eventually to converge to the exact result, but
their convergence is hampered because they are effectively approximating
cusps in the true wavefunction by sums of smooth functions. In practice,
wavefunction-based post-HF methods are typically limited in applicability
to systems containing a few dozen non-hydrogen atoms.
Density-based methods are a popular low-cost alternative. They are
based upon the HohenbergKohn theorem [6], which states that the energy
of the ground state of a system is a universal functional of its electron density
(r). Unfortunately, the theorem gives little insight into the construction
of the functional and, despite the efforts of many researchers over many
years, its form remains unknown. Many approximate functionals have been
devised, each with its own strengths and weaknesses, but none yet has
proven accurate for all types of chemical problems. The major systematic
weaknesses [7] of density functional theory (DFT) stem from its inability
to deal with intrinsically two-electron phenomena such as bond cleavage
and static correlation.
Comparing these two alternatives wavefunction-based and density-
based models reveals a vast and largely unexplored intermediate ground
between the complexity of wavefunction schemes (which depend explicitly
on the coordinates of every electron) and the simplicity of density schemes
(which depend only on the one-electron density). The most obvious entry
point and this is our present strategy is to develop approaches that
incorporate two-electron information but retain the computational advan-
tages enjoyed by DFT. We will use atomic units throughout.
1.2. Intracules
which gives the joint probability of finding one electron at r1 and another at
r2 . How might one extract the correlation energy from this six-dimensional
object? Intuitively, one may expect the statistical correlation between the
motions of two electrons to depend strongly on their separation and this
leads naturally to the position intracule [8]
P(u) = 2 (r1 , r2 )(r12 u)dr1 dr2 , (1.4)
Continued
July 19, 2011 11:28 9in x 6in b1189-ch01 Solving the Schrodinger Equation
P(u)
1
0.8
0.6
0.4
0.2
u
0.5 1 1.5 2 2.5 3 3.5 4
Continued
July 19, 2011 11:28 9in x 6in b1189-ch01 Solving the Schrodinger Equation
and one finds from Eq. (1.7) that the momentum intracule
M(v) = 2 (p1 , p2 )(p12 v)dp1 dp2
= (2)3 exp[(p21 + p22 )/2](p12 v)dp1 dp2
= (2)3 exp[(p21 + |p1 v|2 )/2]dp1 dv ,
by writing p2 = p1 v
= (4)3/2 exp(v2 /4)dv ,
by integrating over p1
3/2
= (4) 2
4v exp(v /4),2
M(v)
0.5
0.4
0.3
0.2
0.1
v
1 2 3 4 5 6 7 8
electrons and therefore reduce the extent of their correlation. This infor-
mation is captured by the momentum intracule [15, 16]
M(v) = 2 (p1 , p2 )(p12 v) dp1 dp2 , (1.7)
12 = 0 12 = /2 12 =
where the abcd are two-particle density matrix (2PDM) elements. Thus,
from Eq. (1.12), the Omega intracule is
(u, v, ) = abcd [abcd] , (1.14)
abcd
Here K0 and K1 are modified Bessel functions of the second kind [25].
Each of the three one-dimensional intracules, A(s), D(x) and (), whose
graphs are shown below, is independent of the exponent , that is, they
are invariant with respect to dilation. As such, they apply not only to the
helium atom but, equally, to any helium-like ion. This will be important in
Section 1.3.
The attentive reader may wonder why, if u and v are statistically independent
in this system, the angle intracule () is not constant. After all, if the relative
positions and momenta of the two electrons are independent, one might have
expected the angle between r12 and p12 to be equally likely to take any value
between 0 and . The fact that this is not the case is a purely geometrical
(Jacobian) effect: as r12 and p12 range independently over their respective
domains, dynamical angles 12 close to /2 arise far more often than angles
close to 0 or . The fact that there are many more points on the Earths surface
with latitudes near 0 (equatorial regions) than with latitudes near 90 (polar
regions) arises from the same geometrical effect.
A(s) D(x) ()
0.5
0.3 0.3
0.25 0.25 0.4
0.2 0.2 0.3
0.15 0.15
0.2
0.1 0.1
0.05 0.05 0.1
s x
2 4 6 8 10 6 4 2 2 4 6 0.5 1 1.5 2 2.5 3
Fig. 1.5. Action, dot and angle intracules for a He-like ion.
July 19, 2011 11:28 9in x 6in b1189-ch01 Solving the Schrodinger Equation
abcd = Ca1 Cb1 Cc1 Cd1 + 4Ca1 Cb1 Cc2 Cd2 + Ca2 Cb2 Cc2 Cd2 2Ca1 Cb2 Cc2 Cd1 ,
where Cak denotes the ath element of the Ck array. The Wigner intracule is
then assembled through
6
6
6
6
W(u, v) = abcd [abcd]W
a=1 b=1 c=1 d=1
Continued
July 19, 2011 11:28 9in x 6in b1189-ch01 Solving the Schrodinger Equation
3
v
2
0
0 1 2 3 4 5
u
Ec from (u, v, ) than from (r). We call this idea Intracule Functional
Theory (IFT).
Although one can imagine many ways to extract Ec from (u, v, ),
one of the simplest is to contract the intracule with an appropriate kernel,
writing
Ec = (u, v, )G(u, v, )ddvdu. (1.24)
0 0 0
In such a formulation, the correlation kernel G(u, v, ) acts as a weighting
function, assigning high priority to regions of intracule space where the
electrons are strongly correlated, and low priority to regions where corre-
lation is weak. The thought experiment summarized in the diagram below
helps to guide our thinking about this. In situations where both u and v
are small, the electrons are close together and moving relatively slowly
and so we anticipate a large correlation contribution. Conversely, corre-
lation effects should be small when the electrons are far apart and moving
quickly. In intermediate cases, where one of u and v is large and the other is
small, we expect moderate correlation effects. This picture fits nicely with
the conclusion in the preceding section that correlation in the He-like ions
depends in some way on the product r12 p12 .
If the wavefunction is expanded in a Gaussian basis, then combining
Eqs. (1.14) and (1.24) yields
Ec = abcd [abcd]G (1.25)
abcd
large u
small u
small v large v
Fig. 1.8. Comparison of correlation energies from the G2 kernel (left) and G3
kernel (right) with exact correlation energies.
(dashed grey) match Edyn with near-mEh accuracy. Continuing in this vein,
we can abandon the HF/6-311G two-particle density matrix in favour of
the CASSCF(val)/6-311G one and, by re-fitting the G3 kernel again, we
obtain the parameters c = 0.102, 0 = 1.02, 0 = 0.43. The resulting
energies match Edyn with sub-mEh accuracy at all bond lengths. This sug-
gests that combining an IFT-based treatment of dynamic correlation with a
full-valence multireference method will produce a method that is capable
of estimating Ec very accurately.
-10
e (mEh)
-20
-30
-40
-50
-60
0.2 0.8 1.4 2 2.6 3.2 3.8
R ()
At large bond lengths (R
5 A), the UHF energy of H2 rapidly
approaches the energy of two non-interacting H atoms and fails to capture
the long-range dynamic correlation that is responsible for the weak van der
Waals attraction. This long-range correlation energy can be rationalized by
considering a multipole expansion of the Coulomb operator, as pioneered
by London in the early 1930s [35, 36].
Can we use IFT to model dispersion? To answer this, we begin by consid-
ering the simple system two Coulomb-coupled harmonic oscillators
that London used to model dispersion effects. He showed that its dispersion
energy is asymptotically Ec 3/(323 R6 ) and his derivation is outlined
in the box above. Therefore, we must devise kernels that recover this asymp-
totic dispersion energy from this systems intracules. Because we favour
kernels that depend on x = r12 p12 , we confine our attention to the D(x)
intracule and seek kernels that satisfy
Ec D(x)G(x)dx. (1.34)
Continued
July 19, 2011 11:28 9in x 6in b1189-ch01 Solving the Schrodinger Equation
H 6.5 12.2
He 2.8 1.5 5.2 3.3
Li 66.5 22.5 1395 74.2 17.2 1534
Be 34.8 13.2 478 213 76.6 23.0 847 657
C6 coefficients range over three orders of magnitude, the discovery that the
IFT estimates are usually accurate to within a factor of two is a promising
start. Once again, this demonstrates the fundamental suitability of IFT for
capturing intrinsically two-electron correlation effects.
In the teething stages of the development of DFT, much progress was made
through a primarily empirical approach. Indeed, between Slaters intro-
duction of X theory [38] in 1951 and the publication of the Hohenberg
Kohn theorem [6] 13 years later, it was not even realized that DFT was a
theoretically justifiable theory: rather, it was embraced simply because it
was a model that worked, surprisingly often.
In some ways, contemporary IFT has evolved similarly, and now stands
at a similar point. It is clearly capable of yielding chemically useful quan-
titative predictions but, for the moment, it lacks the solid foundation of a
HohenbergKohn analogue. This deficiency may deter the purist, but the
pragmatist finds it difficult to resist the allure of a model that seeks to
rationalize the correlation phenomenon through a simple, quasi-classical
two-electron picture.
So, what are the likely directions for the development of IFT in the near
future?
As functional manufacture has become an industry within DFT, we
foresee the construction of new and improved kernels as one of the most
obvious threads of future IFT research. To ensure that this progress is
rational, we expect that properties of the exact kernel will also be derived
and that these will be used as guides.
However, we also foresee the real possibility that the kernel ansatz (1.24)
may be obsolesced by the discovery of alternative methods for extracting
July 19, 2011 11:28 9in x 6in b1189-ch01 Solving the Schrodinger Equation
Ec from the Omega intracule. Perhaps such methods will be found as by-
products of the construction of a rigorous proof of the central IFT con-
jecture (1.23).
Of course, it is also possible that the Omega intracule family tree does not
contain the ultimate intracule and that, in the future, it will be replaced
by a different, and quantum mechanically rigorous, family. We are opti-
mistic about this because it has been shown recently that the Dot intracule
D(x) is actually a first-order (in h) approximation to the true density of the
x variable. Furthermore, the exact density X(x) has also been discovered
[3941] and it is no more difficult to extract from the wavefunction than
is D(x).
Finally, we conclude with a statement that is surpassingly obvious and
yet often overlooked. If we are to refine and enrich our understanding of the
electron correlation phenomenon, we must continue to unearth and analyze
simple systems where the phenomenon is most clearly exposed and most
readily comprehended. The helium atom, the hydrogen molecule and the
uniform electron gas have all proven to be rich veins in the past but our
quest for deeper understanding must be an ongoing one and there is no
doubt whatever that there is much to be learned from other prototypical
systems [42].
Bibliography
Chapter 2
Frederick R. Manby
2.1. Introduction
25
July 19, 2011 11:28 9in x 6in b1189-ch02 Solving the Schrodinger Equation
26 F.R. Manby
1 It should be pointed out that for crystalline solids, or indeed for any periodic system, plane-waves can
be used, and then the treatment of two-electron terms is even more straightforward. Moreover, for pure
density functional theory, it is practical to use Slater-type orbitals, since in a density-fitted approach to
the Coulomb problem the need for multicentred Coulomb integrals can be avoided [4].
July 19, 2011 11:28 9in x 6in b1189-ch02 Solving the Schrodinger Equation
28 F.R. Manby
Here, the operators i and a are the annihilation operator for a particle in
spin-orbital i and the creation operator for spin-orbital a, respectively. Thus
July 19, 2011 11:28 9in x 6in b1189-ch02 Solving the Schrodinger Equation
2.2.1. MP2-F12
In MP2-F12 theory the basic idea is to supplement the product-of-virtuals
expansion in Eq. (2.3) with explicitly correlated terms formed as a product
of occupied orbitals and a correlation factor, f12 , which depends explicitly
on the distance between two electrons, r12 :
1 ij 1 ij
|uij = Tab |ab + Q12 Tkl f12 |kl (2.4)
2 2
ab kl
1 ij 1 ij
= Tab |ab + kl
Tkl F |, (2.7)
2 2
ab kl
July 19, 2011 11:28 9in x 6in b1189-ch02 Solving the Schrodinger Equation
30 F.R. Manby
This form is convenient for deriving methods, but hardly physically trans-
parent, so it is worth noting that the operator produces excitations of two
electrons into states formed from the product of two occupied orbitals (k
and l) multiplied by the correlation factor f12 and projected to give a con-
figuration strongly orthogonal to the occupied space.
In the final working equations it is of course essential to remove any
explicit reference to the infinite
basis , ,
and this can be done by strategic
replacements of the kind || = 1 i |ii|. This process introduces
many-electron integrals which are very expensive to evaluate. Kutzelnigg
suggested removing all such many-electron integrals by a strategy equiv-
alent to replacing these exact resolutions of the identity by approximate
ones, initially in the MO basis set:
|| |aa|.
a
A very important refinement was made by Klopper and Sansom [29], who
introduced the idea of performing the approximate resolution of the identity
in a separate, auxiliary basis set. Among many other technical developments
(reviewed elsewhere [16, 17, 30]) which followed, one stands out as partic-
ularly significant: Ten-no discovered that a short-range correlation factor,
in particular the exponential f12 = exp(r12 ), led to a huge improvement
in accuracy compared to f12 = r12 [31], and now this form is used by all
leading groups. The general theory in its modern form is clearly presented
in [30].
32 F.R. Manby
34 F.R. Manby
The first-order energy vanishes, but at second order there are terms like
1|H1 |0 = 1sA (1)1sB (2)f(r12 )|H1 |1sA (1)1sB (2)
(|1sA (1)|2 |f(r12 )H1 |1sB (2)|2 ). (2.12)
we obtain
Given that R is so large, we can approximate r12 R
T(R)
1|H1 |0 = f(R) B ,
A
which is clearly zero since the dipole moments A = B = 0. Even in the
case of interaction of non-spherical fragments, for which these dipoles need
not vanish, there is clearly no transferable way to ensure that the correlation
factor f reproduces the correct long range behaviour; one would need to
know the answer in advance. Similarly, the other second-order term in the
2 , so again there appears no way
energy, 1|H0 |1, is proportional to f(R)
that a reasonable description of long-range dispersion could emerge.
In conclusion, it seems that orbital relaxation effects are essential for
an effective description of dynamic correlation, but including this effect is
more expensive than retaining the conventional virtual products. And long-
range dispersion seems impossible in any ansatz where correlation only
arises from products of occupied orbitals and a function of the interelec-
tronic distance. Perhaps progress can be made by considering more flexible
correlation factors of pair-function ansatze, such as those considered in the
next section.
expanded in terms of the average and relative coordinates of the two elec-
= (r1 + r2 )/2 and r = r2 r1 [53]. At the time we were reluctant
trons, R
to use the RI approximation for the many-electron integrals but, in the
absence of a practical alternative, the method remained applicable only to
two-electron systems.
Now we realise that the main problem with MP2-R12 theory was the
choice f12 = r12 , not the decision to approximate the integrals using
RIs [54], so it seems a good time to revisit this type of theory. A flexible
ansatz can be made in the R, r coordinates, in the form
ij
|uij = Q12 p (r )|kl.
tklPp P (R) (2.13)
klPp
Here the P (R) are atom-centred basis functions, whose number would
therefore scale linearly with system size. The functions in r , on the other
hand, only have to span a length-scale characteristic of electron correlation,
and the number of p (r ) needed should not scale with system size. The
derivation of the theory (for example MP2 or coupled-cluster theory) would
be the same as for any model with more than one correlation factor [55]; the
challenges are first, scaling; and second, integrals. The number of param-
eters is on the order o4 Mm, if there are M functions in R and m functions
in r . Since M scales linearly with system size, it can be seen that there are
O(N 5 ) parameters as a function of system size.
The resolution of the identity ensures that only two-electron integrals
will appear in the working equations; but there will be several new types
r )|(1)(2). Some effort would be
of integrals, such as (1)(2)|P(R)p(
involved in making these available, but in the end we know that if all of the
functions are Gaussians, efficient computation is possible.
The scaling problem above could be circumvented if instead of
expanding the correlation factor in R, r one directly expanded the first-
order wavefunction:
ij
|uij = Q12 p (r ).
t | P (R) (2.14)
Pp
Pp
36 F.R. Manby
the maximum angular momentum of the occupied orbitals, 3locc . For light
elements in the s- and p-block there is no problem, because this implies only
up to f-functions in the fitting set; but for transition metals, the occupied
d-orbitals lead to the need for i-functions in the RI basis; and, worse still,
for lanthanides and actinides, the auxiliary basis requires functions with
l = 9.
An alternative strategy is to avoid the RI altogether and directly apply
density fitting to the three-electron integrals. In conventional density fitting
orbital products i (r )j (r ) |ij) are approximately expanded in a basis
set of auxiliary functions |A) (typically taken as atom-centered Gaussians).
The approximation to the orbital product then has the form
ij
|ij) |ij) = DA |A), (2.20)
A
ij
where the fitting coefficients DA are found by minimizing the Coulomb
energy of the fitting residual (ij
ij|ij
ij)/2.
Writing our typical three-electron integral in a Mulliken-like notation
1 1
ijm|f12 r23 |mlk (im|f12 | jl |r23 |mk) (2.21)
1
(im|f12 | jl |r23 |r1 |mk)
|mk) (im|f12 | jl + (im|f
12 | jl |r1 |mk)
23 23
|r1 |mk) 2(im|f
12 | jl
+ (im|f |r1 |mk),
12 | jl (2.23)
23 23
July 19, 2011 11:28 9in x 6in b1189-ch02 Solving the Schrodinger Equation
38 F.R. Manby
1
+ (A|f12 | jl |r23 |C)DA
im mk
DC
1 jl
+ (A|f12 |B|r23 |mk)DA
im
DB
1 jl
2(A|f12 |B|r23 |C)DA
im mk
DB DC . (2.24)
This expression involves only three- and four-index integrals and interme-
diates, and the various contractions between integrals and coefficients scale
only as O(N 5 ). The conventional RI approach has an O(N 6 ) cost. Using
RI, as noted above, functions with up to 3locc are needed in the auxiliary
basis set, but using DF, only functions up to 2locc are required. Therefore,
direct density fitting of three-electron integrals would appear to be a viable
alternative to the resolution of the identity.
where we have defined overlap integrals S 3
= |3 . Thus, for
purposes of formal derivation, a triples cluster operator can be written
July 19, 2011 11:28 9in x 6in b1189-ch02 Solving the Schrodinger Equation
in the form
T3 = Tijk3 S
3
k j i
ijk3
ijk
= T k j i . (2.26)
ijk
Because of this formal use of an infinite virtual space, the basic equa-
tions are exactly as in conventional coupled cluster theory with a complete
treatment of triples [60]; but here the challenge is to take these equations
and convert them into a computable form by eliminating all references to
the infinite virtual space, resorting to resolution-of-the-identity approxima-
tions as necessary. This is definitely a technical challenge, which cannot be
addressed without a very considerable investment of effort.
One can speculate about useful possible forms for |3 . The scattering
picture of triple excitations (see for example [61]) suggests a formalism
of the structure f12 f23 |lmn or f12 |lma for the triples function. The fully
connected version f12 f13 f23 |lmn would presumably lead to very difficult
integrals; and the most general type of expression f123 |lmn would lead to
many-electron integrals of a kind that could not be resolved by conventional
RIs. But given that the Hamiltonian contains only two-particle interactions,
and based on the success of Kohns work, there are grounds to be optimistic
that a form such as f12 f23 |lmn might be accurate enough. If this could be
used in place of rather than in addition to the conventional expansion,
very significant savings could be made, and explicitly correlated triples
models with the order of o6 amplitudes, instead of o3 v3 , at least offer an
interesting prospect.
40 F.R. Manby
2.4. Conclusions
Acknowledgments
Bibliography
[1] B. Klahn and W.A. Bingel, Theo. Chim. Acta 44, 9 (1977).
[2] B. Klahn and W.A. Bingel, Theo. Chim. Acta 44, 27 (1977).
[3] P.-O. Lowdin, Phys. Rev. 97, 1474 (1955).
[4] E.J. Baerends, D.E. Ellis, and P. Ros, Chem. Phys. 2, 41 (1973).
[5] T.H. Dunning, Jr., J. Chem. Phys. 90, 1007 (1989).
[6] T. Helgaker, W. Klopper, H. Koch, and J. Noga, J. Chem. Phys. 106, 9639 (1997).
July 19, 2011 11:28 9in x 6in b1189-ch02 Solving the Schrodinger Equation
[7] R.G. Parr and W. Yang, Density-functional theory of atoms and molecules (Oxford
University Press, New York, 1994).
[8] W. Koch and M.C. Holthausen, A Chemists Guide to Density Functional Theory
(Wiley-VCH, New York, 2000).
[9] F. Furche, J. Chem. Phys. 129, 114105 (2008).
[10] J. Harl and G. Kresse, Phys. Rev. Lett. 103, 1 (2009).
[11] B.G. Janesko, T.M. Henderson, and G.E. Scuseria, J. Chem. Phys. 130, 081105
(2009).
[12] R.J. Bartlett, Chem. Phys. Lett. 484, 1 (2009).
[13] E.A. Hylleraas, Z. Phys. 54, 347 (1929).
[14] H. Hettema, Quantum Chemistry: Classic Scientific Papers, volume 8 of 20th Century
Chemistry (World Scientific, Singapore, 2000), for an English translation of [13].
[15] T. Helgaker and W. Klopper, Theor. Chim. Acta 103, 180 (1999), for a modern
perspective on [13].
[16] W. Klopper, F.R. Manby, S. Ten-no, and E.F. Valeev, Int. Rev. Phys. Chem. 25, 427
(2006).
[17] T. Helgaker, W. Klopper, and D.P. Tew, Mol. Phys. 106, 2107 (2008).
[18] D. Tew, C. Hattig, R. Bachorz, and W. Kopper, in Recent Progress in Coupled Cluster
Methods edited by P. Carsky, J. Paldus, and J. Pittner (Springer, Dordrecht, 2010),
535572.
[19] H.-J. Werner, T.B. Adler, G. Knizia, and F.R. Manby, in Recent Progress in Coupled
Cluster Methods edited by P. Carsky, J. Paldus, and J. Pittner (Springer, Dordrecht,
2010), 573620.
[20] F. Jensen, Introduction to Computational Chemistry (John Wiley & Sons, Chichester,
second edition, 2007).
[21] A. Szabo and N.S. Ostlund, Modern Quantum Chemistry (McGraw-Hill, New York,
1982).
[22] T. Helgaker, P. Jrgensen, and J. Olsen, Molecular Electronic Structure Theory (John
Wiley & Sons, Chichester, 2000).
[23] W. Klopper and W. Kutzelnigg, Chem. Phys. Lett. 134, 17 (1987).
[24] W. Klopper and W. Kutzelnigg, J. Phys. Chem. 94, 5625 (1990).
[25] W. Kutzelnigg and W. Klopper, J. Chem. Phys. 94, 1985 (1991).
[26] V. Termath, W. Klopper, and W. Kutzelnigg, J. Chem. Phys. 94, 2002 (1991).
[27] W. Klopper and W. Kutzelnigg, J. Chem. Phys. 94, 2020 (1991).
[28] W. Klopper, Chem. Phys. Lett. 186, 583 (1991).
[29] W. Klopper and C.C.M. Samson, J. Chem. Phys. 116, 6397 (2002).
[30] H.-J. Werner, T.B. Adler, and F.R. Manby, J. Chem. Phys. 126, 164102 (2007).
[31] S. Ten-no, J. Chem. Phys. 126, 014108 (2007).
[32] J. Noga, W. Kutzelnigg, and W. Klopper, Chem. Phys. Lett. 199, 497 (1992).
[33] J. Noga and W. Kutzelnigg, J. Chem. Phys. 101, 7738 (1994).
[34] T. Shiozaki, M. Kamiya, S. Hirata, and E.F. Valeev, Phys. Chem. Chem. Phys. 10,
3358 (2008).
[35] T. Shiozaki, M. Kamiya, S. Hirata, and E.F. Valeev, J. Chem. Phys. 129, 071101
(2008).
[36] A. Kohn, G.W. Richings, and D.P. Tew, J. Chem. Phys. 129, 201103 (2008).
[37] H. Fliegl, W. Klopper, and C. Hattig, J. Chem. Phys. 122, 084107 (2005).
July 19, 2011 11:28 9in x 6in b1189-ch02 Solving the Schrodinger Equation
42 F.R. Manby
[38] D.P. Tew, W. Klopper, C. Neiss, and C. Hattig, Phys. Chem. Chem. Phys. 9, 1921
(2007).
[39] D.P. Tew, W. Klopper, and C. Hattig, Chem. Phys. Lett. 452, 326 (2008).
[40] E.F. Valeev, Phys. Chem. Chem. Phys. 10, 106 (2008).
[41] T.B. Adler, G. Knizia, and H.-J. Werner, J. Chem. Phys. 127, 221106 (2007).
[42] G. Knizia, T.B. Adler, and H.-J. Werner, J. Chem. Phys. 130, 054104 (2009).
[43] C. Hattig, D.P. Tew, and A. Kohn, J. Chem. Phys. 132, 231102 (2010).
[44] T. Shiozaki, M. Kamiya, S. Hirata, and E.F. Valeev, J. Chem. Phys. 130, 054101
(2009).
[45] T. Shiozaki, E.F. Valeev, and S. Hirata, J. Chem. Phys. 131, 044118 (2009).
[46] A. Kohn, J. Chem. Phys. 130, 131101 (2009).
[47] H.-J. Werner, P.J. Knowles, R. Lindh, F.R. Manby, M. Schutz, et al., Molpro, version
2009.1, a package of ab initio programs, 2009, see http://www.molpro.net.
[48] TURBOMOLE V6.2 2010, a development of University of Karlsruhe and
Forschungszentrum Karlsruhe GmbH, 19892007, TURBOMOLE GmbH, since
2007; available from http://www.turbomole.com.
[49] S. Hofener, D.P. Tew, W. Klopper, and T. Helgaker, Chem. Phys. 356, 25 (2009).
[50] J.C. Slater and J.G. Kirkwood, Phys. Rev. 37, 682 (1931).
[51] S.F. Boys and N.C. Handy, Proc. Roy. Soc. A309, 209 (1969).
[52] S.F. Boys and N.C. Handy, Proc. Roy. Soc. A310, 43 (1969).
[53] F.R. Manby and P.J. Knowles, Chem. Phys. Lett. 310, 561 (1999).
[54] A.J. May, E. Valeev, R. Polly, and F.R. Manby, Phys. Chem. Chem. Phys. 7, 2710
(2005).
[55] E.F. Valeev, J. Chem. Phys. 125, 244106 (2006).
[56] F.R. Manby, J. Chem. Phys. 119, 4607 (2003).
[57] B.I. Dunlap, Phys. Chem. Chem. Phys. 2, 2113 (2000).
[58] A.J. May and F.R. Manby, J. Chem. Phys. 121, 4479 (2004).
[59] M. Schutz and F.R. Manby, Phys. Chem. Chem. Phys. 5, 3349 (2003).
[60] J. Noga and R.J. Bartlett, J. Chem. Phys. 86, 7041 (1987).
[61] P.E. Maslen, A.D. Dutoi, M.S. Lee, Y. Shao, and M. Head-Gordon, Mol. Phys. 103,
425 (2005).
July 19, 2011 11:28 9in x 6in b1189-ch03 Solving the Schrodinger Equation
Chapter 3
43
July 19, 2011 11:28 9in x 6in b1189-ch03 Solving the Schrodinger Equation
(or greater). For the purposes of this chapter, strong correlation may be
taken to be synonymous with multireference.
Strongly correlated states most commonly arise from near-degeneracy
in the underlying orbitals. As a familiar example, the hydrogen molecule
at equilibrium (in a minimal basis) possesses energetically well-separated
bonding g and antibonding u orbitals, and | is well approximated by
the single configuration g2 . However, as the bond is stretched, g and u
become near-degenerate and | evolves into a strongly correlated superpo-
sition of configurations with different occupancies across the two orbitals.
While the correlation in stretched H2 is strong, it can be exactly described
by many methods, such as doubles configuration interaction and coupled
cluster doubles theory. This is because |corr contains only one deter-
minant with large weight, u2 . In larger problems, however, the number of
significant determinants in |corr rises very rapidly. For example, consider
a set of hydrogen atoms arranged in a square lattice (Fig. 3.1). As we expand
the lattice constant (i.e. stretch the bonds) we recover a large degeneracy in
the underlying orbitals and |corr consists of a superposition of many con-
figurations distributed across all the orbitals. The strongly correlated super-
position of such a large set of configurations can now no longer be described
by simple theories. Of course, such a hydrogen lattice problem is artificial
from a chemistry standpoint, but one can readily find realistic examples
of large-scale strongly correlated electronic structure. For example,
molecules with multiple transition metals contain many near-degenerate 3d
orbitals which experience only limited overlap with neighbouring orbitals,
and are thus much like the atomic orbitals in the expanded hydrogen
lattice.
The new indices i, i are auxiliary in the sense that they do not appear in
the final coefficient tensor and must be contracted over in some fashion.
The simplest arrangement is to contract the indices sequentially from one
n tensor to the next. We then have
n1 n2 n3 ...nk in11 in12i2 in23i3 . . . ink1
k
. (3.8)
i1 i2 i3 ...ik1
More compactly, we can use matrix notation,
n1 n2 n3 ...nk n1 n2 n3 . . . nk , (3.9)
where we understand e.g. n2 n3 to denote the matrix product between the
two involving the auxiliary indices. For simplicity, we will assume that the
dimensions of all auxiliary indices are the same, and we call this dimension
M. Then the tensors n are of dimension 4 M M (except for the first
and the last) and the total number of parameters in the wavefunction ansatz
is O(4M 2 k).
This approximation (3.9) is, in fact, the DMRG wavefunction. It is com-
monly referred to as the DMRG wavefunction with M states. In calculations
it is typically used in a variational fashion, where the components iin are
the coefficients to be varied. Note that by increasing the dimension M,
we make the ansatz arbitrarily flexible, and eventually exact. Because the
wavefunction coefficients are obtained as a series of matrix products, the
ansatz is also referred to in the literature as the matrix product state [47].
Combining the above ansatz for the coefficient tensor explicitly with the
Slater determinants yields the full DMRG wavefunction,
|DMRG = n1 n2 n3 . . . nk |n1 n2 n3 . . . nk . (3.10)
{n}
Fig. 3.2. This figure illustrates how the matrix product state of the DMRG wave-
function encodes locality of the problem by sequentially contracting auxiliary
indices that connect adjacent orbitals.
The DMRG has many formal properties which are beneficial for quantum
chemical applications. Here we briefly discuss a few:
correlation one would benefit from the knowledge of which orbitals are
in the occupied and virtual spaces. We return to this in Section 3.6.
Size-consistency: The DMRG ansatz is size-consistent when using a
localised basis. To see this in an informal way, let us assume that we
have two DMRG wavefunctions |A and |B for subsystems A and B
separately. Both A and B have a matrix product structure, i.e.
|A = na1 . . . nak |na1 . . . nak (3.19)
{na }
|B = nb1 . . . nbk |nb1 . . . nbk . (3.20)
{nb }
Their product is also a DMRG wavefunction with a matrix product
structure. This then describes the combined system AB in a size-
consistent way, i.e.
|AB = |A |B = na1 . . . nak nb1 . . .
{na }{nb }
nbk
|na1 . . . nak nb1 . . . nbk (3.21)
In Section 3.2 we motivated the construction of the DMRG from the decom-
position of a high-dimensional tensor. In the original formulation, however,
the DMRG was derived from the numerical renormalization group ideas of
Wilson [1,2,13]. This alternative viewpoint is quite helpful and we describe
it briefly.
Consider again the model problem of a linear chain of k hydrogen atoms
in a minimal orthonormal basis. In a renormalization group approach, we
build up the electronic structure of the hydrogen chain one atom at a time.
For the first atom, any state |i1 in the Fock space F1 of the first basis
function, {n1 }, can be written as
|i1 = in11 |n1 . (3.22)
n1
For states |i2 , |i3 in the Fock spaces F2 , F3 of two and three hydrogen
atoms, respectively, nn
|i2 = i21 2 |n1 n2 (3.23)
n1 n2
|i3 = in31 n2 n3 |n1 n2 n3 . (3.24)
n1 n2 n3
July 19, 2011 11:28 9in x 6in b1189-ch03 Solving the Schrodinger Equation
Finally, for a state in the Fock space of the full chain Fk , we recover the
full configuration interaction representation of Eq. (3.3).
In the above, we expanded states in the product basis of occupancies
of the individual atoms. However, imagine that we solve the Schrodinger
equation of successively longer hydrogen chains, first with one atom, then
two, and so on. We would like to reuse information from the eigenstates
of the k 1 atom subchain to construct eigenstates of the k atom chain.
Since the k 1 chain eigenstates |ik1 form a complete basis for Fk1 ,
rather than expanding in the occupancy basis of Fk , we can instead use the
basis {|ik1 } {|nk }. For example, for |i3 F3 , instead of (3.24), we
can write
n
|i3 = i23i3 |i2 n3 , (3.25)
i2 n3
where the coefficients in this intermediate expansion in23 and in the occu-
pancy basis n1 n2 n3 are related via
nn n
in31 n2 n3 = i21 2 i23i3 . (3.26)
i2
Extrapolating, a state of k hydrogen atoms |ik can be written in terms of
the states of the intermediate k 1 atom chain |ik1 , which themselves can
be written in turns of the intermediate states |ik2 , and this is repeated all
the way to |i1 . This leads to a nested relationship between the coefficients
in the intermediate expansion and occupancy basis of the k atom chain.
ink1 n2 n3 ...nk = in11 in22i3 in33i4 . . . ink1
k
ik . (3.27)
i1 i2 i3 ...ik1
For ground state ik = 1 and Eq. (3.27) takes the same form as the
DMRG wavefunction. However, there are two details which we have yet
to discuss. Firstly, Eq. (3.27) is exact, since we used the complete basis
for each (intermediate) Fock space of the subchains. As derived above, the
dimension of the auxiliary ip index, associated with the p atom subchain,
is not M, but grows exponentially as p = 1 . . . k 1. Secondly, the compo-
nents are here constructed from the Hamiltonian eigenstates of interme-
diate subchains, rather than being variational coefficients as in the DMRG
wavefunction.
Regarding the first point, Wilson noted that the eigenstates |ip of an
intermediate p subchain span an increasing set of energies as the subchain
length p increases. If we are interested in only a few low energy eigen-
states of the full k chain problem, it would be unnecessary to use a com-
plete basis for the intermediate Fock spaces. Instead, for each intermediate
July 19, 2011 11:28 9in x 6in b1189-ch03 Solving the Schrodinger Equation
After its introduction in 1992 by White [1], the density matrix renor-
malization group was soon applied to many problems involving model
Hamiltonians in condensed matter. Early applications in conjunction with
semi-empirical Hamiltonians focused on the Hubbard and PariserParr
Pople (PPP) models for conjugated systems, see e.g. Refs. [1521]. As
a representative example, we consider the work by Fano et al. [18] who
performed DMRG studies on cyclic polyenes (Cm Hm , m = 4n + 2,
July 19, 2011 11:28 9in x 6in b1189-ch03 Solving the Schrodinger Equation
the same accuracy across the entire potential energy curve. This and other
studies demonstrated the ability of the DMRG wavefunction to capture
multireference correlation in a balanced way, as we described in Section 3.4.
Conversely, when moving from a small basis to a larger basis DMRG calcu-
lation for the same molecule (e.g. from a double-zeta to a triple-zeta basis
for the water molecule, as in [29]), the number of states that needed to be
kept in the DMRG ansatz to achieve a given accuracy had to be increased
significantly, demonstrating that dynamical correlation is not efficiently
captured by the DMRG wavefunction. Thus, as emphasized several times
in this chapter, the most promising domain of application of the DMRG
method must to be to solve active-space strong correlation, multireference,
problems. With current DMRG technology, a nearly-exact treatment of the
complete active space correlation for arbitrary molecules with up to roughly
30 active orbitals and electrons can be achieved.
Given the strength of the DMRG method for large-scale multireference
electronic structure, a clear domain of application must be to complicated
transition metal problems. Although such applications are still at a rel-
atively early stage, the Reiher group has performed some preliminary
studies [30, 3841]. For example, they used the DMRG method [30] to
calculate the spin-gap of the Cu2 O2 core of tyrosinase, a problem which
had evaded conventional complete active space methods due to the need for
a large active space. More recently, Kurashige and Yanai, not only obtained
correctly converged DMRG energies for the same Cu2 O2 system that had
been studied earlier but not fully converged by the Reiher group, but also
carried out a near exact solution of the complete active space problem for the
Cr2 molecule correlating an active space of 24 electrons in 30 orbitals [33].
Finally, we recently reported a description of the Cu2 O2 electronic structure
problem that included dynamic correlation via canonical transformation
theory on top of the density matrix renormalization group, along the lines
of Section 3.6.
One of the directions of our own group in recent years has been to use
the DMRG as an efficient local multireference method for long molecules.
In these ideal settings, the DMRG method obtains near-exact active space
solutions of the Schrodinger equation for problem sizes inconceivable using
other techniques, e.g. for 100 orbital, 100 electron active spaces. In our
first demonstration, we showed how the DMRG could exactly describe the
simultaneous bond-breaking of 49 bonds in a hydrogen chain, a problem
nominally requiring a 50 electron, 50 orbital active space. In more recent
works, we have used our local DMRG method to study excited states in con-
jugated systems which have significant multireference character, ranging
July 19, 2011 11:28 9in x 6in b1189-ch03 Solving the Schrodinger Equation
Fig. 3.3. A keplerate magnet contains more than 30 iron spin-centres. Because
of the non-linearity of the correlation DMRG does not give a good description of
electronic structure of this molecule, but this might be a candidate for solution by
a more general tensor network.
July 19, 2011 11:28 9in x 6in b1189-ch03 Solving the Schrodinger Equation
where x, y range from 0 to k and denote the coordinates of the orbitals on the
lattice. Promisingly, the general arguments that demonstrate the optimality
of the DMRG in one-dimensional topologies appear to apply to the PEPS
wavefunction in two- and three-dimensional topologies.
There remain, however, many challenges before efficient calculations
using tensor network wavefunctions are practical. Most of the difficulties
arise from the proliferation of auxiliary indices. For example, unlike in
the case of the DMRG wavefunction, the exact variational evaluation of
the energy in PEPS formally requires exponential time! However, as is
well known from coupled cluster theory, it is not necessary for the energy
(and other observables) to be evaluated as variational expectation values
and indeed approximate polynomial time algorithms to evaluate the PEPS
energy have been introduced [45]. Still, the most pressing questions before
these higher dimensional analogues of the DMRG become widely used,
however, are (i) what is the best approximate algorithm for evaluating
expectation values, and (ii) are there modifications to the tensor network
form which facilitate more efficient manipulation. Should these questions
be satisfactorily solved, this would open the way to the application of
tensor network states such as PEPS and MERA as a general way to solve
strongly correlated electronic structure problems, of arbitrary complexity,
in quantum chemistry.
Bibliography
[18] G. Fano, F. Ortolani, and L. Ziosi, J. Chem. Phys. 108, 9246 (1998).
[19] G.L. Bendazzoli, S. Evangelisti, G. Fano, F. Ortolani, and L. Ziosi, J. Chem. Phys.
110, 1277 (1999).
[20] C. Raghu, Y. Anusooya Pati, and S. Ramasesha, Phys. Rev. B 65, 155204 (2002).
[21] C. Raghu, Y. Anusooya Pati, and S. Ramasesha, Phys. Rev. B 66, 035116 (2002).
[22] R. Pariser and R. Parr, J. Chem. Phys. 21, 466 (1953).
[23] R. Pariser and R. Parr, J. Chem. Phys. 21, 767 (1953).
[24] J.A. Pople, Trans. Faraday Soc. 49, 1375 (1953).
[25] J. Paldus, M. Takahashi, and R.W.H. Cho, Phys. Rev. B 30, 4267 (1984).
[26] J. Paldus, J. Czek, and M. Takahashi, Phys. 30, 2193 (1984).
[27] S.R. White and R.L. Martin, J. Chem. Phys. 110, 4127 (1999).
[28] A.O. Mitrushenkov, G. Fano, F. Ortolani, R. Linguerri, and P. Palmieri, J. Chem.
Phys. 115, 6815 (2001).
[29] G.K.-L. Chan and M. Head-Gordon, J. Chem. Phys. 116, 4462 (2002).
[30] K.H. Marti, I.M. Ondk, G. Moritz, and M. Reiher, J. Chem. Phys. 128, 014104
(2008).
[31] O. Legeza, J. Roder, and B.A. Hess, Phys. Rev. B 67, 125114 (2003).
[32] D. Zgid and M. Nooijen, J. Chem. Phys. 128, 014107 (2008).
[33] Y. Kurashige and T. Yanai, J. Chem. Phys. 130, 234114 (2009).
[34] S. Daul, I. Ciofini, C. Daul, and S.R. White, Int. J. Quantum Chem. 79, 331 (2000).
[35] A.O. Mitrushenkov, R. Linguerri, P. Palmieri, and G. Fano, J. Chem. Phys. 119, 4148
(2003).
[36] O. Legeza, J. Roder, and B.A. Hess, Mol. Phys. 101, 2019 (2003).
[37] G.K.-L. Chan and M. Head-Gordon, J. Chem. Phys. 118, 8551 (2003).
[38] G. Moritz, B.A. Hess, and M. Reiher, J. Chem. Phys. 122, 024107 (2005).
[39] G. Moritz and M. Reiher, J. Chem. Phys. 124, 034103 (2006).
[40] G. Moritz, A. Wolf, and M. Reiher, J. Chem. Phys. 123, 184105 (2005).
[41] G. Moritz and M. Reiher, J. Chem. Phys. 126, 244109 (2007).
[42] J. Hachmann, J.J. Dorando, M. Aviles, and G.K.-L. Chan, J. Chem. Phys. 127, 134309
(2007).
[43] T. Yanai, Y. Kurashige, D. Ghosh, and G.K.-L. Chan, Int. J. Quantum Chem. (2009).
In press.
[44] D. Ghosh, J. Hachmann, T. Yanai, and G.K.-L. Chan, J. Chem. Phys. 128, 144117
(2008).
[45] F. Verstraete, V. Murg, and J.I. Cirac, Adv. Phy. 57, 143 (2008).
[46] G. Vidal. Understanding Quantum Phase Transitions, Series in Condensed Matter
Physics, edited by L.D. Carr (Taylor & Francis, Boca Raton, 2010) pp. 115138.
July 19, 2011 11:28 9in x 6in b1189-ch04 Solving the Schrodinger Equation
Chapter 4
Reduced-Density-Matrix Theory
for Many-electron Correlation
David A. Mazziotti
61
July 19, 2011 11:28 9in x 6in b1189-ch04 Solving the Schrodinger Equation
62 D.A. Mazziotti
4.1. Introduction
where the a and the a are the second-quantized creation and annihilation
operators, the indices refer to members of a spin-orbital basis set, and the
two-electron reduced Hamiltonian matrix 2 K is the matrix representation
of the operator
1 1 Zj 1
2
K = 12 + 1 . (4.2)
N 1 2 r1j 2 r12
j
July 19, 2011 11:28 9in x 6in b1189-ch04 Solving the Schrodinger Equation
64 D.A. Mazziotti
i,j i,j
E= 2
Kk,l 2 Dk,l (4.3)
E = Tr(2 K 2 D) (4.4)
Both the energy as well as the one- and two-electron properties of an atom
or molecule can be computed from a knowledge of the 2-RDM. To perform
a variational optimization of the ground-state energy, we must constraint
the 2-RDM to derive from integrating an N-electron density matrix. These
necessary yet sufficient constraints are known as N-representability condi-
tions [1, 2].
4.2.2.1. Two-positivity
When p = 2, we may choose the Ci,j in three distinct ways: (i) to create
one particle in the jth orbital and one particle in the ith orbital, that is
Ci,j = ai aj , (ii) to annihilate one particle in the jth orbital and one particle
in the ith orbital (or create holes in each of these orbitals), Ci,j = ai aj , and
(iii) to annihilate one particle in the jth orbital and create one particle in
the ith orbital, that is Ci,j = ai aj . These three choices for the Ci,j produce
the following three different metric matrices for the 2-RDM:
i,j
2
Dk,l = |ai aj al ak |, (4.9)
i,j
2
Qk,l = |ai aj al ak |, (4.10)
i,j
2
Gk,l = |ai aj al ak |, (4.11)
66 D.A. Mazziotti
4.2.2.2. Three-positivity
The conditions that a 3-RDM be three-positive follow from writing the
operators in Eq. (4.6) as products of three second-quantized operators [17,
19,34,76]. The resulting basis functions lie in four vector spaces according
to the number of creation operators in the product. Basis functions between
these vector spaces are orthogonal because they are contained in Hilbert
spaces with different numbers of particles. The four metric matrices that
must be constrained to be positive semidefinite for three-positivity [17] are
given by
3 i,j,k
Dp,q,r = |ai aj ak ar aq ap | (4.16)
3 i,j,k
Ep,q,r = |ai aj ak ar aq ap | (4.17)
3 i,j,k
Fp,q,r = |ai aj ak ar aq ap | (4.18)
p,q,r = |ai aj ak ar aq ap |.
3
Qi,j,k
(4.19)
probability distributions for one particle and two holes and three holes
to be nonnegative.
As in Eqs. (4.12) and (4.13) for the two-positive metric matrices, the
three-positive metric matrices are connected by linear mappings which can
be derived by rearranging the second-quantized operators. For example,
the mapping from 3 D to 3 Q may be written with the Grassmann wedge
product [14, 77] as
where 1 I, 2 I, and 3 I are the one-, two-, and three-particle identity matrices.
Similar mappings can be derived to express 3 E and 3 F as functionals of 3 D.
Contraction of the three-positivity matrices in Eq. (4.16) generates the one-
and two-positivity metric matrices, and hence, the three-positivity condi-
tions imply the one- and two-positivity conditions. A 2-RDM is defined to
be three-positive if it arises from the contraction of a three-positive 3-RDM:
2 i,j 1 3 i,j,k
Dp,q = Dp,q,k . (4.21)
N 2
k
The three-positivity conditions have been examined in variational 2-RDM
calculations on spin [17, 76, 78] and molecular [34, 35] systems where they
give highly accurate energies and 2-RDMs.
68 D.A. Mazziotti
the context of 2-RDM theory by Rosina [82], Harriman [83], and the
author [14], and it was recently employed for solving large-scale semidef-
inite programs in combinatorial optimization [84]. The applications in
Mazziotti [25, 26] and Burer and Choi [85] are the first to apply the matrix
factorization to semidefinite programs with multiple diagonal blocks in
the solution matrix M. The linear constraints, including the trace, the
contraction, and the interrelations between the metric matrices, become
quadratic in the new independent variables R. Therefore, the factorization
in Eq. (4.22) converts the semidefinite program into a nonlinear program
where the energy must be minimized with respect to R, while nonlinear
constraint equalities are enforced.
We solve the nonlinear formulation of the semidefinite program by
the augmented Lagrange multiplier method for constrained nonlinear opti-
mization [25, 26, 37, 84, 86]. Consider the augmented Lagrangian function
1
L(R) = E(R) i ci (R) + ci (R)2 , (4.23)
i i
where R is the matrix factor for the solution matrix M, E(R) is the ground-
state energy as a function of R, {ci (R)} is the set of equality constraints, {i }
is the set of Lagrange multipliers, and is the penalty parameter. For an
July 19, 2011 11:28 9in x 6in b1189-ch04 Solving the Schrodinger Equation
4.2.4. Applications
Because the N-representability conditions are independent of a reference
wavefunction, the variational 2-RDM method can capture strong electron
correlation effects in molecules. To illustrate this ability, we discuss pre-
vious applications of the variational 2-RDM method to (i) the dissociation
of the N2 molecule [34], (ii) the prediction of the metal-to-insulator tran-
sition in the H64 lattice [41], and (iii) the emergence of polyradical character
in acene chains [38].
70 D.A. Mazziotti
108.5 3POS
CCSDT
Energy (a.u.)
108.55
FCI
108.6
108.65
108.7
Fig. 4.1. This figure compares the 3POS and CCSDT potential energy curves
denoted by dashed lines with the FCI curve denoted by a solid line. The variational
lower-bound 3POS curve is essentially indistinguishable from the FCI curve. Bond
distance is reported in Angstroms (A).
0.3
RHF
MP2
0.4
0.45
0.5
0.55
0.5 1 1.5 2 2.5 3 3.5
R ()
Fig. 4.3. Potential energy curve for the symmetric dissociation of the 444
hydrogen cube, reported per atom, as a function of the distance between the closest
atoms.
0.25
RHF
DQG
0.2
0.15
0.1
0.05
0
0.5 1 1.5 2 2.5 3 3.5
R ()
Fig. 4.4. Metal-to-insulator transition in the 444 hydrogen cube under the
change of the distance R between closest atoms.
72 D.A. Mazziotti
2
2acene
3acene
4acene
5acene
Occupation number
1.5 6acene
7acene
8acene
0.5
0
5 10 15 20 25 30
Orbital index
Fig. 4.6. Natural orbital occupation numbers for the n-acene series (n = 2 8).
The basis set is double- and calculations are performed with an active space that
includes the 4n + 2 lowest lying molecular orbitals.
74 D.A. Mazziotti
ij
|ai aj al akH| = E 2Dkl . (4.24)
physicists notation [92], the ACSE depends only upon the 2- and 3-RDMs.
To eliminate the 3-RDM from the ACSE approximately, we can reconstruct
the 3-RDM from the 2-RDM according to its cumulant expansion [15, 16,
45, 47, 48, 93]
3 i,j,k j i,j j
Ds,t,u 6 1Dsi 1Dt 1Duk + 9 2Ds,t 2 1Dsi 1Dt 1Duk , (4.27)
where denotes the anti-symmetric Grassmann (or wedge) product [14,
94]. The missing term in the reconstruction, known as the connected (or
cumulant) part 3 of the 3-RDM, contains information not expressible as
wedge products of the 1- and 2-RDMs [15, 16, 45, 47, 48, 93]. Although
the connected 3-RDM can be approximated in terms of the 2-RDM, it is
neglected in the multi-reference formulation of the ACSE in [58].
The cumulant reconstruction [15, 16, 45, 47] is also an essential part of
the canonical transformation (CT) method [95, 96], which has been shown
to be a solution of the ACSE in the Heisenberg representation [54]. Despite
their theoretical connections, the ACSE and CT methods are practically
very different with distinct fundamental variables (the 2-RDM (ACSE)
versus an effective Hamiltonian (CT)), convergence behaviors, results, and
capabilities [54, 61]. In general, reconstruction is an important component
of any method within contracted Schrodinger theory, that is a theory using
the CSE, or a part of the CSE such as the ACSE, as a stationary-state
condition [40].
76 D.A. Mazziotti
and
i,j
d 2 Dk,l
= ()|[2 k,l , S()]|().
i,j
(4.32)
d
To minimize the energy along , we select the following elements of the
p,q
two-particle matrix 2 Ss,t (), which minimize dE/d along its gradient
with respect to these elements [34]:
Sr,s () = ()|[2 s,t ,H]|().
2 p,q p,q
(4.33)
Importantly, the left side of Eq. (4.33) is simply the residual of the ACSE.
If the residual in the ACSE vanishes, the unitary transformations become
the identity operator, and the energy and 2-RDM cease to change with .
Using the cumulant reconstruction of the 3-RDM in Eq. (4.27) permits
us to express these equations approximately in terms of the 2-RDM.
Hence, Eqs. (4.31)(4.33) collectively provide a system of differential
equations [5355, 58] for evolving an initial 2-RDM to a final 2-RDM
that solves the ACSE for stationary states. In practice, the equations are
evolved in until either (i) the energy or (ii) the least-squares norm of
the ACSE increases. The ACSE can be seeded with an initial 2-RDM from
either (i) a HartreeFock calculation or (ii) any correlated calculation (i.e. a
multi-configuration self-consistent-field (MCSCF) calculation [58]). Con-
vergence to the ACSEs solution is efficient in both cases [53, 54, 58].
As demonstrated in the recent extension of the ACSE to excited
states [63], even though the unitary rotations are selected in Eq. (4.33)
to minimize the energy, the system of differential equations in Eqs. (4.31)
(4.33) is capable of producing energy and 2-RDM solutions of the ACSE
for both ground and excited states. Because excited states correspond to
local energy minima of the ACSE and the gradient in Eq. (4.33) leads to a
local rather than global energy minimum, an excited-state solution can be
readily obtained from a good guess for the initial 2-RDM. A guess will be
good when it is closer to the minimum of the desired solution of the ACSE
than to any other minimum. Such 2-RDM guesses can be generated from
multi-configuration self-consistent-field (MCSCF) calculations. The initial
MCSCF 2-RDM directs the optimization of the ACSE to a desired excited
state because it contains important multi-reference correlation effects that
identify the state.
Seeding the ACSE with an MCSCF 2-RDM yields a balanced treatment
of both single- and multi-reference correlation [40, 58, 61, 64, 65]. Because
the ACSE with reconstruction incorporates many high orders of a renor-
malized perturbation theory, its energies are significantly more accurate
than those from second or third orders of a multi-reference many-body
July 19, 2011 11:28 9in x 6in b1189-ch04 Solving the Schrodinger Equation
perturbation theory [40, 58, 61, 64]. Furthermore, in the absence of strong
correlation the ACSE can be compared to coupled cluster methods where it
yields energies that are between the accuracies of coupled cluster with
single and double excitations (CCSD) and coupled cluster with single,
double, and triple excitations (CCSDT) [55]. In addition to its balance of
moderate and strong correlation effects, the ACSE has advantages in com-
putational scaling. It scales like r6 , where r is the rank of the one-electron
basis set, but its accuracy is between that of CCSD and CCSDT where the
latter scales as r7 . Moreover, while multi-reference wavefunction methods
scale exponentially with the number ra of active orbitals of the active space,
the ACSE only scales quadratically ra2 [58, 61]. This significant reduction
in computational cost allows the ACSE to treat larger active spaces than
traditional methods.
4.3.3. Applications
Applications of the ACSE to the ground state have been made to a number
of systems and reactions including: (i) the electrocyclic ring-opening of
bicyclobutane to gauche-1,3-butadiene [60], (ii) the relative energies of the
cis-trans isomers of HO 3 [57], (iii) the sigmatropic shift of hydrogen in
propene and acetone enolate [61], and (iv) the study of vinylidene carbene
reactions [65]. These calculations demonstrate that the ACSE yields a
balanced description of single- and multi-reference (strong) correlation
effects in both the presence or absence of strong electron correlation. In
contrast, traditional wavefunction methods tend to be optimal in either
the presence (multi-reference perturbation methods) or absence (coupled-
cluster methods) of strong correlation. An equally accurate description of
correlation in both limits is extremely important in practical applications
where energy differences must be computed between molecular species or
states with significantly different degrees of electron correlation.
78 D.A. Mazziotti
41.2 and 55.7 kcal/mol reaction barriers for the conrotatory and disrotatory
pathways, respectively (Fig. 4.7). The ACSE energy barrier of 55 kcal/mol
appears to resolve a 10 kcal/mol energy discrepancy between coupled
cluster and multi-reference perturbation methods in the literature [60].
38.4 3
MCSCF 1 B1
3
MCSCF 13A2
38.5 MCSCF 2 B1
3
Full CI 1 B1
Full CI 13A2
3
38.6 Full CI 2 B1
Energy (H)
38.7
38.8
38.9
39
39.1
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8
R ()
(a)
38.4 3
MRMP2 1 B1
3
MRMP2 1 A2
38.5 MRMP2 23B1
3
Full CI 1 B1
Full CI 13A2
3
38.6 Full CI 2 B1
Energy (H)
38.7
38.8
38.9
39
39.1
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8
R ()
(b)
80 D.A. Mazziotti
38.4 3
ACSE 1 B1
3
ACSE 1 A2
3
38.5 ACSE 2 B1
3
Full CI 1 B1
3
Full CI 1 A2
3
38.6 Full CI 2 B1
Energy (H)
38.7
38.8
38.9
39
39.1
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8
R ()
(c)
Fig. 4.8. The (a) MCSCF (top), (b) MRMP2 (middle), and (c) ACSE (bottom)
potential energy curves for the 1 3 B1 , 1 3A2 , and 2 3 B1 states of methylene, as
functions of R, plotted against those from FCI, given by data points. Reproduced
from [66].
Both the variational 2-RDM method in Section 4.2 and the contracted
Schrodinger methods in Section 4.3 have the ability to capture strong
July 19, 2011 11:28 9in x 6in b1189-ch04 Solving the Schrodinger Equation
1 ab
| = c0 |0 + cij aa ab aj ai |0 , (4.34)
4
ijab
82 D.A. Mazziotti
1 ijkl cd 2
2
Dijab = cijab 1 fabcd |ckl | . (4.36)
4
klcd
Because of its role, this tensor has been called the topological factor [67
69, 98]. If all elements of f are set to one, we obtain CID as in Eq. (4.35),
but if all elements of f are set to zero, we obtain a coupled electron-
pair approximation (CEPA). While the CID energy is not size extensive,
its 2-RDM is N-representable; in contrast, the energy from CEPA is size
extensive, but its 2-RDM is not N-representable. Selection of an optimal f
requires us to consider the N-representability of the 2-RDM.
The most important N-representability conditions are the two-positivity
conditions. The two-positivity conditions imply N-representability condi-
tions known as the CauchySchwarz inequalities. From nonnegativity of
2 D and 2 Q we have
ij ij
(2 Dab )2 2 Dij 2 Dab
ab
(4.37)
ij ij
(2 Qab )2 2 Qij 2 Qab
ab . (4.38)
possible values can be divided into nine classes, labeled by no /nv , where
no is the number of occupied orbitals shared by {ij} and {kl} and nv is
July 19, 2011 11:28 9in x 6in b1189-ch04 Solving the Schrodinger Equation
ijkl
Topological factors fabcd (or fno /nv )
2-RDM
methods 0/0 1/0 2/0 0/1 0/2 1/1 2/1 1/2 2/2
CID 1 1 1 1 1 1 1 1 1
CEPA 0 0 0 0 0 0 0 0 0
D 0 1 1 0 0 1 1 1 1
Q 0 0 0 1 1 1 1 1 1
K 0 1/2 1 1/2 1 3/4 1 1 1
M 0 0 1 0 1 1 1 1 1
the number of virtual orbitals shared by {ab} and {cd}. For the energy
functional to be size extensive, the topological factor must vanish for the
class no /nv = 0/0. Unlike the other factors, the D and Q factors do not
maintain particle-hole symmetry, that is in general fno /nv = fnv /no . To
restore particle-hole symmetry with exact results for two particles or two
holes (when single excitations are included), we must set the other classes
to one, except for 0/1, 1/0, and 1/1. Because either D or Q has a factor of
zero for 0/1 or 1/0 while both D and Q have factors of one for 1/1, we set
the factor to one for 1/1 and zero for 0/1 and 1/0. These choices generate a
new topological factor (M) in Table 4.1 [67, 75].
4.4.2. Applications
4.4.2.1. Correlation energies at equilibrium geometries
Correlation energies from parametric 2-RDM methods as well as traditional
wavefunction methods are reported in Table 4.2 for several molecules in
the polarized quadruple-zeta (cc-pVQZ) basis set [75, 99]. Molecules NH3
and HCN are given in the correlation-consistent polarized triple-zeta (cc-
pVTZ) basis set [99]. The K and M methods recover much more correlation
energy than CISD, which is not size extensive. Furthermore, the M method
improves significantly upon CCSD with energies that are closer to those
from CCSD(T). The K method improves slightly upon CCSD. The 2-RDMs
from the parametric methods are nearly N-representable; for example, with
the M method for N2 the lowest eigenvalues of 2 D, 2 Q, and 2 G, 5.0
104 , 3.0 104 , and 4.1 104 , are 3-to-4 orders of magnitude
smaller than the largest eigenvalues.
July 19, 2011 11:28 9in x 6in b1189-ch04 Solving the Schrodinger Equation
84 D.A. Mazziotti
Table 4.2. Correlation energies from parametric 2-RDM methods as well as tra-
ditional wavefunction methods are reported for molecules in the cc-pVQZ basis
set except for NH3 and HCN in the cc-pVTZ basis set. The M 2-RDM methods
improves significantly upon CCSD. Energies are given in Hartrees (H).
100.15
Energy (a.u.)
100.2
CCSD
2-RDM (K)
100.25 2-RDM (M)
CCSD(T)
CR-CC(T)
100.3
Fig. 4.9. The potential energy curves for hydrogen fluoride in the cc-pVQZ basis
set from the K, M, CCSD, CCSD(T), and CR-CC(2,3) methods. The energy results
of the M functional are nearly indistinguishable from those of the computationally
more expensive CR-CC(2,3). The length of the H-F bond is given in Angstroms
(A).
The collection of 2-RDM methods offers a new paradigm for the com-
putation of electron correlation in quantum systems [1]. While the wave-
function scales exponentially in the number N of electrons, the 2-RDM
scales polynomially in N. Consequently, for many-electron quantum
systems the 2-RDM theory offers a significant reduction in computational
cost even in the presence of strong electron correlation. The 2-RDM has
long been employed as a tool for analysis of quantum information, but, as
discussed in the Introduction, efforts to compute the 2-RDM directly were
hindered by the N-representability problem, that is the 2-RDM must be
constrained to correspond to an N-electron system [7].
July 19, 2011 11:28 9in x 6in b1189-ch04 Solving the Schrodinger Equation
86 D.A. Mazziotti
Fig. 4.10. Critical points on the potential energy surface for the isomerization of
nitrosomethane to trans-formaldoxime as computed by the 2-RDM method in the
aug-cc-pVTZ basis set. The dashed line represents a 1,3-hydrogen shift; the solid
line represents successive 1,2-shifts. The figure shows that 1,2-shift is energetically
more favorable than the 1,3-shift by about 10 kcal/mol. All relative energies are
reported in kcal/mol.
Recent advances [1] have enabled the direct computation of the 2-RDM
without the many-electron wavefunction by the methods discussed in this
chapter: (i) the constrained and parametric variational 2-RDM methods
and (ii) the solution of the contracted Schrodinger equation or its anti-
Hermitian part. Importantly, as seen with the acene chains [38] and the
hydrogen lattices [41], these 2-RDM approaches permit the treatment
of strong electron correlation in systems that are too large to treat with
traditional electronic structure methods. Although recent wavefunction
methods for strong correlation such as density-matrix renormalization
group are often limited to systems with a well-defined, one-dimensional
ordering of the electronic orbitals (i.e. linear systems) [101], the variational
2-RDM method is applicable to a broader range of molecules including
systems with arbitrary orbital orderings and geometries. The 2-RDM-based
methods have been applied to study: (i) chemical reactions and materials
July 19, 2011 11:28 9in x 6in b1189-ch04 Solving the Schrodinger Equation
[38, 40, 41, 6066, 73, 74], (ii) quantum phase transitions [104, 105], (iii)
motions of electrons and nuclei [39, 106, 107], (iv) molecular conduc-
tivity [102, 103], and (v) high-temperature superconductivity [108].
While significant progress has been made, there remain many important
opportunities for further advancements in theory and applications. A sam-
pling of future extensions of recent work might include: (i) improvements
in the computational efficiency of the first-order semidefinite-programming
algorithms [25, 26, 37], (ii) enhancements of existing linear-scaling para-
metric 2-RDM methods [73] for the better treatment of medium-to-large
molecular systems, and (iii) generalizations of existing non-equilibrium
steady-state ACSE methods [102, 103] to treat electron correlation in
molecular conductors explicitly. It is hoped that the present chapter may
serve as a starting point for these and other new developments in 2-RDM
mechanics that will further enhance our ability to study and understand
quantum molecular systems and processes.
Acknowledgments
Bibliography
88 D.A. Mazziotti
[11] F. Colmenero and C. Valdemoro, Int. J. Quantum Chem. 51, 369 (1994).
[12] H. Nakatsuji and K. Yasuda, Phys. Rev. Lett. 76, 1039 (1996).
[13] K. Yasuda and H. Nakatsuji, Phys. Rev. A 56, 2648 (1997).
[14] D.A. Mazziotti, Phys. Rev. A 57, 4219 (1998).
[15] D.A. Mazziotti, Chem. Phys. Lett. 289, 419 (1998); Int. J. Quantum Chem. 70, 557
(1998).
[16] D.A. Mazziotti, Phys. Rev. A 60, 3618 (1999); 4396 (1999).
[17] D.A. Mazziotti and R. M. Erdahl, Phys. Rev. A 63, 042113 (2001).
[18] M. Nakata, H. Nakatsuji, M. Ehara, M. Fukuda, K. Nakata, and K. Fujisawa, J.
Chem. Phys. 114, 8282 (2001).
[19] D.A. Mazziotti, Phys. Rev. A 65, 062511 (2002).
[20] M. Nakata, M. Ehara, and H. Nakatsuji, J. Chem. Phys. 116, 5432 (2002).
[21] D.A. Mazziotti, Phys. Rev. A 66, 062503 (2002).
[22] G. Gidofalvi and D.A. Mazziotti, Phys. Rev. A 69, 042511 (2004).
[23] T. Juhasz and D.A. Mazziotti, J. Chem. Phys. 121, 1201 (2004).
[24] Z. Zhao, B.J. Braams, H. Fukuda, M.L. Overton, and J.K. Percus, J. Chem. Phys.
120, 2095 (2004).
[25] D.A. Mazziotti, Phys. Rev. Lett. 93, 213001 (2004).
[26] D.A. Mazziotti, J. Chem. Phys. 121, 10957 (2004).
[27] G. Gidofalvi and D.A. Mazziotti, J. Chem. Phys. 122, 094107 (2005).
[28] G. Gidofalvi and D.A. Mazziotti, J. Chem. Phys. 122, 194104 (2005).
[29] G. Gidofalvi and D.A. Mazziotti, Phys. Rev. A 72, 052505 (2005).
[30] D.A. Mazziotti, Phys. Rev. A 72, 032510 (2005).
[31] J.R. Hammond and D.A. Mazziotti, Phys. Rev. A 73, 012509 (2006).
[32] D.A. Mazziotti, Acc. Chem. Res. 39, 207 (2006).
[33] G. Gidofalvi and D.A. Mazziotti, J. Phys. Chem. A 110, 5481 (2006); J. Chem.
Phys. 125, 144102 (2006).
[34] D.A. Mazziotti, Phys. Rev. A 74, 032501 (2006).
[35] G. Gidofalvi and D.A. Mazziotti, J. Chem. Phys. 126, 024105 (2007).
[36] M. Nakata, B.J. Braams, K. Fujisawa, M. Fukuda, J.K. Percus, M. Yamashita, and
Z. Zhao, J. Chem. Phys. 128, 164113 (2008).
[37] D.A. Mazziotti, Math. Model. Num. Anal. 41, 249 (2007).
[38] G. Gidofalvi and D.A. Mazziotti, J. Chem. Phys. 129, 134108 (2008).
[39] E. Kamarchik and D.A. Mazziotti, Phys. Rev. A 79, 012502 (2009).
[40] L. Greenman and D.A. Mazziotti, J. Chem. Phys. 130, 184101 (2009).
[41] A.V. Sinitskiy, L. Greenman, and D.A. Mazziotti, J. Chem. Phys. 133, 014104
(2010).
[42] M.V. Mihailovic and M. Rosina, Nucl. Phys. A237, 221 (1975).
[43] C. Garrod, V. Mihailovic, and M. Rosina, J. Math. Phys. 10, 1855 (1975).
[44] R.M. Erdahl, Reports Math. Phys. 15, 147 (1979).
[45] W. Kutzelnigg and D. Mukherjee, J. Chem. Phys. 110, 2800 (1999).
[46] K. Yasuda, Phys. Rev. A 59, 4133 (1999).
[47] D.A. Mazziotti, Chem. Phys. Lett. 326, 212 (2000).
[48] D.A. Mazziotti in Many-electron Densities and Density Matrices, edited by
J. Cioslowski (Kluwer, Boston, 2000) pp. 139163.
[49] W. Kutzelnigg and D. Mukherjee, J. Chem. Phys. 114, 2047 (2001).
[50] D.A. Mazziotti, J. Chem. Phys. 116, 1239 (2002); Phys. Rev. E 65, 026704 (2002).
July 19, 2011 11:28 9in x 6in b1189-ch04 Solving the Schrodinger Equation
[51] M.D. Benayoun, A.Y. Lu, and D.A. Mazziotti, Chem. Phys. Lett. 387, 485 (2004).
[52] D.R. Alcoba, F.J. Casquero, L.M. Tel, E. Perez-Romero, and C. Valdemoro, Int. J.
Quantum Chem. 102, 620 (2005).
[53] D.A. Mazziotti, Phys. Rev. Lett. 97, 143002 (2006).
[54] D.A. Mazziotti, Phys. Rev. A 75, 022505 (2007).
[55] D.A. Mazziotti, J. Chem. Phys. 126, 184101 (2007).
[56] C. Valdemoro, L.M. Tel, D.R. Alcoba, and E. Perez-Romero, Theor. Chem. Acc.
118, 503509 (2007).
[57] D.A. Mazziotti, J. Phys. Chem. A 111, 12635 (2007).
[58] D.A. Mazziotti, Phys. Rev. A 76, 052502 (2007).
[59] C. Valdemoro, L.M. Tel, E. Perez-Romero, and D.R. Alcoba, Int. J. Quantum Chem.
108, 1090 (2008).
[60] D.A. Mazziotti, J. Phys. Chem. A 112, 13684 (2008).
[61] J.J. Foley IV, A.E. Rothman, and D.A. Mazziotti, J. Chem. Phys. 130, 184112
(2009).
[62] C. Valdemoro, D.R. Alcoba, L.M. Tel, and E. Perez-Romero, Int. J. Quantum Chem.
109, 2622 (2009).
[63] G. Gidofalvi and D.A. Mazziotti, Phys. Rev. A 80, 022507 (2009).
[64] A.E. Rothman, J.J. Foley IV, and D.A. Mazziotti, Phys. Rev. A 80, 052508 (2009).
[65] L. Greenman and D.A. Mazziotti, J. Phys. Chem. A 114, 583 (2010).
[66] J.W. Snyder Jr., A.E. Rothman, J.J. Foley IV, and D.A. Mazziotti, J. Chem. Phys.
132, 154109 (2010).
[67] D.A. Mazziotti, Phys. Rev. Lett. 101, 253002 (2008).
[68] C. Kollmar, J. Chem. Phys. 125, 084108 (2006).
[69] A.E. DePrince III and D.A. Mazziotti, Phys. Rev. A. 76, 042501 (2007).
[70] A.E. DePrince III, E. Kamarchik, and D.A. Mazziotti, J. Chem. Phys. 128, 234103
(2008).
[71] A.E. DePrince III and D.A. Mazziotti, J. Phys. Chem. B 112, 16158 (2008).
[72] A.E. DePrince III and D.A. Mazziotti, J. Chem. Phys. 130, 164109 (2009).
[73] A.E. DePrince III and D.A. Mazziotti, J. Chem. Phys. 132, 034110 (2010).
[74] A.E. DePrince III and D.A. Mazziotti, J. Chem. Phys. 133, 034112 (2010).
[75] D.A. Mazziotti, Phys. Rev. A 81, 062515 (2010).
[76] R.M. Erdahl and B. Jin in Many-electron Densities and Density Matrices, edited
by J. Cioslowski (Kluwer, Boston, 2000) pp. 5784.
[77] D.A. Mazziotti, Phys. Rev. E 65, 026704 (2002).
[78] J.R. Hammond and D.A. Mazziotti, Phys. Rev. A 71, 062503 (2005).
[79] L. Vandenberghe and S. Boyd, SIAM Review 38, 49 (1996).
[80] S. Wright, Primal-Dual Interior-Point Methods (SIAM, Philadelphia, 1997).
[81] Y. Nesterov and A.S. Nemirovskii, Interior Point Polynomial Method in Convex
Programming: Theory and Applications (SIAM, Philadelphia, 1993).
[82] M.V. Mihailovic and M. Rosina, Nucl. Phys. A 130, 386 (1969).
[83] J.E. Harriman, Phys. Rev. A 17, 1257 (1978).
[84] S. Burer and R.D.C. Monteiro, Math. Programm. Ser. B 95, 329 (2003).
[85] S. Burer and C. Choi, Optim. Methods Soft. 21, 493 (2006).
[86] R. Fletcher, Practical Methods of Optimization (John Wiley & Sons, New York,
1987).
[87] L. Cohen and C. Frishberg, Phys. Rev. A 13, 927 (1976).
July 19, 2011 11:28 9in x 6in b1189-ch04 Solving the Schrodinger Equation
90 D.A. Mazziotti
Chapter 5
Sabre Kais
91
July 19, 2011 11:29 9in x 6in b1189-ch05 Solving the Schrodinger Equation
92 Sabre Kais
5.1. Introduction
94 Sabre Kais
where r1 and r2 are the electron-nucleus radii, and Z is the nuclear charge.
The ground state energy at the large-D limit is then given by E (Z, E ) =
min{r1 ,r2 } H .
In the absence of an external electric field, E = 0; Herschbach and
coworkers [49] have found that these equations have a symmetric solution
with the two electrons equidistant from the nucleus, with r1 = r2 = r.
This symmetric solution represents a minimum in the region where all the
eigenvalues of the Hessian matrix are positive, Z Zc = 2. For values of
Z smaller than Zc , the solutions become unsymmetrical with one electron
much closer to the nucleus than the other (r1 = r2 ). In order to describe
this symmetry breaking, it is convenient to introduce new variables (r, )
of the form: r1 = r; r2 = (1 )r, where = (r1 r2 )/r1 = 0 measures
the deviation from the symmetric solution.
By studying the eigenvalues of the Hessian matrix, one finds that the
solution is a minimum of the effective potential for the range, 1 Z Zc .
We now turn to the question of how to describe the system near the critical
point. To answer this question, a complete mapping between this problem
and critical phenomena in statistical mechanics is readily feasible with the
following analogies:
nuclear charge (Z) temperature (T)
external electric field (E ) ordering field (h)
ground state energy (E (Z, E )) free energy (f(T, h))
asymmetry parameter () order parameter (m)
stability limit point (Zc , E = 0) critical point (Tc , h = 0).
Using the above scheme, we can define the critical exponents (, ,
and ) for the electronic structure of the two-electron atom in the following
way:
(Z, E = 0) (Z) ; Z 0
E (Z, E = 0) | Z | ; Z 0
(5.2)
E (Zc , ) sgn() ; 0
E |E =0 | Z | ; Z 0
where Z Z Zc . These critical exponents describe the nature of the
singularities in the above quantities at the critical charge Zc . The values
obtained for these critical exponents are known as classical or mean-field
critical exponents: = 21 ; = 2; = 3; = 1.
This analogy between symmetry breaking and phase transitions was
also generalized to include the large-dimensional model of the N-electron
July 19, 2011 11:29 9in x 6in b1189-ch05 Solving the Schrodinger Equation
atoms [40], simple diatomic molecules [41,43], both linear and planar one-
electron systems [42] as well as three-body Coulomb systems of the general
form ABA [44].
The above simple large-D picture helps to establish a connection to
phase transitions. However, the next question to be addressed is: how to
carry out such an analogy to D = 3? This question will be examined in the
subsequent sections using the finite size scaling approach.
Ice tea, boiling water and other aspects of two-phase coexistence are
familiar features of daily life.Yet phase transitions do not exist at all in finite
systems! They appear in the thermodynamic limit: the volume V and
particle number N in such a way that their ratio, which is the density
= N/V , approaches a finite quantity. In statistical mechanics, the exis-
tence of phase transitions is associated with singularities of the free energy
per particle in some region of the thermodynamic space. These singular-
ities occur only in the thermodynamic limit [11, 12]. This fact could be
understood by examining the partition function Z.
Z= eE(
)/kB T, (5.3)
microstate
where E(
) is the energies of the states, kB is the Boltzmann constant and
T is the temperature. For a finite system, the partition function is a finite
sum of analytical terms, and therefore it is itself an analytical function.
The Boltzmann factor is an analytical function of T except at T = 0. For
T > 0, it is necessary to take an infinite number of terms in order to obtain
a singularity in the thermodynamic limit [11, 12].
In practice, real systems have a large but finite volume and particle
numbers (N 1023 ), and phase transitions are observed. More dramatic
even is the case of numerical simulations, where sometimes systems with
only a few number (hundreds, or even tens) of particles are studied, and
critical phenomena are still present. Finite size scaling theory, which
was pioneered by Fisher [13], addresses the question of why finite systems
apparently describe phase transitions and what is the relation of this phe-
nomena with the true phase transitions in corresponding infinite systems.
Moreover, finite size scaling is not only a formal way to understand the
asymptotic behavior of a system when the size tends to infinity. In fact, the
theory gives us numerical methods capable of obtaining accurate results for
July 19, 2011 11:29 9in x 6in b1189-ch05 Solving the Schrodinger Equation
96 Sabre Kais
July 19, 2011 11:29 9in x 6in b1189-ch05 Solving the Schrodinger Equation
and has a fixed point at T (L,L ). It is expected that the succession of points
The finite size scaling method is a systematic way to extract the critical
behavior of an infinite system from analysis on finite systems [30]. It
is efficient and accurate for the calculation of critical parameters of the
Schrodinger equation. Lets assume we have the following Hamiltonian:
H = H0 + V , (5.11)
where H0 is -independent and V is the -dependent term. We are inter-
ested in the study of how the different properties of the system change when
the value of varies. A critical point, c , will be defined as a point for which
a bound state becomes absorbed or degenerate with a continuum.
Without loss of generality, we will assume that the Hamiltonian,
Eq. (5.11), has a bound state, E , for > c which becomes equal to
zero at = c . As in statistical mechanics, we can define some critical
July 19, 2011 11:29 9in x 6in b1189-ch05 Solving the Schrodinger Equation
98 Sabre Kais
with indicating the nodal index of the element; i = 1 for the left and i = 2
=
for the right border of the element. The functions i (r), i (r), and i (r) are
ln(ON N
/O )
O (; N, N
) = . (5.18)
ln(N
/N)
At the critical point, the expectation value is related to N as a power law,
O N O / , and Eq. (5.18) becomes independent of N. For the energy
operator O = H and using the critical exponent for the corresponding
exponent O we have:
H (c ; N, N
) = . (5.19)
In order to obtain the critical exponent from numerical calculations,
it is convenient to define a new function [30]:
H (; N, N
)
(, N, N
) = , (5.20)
H (; N, N
) V (; N, N
)
r (r)
(r)dr. (5.29)
2 0
For the potential energy:
er
r 2 (r)(r) dr. (5.30)
0 1 er
We calculated the local matrix elements of the potential energy by using
a four point Gaussian quadrature to evaluate the integral. We set the cutoff
for the integration to rc . To include the integration to infinity, we added an
infinite element approximation. To do so, we approximate the solution of
the wave function in the region of [rc , ) to be an exponentially decaying
function with the form (r) = (rc )er .
The local matrices are then assembled to form the complete solution and
by invoking the variational principle on the nodal values i we obtain a gen-
eralized eigenvalue problem representing the initial Schrodinger equation:
Hij |j = Uij |j . (5.31)
The solution of Eq. (5.31) is achieved using standard numerical methods
(see Chapter 10 for details [63]).
Fig. 5.1. Plot of , obtained by FSS method, as a function of . Using the number
of basis N from 8 to 48 in steps of two. For FEM the number elements used were
from 100 to 380 in steps of 20.
Fig. 5.2. Extrapolated values for the critical exponents and the critical parameter .
The solid red dots at 1/N = 0 are the extrapolated critical values. The left side
is the basis set method while the right is the FEM with Hermite interpolation
polynomials.
July 19, 2011 11:29 9in x 6in b1189-ch05 Solving the Schrodinger Equation
FO (x) xO / . (5.32)
O(N) (c ) N O / . (5.33)
Because the same argument of regularity holds for the derivatives of the
truncated expectation values, we have:
m O(N)
N (O m)/ , (5.34)
m
=c
Fig. 5.3. Data collapse study of the basis set method and FEM. The left is the
basis set method and the right being the FEM.
solution even for the very simplistic linear interpolation used for the FEM
calculations. However, the ability of the FEM to describe the wavefunction
locally in terms of elements affords a very natural way to extend its use for
FSS purposes.
where rij are the interelectron distances, and = 1/Z is the inverse of
the nuclear charge. For this Hamiltonian, a critical point means the value
of the parameter, c , for which a bound state energy becomes absorbed or
degenerate with the continuum.
July 19, 2011 11:29 9in x 6in b1189-ch05 Solving the Schrodinger Equation
To carry out the FSS procedure, one has to choose a convenient basis
set to obtain the two lowest eigenvalues and eigenvectors of the finite
Hamiltonian matrix. For M = 2, one can choose the following basis set
functions:
1 j
ijk, (x1 , x2 ) = r1i r2 e(r1 +r2 ) + r1 r2i e(r1 +r2 ) r12
j k
F (12 ,
),
2
(5.37)
where and are fixed parameters; we have found numerically that = 2
and = 0.15 is a good choice for the ground state [21], r12 is the interelec-
tronic distance and F (12 ,
) is a suitable function of the angle between
the positions of the two electrons 12 and the Euler angles
= (, , ).
This function F is different for each orbital-block of the Hamiltonian. For
the ground state F0 (12 ,
) = 1 and F1 (12 ,
) = sin(12 ) cos() for the
2p23 P state. These basis sets are complete for each -subspace. The com-
plete wave function is then a linear combination of these terms multiplied
by variational coefficients determined by matrix diagonalization [21]. In the
truncated basis set at order N, all terms are included such that N i+j +k.
Using FSS calculations with N = 6, 7, 8, . . . , 13 gives the extrapolated
values of c = 1.0976 0.0004 which is in excellent agreement with the
best estimate of c = 1.09766079 using large-order perturbation calcula-
tions [65]. Since the critical charge Zc = 1/c 0.91 indicates that the
hydrogen anion H is stable, Z = 1 > Zc .
For three-electron atoms, M = 3, one can repeat the FSS procedure
with the following Hylleraas-type basis set [22]:
j l m n (r1 +r2 ) r3
ijklmn (x1 , x2 , x3 ) = CA r1i r2 r3k r12 r23 r31 e e 1 , (5.38)
where the variational parameters, = 0.9 and = 0.1, were chosen to
obtain accurate results near the critical charge Z 2, 1 is the spin function
with spin angular moment 1/2:
1 = (1)(2)(3) (1)(2)(3), (5.39)
C is a normalization constant and A is the usual three-particle antisym-
metrizer operator [22]. The FSS calculations gives c = 0.48 0.03. Since
Zc 2.08 the anions He and H are unstable.
One can extend this analysis and calculate the critical charges for
M-electron atoms in order to perform a systematic check of the stability
of atomic dianions. In order to have a stable doubly negatively charged
atomic ion one should require the surcharge, Se (M) M Zc (M) 2. We
have found that the surcharge never exceeds two. The maximal surcharge,
Se (86) = 1.48, is found for the closed-shell configuration of element Rn
July 19, 2011 11:29 9in x 6in b1189-ch05 Solving the Schrodinger Equation
and can be related to the peak of electron affinity of the element N = 85.
The FSS numerical results for M-electron atoms show that at most, only
one electron can be added to a free atom in the gas phase. The second extra
electron is not bound by singly charged negative ion because the combined
action of the repulsive potential surrounding the isolated negative ion and
the Pauli exclusion principle. However, doubly charged atomic negative
ions might exist in a strong magnetic field of the order few atomic units,
where 1 a.u. = 2.3505109 G and superintese laser fields.
5.7. Conclusions
In this chapter, we have shown how the finite size scaling ansatz can be
combined with the variational method to extract information about critical
behavior of quantum Hamiltonians. This approach is based on taking the
number of elements in a complete basis set or the finite element method as
the size of the system. As in statistical mechanics, finite size scaling can then
be used directly to the Schrodinger equation. This approach is general and
gives very accurate results for the critical parameters, for which the bound
state energy becomes absorbed or degenerate with a continuum. To illus-
trate the applications in quantum calculations, we have presented detailed
calculations for the simple case of Hulthen potential and few electron atoms.
For atomic systems we have shown that finite size scaling can be used to
explain and predict the stability of atomic anions: at most, only one electron
can be added to a free atom in the gas phase.
Recently, there has been an ongoing experimental and theoretical search
for doubly charged negative molecular dianions [1]. In contrast to atoms,
large molecular systems can hold many extra electrons because the extra
electrons can stay well separated. However, such systems are challenging
from both theoretical and experimental points of view. The present finite
size scaling approach might be useful in predicting the general stability of
molecular dianions.
The approach can be generalized to complex systems by calculating
the matrix elements needed for FSS analysis by ab initio, density func-
tional methods, orbital free density functional (OF-DFT) [66,67] approach,
density matrices [68, 69] and other electronic structure methods [70]. The
implementation should be straightforward. We need to obtain the matrix
elements to calculate a as a function of the number of elements used in
solving for the system. In the finite element using mean field equations
(like HartreeFock or KohnSham methods) the solution region will be
discretized into elements composed of tetrahedrons.
July 19, 2011 11:29 9in x 6in b1189-ch05 Solving the Schrodinger Equation
Acknowledgments
I would like to thank Pablo Serra, Juan Pablo Neirotti, Marcelo Carignano,
Winton Moy and Qi Wei for their valuable contributions to this ongoing
research of developing and applying finite size scaling to quantum problems
and Ross Hoehn for critical reading of the chapter. I would also like to thank
the Army Research Office (ARO) for financial support of this project.
Bibliography
[1] M.K. Scheller, R.N. Compton, and L.S. Cederbaum, Science 270, 1160 (1995).
[2] V.G. Bezchastov, P. Schmelcher, and L.S. Cederbaum, Phys. Chem. Chem. Phys. 5,
4981 (2003).
July 19, 2011 11:29 9in x 6in b1189-ch05 Solving the Schrodinger Equation
[3] M. Gavrila, in Atoms in Super Intense Laser Fields, edited by M. Gavrila (Academic,
New York, 1992), p. 435.
[4] Q. Wei, S. Kais and N. Moiseyev J. Chem. Phys. 124, 201108 (2006).
[5] E. van Duijn and H.G. Muller, Phys. Rev. A 56, 2182 (1997).
[6] E. van Duijn and H.G. Muller, Phys. Rev. A 56, 2192 (1997).
[7] Q. Wei, S. Kais, and D. Herschbach, J. Chem. Phys. 127, 094301 (2007).
[8] F.H. Stillinger and D.K. Stillinger, Phys. Rev. A 10, 1109 (1974).
[9] J. Katriel and E. Domany, Int. J. Quantum Chem. 8, 559 (1974).
[10] D.R. Hershbach, J. Avery, and O. Goscinsky, Dimensional Scaling in Chemical
Physics (Kluwer, Dordercht, 1993).
[11] C.N. Yang and T.D. Lee, Phys. Rev. 87, 404 (1952).
[12] T.D. Lee and C.N. Yang, Phys. Rev. 87, 410 (1952).
[13] M.E. Fisher, in Critical Phenomena, Proceedings of the 51st Enrico Fermi Summer
School, Varenna, Italy, edited by M.S. Green (Academic, New York, 1971); M.E.
Fisher and M.N. Barber, Phys. Rev. Lett. 28, 1516 (1972).
[14] B. Widom, in Critical Phenomena in Fundamental Problems in Statistical
Mechanics, edited by E.G.D. Cohen (Elsevier, New York, 1975).
[15] M.N. Barber, in Phase Transitions and Critical Phenomena Vol. 8, edited by C. Domb
and J.L. Lebowits (Academic, London, 1983).
[16] V. Privman, Finite Size Scaling and Numerical Simulations of Statistical Systems
(World Scientific, Singapore, 1990).
[17] J.L. Cardy, Finite-Size Scaling (Elsevier Science Publishers, New York, 1988).
[18] M.P. Nightingale, Physica 83A, 561 (1976).
[19] P.J. Reynolds, H.E. Stanley, and W. Klein, J. Phys. A 11, L199 (1978).
[20] P.J. Reynolds, H.E. Stanley, and W. Klein, Phys. Rev. B 21, 1223 (1980).
[21] J.P. Neirotto, P. Serra, and S. Kais, Phys. Rev. Lett., 79, 3142 (1997).
[22] P. Serra, J.P. Neirotti, and S. Kais, Phys. Rev. Lett. 80, 5293 (1998).
[23] S. Kais, J.P. Neirotti, and P. Serra, Int. J. Mass Spectrometry 182/183, 23 (1999).
[24] P. Serra, J.P. Neirotti, and S. Kais, Phys. Rev. A 57, R1481 (1998).
[25] P. Serra, J.P. Neirotti, and S. Kais, J. Chem. Phys. 102, 9518 (1998).
[26] J.P. Neirotti, P. Serra, and S. Kais, J. Chem. Phys. 108, 2765 (1998).
[27] Q. Shi and S. Kais, Mol. Phys. 98, 1485 (2000).
[28] S. Kais and Q Shi, Phys. Rev. A62, 60502 (2000).
[29] S. Kais, and P. Serra, Rev. Phys. Chem. 19, 97 (2000).
[30] S. Kais, and P. Serra, Adv. Chem. Phys. 125, 1 (2003).
[31] P. Serra, and S. Kais, Chem. Phys. Lett. 372, 205209 (2003).
[32] A. Ferron, P. Serra, and S. Kais, J. Chem. Phys. 120, 84128419 (2004).
[33] W. Moy, P. Serra, and S. Kais, Mol. Phys. 106, 203 (2008).
[34] W. Moy, M. Carignano, and S. Kais, J. Phys. Chem. 112, 54485452 (2008).
[35] For reviews see A. Chartterjee, Phys. Reports 186, 249 (1990).
[36] E. Witten, Phys. Today 33 (7), 38 (1980).
[37] D.R. Herschbach, J. Chem. Phys. 84, 838 (1986).
[38] C.A. Tsipis, V.S. Popov, D.R. Hershbach, and J.S. Avery, New Methods in Quantum
Theory (Kluwer, Dordrecht 1996).
[39] P. Serra and S. Kais, Phys. Rev. Lett. 77, 466 (1996).
[40] P. Serra and S. Kais, Phys. Rev. A 55, 238 (1997).
[41] P. Serra and S. Kais, Chem. Phys. Lett. 260, 302 (1996).
July 19, 2011 11:29 9in x 6in b1189-ch05 Solving the Schrodinger Equation
Chapter 6
The generalized Sturmian method makes use of basis sets that are solutions
to an approximate wave equation with a weighted potential. The weighting
factors are chosen in such a way as to make all the members of the basis
set isoenergetic. In this chapter we will show that when the approximate
potential is taken to be that due to the attraction of the bare nucleus, the
generalized Sturmian method is especially well suited for the calculation
of large numbers of excited states of few-electron atoms and ions. Using
the method we shall derive simple closed-form expressions that approx-
imate the excited state energies of ions. The approximation improves with
increasing nuclear charge. The method also allows automatic generation
of near-optimal symmetry adapted basis sets, and it avoids the Hartree
Fock SCF approximation. Programs implementing the method may be
freely downloaded from our website, sturmian.kvante.org [1].
111
July 19, 2011 11:29 9in x 6in b1189-ch06 Solving the Schrodinger Equation
In Eq. (6.1) and throughout the chapter, atomic units are used. The energies
and wavefunctions are given respectively by
Z2
En = , n = 1, 2, 3, . . . (6.2)
2n2
and
n,l,m (x) = Rn,l (r)Yl,m (, ) (6.3)
Here Yl,m (, ) is a spherical harmonic, and
R1,0 (r) = 2(Z/1)3/2 eZr/1
R2,0 (r) = 2(Z/2)3/2 (1 Zr/2)eZr/2
2
R2,1 (r) = (Z/2)3/2 (Zr/2) eZr/2
3
.. .. ..
. . . (6.4)
It was natural to try to use hydrogen-like orbitals as building blocks to
represent the wave functions of more complicated atoms. However, to the
great disappointment of the early workers in atomic theory, it was soon
realized that unless the continuum was included, the hydrogen-like orbitals
did not form a complete set; and the continuum proved to be prohibitively
difficult to use in practical calculations. This dilemma led Hloien, Shull
and Lowdin [2] to introduce basis functions that have exactly the same
form as hydrogen-like orbitals except that Z/n is replaced by a constant,
k, which is the same for all the members of the basis set. This type of basis
set came to be called Coulomb Sturmians, the name being given to them by
A. Rotenberg [3] to emphasize their connection with the SturmLiouville
theory of orthonormal sets of functions. Coulomb Sturmian basis sets are
complete without the inclusion of the continuum: any square-integrable
solution to a one-electron Schrodinger equation can be represented as a
linear superposition of them. If the potential in the one-electron Schrodinger
equation has some similarity to a Coulomb potential for example if it is
a screened Coulomb potential the convergence of such a series is rapid.
The members of a Coulomb Sturmian basis set are solutions to a one-
electron equation of the form
1 2 nk k2
+ n,l,m (x) = 0. (6.5)
2 r 2
If we compare Eq. (6.5) with (6.1) we can see that with the substitu-
tions Z/n k and En k2 /2, Eq. (6.1) is converted into Eq. (6.5).
July 19, 2011 11:29 9in x 6in b1189-ch06 Solving the Schrodinger Equation
n l Rn,l (r)
1 0 2k 3/2 ekr
2 0 2k3/2 (1 kr)ekr
3/2
2 1 2k
kr ekr
3
2
3 0 2k3/2 1 2kr + 2(kr)3 ekr
3 1 2k3/2 2 3 2 kr 1 kr2 ekr
3 2 2k 3/2 2 (kr)2 ekr
3 5
quantum theory. Equation (6.1) is the usual type of eigenvalue problem with
which everyone in the physical sciences is familiar. By contrast, Eq. (6.5)
is an entirely different problem, sometimes called a conjugate eigenvalue
problem: Each member of a set of solutions corresponds to the same energy
k 2 /2, k being a constant that is the same for all the members of the set. The
quantity that plays the role of the usual eigenvalue is now a weighting factor
attached to the potential, which is chosen in such a way as to make all the
members of the basis set isoenergetic. Because of their useful properties,
Coulomb Sturmian basis sets are widely used in atomic theory, and there
exists a large literature discussing their properties and applications [213].
relations:
2E p2
d (x)V0 (x) (x) = , (6.10)
where we let denote a particular state and where we have introduced
the abbreviated notation x (x1 , x2 , . . . , xN ). To obtain the generalized
Sturmian secular equations, we begin by substituting the superposition
We now split the potential V(x) into two parts, V(x) = V0 (x) + V (x), and
introduce the definitions
1
T
0
d (x)V0 (x) (x)
p
(6.13)
1
T d (x)V (x) (x).
p
From the potential-weighted orthonormality relations (6.10) it follows
that T 0 is diagonal:
p
T0 = T
0
= . (6.14)
Next, we notice that since all of the isoenergetic configurations in the basis
set obey (6.9), Eq. (6.12) can be rewritten as
0
T + T p B = 0. (6.19)
Generalized Sturmian basis sets can come in many species and vari-
eties: Every choice of the approximate potential V0 (x) (which should be
chosen to resemble V(x) as closely as possible) leads to a particular set of
shapes for the N-particle basis functions (x). Solving Eq. (6.9), which
is done once and for all for a particular V0 , specifies the functions up to
an undetermined scaling parameter p . Solving the generalized Sturmian
eigenproblem (6.19) then yields as eigenvalues the scaling parameters p
and as eigenfunctions
where each p scales the entire basis to give all the N-particle basis
functions the same energy E . If the generalized Sturmian basis { } is
complete, then Eq. (6.19) has exactly the same eigenfunctions as the
Schrodinger equation, and the energies are
p2
E = . (6.21)
2
In practice, one of course always uses a finite basis, so solutions are approx-
imate. However, we shall see that the automatic scaling allows us to obtain
good accuracy with few basis functions, as well as to obtain many excited
states at once.
It is remarkable to see how completely Eq. (6.19) differs from the con-
ventional secular equations used in quantum theory:
(1) The kinetic energy term has vanished.
(2) The matrix representing the approximate potential V0 (x) is diagonal.
(3) The roots are not energies but values of the scaling parameter, p ,
which is proportional to the square roots of the binding energies
(Eq. (6.10)).
July 19, 2011 11:29 9in x 6in b1189-ch06 Solving the Schrodinger Equation
(4) Before the secular equation is solved, only the shapes of the basis
functions are known, but not the values of the scaling parameters p .
(5) Solution of the secular equations yields a near-optimum basis set
appropriate for each state, as well as the states themselves and their
corresponding energies.
(6) The Hamiltonian formalism is nowhere to be seen!
N
Z
1
N N
V(x) = + . (6.23)
rj rij
j=1 j>i i=1
and
1
2
N
1
j , (6.24)
2 2
j=1
N
Z
N
1
V0 (x) = and V (x) = . (6.26)
rj rij
j=1 j>i i=1
Now we claim that with this choice of V0 (x), the weighting factors are
determined automatically, and Eq. (6.25) is satisfied by Slater determinants
of the form:
1 (1) 2 (1) N (1)
(2) (2) (2)
1
(x) = .
1 2 N
N! ..
.
.. .
..
(N) (N) (N)
1 2 N
= |1 2 N |, (6.27)
but with the weighted charges Q (Ref. [16], Chapter 3) chosen according
to the rules in the following box, where n1 , n2 , . . . , nN are the prin-
cipal quantum numbers of the hydrogen-like spin-orbitals in the configu-
ration . The Goscinskian configurations will be exact solutions to (6.25)
July 19, 2011 11:29 9in x 6in b1189-ch06 Solving the Schrodinger Equation
provided that:
p
Q = Z =
R
p 2E (6.29)
1 1 1
R 2
+ 2 + + 2 .
n1 n2 nN
At this point the reader may be muttering I dont believe it. Well, if
you dont believe it, think about this: The energy E will then be related to
the weighted nuclear charges Q by
p2 1 2 2 Q2 Q2 Q2
E = = Q R = + + + 2 . (6.30)
2 2 2n21 2n22 2nN
Now compare Eq. (6.32) with (6.25): they are the same! Thus Eq. (6.25)
will indeed be satisfied by the configurations shown in Eq. (6.27), pro-
vided that the effective nuclear charges Q are chosen according to the rule
given in Eq. (6.29). We shall call such a set of isoenergetic solutions to
(6.25) with V0 (x) chosen to be the nuclear attraction potential a set
of Goscinskian configurations to honor Professor Osvaldo Goscinskis
important early contributions to the generalized Sturmian method [14].
July 19, 2011 11:29 9in x 6in b1189-ch06 Solving the Schrodinger Equation
We note that the only thing that requires any effort to calculate in
Eq. (6.35) is the interelectron repulsion matrix T the rest is trivial.
We have just seen the remarkable ways in which the generalized Sturmian
secular equations differ from the usual secular equations that result from
diagonalizing the matrix representation of the Hamiltonian of a system: We
should especially notice that the eigenvalues are not energies, but values
of a parameter p , which is related to the energies by E = p2 /2. In the
case of Goscinskians, the configurations become pure functions (p x)
of p x, i.e. p acts as a scaling parameter of the space. Thus, in the solution
of the secular equations, an automatic scaling of the basis functions occurs:
For tightly-bound states, the atomic orbitals correspond to large values of
the effective charge, Q = p /R , and are contracted in space, whereas for
loosely-bound states the orbitals are spatially diffuse. It turns out, in fact,
that the Slater exponents that are automatically obtained by solution of
the generalized Sturmian secular equations are very nearly optimal. Thus,
when the generalized Sturmian method is applied to atoms and atomic
July 19, 2011 11:29 9in x 6in b1189-ch06 Solving the Schrodinger Equation
Table 6.2. 1 S excited state energies (in Hartrees) for the two-electron isoelec-
tronic series. The basis set used consisted of 40 generalized Sturmians of the
Goscinski type, and the whole table was computed in a few milliseconds. Exper-
imental values are taken from the NIST tables [19] (http://physics.nist.gov/asd),
and the exact nonrelativistic results of Nakatsuji and coworkers [20] are also given
for comparison.
Table 6.3. 3 S excited state energies calculated with 36 Goscinskians. The cal-
culation of similar tables for 1 P, 3 P, 1 D, 3 D, doubly excited autoionizing states,
etc., is equally easy, rapid, and of comparable accuracy. Tables are given in
Chapters 3 and 4 in [16], but may easily be reproduced using our programs, as
shown in Tutorial 1 on [1].
Z2
0.58
0.60
0.62
0.64
Z
10 20 30 40 50
Fig. 6.1. This figure shows energies for the lowest 3 S state of the helium-like
isoelectronic series, divided by Z2 to make the details easier to see for large Z. The
values are calculated in the large-Z approximation, which here limits the basis to
a single configuration. The lower (solid) line is corrected for relativistic effects as
discussed in the text; the dots indicate experimental values from the NIST tables. It
is easy to visually verify that, for Z > 10, the relativistic correction is much larger
than calculational errors due to the large-Z approximation.
little effort that the calculation can literally be carried out on the back of an
envelope! We call this approximation the Large-Z Approximation.
If interelectron repulsion is entirely neglected, i.e. when disregarding
the second term in Eq. (6.19), the calculated energies E of become
those of a set of N completely independent electrons moving in the field of
the bare nucleus:
p2 1 Z2 Z2 Z2
E = Z2 R 2 = 2 2 2 . (6.36)
2 2 2n1 2n2 2nN
2
Z
1.0
1.1
1.2
1.3
1.4
Z
10 20 30 40 50
Fig. 6.2. The ground state of the carbon-like isoelectronic series. As Z grows,
the approximation approaches the exact solution to the nonrelativistic Schrodinger
equation. Due to the increased role of interelectron repulsion in the carbon-like
series, this takes longer than for the helium-like series. However, at around Z = 18,
the inaccuracy of the large-Z approximation becomes smaller than the relativistic
correction.
The roots are shifted by an amount equal to the constant by which the
identity matrix is multiplied:
p = ZR + = ZR | | (6.38)
and the energies become
1 1
E = p2 = (ZR | |)2 . (6.39)
2 2
July 19, 2011 11:29 9in x 6in b1189-ch06 Solving the Schrodinger Equation
1.54037 2D 1.98389 3 yS
1.55726 2P 1.98524 1D
1.99742 1P
2.04342 3P
He-like 2.05560 1D
0.441942 1S 2.07900 1S
Since the roots are always negative, we may use the form | | in
place of to make explicit the fact that interelectron repulsion reduces
the binding energies, as of course it must. The roots are pure numbers
that can be calculated once and for all and stored. Values of these roots for
N = 2, 3, . . . , 10 are shown in Tables 6.4 and 6.5, together with their cor-
responding spectroscopic terms. From the roots, a great deal of information
about atomic states can be found with almost no effort: Given the values
of the principal quantum numbers n1 , n2 , . . . , nN , and given the value of
| |, which can be looked up in a table, the calculation of the energies for
the entire isoelectronic series is completely effortless!
The eigenfunctions corresponding to the spectroscopic terms in
Tables 6.4 and 6.5 are symmetry adapted RusselSaunders states and can
be used as basis functions for more exact calculations. The classification is
done automatically by the method discussed in [16], Sections 3.4 and 3.5.
Tutorial 2 on our website sturmian.kvante.org [1] shows in detail
how to do this.
2.44111 2P 3.05065 1S
2.49314 4P 3.11850 3P
2.52109 2D 3.14982 1P
2.53864 2S 3.24065 1S
2.54189 2P
2.61775 2P
2 Z R .
1 2 2
In the relativistic case, the exact solution to the Dirac equation for
hydrogen-like atoms can be found in [21], or in [16], Eqs. (7.35) through
July 19, 2011 11:29 9in x 6in b1189-ch06 Solving the Schrodinger Equation
(7.40). The ratio of the relativistic energy Erel and the nonrelativistic energy
Enonrel for a multiconfigurational state
= B (6.41)
is
2
Erel B |H0 | rel
f (Z) = = 2
Enonrel B |H0 | nonrel
2
B |H0 | rel
= 1 . (6.42)
2 Z2 B
2 R 2
Relative error
Gosc.rel.
LargeZrel.
Gosc. nonrel.
0.015
LargeZnonrel.
0.010
0.005
Z
5 10 15 20 25 30
E E
Fig. 6.3. Ground state relative errors calcEexp exp compared to experiment
for the helium-like isoelectronic series. The large-Z approximation energies
12 (Z 2 .441942)2 are compared to results using a fuller Goscinskian basis.
The two dotted lines are the nonrelativistic values, while the solid lines are cor-
rected for relativistic effects using Eq. (6.42). For very large values of Z, errors due
to quantum electrodynamic effects cause a systematic overestimation of binding
energy.
Z2
E
2 10 20 30 40
0 Z
20
N=2
40
60
80
N = 10
100
120
N = 18
Fig. 6.4. For isoelectronic series, Eq. (6.45) indicates that within the large-Z
approximation, the quantity E Z2 /2 is exactly linear in Z, as is illustrated
above.
How can we correct this defect? One way to extend the range of the
method is to use a V0 in Eq. (6.25) that in some form includes interelectron
repulsion effects. This will make it less straightforward to obtain the gen-
eralized Sturmian configurations , depending on the complexity of the
chosen V0 , in general requiring a self-consistent field iteration. However,
the useful properties of the generalized Sturmian basis are retained, and the
extra initial work would lead to improved convergence.
Another possibility is to extend the method by using a basis set consisting
of isoenergetic configurations
(x) = |1 2 N |, (6.46)
constructed from orbitals satisfying
1 2 k2
j + + v(rj ) (xj ) = 0, (6.47)
2 2
where v(rj ) is the nuclear attraction potential, corrected by a repulsive
potential due to the core electrons:
Z
v(rj ) = + vc (rj ). (6.48)
rj
This introduces interelectron repulsion effects even earlier in the calcula-
tions. The potential vc (rj ) can be found by performing a fast preliminary
calculation using Goscinskian configurations. From this preliminary cal-
culation, a spherically-averaged core density, (rj ) can be obtained, and
from this vc (rj ) may be calculated by means of the relationship
1
vc (rj ) = drj rj2 (rj ) , r> Max[rj , rj ]. (6.49)
0 r>
The orbitals (xj ) can be built up from Coulomb Sturmians, so that (6.47)
becomes:
1 k2
j +2
+ v(rj ) (xj )C, = 0. (6.50)
2 2
Multiplying from the left by a conjugate Coulomb Sturmian, we obtain:
1 2 k2
0= d xj (xj ) j +
3
+ v(rj ) (xj )C,
2 2
= k + d xj (xj )v(rj ) (xj ) C,
2 3
= k 2 k t C, (6.51)
July 19, 2011 11:29 9in x 6in b1189-ch06 Solving the Schrodinger Equation
or
k
t C, = 0, (6.52)
where
1
t d 3 xj (xj )v(rj ) (xj ). (6.53)
k
After solving Eq. (6.52) to obtain the coefficients C, , we can next use
the isoenergetic configurations (x) = |1 2 N | as basis functions
for solving the Schrodinger equation for an atom or atomic ion. This can
be written in the form
N
1 2 k2
j + + V(x) (x) = 0, (6.54)
2 2
j=1
with
N
Z
1
N N
V(x) = + , (6.55)
rj rij
j=1 i>j j=1
and with
N
k2 Nk2
E = = . (6.56)
2 2
j=1
Thus we write
[T k S ] B, = 0. (6.61)
This gives us a spectrum of k-values from which the energies of the various
states, E = Nk 2 /2, can be obtained. It seems quite likely that this
procedure would allow the generalized Sturmian method for atoms and
atomic ions to be extended to larger values of N. Some steps in this direction
have already been taken by us and by Professor Gustavo Gasaneo and his
students at Universidad Nacional del Sur in Argentina.
What developments are necessary in order to apply the generalized
Sturmian method to complex chemical problems? Once we have found
a generalized Sturmian basis that converges well, most of the standard
techniques in quantum chemistry can be employed in the same way that
they are currently used with bases obtained from initial HartreeFock cal-
culations. Two obvious steps are to use the frozen core approximation to
factor out correlation of core electrons, and use standard perturbation theory
based techniques to reduce the computational efforts necessary for con-
figuration interaction. Using the generalized Sturmian method with, for
example, coupled cluster methods requires some work, but may be well
worth the effort due to improved convergence properties compared to using
HartreeFock based configurations.
k
W S C = 0. (6.71)
Za
1N N
V(x) = + . (6.75)
|xj Xa | rij
j=1 a i>j j=1
[T k S ] B, = 0, (6.76)
E/Hartree
0 R/Bohr
1 2 3 4 5 6
1
Etot
2
He+H+
3 Ee
7
Li +
Fig. 6.5. This figure shows the electronic energy Ee and the total energy Etot of
the HeH+ ion as a function of the internuclear separation R = s/k. The calculation
was performed with a single configuration using a one-electron basis set consisting
of three Coulomb Sturmians on each center. For R 0, the electronic energy
approaches the energy calculated for the Li+ ion using the generalized Sturmian
method with a single configuration [16]. In the separated atom region, the total
energy approaches that of He when calculated in the same way. Our calculation
exhibits a shallow minimum at R = 1.35 Bohrs, which can be compared to the
equilibrium bond length of 1.3782 Bohrs resulting from a HF/STO-3G calculation
quoted by Szabo and Ostlund [25], and with the value of 1.46 Bohrs obtained
in a benchmark calculation by Wolniewicz [26]. Since our pilot calculation uses
only one configuration, it makes sense that we obtain a result comparable to the
HartreeFock calculation.
of the Coulomb Sturmian basis set. Neither R nor k is known at this point,
but only their product s. For the diatomic case, all of the integrals involved
in Eqs. (6.71) and (6.61) are pure functions of s. Having chosen s, we can
thus solve the one-electron secular equations and obtain the coefficients
C and the spectrum of ratios k/ . We are then able to solve Eq. (6.76),
which gives us the eigenvectors B as well as a spectrum of k-values, and
thus energies Nk 2 /2. From a k-value, we also get the unscaled distance
R = s/k. We repeat the procedure for a range of s-values and interpolate to
find the solutions as functions of R. Figure 6.5 shows our pilot calculation
on the HeH+ two-electron molecular ion using the method described above.
This is an extremely simple calculation, using only one configuration, but
we are actively working to explore the method further. We chose HeH+ for
the pilot calculation rather than H2 because, as is well known, the correct
dissociation curve for H2 needs at least two configurations.
July 19, 2011 11:29 9in x 6in b1189-ch06 Solving the Schrodinger Equation
6.4. Discussion
helium atom, where Nakatsujis results [20] are available, our results agree
well with his as can be seen in Tables 6.2 and 6.3. Had Nakatsuji and
coworkers made calculations on the whole isoelectronic series, agreement
would be progressively better for the heavier ions in the series.
We find that in order to obtain good agreement with experiment, it is
necessary to include relativistic effects. For the few-electron systems treated
here, the crude relativistic correction of Eq. (6.42) gives very good results.
For the two-electron isoelectronic series, the ground state was obtained
with relative error compared to experiment of 3.5 103 for Z = 2 (the
worst case) to roughly 106 for Z 8, and excited states were obtained
with relative errors between 104 and 106 . The complete calculation of
all the states (found in [18], Chapter 4) required only 77ms of computation.
It should be noted, that for very large values of Z, quantum electrodynamic
effects become important, and neglecting them will cause an overestimation
of the binding energies. If more precision is required, we can treat the
system by means of the DiracCoulomb equation. Calculations using a
fully relativistic analogue to the Goscinskian configurations can be found
in Chapter 7 of our book [16].
As the number of electrons grow, there is a decrease in accuracy when
using similar computation time. Already for five-electron systems, the cal-
culations in Chapter 4 of [18] yield ground states for the Z = N case with
less accuracy than the HartreeFock limit. This suggests to us that in order
to solve systems with many electrons accurately, while retaining efficiency,
we need Sturmian basis functions that incorporate interelectron repulsion
in V0 (x), as is discussed in Section 6.3.
The generalized Sturmian method using Goscinskians leads to an
extremely simple and convenient approximation, the large-Z approxi-
mation, which is described in Section 6.2. The approximation leads to a
remarkably simple closed-form expression, E = 21 (ZR | |)2 , for the
energies of states in terms of the appropriate roots of the energy-independent
interelectron repulsion matrix. As the name suggests, the large-Z approxi-
mation is not very accurate when Z = N, especially in the case of ground
states. It underestimates the binding energy of the ground state of neutral
helium by 2% and of neutral argon by 5% (Fig. 4.5 in [18]), but it improves
rapidly with increasing Z N (Fig. 4.3 in [18]). For excited states of few-
electron atoms, the large-Z approximation gives surprisingly good results
even for modest values of Z N. Given the interelectron repulsion roots
, which are dimensionless quantities that depend only on the number of
electrons and can be precalculated, we can calculate electronic states for
entire isoelectronic series with a pencil and a scrap of paper.
July 19, 2011 11:29 9in x 6in b1189-ch06 Solving the Schrodinger Equation
It is our hope that in the future the method may be extended to give
accurate calculations for atoms where interelectron repulsion effects are
comparable in importance to nuclear attraction. We are also in the process
of extending the method to molecules.
Bibliography
Chapter 7
Philip E. Hoggan
It is easy to prove that atomic and molecular orbitals must decay expo-
nentially at long-range. They should also possess cusps when an electron
approaches another particle (a peak where the ratio of orbital gradient to
function gives the particle charge).
Therefore, hydrogen-like or Slater-type orbitals are the natural basis
functions in quantum molecular calculations. Over the past four decades,
the difficult integrals led computational chemists to seek alternatives.
Consequently, Slater-type orbitals were replaced by Gaussian expansions
in molecular calculations (although they decay more rapidly and have
no cusps). From the 1990s on, considerable effort on the Slater integral
problem by several groups has led to efficient algorithms which have
served as the tools of new computer programs for polyatomic molecules.
The key ideas for integration: one-center expansion, Gauss transform,
Fourier transform, use of Sturmians and elliptical coordinate methods are
presented here, together with their advantages and disadvantages, and the
latest developments within the field.
Recent advances using symbolic algebra, pre-calculated and stored
factors and the state-of-the art with regard to parallel calculations are
reported.
141
July 20, 2011 9:7 9in x 6in b1189-ch07 Solving the Schrodinger Equation
7.1. Introduction
1 De Rerum Natura: work based on that of Democritus and the Greek atomists of the fourth century
BC, who were referring to molecules by the term atom and interpreting odour etc.
July 20, 2011 9:7 9in x 6in b1189-ch07 Solving the Schrodinger Equation
2A triplet states symmetric spin factors and orbital, anti-symmetric in electron pair exchange (Pauli
principle), imply vanishing density for small r12 . This is stabilising and indicates that electrons tend
to avoid each other. Conversely, a singlet orbital is symmetric with non-zero density even if electrons
collide.
July 20, 2011 9:7 9in x 6in b1189-ch07 Solving the Schrodinger Equation
It has also been shown that electron correlation is better accounted for
by exponential type orbitals. Two cases are considered below, the fact
that configuration interaction (CI) requires many fewer exponential than
Gaussian functions and explicitly correlated (or geminal) exponential func-
tions (Section 7.9.2).
A further domain of successful application, which is now developing
fast, is the use of Slater basis trial wave-functions for the correlated ground
state obtained in quantum Monte Carlo simulations (Section 7.9.3).
Nowadays, there is great interest in weakly bound systems, often treated
with difficulty using DFT. It is possible to tailor functionals on the bench-
marks available from accurate Slater-type orbital basis calculations, to
ensure the analytical asymptote which is exponential decay. Nevertheless,
in view of the low energies involved and the fact that they are expressed
as a difference between two large total energies, the strategy so far favored
has been the use of quantum Monte Carlo methods in order to account for
the majority of essential correlation energy.
QMC, however is hellish long, especially when three-body (e-e-N) terms
are required. These highly accurate benchmarks often show the three-body
terms are absolutely essential.
This is not the case yet, and may never be, since although Slater-type
orbitals are now viable as an alternative, it sometimes remains faster and
more convenient to use Gaussians. Perhaps a new method will eventually
take over: orbital free/density based or another?
Using GTO expansions of STOs instead of analytic STOs was a prag-
matic solution and originally intended to facilitate numerical integration
in the calculation of the first molecules on early mainframe computers.
The GTO expansion together with the popular distribution of computery
programs like Gaussian (g09) have contributed to the use of GTOs for
accurate calculations of large systems. The size limits in systems studied
have receded, e.g. HF calculations of clusters of hundreds of atoms, CI cal-
culations including hundreds of thousands of Slater determinants. In spite
of the rapid development of the computer technology and the availability
of supercomputers, computational times are unreasonably long, so that the
computational chemist is often restricted to test and model calculations.
This motivates the search for basis functions, where fewer would give a
3 It is assumed the reader is familiar with this SCF method (see Atkins molecular quantum mechanics).
Note, in passing, it is the work of Douglas Hartree and his father, William, and Bertha Swirles (drawing
on the Fock operator).
July 20, 2011 9:7 9in x 6in b1189-ch07 Solving the Schrodinger Equation
Chemical intuition can directly be related to these units, in the sense that
the difference between bound and unbound atom pairs is obviously related
to the value if integrals involved, as is the gap between the bonding and anti-
bonding orbitals, when they are well-localized to atom pairs. The key role
is played by the value of exchange integrals, which decrease exponentially
at long range.
The diatomic unit is also useful to implement two approximations, where
accuracy is chosen by the user. The first is the use of the Schwarz inequality,
which defines a product of two center integrals as an upper limit for the
value of the corresponding three- and four-center terms. The second is a
resolution of the Coulomb operator, which may be applied whatever the
basis function used and which is due to P. Gill [70].
In the current state of the art, both of these leave some integrals for
more accurate treatment. The inequality is simply used for screening negli-
gible integrals and the Coulomb resolution may be limited to micro Hartree
accuracy for atomic orbitals with high angular momentum (l 2). This lim-
itation, may, however, be overcome by techniques that reduce the angular
momentum of orbital products (see Section 7.8.7).
At present, it is necessary to evaluate a few of the ugly infinite sums
but there is reason to believe this will shortly be a thing of the past. This
text will relegate the analysis involved (which is a specialist subject) to
Appendix A (see Section 7.12).
4 The two-center two-electron integrals are classified according to the centers a, b. Writing them
according to the charge distributions [(1)|(2)] the Coulomb integrals are [aa|bb], the hybrids
[aa|ab] and the exchange integrals [ab|ab]. The most difficult are the exchange integrals because the
charge distribution of every electron is located over two centers.
July 20, 2011 9:7 9in x 6in b1189-ch07 Solving the Schrodinger Equation
used for Coulomb integrals in the group (K Rudenberg, during the 50th
Sanibel Symposium, 2010).
Among the many authors who were working around the world on the
solution of the necessary integrals was M Kotani in Japan [9], who wrote
the famous integral tables which bear his name and were widely used.
Coulson in Oxford proposed a method to evaluate the three- and four-center
integrals [10]; Lowdin in Uppsala [11] and a young American scientist
called Harris [12] were involved. Work in the early 1950s mostly focused
on integrals over STO.
The interest was to make the first theoretical calculations of some
molecules starting with the diatomic systems H2 , N2 . For three-center
molecules the problem of integration was encountered (orbital translation).
Mulliken and Roothaan called this The bottleneck of Quantum Chem-
istry [14], Mulliken mentioning it in his Nobel Lecture in 1966 on the
molecular orbital method.
Boys in Cambridge published his landmark paper [15] containing the
evaluation of three- and four-center integrals using Gaussian function
expansions of the STO. This bold step led to great simplifications, based
on the so-called product theorem: the product of two Gaussian functions
located on different centers is a new Gaussian function located on a new
center. Thus, four-center electron distributions could be reduced to single-
center distributions and analytical evaluation was greatly facilitated. Boys
regarded this as an existence theorem for a closed GTO product rule. It
was to change the course of molecular computations. Note that the product
theorem for Slater orbitals leads to complicated infinite sums, making eval-
uation awkward compared with the simple closed forms for Gaussians.
In 1954 Boys, Shavitt et al. [16] expanded Slater orbitals into Gaussians
to perform quantum mechanical calculations. In 1963 Clementi presented
the so-called basis set using Slater exponents/orbitals [17]. Later, Pople
would base his programs on Boys pragmatism (see [23]).
5 i.e. the ratio of radial derivative to the value of the function, when r tends to zero gives the exponent:
nuclear cusp condition.
6 This expression has been borrowed in English from Don Quixote by Cervantes.
July 20, 2011 9:7 9in x 6in b1189-ch07 Solving the Schrodinger Equation
the turn of the century, better bases and CI work had made presentations of
such work far more convincing.
7 Electron repulsion energy (e.g. two-center): the Coulomb terms decay in 1/R for large interatomic
distance R and the exchange terms in ekR . This implied exchange is purely a quantum (spin related)
phenomenon. For large R exchange is zero, the classical Coulomb law remains.
July 20, 2011 9:7 9in x 6in b1189-ch07 Solving the Schrodinger Equation
0.9 STO
0.8 GTO
0.7
0.6
Wave function 0.5
0.4
0.3
0.2
0.1
-3.0 -2.5 -2.0 -1.5 -1.0 -0.5 0 0.5 1.0 1.5 2.0 2.5
r
Fig. 7.1. Comparison of the shape of a STO with a GTO 1s function. Radius (r)
in atomic units.
The hydrogen-like orbitals have nodes, i.e. the 2s orbital is of the form
(1 br)er , and higher quantum number orbitals are similar but STOs are
node-less. A related problem appears for Gaussians.
In 1928 Slater [1] noted that the radial polynomial factors make cal-
culations messy and proposed the use of single powers of r, i.e. linear
combinations of hydrogen-like terms.
A picture which helps to visualize the differences between Slater and
Gaussian orbitals is the representation of the 1s orbital function of both
types (with suitable exponents), see Fig. 7.1.
STOs represent the electron density well near the nucleus (cusp)
and at long range (correct asymptotic decay). STOs thus resemble the
physical atomic orbitals provided a suitable exponent has been obtained by
optimization.
Conversely, the GTOs have no cusp (zero radial derivative at the nucleus)
and decay too fast.
To reproduce a 1s STO using three GTOs (the so-called minimal GTO
basis) an orbital is obtained with the (bell-shape) of a Gauss curve, no cusp,
see Fig. 7.2.
To reproduce a single STO many GTOs are necessary, but the electron
cusp at the nucleus is missing. This is one of the reasons for the slow con-
vergence of the wave function solutions to the exact (HF or CI) result.
In general, if the basis function is not built up of eigenfunctions of the
Schrodinger equation, its convergence is slower. In GTOs more Slater deter-
minants are needed for a given accuracy.
Another advantage of Slater orbitals is the size of the basis, one orbital
per electron is of reasonable quality and multiple-zeta basis sets converge
July 20, 2011 9:7 9in x 6in b1189-ch07 Solving the Schrodinger Equation
0.9 STO
0.8 GTOs
0.7
Wave function
0.6
0.5
0.4
0.3
0.2
0.1
0
0 0.5 1.0 1.5 2.0 2.5
r
Fig. 7.2. Construction of a 1s STO with three GTOs. Radius (r) in atomic units.
fast to the HartreeFock limit. This is the lowest energy solution for inde-
pendent electrons. A basis approaching this limit is said to saturate i.e.
adding higher angular momentum functions leads to little improvement.
A Slater-type orbital basis saturates much faster than GTOs. Therefore, the
number of integrals to be evaluated in an STO basis is dramatically smaller.
CI is also spectacularly more efficient (see Section 7.9.2). Finally, con-
ceptually the Slater orbitals give a more intuitive description of the atomic
orbitals and of the molecular orbitals (MO) using them.
The disadvantages of Slater orbitals have already been mentioned: the
three- and four-center two-electron integrals are the bottleneck. There is
no general analytical solution for them, which would be the most effective
and fastest way of calculation. Instead, there are a number of approximate
methods of calculation, involving infinite series, or truncated approxima-
tions to the Coulomb operator itself. They will be treated in Section 7.7.8.
Radial Slater functions do not represent the bonding region adequately
and higher angular momentum functions should be added.
It is nevertheless possible to use linear combinations restoring radial
nodes. This approach is advocated particularly for ADF, where the
hydrogen-like basis is obtained by fixing the coefficients for combining
Slater functions.
Another disadvantage is that some of the two-center integrals since
the times of Roothaan and Rudenberg have been solved for a co-axial
July 20, 2011 9:7 9in x 6in b1189-ch07 Solving the Schrodinger Equation
where Plm (cos ) are the associated Legendre functions. The spherical har-
monics are eigenfunctions of the angular momentum operator L2 and its
z-projection Lz .
July 20, 2011 9:7 9in x 6in b1189-ch07 Solving the Schrodinger Equation
The complex spherical harmonics are used mainly in atoms and in devel-
oping theories because it is easier to work out general formulae and deriva-
tions with them. The real spherical harmonics are linear combinations of
the complex ones. Orbitals are chosen to be real in atoms and molecules.
Note that they are written using polar coordinates (suitable for atoms).
Cartesian Slater-type orbitals are very seldom used compared with
Cartesian Gaussians, which are an almost systematic choice.
When the principal quantum number n in Eq. (7.1) is a non-integer we
have the NISTOs (non integer slater orbitals). The main difficulty when
working with these orbitals is during the derivations a binomial has to be
used with an non-integer power what leads to an infinite expansion. These
orbitals are widely investigated in the present [32]. The additional flexibility
of using non-integer quantum numbers results in a lowering in the energy
and better density. It is also the possible to transform from polar coordinates
to elliptical coordinates.
Elliptical Slater orbitals have been used extensively as basis functions
for two-center molecules [3335]. These orbitals are known to lead to lower
energy results [36]. Using = 1 = ra + rb and = 1 = ra rb ,
nlm (r) = n l ( 2 1)m/2 (1 2 )m/2 e eim , (7.3)
where , , are the elliptical coordinates.
therefore, hydrogen-like orbitals do not form a complete set (for finite n),
they need orbitals of the continuum to be complete. This is important for the
convergence of the solutions. Shull and Lowdin [39] realized that this was
due to the dependence of Z with n that dilates the orbitals and they pro-
posed the following orbitals where these were substituted by adjustable
parameters, i.e. usual orbital exponents:
l r m
nlm (r) = Nnl L2l+2
nl1 (2r)r e Yl (, ). (7.7)
Due to the form of the Hamiltonian and of its expectation value we find
the following kinds of integral. First the integrals which appear when using
HartreeFock and CI wave functions, in general ab initio methods. The
integrals are classified according the number of electrons and atomic centers
involved. In order of increasing difficulty, there are:
8 1/r two independent electrons, possibly two distinct orbitals in each of the bra and ket which may
12
all be on different atoms i.e. up to four atomic centers.
July 20, 2011 9:7 9in x 6in b1189-ch07 Solving the Schrodinger Equation
This formula is due to Smeyers [42]. In the brackets are the requisite coef-
ficients. The various methods of single-center expansion differ in the tech-
nique to calculate these coefficients.
This approach was first proposed by Barnett and Coulson [10] in 1956
using radial orbitals (s-orbitals) and was called the zeta function method
because of expansions in terms of successive derivatives with respect to
exponents. The terms have alternate sign.
The method is similar to Lowdins alpha function method [11]. Harris
and Michels [43] extended the method to angular general orbitals in 1965.
This method has been used by many and Appendix A (Section 7.12) details
it. Signs of terms give oscillating sums and poor convergence.
The idea is the translation of an orbital from one point to another. Trans-
lation of a spherical harmonic is a finite expansion; on the other hand, trans-
lation of the radial part leads to an infinite series. This situation can be best
explained following Guseinov [44]:
n
1
l
n,l,m (, rA ) = Vnlm,n l m (, RAB ) n ,l ,m (, rB ), (7.9)
n =1 l =0 m =l
July 20, 2011 9:7 9in x 6in b1189-ch07 Solving the Schrodinger Equation
where V are the coefficients of the expansion. The method is very stable but
it requires computation of many terms to obtain sufficient correct decimal
digits, therefore this method gives lengthy computation times.
Steinborn has developed this method [25]. The evaluation of integrals using
B-functions leads to some integrals including a Bessel function of the first
kind which is oscillatory (i.e. like the sum it replaces in Section 7.7.1):
r n er Jl+1/2 (rx)dr (7.12)
0
To evaluate these accurately, extrapolation methods are used [47, 48] due
to Sidi [49] and [50], i.e. substituting this integral by a sine integral which
has the same behavior. This evaluation requires numerical integration (e.g.
GaussLegendre quadrature, summing the integrand at polynomial roots).
y1a y1b
1 (1, 1, 1)
1a R 1b
a b
x1a z1a z1b
x1b
The method is used by many authors. It was recently used for three-electron
integrals [52].
7.8.1. Introduction
The Coulomb resolution now will be presented. This is a readily controlled
approximation to separate the variables in the 1/r12 which, in recent work
by Gill and by Hoggan [70, 74] is shown to spell the end of exponential
orbital translations and ensuing integral bottlenecks.
This section advocates the use of atomic orbitals which have direct
physical interpretation, i.e. hydrogen-like orbitals. They are exponential
type orbitals (ETOs).
Until 2008, such orbital products on different atoms were difficult to
manipulate for the evaluation of two-electron integrals. The difficulty was
mostly due to cumbersome orbital translations involving slowly convergent
infinite sums. These are completely eliminated using Coulomb resolutions.
They provide an excellent approximation that reduces these integrals to a
sum of one-electron overlap-like integral products that each involve orbitals
on at most two centers. Such two-center integrals are separable in prolate
spheroidal coordinates. They are thus readily evaluated. Only these integrals
need to be re-evaluated to change basis functions.
The above is still valid for three-center integrals. In four-center integrals,
the resolutions require translating one potential term per product. This is
outlined here.
Numerical results are reported for the H2 dimer and CH3 F molecule.
The choice between Gaussian and exponential basis sets for molecules
is usually made for reasons of convenience at present. In fact, it appears
to be constructive to regard them as being complementary, depending on
the specific physical property required from molecular electronic structure
calculations.
As regards exponential type orbitals (ETOs) such as Slater functions,
much analysis suggests it is difficult to evaluate two-electron integrals
because the general three- and four-center integrals evaluated by the usual
methods require orbital translations. Some workers avoid the problem using
large GTO expansions, e.g. SMILES [53, 54].
It would be helpful to devise a separation of variables for integration.
This would eliminate orbital translations and therefore present major advan-
tages, although some other translations remain involving a simple analytic
potential.
The present section describes a breakthrough in two-electron integral
calculations, as a result of Coulomb operator resolutions. This separates
July 20, 2011 9:7 9in x 6in b1189-ch07 Solving the Schrodinger Equation
the independent variables of the operator and gives rise to simple analytic
potentials. The two-center integrals are replaced by sums of overlap-like
one-electron integral products. One potential term in these products requires
translation in four-center terms, which is significantly simpler to carry
out than that of the orbitals. This implies a speed-up for all basis sets,
including Gaussians. The improvement is most spectacular for exponential
type orbitals. A change of basis set is also facilitated as only these one-
electron integrals need to be changed. The Gaussian and exponential type
orbital basis sets are, therefore interchangeable in a given program. The
timings of exponential type orbital calculations are no longer significantly
longer than for a Gaussian basis, when a given accuracy is sought for
molecular electronic properties.
For STOs, nano-Hartree accuracy of Coulomb resolutions is acces-
sible for AO angular momentum to l = 2, beyond this falls to at worst
milli-Hartree accuracy, they should thus be used systematically until the
last stages of SCF, when high accuracy required implies that precautions
described in Section 7.8.7 should be taken for high angular momentum, or
treating the coupling relations.
1
fi fj = ij . (7.21)
r12
The completeness relation for the associated potentials can also be written
in the form of Eq. (7.20) The functional expression of the above gives:
1
= i (r1 )i (r2 ). (7.22)
r12
The potential functions i , are solutions of Poissons equation. The
functions chosen may also be based on Coulomb Sturmians (see the work
by Avery, e.g. Chapter 6 of the present book and references therein).
Completeness of the functions fi allows us to expand a density in terms
of them (using Eqs. (7.19) and (7.21)):
1
(r)| = (r) fi (r)fi (r)|. (7.23)
r12
J is re-written, summing over i and j:
1
J12 = (r1 ) (r2 )
r12
1 1
= (r1 ) fi (r1 )fi (r1 )
fj (r2 )fj (r2 ) 1 (r2 ).
r r r
12 12 12
(7.24)
And recalling the defining relation for potentials (i.e. one-electron functions
of a single radial variable):
1
fi (r) = |i (r) (7.26)
r
12
J12 = (r1 )i (r1 )i (r2 )(r2 ) with implied sum over i (7.27)
In addition, the potentials must ensure rapid convergence of the above
implied sum in the resulting expression for Coulomb integrals J12 as
products of auxiliaries i.e. overlap integrals, as detailed in [69].
This technique can be readily generalized to exchange and multi-center
two-electron integrals [74]. For two-center terms it is helpful to define
structure harmonics by Fourier transforms, limiting evaluation to non-zero
terms [70].
The requisite potentials and auxiliaries are given in Appendix D
(Section 7.15).
This assumes tacitly that the potential obtained from the Coulomb
operator resolution be centered on one of the atoms. Whilst this choice
can be made for one pair in a four-center product, it cannot for the second.
There remains a single translation for this potential in one auxiliary of the
two in a product representing a four-center integral and none otherwise.
This method obviates the need to evaluate infinite series that arise
from the orbital translations efficiently. They have been eliminated in the
Coulomb operator resolution approach, since only orbitals on two centers
remain in the one-electron overlap-like auxiliaries. These can be evaluated
with no orbital translation, in prolate spheroidal coordinates, or by Fourier
transformation [70, 74].
AO No. n l m zeta
01 1 0 0 5.6727
02 2 0 0 1.6083
35 2 1 m 1.5679
06 1 0 0 8.5600
07 2 0 0 2.5600
810 2 1 m 2.5200
H 1 0 0 1.2400
July 20, 2011
170
9:7
Table 7.3b. Selected examples of three-center exchange integrals.
9in x 6in
Integral Value Integral Value
2sC 2sF |2sC 1sHa 0.4970 48510 101 2sF 1sHa |1sF 2sC 0.1014 05594 102
2sC 2sF |2sC 1sHa 0.8420 56635 102 2sF 1sHa |2sF 2sC 0.9341 35949 102
2sC 1sF |1sC 1sHa 0.5737 90540 103 2sF 1sHa |2pzF 2sC 0.8442 95091 102
2sC 1sF |2sC 1sHa 0.3789 18525 102 2sF 1sHa |1sF 2pzC 0.1813 23479 102
b1189-ch07
1sC 2pzF |2pzC 1sHa 0.1587 58344 102 2sF 1sHa |2sF 2pzC 0.1379 64387 101
P.E. Hoggan
2sC 2pzF |2pzC 1sHa 0.5258 34208 102 2sF 1sHa |2pzF 2pzC 0.1135 01125 101
2pzC 1sF |1sC 1sHa 0.1025 32536 102 1sHa 2sF |1sHa 2sC 0.1252 319411 101
2pzC 1sF |2sC 1sHa 0.6772 76818 102 1sHa 2sF |1sHa 2pzC 0.1591 49899 102
1sC 1sF |1sC 1sHa 0.1099 00118 106 1sHa 2pzF |1sHa 2pzC 0.1772 90873 102
1sC 1sF |2sC 1sHa 0.6794 54131 106 1sF 1sHb |2sF 1sC 0.2287 77210 104
1sC 2sF |1sC 1sHa 0.1446 31297 102 1sHb 2sF |1sHb 1sC 0.1939 63837 102
resolution can be used to give fast and accurate results for basis sets of s
and p Slater-type orbitals. Generalization is in progress.
Numerical vales for the H2 dimer geometry and interaction energy agree
well with complete ab initio potential energy surfaces obtained using very
large Gaussian basis sets and data from vibrational spectroscopy [77].
N =0, B0 =0
with
N the nuclear dipole moment of nucleus N and B
0 the external field.
|0 is a closed shell ground state Slater determinant. and stand for
cartesian coordinates.
A coupled HartreeFock treatment of the above equation leads to [121,
122, 136]:
(0,1) (0) (1,1)
N
= Tr P h(1,0)
+ P h , (7.29)
(0,1)
where P (0) and P are the density matrix of zero order and first order
(1,0)
with respect to the external magnetic field, h is the core Hamiltonian
(1,1)
of the first order with respect to nuclear dipole moment and h is the
second order one-electron Hamiltonian with respect to the nuclear moment
and the external field B .
The non-zero orders in (7.29) involve integrals which are absent from
ab initio HartreeFock calculations. In this work, we focus our attention
on integrals involving 1/rN3 in their operator. These integrals appearing in
The integral which we have chosen to investigate in detail within the Fourier
transform approach, is the three-center one-electron integral:
r
r r r
N , N,
I = . (7.31)
3
rN
Here
rN is the instantaneous position of the electron with respect to the
nuclei N.
July 20, 2011 9:7 9in x 6in b1189-ch07 Solving the Schrodinger Equation
Analytical treatment
Application
H
N N
XH X
Y S S
Y
XBT PXBT
Molecule Substituent a b c
Molecule Substituent ac ab ad
The above results prompted use of the protonated structure to obtain the
zero-order wave-function, in all cases apart from benzothiazole (BT) and
ABT. Below, the same cases are treated in the DFT work.
Note that the basis sets including hydrogen-like orbitals perform better
than the STO basis sets that in turn improve upon dense-core Gaussian basis
sets [6-311++G(2d,p)].
Basis sets augmented with hydrogen-like orbitals are within 5 ppm
of the experimental values (measured within 2 ppm) for the discrete
July 20, 2011 9:7 9in x 6in b1189-ch07 Solving the Schrodinger Equation
The content of this table is original and based on the previous work
of the author [125] i.e. geometries are re-optimized from the coordinates
of [125].
Conclusion
Another step on the way to ab initio ETO basis nuclear screening tensor
calculations has been accomplished.
It is essential to use a basis set which comprises orbitals with the correct
nuclear cusp behavior. This implies a non-zero value of the function at the
origin for spherically symmetric cases and satisfying Katos conditions.
Hydrogen-like atomic orbital basis sets therefore perform better than Slater-
type orbitals which are an improvement upon even large Gaussian basis
sets.
The NDDO-PM3 molecular site approach has the advantage of rapidity.
Calculations take about a minute instead of 5075 hours on the IBM-44P-
270. They cannot be systematically improved, however, once the site Slater
exponents have been fitted. Note that the 2s Slater exponent fluctuates wildly
in fits, providing further evidence that shielding must be of the form (2-r)
for the 2s ETO.
Fundamental work on orbital translation is also in progress to speed up
these calculations within the test-bed of the STOP programs [50, 80, 131].
The interplay of these discrete molecule solvent models and accurate
in vivo NMR measurements is satisfactory, in that the structures postu-
lated give calculated chemical shifts to similar accuracy as obtained for
experimental values (on the order 2 ppm). It should be stressed that energy
minimization in this case does evidence directional hydrogen bonds but can
lead to several possible solvent geometries. Further study, using molecular
dynamics techniques would be useful in the modeling of solvent shells and
July 20, 2011 9:7 9in x 6in b1189-ch07 Solving the Schrodinger Equation
in terms of two-electron integrals. Due to the use of a single r12 value, the
accuracy achieved for atomic calculations is less than that of Hy and Hy-CI
calculations. Recent improvements of the method [9597] can achieve
micro-Hartree accurate energy results for chemically interesting systems.
Short wave function expansions lead to very good results. When a large
number of configurations are used (up to 10,000) the energy results are
beyond pico-hartree accuracy, while the CI wave function would need in
the order of millions of configurations.
Caffarels basis three gives about 0.01 a.u. lower total energy, at this
preliminary stage.
Examining the total DMC energies with the corresponding nodes (within
the fixed-node approximation for the ground state), the results below were
subsequently obtained:
E-DMC(HF VB1) = 191.8481(5) E-DMC(HF BS3) = 191.729(4)
E(CAS(6,5) vb) = 190.8479(5) E-DMC(CAS(6,5) 3) = 190.744(5)
The improvement in results is remarkable! (They are about 0.12 u.a.
lower in energy for the VB1 basis).
These results do not depend on whether the Jastrow factors are well
optimized or not. They are simply dictated by the initial nodal structure of
the wave-function that remains unchanged.
It is therefore possible to compare the basis sets directly.
These results show that the Slater-type orbital basis set is much more
appropriate for a trial wave-function than that on a Gaussian basis. It is pos-
sible to infer categorically that the nodal structure is substantially improved
towards the exact result when an STO basis set is used, compared to a cusp-
corrected Gaussian basis set.
The following results show the singlet-triplet gap calculation is far less
basis set sensitive in the case of Slater-type orbital basis, regardless of
Jastrow factor optimisation. The basis appears to saturate much sooner
and this may allow for limited optimization thus avoiding (at least in part)
the very time-consuming step in quantum Monte Carlo. Fixed-node and
variational accuracy are both improved over the cusp-corrected Gaussians.
The Slater-type orbital basis State Specific results are:
E-DMC (CAS(2,2)vb) = 191.8494(5) 191.7107(5) gap = 3.77(2)
E-DMC (CAS(6,5)vb) = 191.8479(5) 191.7154(5) gap = 3.61(2)
Whereas the previous (cusp-corrected Gaussian) State Specific results
are:
E-DMC (CAS(2,2) BS3) = 191.729(4) 191.608(2) gap = 3.29(12)
E-DMC (CAS(6,5) BS3) = 191.744(5) 191.596(2) gap = 4.03(14)
The gap energies fluctuate much less with the configurations taken into
account over STO compared to cusp-corrected GTO because the nodal
structure is closer to that of the exact solution.
When the basis is increased progressively from VB1 to VB3 through
VB2 there is a systematic improvement, both variationally and in the
July 20, 2011 9:7 9in x 6in b1189-ch07 Solving the Schrodinger Equation
7.9.3.3. Perspectives
Time-consuming optimisation of the Jastrow factor which introduced cor-
relation through explicit r12 dependence may be reduced by using a trial
wave-function over Slater-type orbitals that enable the HartreeFock limit
to be approached for a smaller basis (i.e. the basis saturates sooner) than
for Gaussians.
Recently, Gill proved a simple model two-particle closed-form
expression for correlation in hyper-spherical coordinates [142]. Co-
ordinates have also been given by Fano, for three-particles. A simple
expression would lead to considerable time-gains for Jastrow factor opti-
misation, since these terms dominate the calculation time to a large extent
when they are necessary.
Early molecular electronic structure work, from the first diatomic molecules
tackled before 1930 to the advent of mainframe computers in the mid-1950s,
July 20, 2011 9:7 9in x 6in b1189-ch07 Solving the Schrodinger Equation
these functions instead. The scaling we use enables us to avoid the under-
flows and overflows that may occur in direct computation of In+1/2 (x) and
Kn+1/2 (x) for large values of n; it is thus an important ingredient of our
method. This also allows us to scale the BCLFs appropriately. In order to
end up with BCLFs that have double-precision accuracy, in our method,
we compute both the functions In+1/2 (x) and Kn+1/2 (x) and the BCLFs in
extended precision arithmetic, the idea being that the quadruple-precision
arithmetic is shown to suffice and it is offered with some high-level pro-
gramming language compilers used for scientific applications, e.g. Fortran
77 and C. As the number of arithmetic operations required is very small
(of the order of wN, where N is the number of BCLFs computed and w is
a small integer), the use of quadruple-precision arithmetic cannot increase
the cost of the computation time-wise. We provide an error analysis for the
procedure we use to compute the scaled modified spherical Bessel func-
tions, which shows that the procedure is indeed very accurate in previous
work [116].
Finally, in [116], we also provide three appendices that contain several
results that seem to be new. In the first, we analyze the asymptotic behavior
of the modified Bessel functions I (x) and K (x) as . We derive
two sets of full asymptotic expansions that have some quite interesting
properties. The scalings we use in [116] are based on the results of this
appendix. In the second, we obtain explicit power series expansions for
products of modified spherical Bessel functions. In the third appendix
of [116], we derive asymptotic expansions of BCLFs as their order tends to
infinity.
An+1/2 being the BCLFs. From this relation, it is seen that Rn1 eR serves
as a generating function for the the BCLFs. Since
+1
2
P2 (x) dx = , = 0, 1, . . . , (7.34)
1 2 +1
we immediately deduce from (7.33) that
+1
ar
A+1/2 (, a, r) =
n
Rn1 eR P (x) dx, = 0, 1, . . . .
2 1
(7.35)
Clearly, the An+1/2 (, a, r) are symmetric functions of a and r, that is,
An+1/2 (, a, r) = An+1/2 (, r, a), (7.36)
because the function Rn1 eR is.
A simple expression for BCLFs with n = 0 and = 0, 1, . . . , is known
(see Abramowitz and Stegun [102], p. 445, formula 10.2.35):
A0+1/2 (, a, r) = I+1/2 ()K+1/2 ( ); = min{a, r},
= max{a, r}. (7.37)
Here, I+1/2 (x) and K+1/2 (x) are the modified spherical Bessel functions.
The functions I+1/2 (x) and K+1/2 (x) satisfy three-term recursion rela-
tions in that are given in this work, and are defined for all integer values
of . Those I+1/2 (x) with 0 are called modified spherical Bessel
functions of the first kind, while those with < 0 are called modified
spherical Bessel functions of the second kind. The K+1/2 (x) are called
modified spherical Bessel functions of the third kind. Each of the two pairs
[I+1/2 (x) and I1/2 (x)] and [I+1/2 (x) and K+1/2 (x)] is a linearly
independent set of solutions of the modified spherical Bessel equation of
order (see Abramowitz and Stegun [102], Chapter 10) of the first and
third kind, respectively. Because I+1/2 (x) and K+1/2 (x) are defined for
all integer values of , we let (7.37) define A0+1/2 (, a, r) for < 0 as
well. This is an important step that enables us to define An+1/2 (, a, r) for
< 0 as well, which is what we consider next (see [106]).
From the integral representation in (7.35), it follows that, for n 0,
n
+1/2 (, a, r) =
An+1 A (, a, r), (7.38)
+1/2
and hence
n
An+1/2 (, a, r) = (1)n n A0+1/2 (, a, r). (7.39)
July 20, 2011 9:7 9in x 6in b1189-ch07 Solving the Schrodinger Equation
1978 Filter and Steinborn: Fourier transform work. B-functions and plane-
wave expansion of Coulomb operator.
1981 ETO Conference in Tallahassee. Weatherford and Jones.
1994 First (published) STOP (Slater-type orbital package, QCPE 667 1996)
code. Bouferguene and Hoggan.
2001 First SMILES (Slater molecular integrals for large electronic systems)
code. Fernandez Rico, Lopez et al.
2002 Complete analytic study of two-center repulsion integrals using
Maple symbolic algebra. F. Harris
2009 Gill: Coulomb resolution. Model exact pair correlated wave-
functions.
See Annalen der Physik, 402, 7 (1931), pp. 868872 (in German):
In molecular quantum chemistry, major contributions to the electronic
interaction energy for diatomic systems are exchange integrals of the form:
(1) (2) (1) (2)
a (1)b (1)a (2)b (2)
Iex = d1 d2 ,
r12
where (1) and (2) are (hydrogen-like) atomic orbitals centered at a and
b. This integral was solved in 1927 for ground-state H2 by Sugiura.9
In general, these two-center exchange integrals may be expressed as
polynomials multiplied in turn by exponentials, logarithms, and the expo-
nential integral functions. The sum is finite for equal orbital exponents, ,
or for orbitals with the same principal quantum number.
Some standard integrals are required.10
where r is a rational function. Integration may be easily carried out, using the formula
x r (i) (x)
ex r(t)et dt = (r (i) = i-th derivative of r).
i+1
i=0
2 D5 ((1) +(2) )
Iex = e J.
8
2(1) (2) (1)
J = a0 (1) + log (1) a1 (1)e2 Ei (2(1) )
+ (2)
a2 (1)e2 Ei (2(2) ) + a0 (1)e2( + ) Ei (2((1) + (2) ))
(2) (1) (2)
1 1
(1) x
(1) (2)
A2 (x)e x dx
(2)
+e A1 (x)e dx + +e
1
+ e + [b(x) A0 (x)]e( + )x dx,
(1) (2) (1) (2)
i = 23/2 n l (r)Ylm (, ).
Here, hn (x) is the nth member of any set of functions that are complete
and orthonormal on the interval [0, +), such as the nth order polynomial
function (i.e. polynomial factor of an exponential). The choice made in [69]
is to use parabolic cylinder functions (see also another application [54]),
i.e. functions with the even order Hermite polynomials as a factor. This is
not the only possibility and a more natural and convenient choice is based
July 20, 2011 9:7 9in x 6in b1189-ch07 Solving the Schrodinger Equation
and analytical expressions of Vnl (r) with non-zero l are also readily obtained
by recurrence.
These radial potentials can generally be expressed in terms of hyperge-
ometric functions, whether the choice of polynomial is the present one, i.e.
Laguerre, or Hermite polynomials, as in [69]. This structure has been used to
confirm the results of [70] using a rapid code in C [71]. Spherical harmonics
are translated using Talmans approach [72]. The displaced potential in one
factor of the product of auxiliaries, from four-center integrals is readily
expanded in two-center overlaps, after applying Eulers hypergeometric
transformation [73, 74].
The auxiliary overlap integrals (r1 )i (r1 ) and i (r2 )(r2 ) will
involve densities obtained from atomic orbitals centered on two different
atoms in exchange multi-center two-electron integrals. The overlap inte-
grals required in an ETO basis are thus of the type:
a (r1 )b (r1 )i (r1 )
max
= N (n1 , n2 , ni , li , |mi |)s(n1 , l1 , m, n2 , l2 , ) (7.49)
=0
The operator (
r
rN r, rN, )/(rN3 ) can be expressed as a combination
Here we have used the Gaunt coefficients [124, 139] and the Condon
and Shortley phase convention for spherical harmonics Ylm (
r ,
r ) [120].
Consequently, the integral (7.31) is reduced to a sum of terms of the form:
j
Y ( , )
1 r
N r
N
(7.56)
2
rN
p ) = (7.57)
rN 2 p
This immediately allows us to write the inverse Fourier transform:
j i
pr
N j
Y1 (r
N , r
N ) i e Y1 (p
, p
)
= 2
.
dp (7.58)
rN2 2 |
p|
July 20, 2011 9:7 9in x 6in b1189-ch07 Solving the Schrodinger Equation
Now, this places us in a position to write the Fourier integral for the present
term in the NMR nuclear shielding tensor calculation. After expanding the
Slater-type orbitals in terms of B-functions and substituting (7.58) in (7.56),
the present integral becomes:
i
pR
N j
i e Y1 (p
, p
)
I= 2
2 |
p| (7.59)
B m1
(1 ,
r )|ei
p
r |B 2 (2 ,
r R
2 )(
r) d p
n1 ,l1
m
n2 ,l2
,
whereas the three-center nuclear attraction integral is:
i
pR
N
1 e
I= Bnm11,l1 (1 ,
r )|ei
p
r |Bnm22,l2 (2 ,
r R
2 )(
r) d p
.
2 2 |
p|2
(7.60)
The three-center dipolar integral (7.13) appears in a form closely related to
that of the three-center nuclear attraction integrals required at the HF-SCF
level of electronic structure calculation (and also used in electronic DFT
work). In both above integrals note the presence of the common factor in
B-function Fourier transform work first studied by the Steinborn group:
Eq. (7.4 and 7.5). See [25], i.e.:
Bnm11,l1 (1 ,
r )|ei
p
r |Bnm22,l2 (2 ,
r R
2 )(
r) . (7.61)
The analytical treatment developed here has not required any hypothesis
on the relative position of nucleus (aligned or not) and any restriction on
quantum numbers. Consequently, the equation (7.59) is completely general
and may be directly evaluated from routines available in a quantum calcu-
lation software.
Note that such an integral satisfies all applicability conditions of non-
linear transformations for extrapolation described byA. Sidi [133]. Previous
work on three-center nuclear integral evaluation [132] has been used to
develop an efficient program to compute this dipolar 1/rN 3 three-center
integral.
Acknowledgements
Bibliography
[14] R.S. Mulliken and C.C.J. Roothaan, Proc. Natl. Acad. Sci. U.S. 45, 394 (1959).
[15] S.F. Boys, Proc. Roy. Soc. (London). A200, 542 (1950).
[16] I. Shavitt and M. Karplus M, J. Chem. Phys. 36, 550 (1962).
[17] E. Clementi and D.L. Raimondi, J. Chem. Phys. 38, 2686 (1963).
[18] C.W. Scherr, J. Chem. Phys. 23, 569 (1955).
[19] S.J. Smith and B.T. Sutcliffe, in Reviews in Computational Chemistry, edited by
D.B. Boyd and K.B. Lipkowitz 10, 271 (1997).
[20] POLYATOM: D.B. Newmann, H. Basch, R.L. Korregay, L.C. Snyder, J. Moskowitz,
C. Hornback, and P. Liebman, Quantum Chemistry Program Exchange, Indiana
University, No. 199; I.G. Csizmadia, M.C. Harrison, J.W. Moscowitz, B.T. Sutcliffe,
Theoret. Chim. Acta. 6, 191 (1966).
[21] IBMOL: D.J. David, CDC 6600 Version. Technical Report of C.C. ENSJF and Lab.
de Chimie ENS, Paris (1969). E. Clementi, D.R. Davis, J. Comput. Phys. 1, 223244
(1967); A. Veillard, IBMOL: Computation of wave-functions for molecules of
general geometry, Version 4; IBM Research Laboratory, San Jose.
[22] A.D. McLean, M. Yoshimine, B.H. Lengsfield, P.S. Bagus, and B. Liu, in Modern
Techniques in Computational Chemistry, MOTECC 91, edited by Clementi E.
(Elsevier, Leiden, 1991) pp. 233353 .
[23] W.J. Hehre, W.A. Lathan, R. Ditchfield, M.D. Newton, and J.A. Pople. GAUSSIAN
70: Ab Initio SCF-MO Calculations on Organic Molecules QCPE 11, Programme
number 236 (1973).
[24] A. Bouferguene, M. Fares, and P.E. Hoggan, Int. J. Quantum Chem. 57, 801
(1996).
[25] H.H.H. Homeier, E. Joachim Weniger, and E.O. Steinborn, Comp. Phys. Comm.
72, 269 (1992). H.H.H. Homeier and E.O. Steinborn, Comp. Phys. Comm. 77, 135
(1993).
[26] J. Fernandez Rico, R. Lopez, I. Ema, and G. Ramrez, J. Comp. Chem. 19, 1284
(1998). ibid, J. Comp. Chem. 25, 1347 (2004).
[27] CADPAC: The Cambridge Analytic Derivatives Package, R.D. Amos, I.L. Alberts,
J.S. Andrews, S.M. Colwell, N.C. Handy, D. Jayatilaka, P.J. Kowles, R. Kobayashi,
K.E. Laidig, G. Laming, A.M. Lee, P.E. Maslen, C.W. Murray, J.E. Rice, E.D.
Simandiras, A.J. Stone, M.D. Su, and D.J. Tozer, Cambridge, (1995).
[28] E.J. Baerends, D.E. Ellis, and P. Ros, Chem. Phys. 2, 17 (1973).
[29] G. Te Velde, F.M. Bickelhaupt, E.J. Baerends, C. Fonseca Guerra, S.J. A. Van Gis-
bergen, J.G. Snijders, and T. Ziegler, J. Comp. Chem. 22, 931 (2001). ADF: Ams-
terdam Density Functional, available at http://www.scm.com/.
[30] C.F. Bunge, R. Jauregui, and E. Ley-Koo, Can. J. Phys. 76, 421 (1998).
[31] D. Pinchon and P.E. Hoggan, J. Phys. A: Math. Theor. 40, 1597 (2007).
[32] T. Ozdogan, Int. J. Quantum Chem. 92, 419 (2003).
[33] S.A. Hagstrom and H. Shull, Rev. Mod. Phys. 35, 624 (1963).
[34] D.C. Clary, Mol. Phys. 34, 793 (1977).
[35] J.S. Sims and S.A. Hagstrom, J. Chem. Phys. 124, 094101 (2006).
[36] M.B. Ruiz and K. Peuker, in Recent Advances in Computational Chemistry:
Molecular Integrals over Slater Orbitals, edited by T. Ozdogan and M.B. Ruiz,
(Transworld Research Network, Kerala, 2008) pp. 100144.
[37] E. Filter and E.O. Steinborn, Phys. Rev. A. 18, 1 (1978).
July 20, 2011 9:7 9in x 6in b1189-ch07 Solving the Schrodinger Equation
Chapter 8
201
July 20, 2011 9:6 9in x 6in b1189-ch08 Solving the Schrodinger Equation
H H H H A B A B A B g
1 2 3
By itself, each Slater determinant in Eq. (8.1) is not much lower in energy
than the two separate atoms, and is therefore practically non-bonding [2].
It is the superposition of the two determinants, or resonance between the
two spin arrangements, that creates bonding, as represented in 1 in Scheme
8.1. This early description was remarkably successful since it accounted
for 75% of the total bond energy of H2 . For a complete description, it is
necessary to include two ionic terms, as is done in Eq. (8.2) that represents
the full VB wave function for a general single bond between two atoms A
and B:
VB (A B) = C1 a b a b + C2 a a + C3 b b (8.2)
where the two last Slater determinants represent A B+ and A+ B ionic
situations, as in 2 (Scheme 8.1). Thus, the full VB wave function is a linear
combination of three VB functions (generally referred to as VB struc-
tures), each representing a particular bonding scheme. The AOs that are
July 20, 2011 9:6 9in x 6in b1189-ch08 Solving the Schrodinger Equation
used to construct the VB structures in Eq. (8.2), are defined as linear com-
binations of the basis functions, , centered on a single atom, Eq. (8.3):
i = Ti (8.3)
and taken from standard basis sets. The AOs are 1s types for hydrogen
atoms, and hybrids of ns and np basis functions for heavier atoms, giving
rise to the concept of hybridization. For example, the so resulting hybrid
atomic orbitals (HAOs) of carbon in a C-H bond in different molecules
will resemble the sp3 , sp2 or sp types, well known from the important
hybridization concept.
The VB description of the AB bond can be compared with the simple
MO description, in which a unique delocalized MO is doubly occupied, 3
in Scheme 8.1, Eq. (8.4):
MO (A B) = g g ; g (a + b ) (8.4)
MOCI (A B) = g g + g u g u + |u u | ;
u (a b ) (8.5)
2 4
1 3
being optimized with freedom to delocalize over the two centers. This is
exemplified in Eq. (8.9) for the AB bond:
CF (A B) = |a b | |a b | (8.9a)
a = a + b (8.9b)
b = b + a (8.9c)
Here a and b are purely localized AOs (or HAOs), while a and b
are delocalized orbitals. In fact, experience shows that the CoulsonFischer
orbitals a and b , which result from the energy minimization, are generally
not extensively delocalized (, < 1), and as such they can be viewed
as distorted atomic orbitals. However, minor as this may look, this slight
delocalization renders the CoulsonFischer wave function equivalent to the
VB (AB) wave function (Eq. (8.2)) with the three classical VB structures.
A straightforward expansion of the CoulsonFischer wave function leads
to the linear combination of the classical structures, in Eq. (8.10).
CF (A B) = (1 + ) a b a b
+2 a a + 2 b b (8.10)
Thus, the CoulsonFischer representation keeps the simplicity of the
covalent picture while treating the covalent/ionic balance by embedding
the effect of the ionic terms in a variational way, through the delocalization
tails of the VB orbitals. The CoulsonFischer idea has later been generalized
to polyatomic molecules and gave rise to the generalized valence bond
(GVB) [7] and spin-coupled (SC) [8] methods. The advantage of using wave
functions of CoulsonFischer type becomes obvious when one wishes to
treat all the bonds of a molecular system in a VB way. For example, the
GVB wave function representing methane with its four C-H bonds needs a
single formally covalent structure (Eq. (8.11)),
GVB = (1 h1 1 h1 )(2 h2 2 h2 )(3 h3 3 h3 )(4 h4 4 h4 )
(8.11)
where the i s are the four HAOs of the carbon atom, which are singlet-
coupled to the orbitals hi of the hydrogen atoms; both i and hi are
localized on their respective centers while bearing small delocalization tails
to the other centers. This is a great simplification compared with the mixed
covalent/ionic wave function that possesses 81 mixed structures based on
localized orbitals.
Letting all the orbitals in Eq. (8.11) be determined variationally leads
to four sp3 type HAOs i pointing in tetrahedral directions toward the
corresponding hydrogen atoms, as shown in Scheme 8.3. These HAOs,
July 20, 2011 9:6 9in x 6in b1189-ch08 Solving the Schrodinger Equation
h1
1
4 C 2
h4
3 h2
h3
which come out from a variational calculation without the input of any
qualitative preconception, clearly demonstrate the validity of the universally
used hybridization model. Incidentally, this GVB wave function is much
lower in energy than the simple MO wave function with its delocalized
canonical MOs [9].
H H
X C Y X C Y
HH HH
6 XCH3 Y
5 X H3CY
H
X C Y
HH
7 X H3C+ Y
field method (VBSCF) [10], which optimizes the coefficients CK and the
orbitals of the VB structures K simultaneously. The VBSCF method has
been implemented in ab initio codes by van Lenthe et al. [11], with effi-
cient algorithms of orbital optimization that get rid of the N! problem. The
algorithm has been further improved just recently by Wu et al. [12], and
even faster versions are currently in progress.
C1 A B + C2 A B + C3 A B
8 9 10
(a)
8 9 10
(b)
Scheme 8.5. Representation of the AB bond by the VBSCF (a) and BOVB (b)
methods. The spectator orbitals in black may be lone pairs or bond orbitals of bonds
between A and/or B to substituents on A and B.
atoms in 8, versus ions in 9 and 10. Clearly, a better wave function would
be allowed to have different orbitals for different VB structures. Such a
wave function is represented in Scheme 8.5b, where it is seen that the
orbitals surrounding, e.g. A in 9 or B in 10, are drawn bigger than those
surrounding A and B in 8. This is the essence of the breathing-orbital
valence bond method (BOVB) [13], and this improvement, that keeps the
wave function as compact as in VBSCF, brings some dynamic correlation
that is necessary for getting accurate dissociation energies.
Another recent VB method that takes care of dynamic correlation is the
VBCI method [14]. This is a post-VBSCF approach, where the VBSCF
wave function serves as a reference wave function for the CI procedure.
Thus, excited VB structures are generated from the reference wave function
by replacing occupied (optimized VBSCF) orbitals with virtual orbitals, and
CI is performed between the reference VB structures and the excited ones.
To generate physically meaningful excited structures, the virtual orbitals are
constructed to be strictly localized, like the occupied VB orbitals. After the
CI has been done, the reference and all the excited VB structures that rep-
resent the same bonding scheme are condensed into a single structure. In this
manner, the extensive VBCI wave function is condensed to a minimal set of
fundamental structures, which ensures that VBCI keeps the VB advantage
of compactness.
A much faster variant than VBCI is the VBPT2 method [15], in which
the excited VB structures are treated by perturbation theory to second order
July 20, 2011 9:6 9in x 6in b1189-ch08 Solving the Schrodinger Equation
energy of 37.9 kcal/mol, very close to the experimental value, with only
three VB structures (810 in Scheme 8.5) [22]. By comparison, in the MO
framework, an MCSCF treatment must go far beyond the valence CASSCF
level, and the resulting dissociation energy oscillates between too small to
too large values until as many as 968 configurations are included and the
dissociation energies converges [23].
VB theory provides a clear picture of the very important concept of
electron correlation. For a single bond, the static electron correlation is
accounted for if the three VB structures are given optimized coefficients, as
in Scheme 8.5a, whereas the weights of the ionic structures are systemat-
ically overestimated at the HartreeFock level. This equilibration of ionic
vs. covalent coefficients is also called left-right correlation. On the other
hand, the subtle breathing-orbital effect, by which the orbitals rearrange
in size and shape to follow the charge fluctuation in the bond (as shown
in Scheme 8.5b) is associated with dynamic correlation, more precisely,
with the change in dynamic correlation that attends bond-breaking/bond-
formation of the AB bond (also called differential dynamic correlation).
This latter term is the dominant correlation term in three-electron/two-
center (3e/2c) bonds, in which there is no left-right correlation. To illus-
trate this point, consider theAB anion that possesses a three-electron bond,
noted [A B] . At the HartreeFock level, the bonding and antibonding
orbitals g and u (see Scheme 8.1) are doubly and singly occupied, respec-
tively. It turns out that the HartreeFock wave function is equivalent to the
VBSCF description, i.e. a resonance between two Lewis structures, as an
expansion of the HartreeFock determinant would show (Eq. (8.13)):
MO [A B] = g g u = b b a a a b ;
g (a + b ), u (a b ) (8.13)
Since the HartreeFock description involves the two Lewis structures
that can be drawn for this system with the right coefficients (5050 ifA and B
have equivalent electronegativities), no left-right correlation is necessary to
re-equilibrate these coefficients by CI, so that the only electron correlation
that has to be accounted for is dynamic [24]. In accord, a simple BOVB
description of the [A B] bond, in terms of two VB structures 11 and
12 each having their own set of optimized orbitals, contains all the physics
of 3e/2c interactions and provides accurate bonding energies [25]. Besides
the very simple picture provided by VB theory for dynamic correlation,
it should be noted that the contribution of this term to the 3e/2c bonding
energy is very large in all cases [26,27]. For example, the bonding energy
of F2 , as calculated at the HartreeFock level, is close to zero, compared
July 20, 2011 9:6 9in x 6in b1189-ch08 Solving the Schrodinger Equation
A B A B
11 12
structure (e.g. 11), to the detriment of the other (12). Then the VB structures
have different energies and 11 ends up having a larger coefficient than 12. In
this symmetry-broken solution, the resonance energy is diminished relative
to the symmetry-adapted solution. Here is the dilemma: at the HartreeFock
level, one cannot have simultaneously good orbitals and full resonance
energy [32]. A classical remedy consists of imposing the symmetry and
doing CI. However, in many cases there is only quasi-symmetry (e.g. in
[A B] with A = B), and in such a case there is no way to avoid the
artificial favoring of one structure over the other. As a consequence, it is
then very difficult to correct the initial deficiency by subsequent CI.
While the problem is currently solved in the MO framework with
elaborate methods such as coupled-cluster calculations using Brueckner
orbitals [33], the symmetry-breaking artefact vanishes at the BOVB level.
Indeed, as this method provides a superposition of two VB structures each
having its optimal set of orbitals, the BOVB wave function involves at the
same time both optimal orbitals and full resonance effect at any molecular
geometry, and the root cause for the symmetry-breaking disappears. It
follows that the BOVB method is, by nature, free from the symmetry-
breaking artefact. Historically, the first calculation of that kind was done by
Jackels and Davidson in 1976 for the NO2 radical [34]. A standard BOVB
calculation was later performed for the potential surface of the HOOH
anion [35].
R* P*
Gr
Gp
r B
p
E
R Erp
P
Reaction Coordinate
Fig. 8.1. VBSCD for a general reaction R P. R and P are ground states of
reactants and products, R* and P* are promoted excited states.
between two diabatic curves, which represent the energy profiles of the VB
state curves of the reactants and products.
The nature of the R* and P* promoted states depends on the reaction
type and will be specified below using a few examples. In all cases, the
promoted state R* is the electronic image of P in the geometry of R,
while P* is the image of R at the geometry of P. The G terms are the
corresponding promotion energy gaps, B is the resonance energy of the
transition state (TS),
E= is the energy barrier, and
Erp is the reaction
energy. The simplest expression for the barrier is given by Eq. (8.14):
E= = f Gr B (8.14)
Here, the term fGr is the height of the crossing point, expressed as some
fraction (f ) of the promotion gap at the reactant side (Gr ).
A more explicit expression is Eq. (8.15):
which shows the effects of the two promotion gaps and f factors through
their average quantities, G0 and f0 .
Equation (8.15) expresses the barrier as a balance of the contributions of
an intrinsic term, f0 G0 B and a driving force term, 0.5
Erp . The model
is general and has been described in detail before [3,4,37], and applied to a
large number of reactions of different types. Here we will briefly summarize
some VB computational applications on hydrogen abstraction reactions and
various SN 2 reactions.
X + HY XH + Y (8.16)
r = C1 (X + H Y) + C2 (X + H+ : Y ) + C3 (X + H : Y+ )
(8.17)
July 20, 2011 9:6 9in x 6in b1189-ch08 Solving the Schrodinger Equation
45 E (eq. 21)
40
35
30
R2 = 0.974
25
20
15
10
5 10 15 20 25 30 35 40 45
E (VB ab initio)
in which VB is the fully delocalized ground state, and L is the reference
Lewis structure (which, depending on the reference state, may be repre-
sented by a single VB structure or by a group of VB structures). Equation
(8.22) has been used with ab initio VB calculations to calculate the res-
onance energies of benzene [39], butadiene [40], allyl radical and ions
[41, 42], transition states of organic reactions [3, 29, 31, 4346], and so on,
and to quantify the -aromaticity of cyclopropane [47]. Equation (8.22) has
also been used to calculate the resonance energy arising from the mixing
of covalent and ionic VB structures in a bond, leading to the discovery of
a new type of chemical bonding (see next subsection) [22, 48, 49].
Another technique for calculating resonance energies or delocalization
energies consists of defining the reference Lewis structure, by a so-called
block-localized wave function (BLW), in which the orbitals are doubly
occupied but optimized with some localization constraints. Thus, the
orbitals that represent e.g. a -bond in a conjugated system can be optimized
while being constrained to be strictly localized on the two bonded atoms,
and the orbitals that represent a lone pair are localized on a single atom. The
orbital optimization can be carried out at the HartreeFock level [50], but a
recent version using orbital optimization at the DFT level also exists [51].
The resonance energy is then calculated as the difference between the BLW
wave function representing the reference Lewis structure and the fully delo-
calized wave function of the ground state. More generally, the BLW method
can be used to calculate delocalization energies by defining a diabatic state
in which delocalization is turned off. In this latter state, the molecule or
interacting system is partitioned into subgroups, and each localized MO
is expanded in terms of basis functions belonging to only one subgroup.
As the BLW method involves optimization of non-orthogonal orbitals, and
since the BLW wave function represents a Lewis structure, the BLW tech-
nique can be considered as belonging to the VB family, actually the simplest
VB-variant.
The above BLW method has been used to calculate the resonance
energies of many organic molecules. For example, it has been used to
quantify the role of resonance in the rotational barriers of amides [52], and
in the acidities of carboxylic acids and enols as compared to alcohols [53].
July 20, 2011 9:6 9in x 6in b1189-ch08 Solving the Schrodinger Equation
It was also used to provide accurate estimations of the vertical and adiabatic
resonance energies of benzene [54], allyl radical and anions [55], and so on.
Calculations of delocalization energies by VBSCF or BLW methods
have also been used to get accurate estimates of the magnitudes of hyper-
conjugation. This has been applied to trace the origin of Saytzeffs rule
[56], the role of hyperconjugation in the rotational barrier of ethane [57]
or in the exceptional short bond length of tetrahedranyl-tetrahedrane [58],
and so on.
K1* K2*
K1* K2*
1B
2u()
1B
2u
K1 K2
1A ()
1g
K1 K2 1A
1g
RC
RC
(a) (b)
Fig. 8.3. VBSCDs showing the crossing and avoided crossing of the Kekule
structures of benzene along the bond alternating mode, b2u for: (a) -only curves,
(b) full + curves.
July 20, 2011 9:6 9in x 6in b1189-ch08 Solving the Schrodinger Equation
Since the
EST value for an isolated bond is well over 100 kcal/mol,
Eq. (8.23) places the electronic system in the region of large gaps. Con-
sequently, the -component of benzene is predicted by the VBSCD model
to be an unstable transition state, 1A1g (), as illustrated in Fig. 8.3(a). This
-transition state prefers a distorted Kekulean geometry with bond alter-
nation, but is forced by the frame, with its strong symmetrizing driving
force, to adopt the regular D6h geometry. This prediction, which was derived
at the time based on qualitative considerations of G in the VBSCD of iso-
electronic series [61], was later confirmed by a variety of rigorous ab initio
separation methods [62]. The prediction was further linked [63] to
experimental data associated with the vibrational frequencies of the excited
states of benzene.
July 20, 2011 9:6 9in x 6in b1189-ch08 Solving the Schrodinger Equation
Here
E+ stands for the total ( and ) distortion energies, the terms
5.0(2n+1) represent the resisting effect, which is 5.0 kcal/mol for an
adjacent pair of -bond, whereas the negative term, 5.4(2n), accounts
for the -distortivity. This expression predicts that for n = 7, namely the
C30 H30 annulene, the
E+ becomes negative and the annulene undergoes
bond localization. If we increase the -distortivity coefficient by just a tiny
July 20, 2011 9:6 9in x 6in b1189-ch08 Solving the Schrodinger Equation
To get better DMC trial functions, several workers had the idea to use
trial wave functions related to VB theory [76, 77], expecting two advan-
tages: (i) for the same amount of electron correlation being taken into
account, VB expansions can be more compact than MCSCF ones; (ii)
because VB orbitals are generally localized on one or two centers, a VB-
based trial wave-function could be cheaper than a trial wave function
of the same expansion length based on MOs delocalized over the entire
molecule.
Lester et al. [77] calculated the BDE for the acetylenic CH bond by
performing a DMC calculation using a trial BOVB wave function with a
polarized triple- basis set of Slater orbitals. The accuracy is excellent,
with a CH BDE of 132.4 0.9 kcal/mol, practically equivalent to the
recommended experimental value of 132.8 0.7 kcal/mol. These values
are to be compared with DMC results obtained with single determinant
trial wave functions, using HartreeFock orbitals (137.5 0.5 kcal/mol)
and local spin density (LDA) KohnSham orbitals (135.6 0.5 kcal/mol).
Very recently, Goddard et al. [78] used simple GVB wave functions
as guess functions for DMC calculations, and applied this approach to the
adiabatic singlet-triplet splitting in methylene, the vertical and adiabatic
singlet-triplet splitting in ethylene, 2 + 2 cycloaddition, and Be2 bond
breaking. In all these cases, this approach was accurate within a few tenths
of a kcal/mol. Less accurate results, however, were found for the very
difficult test case of the NV transition energy of ethylene, for which
dynamic correlation is crucially important.
With the very recent (and actually on-going) progress of QMC algo-
rithms, trial wave functions of VBSCF quality yield results as accurate as
former BOVB trial functions. This improvement should allow quite large
systems to be treated by VB-QMC methods, with possibly up to 100 VB
structures, with an accuracy close to experimental error bars. Moreover,
work is in progress to use QMC methods to perform calculation of VB
type, allowing calculations of the weights of VB structures, and calcula-
tions of individual VB structures, i.e. diabatic states [79].
Last but not least, a further advantage of QMC methods, and soon of
VB-QMC methods, is that QMC algorithms scale as N 3 (N being the
number of electron), and possibly as N 2 in the near future [79].
8.3.5. Prospective
The question that may be posed at this point is whether VB theory will
ever return as a mainstream method that will be used by chemists as a
July 20, 2011 9:6 9in x 6in b1189-ch08 Solving the Schrodinger Equation
conceptual framework for chemistry, other than the saddening fact that for
historical reasons VB is not taught anymore in quantum chemistry courses
as a mainstream method. However, the recent monograph written on VB
theory [3], may ease the way for those who are willing to try and teach
or study elements of VB theory. Once this wall is broken, chemists will
find a beautiful theory, which can easily be incorporated into a thought
process.
This short chapter has intended to show that ab initio VB theory has
enjoyed impressive progress during the past two or three decades. As a
result, ab initio VB algorithms are today much faster than they used to
be, and by several orders of magnitude. Moreover, modern VB has also
reasonably accurate computational methods, which can provide bonding
energies and reaction barriers with accuracies comparable to sophisticated
MO-CI methods. This has been achieved by incorporating dynamic cor-
relation effects in the VB calculations, and this without complicating the
wave functions that remain compact and easily interpretable. In addition,
the modern VB methods can be combined with a solvent model, and provide
thereby a method that can handle molecules and reactions in solution and in
proteins. Thus, from a quantitative point of view, VB theory enables today
the calculations of real chemical problems for organic molecules, as well
as molecules that contain transition metals, and all these can be done in the
gas phase or in solution. Further improvements in speed and capabilities of
VB methods are currently in progress. Especially promising is the VBPT2
method that emerges as a fast and accurate method, and the combination
of VB and QMC methods that may enable, in the future, to handle much
larger systems than those presented here.
Another aspect of VB theory that is emphasized in this chapter is insight.
Thus, despite the sophistication and accuracy of the above VB methods, all
of them rely on a compact wave function that includes a minimal number of
structures in the VB-structure set. The insight of this compact wave function
is projected by a set of applications including bonding in main group ele-
ments, quantitative evaluation of common paradigms such as resonance
energies, hyperconjugation, aromaticity and antiaromaticity in conjugated
systems, distortivity of the system of benzene and related molecules
[65], and general models of chemical reactivity. Many other applications
e.g. to photochemistry, excited states, polyradicals, etc. have appeared in a
recent monograph [3]. VB theory provides also a great deal of insights into
July 20, 2011 9:6 9in x 6in b1189-ch08 Solving the Schrodinger Equation
Other than the GVB method that is implemented in many packages by now,
here are brief descriptions of the main VB software packages we are aware
of and with which we have had some experience to varying degrees.
Bibliography
(b) P.C. Hiberty, S. Humbel, J.H. van Lenthe, and C.P. Byrman, J. Chem. Phys. 101,
5969 (1994).
(c) P.C. Hiberty and S. Shaik, Theor. Chem. Acc. 108, 255 (2002).
[14] (a) W. Wu, L. Song, Z. Cao, Q. Zhang, and S. Shaik, J. Phys. Chem. A 106, 2721
(2002).
(b) L. Song, W. Wu, Q. Zhang, and S. Shaik, J. Comput. Chem. 25, 472 (2004).
[15] Z. Chen, J. Song, S. Shaik, P.C. Hiberty, and W. Wu, J. Phys. Chem. A 113, 11560
(2009).
[16] L. Song, W. Wu, Q. Zhang, and S. Shaik, J. Phys. Chem. A 108, 6017 (2004).
[17] (a) C.J. Cramer and D.G. Truhlar, Chem. Rev. 99, 2161 (1999).
(b) C.P. Kelly, C.J. Cramer, and D.G. Truhlar, J. Chem. Theory Comput. 1, 1133
(2005).
(c) A.V. Marenich, R.M. Olson, C.P. Kelly, C.J. Cramer, and D.G. Truhlar, J. Chem.
Theory Comput. 3, 2011 (2007).
[18] P. Su, W. Wu, C.P. Kelly, C.J. Cramer, and D.G. Truhlar, J. Phys. Chem. A 112, 12761
(2008).
[19] (a) A. Shurki and H.A. Crown, J. Phys. Chem. B 109, 23638 (2005).
(b) A. Sharir-Ivry, H.A. Crown, W. Wu and A. Shurki, J. Phys. Chem. A 112, 2489
(2008).
[20] The bonding energy is found negative at the HartreeFock/6-31G level,
33.4 kcal/mol at an interatomic distance of 1.43 A (see Ref. 13c).
[21] K.P. Huber and G. Herzberg, Molecular Spectra and Molecular structures. IV. Con-
stants of Diatomic Molecules (van Nostrand, Reinhold, New York, 1979).
[22] S. Shaik, D. Danovich, B. Silvi, D. Lauvergnat, and P.C. Hiberty, Chem. Eur. J. 11,
6358 (2005) (see the Supporting Information Document).
[23] M.V. Rama Krishna and K.D. Jordan, Chem. Phys. 115, 405 (1987).
[24] P.C. Hiberty, S. Humbel, D. Danovich, and S. Shaik, J. Amer. Chem. Soc. 117, 9003
(1995).
[25] P.C. Hiberty, S. Humbel, and P. Archirel, J. Phys. Chem. 98, 11697 (1994).
[26] T. Clark, J. Amer. Chem. Soc. 110, 1672 (1988).
[27] P.M.W. Gill and L. Radom, J. Amer. Chem. Soc. 110, 4931 (1988).
[28] L. Song, W. Wu, P.C. Hiberty, D. Danovich, and S. Shaik, Chem. Eur. J. 9, 4540
(2003).
[29] L. Song, W. Wu, P.C. Hiberty, and S. Shaik, Chem. Eur. J. 12, 7458 (2006).
[30] P. Su, F. Ying, W. Wu, P.C. Hiberty, and S. Shaik, S. Chem. Phys. Chem. 8, 2603
(2007).
[31] (a) S. Shaik, W. Wu, K. Dong, L. Song, and P.C. Hiberty, J. Phys. Chem. A 105, 8226
(2001).
(b) L. Song, W. Wu, K. Dong, P.C. Hiberty, and S. Shaik, J. Phys. Chem. A 106,
11361 (2002).
(c) P. Su, L. Song, W. Wu, P.C. Hiberty, and S. Shaik, J. Amer. Chem. Soc. 126, 13539
(2004).
[32] P.O. Lowdin, Rev. Mod. Phys. 35, 496 (1963).
[33] K.A. Brueckner, Phys. Rev. 96, 508 (1954).
[34] C.F. Jackels and E.R. Davidson, J. Chem. Phys. 64, 2908 (1976).
[35] S. Humbel, I. Demachy, and P.C. Hiberty, Chem. Phys. Lett. 247, 126 (1995).
[36] S.S. Shaik, J. Amer. Chem. Soc. 103, 3692 (1981).
July 20, 2011 9:6 9in x 6in b1189-ch08 Solving the Schrodinger Equation
[37] (a) S. Shaik, and P.C. Hiberty, in Adv. Quant. Chem., Vol. 26, edited by P.-O. Lowdin,
(Academic Press, 1995).
(b) A. Pross, Theoretical and Physical Principles of Organic Reactivity (Wiley-
Interscience, New York, 1995).
(c) S. Shaik and A. Shurki, Angew. Chem. Int. Ed. 38, 586 (1999).
(d) S. Shaik, Phys. Chem. Chem. Phys. 12, 8706 (2010).
[38] (a) S. Shaik, D. Kumar, and S.P. de Visser, J. Amer. Chem. Soc. 130, 10128, erratum
p. 14016 (2008).
(b) S. Shaik, W. Lai, H. Chen, and Y. Wang, Acc. Chem. Res. 43, 1154 (2010). In
these papers, anf factor of 0.3 was used together with vertical bond energies, i.e. not
including the geometrical and electronic relaxations of the dissociated radicals.
[39] J.M. Norbeck and G.A. Gallup, J. Amer. Chem. Soc. 96, 3386 (1974).
[40] A.F. Voter and W.A. Goddard III, J. Amer. Chem. Soc. 108, 2830 (1986).
[41] Y.R. Mo, Z.Y. Lin, W. Wu, and Q.N. Zhang, J. Phys. Chem. 100, 6469 (1996).
[42] M. Linares, S. Humbel, and B. Braida, J. Phys. Chem. A 112, 13249 (2008).
[43] G. Sini, G. Ohanessian, P.C. Hiberty, and S.S. Shaik, J. Amer. Chem. Soc. 112, 1407
(1990).
[44] P. Matre, F. Volatron, P.C. Hiberty, and S.S. Shaik, Inorg. Chem. 29, 3047 (1990).
[45] S.S. Shaik, E. Duzy, and A. Bartuv, J. Phys. Chem. 94, 6574 (1990).
[46] S. Shaik and A.C. Reddy, J. Chem. Soc., Faraday Trans. 90, 1631 (1994).
[47] W. Wu, B. Ma, J.I.-C. Wu, P.v.R. Schleyer, and Y.R. Mo, Chem. Eur. J. 15, 9730
(2009).
[48] (a) S. Shaik, P. Matre, G. Sini, and P.C. Hiberty, J. Amer. Chem. Soc. 114, 7861
(1992).
(b) P.C. Hiberty, C. Megret, L. Song, W. Wu, and S. Shaik, J. Amer. Chem. Soc. 128,
2836 (2006).
(c) P.C. Hiberty, R. Ramozzi, L. Song, W. Wu, and S. Shaik, Faraday Discuss. 135,
261 (2007).
(d) L. Zhang, F. Ying, W. Wu, P.C. Hiberty, and S. Shaik, Chem. Eur. J. 15, 2979
(2009).
[49] (a) W. Wu, J. Gu, J. Song, S. Shaik, and P.C. Hiberty, Angew. Chem. Int. Ed. 48, 1407
(2009).
(b) S. Shaik, D. Danovich, W. Wu, and P.C. Hiberty, Nature Chem. 1, 443 (2009).
(c) S. Shaik, Z. Chen, W. Wu, A. Stanger, D. Danovich, and P.C. Hiberty, Chem.
Phys. Chem. 10, 2658 (2009).
[50] Y. Mo and S.D. Peyerimhoff, J. Chem. Phys. 109, 1687 (1998).
[51] Y. Mo, L. Song, and Y. Lin, J. Phys. Chem. A 111, 8291 (2007).
[52] D. Lauvergnat and P.C. Hiberty, J. Amer. Chem. Soc. 119, 9478 (1997).
[53] P.C. Hiberty and C.P. Byrman, J. Amer. Chem. Soc. 117, 9875 (1995).
[54] (a) Y. Mo, J. Phys. Chem. A 113, 5163 (2009).
(b) Y. Mo and P.v.R. Schleyer, Chem. Eur. J. 12, 2009 (2006).
[55] Y.R. Mo, L.C. Song, and Y.C. Lin, J. Phys. Chem. A 111, 8291 (2007).
[56] B. Braida, V. Prana, and P.C. Hiberty, Angew. Chem. Int. Ed. 48, 5724 (2009).
[57] (a) Y. Mo, W. Wu, L. Song, M. Lin, Q. Zhang, and J. Gao, Angew. Chem. Int. Ed. 43,
1986 (2004).
(b) Y. Mo and J. Gao, Acc. Chem. Res. 40, 113 (2007).
[58] Y. Mo, Org. Lett. 8, 535 (2006).
July 20, 2011 9:6 9in x 6in b1189-ch08 Solving the Schrodinger Equation
[59] P. Su, L. Song, W. Wu, S. Shaik, and P.C. Hiberty, J. Phys. Chem. A 112, 2988
(2008).
[60] (a) S.P. de Visser, D. Danovich, W. Wu, and S. Shaik, J. Phys. Chem. A 106, 4961
(2002).
(b) D. Danovich, W. Wu, and S. Shaik, J. Amer. Chem. Soc. 121, 3165 (1999).
(c) D. Danovich and S. Shaik, J. Chem. Theory Comput. 6, 1479 (2010).
[61] S.S. Shaik and R. Bar, Nouv. J. Chim. 8, 411 (1984).
[62] (a) S.S. Shaik, P.C. Hiberty, J.-M. Lefour, and G. Ohanessian, J. Amer. Chem. Soc.
109, 363 (1987).
(b) S.S. Shaik, P.C. Hiberty, G. Ohanessian, and J.-M. Lefour, J. Phys. Chem. 92,
5086 (1988).
(c) P.C. Hiberty, D. Danovich, A. Shurki, and S. Shaik, J. Amer. Chem. Soc. 117,
7760 (1995).
[63] Y. Haas and S. Zilberg, J. Amer. Chem. Soc. 117, 5387 (1995).
[64] E.C. da Silva, J. Gerratt, D.L. Cooper, and M. Raimondi, J. Chem. Phys. 101, 3866
(1994).
[65] S. Shaik, A. Shurki, D. Danovich, and P.C. Hiberty, Chem. Rev. 101, 1501 (2001).
[66] F. Prosser and S. Hagstrom, Int. J. Quant. Chem. 2, 89 (1968).
[67] J. Verbeek and J.H. van Lenthe, Int. J. Quant. Chem. 40, 201 (1991).
[68] (a) X. Li and Q. Zhang, Int. J. Quant. Chem. 36, 599 (1989).
(b) R. McWeeny, Int. J. Quant. Chem. 34, 25 (1988).
[69] (a) J. Li and W. Wu, Theor. Chim. Acta 89, 105 (1994).
(b) J. Li, Theor. Chim. Acta 93, 35 (1996).
[70] (a) J. Li and R. Pauncz, Int. J. Quantum Chem. 62, 245 (1997).
(b) J. Li, Theor. Chim. Acta 93, 35 (1996).
(c) J. Li, J. Math. Chem. 17, 295 (1995).
(d) J. Li and W. Wu, Theor. Chim. Acta 89, 105 (1994).
[71] (a) W. Wu, A. Wu, Y. Mo, and Q. Zhang, Science in China (English Ed.) B39, 35
(1996).
(b) W. Wu, A. Wu, Y. Mo, M. Lin, and Q. Zhang, Int. J. Quant. Chem. 67, 287 (1998).
[72] L. Song, J. Song, Y. Mo, and W. Wu, J. Comput. Chem. 30, 399 (2009).
[73] W. Wu, personal communication.
[74] (a) B.L. Hammond, W.A. Lester Jr., and P.J. Reynolds, World Scientific Lecture and
Course Notes in Chemistry, Vol. 1 (World Scientific, Singapore, 1994).
(b) J.B. Anderson, in Reviews in Computational Chemistry, Vol. 13, edited by K.B.
Lipkowitz and D.B. Boyd, (John Wiley & Sons, New York, 1999) pp. 133182.
(c) D.M. Ceperley and L. Mitas, in New Methods in Computational Quantum
Mechanics, Vol. 93, edited by I. Prigogine and S.A. Rice, (John Wiley & Sons,
New York, 1996) pp. 138.
(d) W.M.C. Foulkes, L. Mitas, R.J. Needs, and G. Rajagopal, Rev. Mod. Phys. 73, 33
(2001).
[75] R.N. Barnett, Z. Sun, and W.A. Lester Jr., J. Chem. Phys. 114, 2013 (2001).
[76] (a) M. Casula, C. Attaccalite, and S. Sorella, J. Chem. Phys. 121, 7110 (2004).
(b) S. Sorella, M. Casula, and D. Rocca, J. Chem. Phys. 127, 014105 (2007).
[77] D. Domin, B. Braida, and W.A. Lester Jr., J. Phys. Chem. A 112, 8964 (2008).
[78] A.G. Anderson and W.A. Goddard III, J. Chem. Phys. 132, 164110 (2010).
[79] B. Braida, personal communication.
July 20, 2011 9:6 9in x 6in b1189-ch08 Solving the Schrodinger Equation
Chapter 9
237
July 19, 2011 11:29 9in x 6in b1189-ch09 Solving the Schrodinger Equation
9.1. Introduction
Very often, the first step carried out in quantum chemistry is the opti-
mization of a wave function describing the particular system under study.
With this available, one could then estimate properties such as binding
energy, electron density moments, and so on.
July 19, 2011 11:29 9in x 6in b1189-ch09 Solving the Schrodinger Equation
parameter set p. The first step of the general Monte Carlo approach is to
rewrite the above equation in a way that can be recognized as identical
to the calculation of a general expectation value of a position-dependent
observable over the coordinate space in statistical physics, i.e.
O = dRO(R)P(R). (9.2)
If we assume for the time being that such a device is indeed available
(vide infra for possible approaches), an estimate for the expectation value
Oloc (p) is provided by
1
M
Oloc (p) Oloc (Rj , p), (9.5)
M
j=1
where M is the total number of points sampled from the distribution
PT (R, p). Such an estimate will converge toward the exact value of
the expectation value for M . Normally, numerical integration
approaches based on a discrete grid of points, such as the generalized trape-
zoidal rule, have a well defined error term depending in a systematic fashion
on the discretization interval of the grid and a convergence to exact results
that is monotonic. This is not so in the case of statistical methods where
it becomes rather fluctuating in nature due to the underlying stochastic
machinery.
If it is reasonable to assume that all the sampled points {Rj }PT (p) are
statistically independent, it is possible to associate a statistical uncertainty
(or standard error) to the average value just computed. This reads
O2loc Oloc 2
(O) = , (9.6)
M1
where it is understood that the average values are computed with respect to
the distribution PT (R, p) and O2loc Oloc 2 represents the variance of
the local operator Oloc (R, p) over the same distribution.
Before turning to the task of distributing points according to PT (R, p),
it is interesting to notice a few formal properties associated with the
calculation of the expectation value of the Hamiltonian. In the limit
T (R, p) = 0 (R), 0 (R) being the exact wave function, and for
Oloc (R, p) = Hloc (R, p), the local operator does no longer depend on the
position in configuration space and it assumes a constant value, the energy
eigenvalue E0 . In this case, the variance of the local operator vanishes and
one reaches an infinite precision on the estimate provided above. It should
also be evident that the same property, usually called zero variance prin-
ciple, applies to any operator that commutes with the Hamiltonian, albeit
there are not many of those objects. This makes the use of the approach
described above to generate an estimate more and more efficient the closer
T (R, p) is to 0 (R).
For non-commuting operators, the property just introduced does not
apply; it is nevertheless possible to improve the statistical accuracy asso-
ciated with the estimate of an expectation value implementing an idea
July 19, 2011 11:29 9in x 6in b1189-ch09 Solving the Schrodinger Equation
9.2.1.2. Sampling of PT
In this section, we will dwell for a while on a few approaches that can
be used to generate a sample of points in configuration space distributed
accordingly to PT (R, p). To make sure that the concept of being distributed
as a specific distribution is as clear as possible, we assume for the time
being that our probability density is mono-dimensional and that it differs
from zero only over a finite interval on the real axis.
If we discretize such interval in subparts of width h and collect the
number of points generated during our sampling that fall inside a spe-
cific subinterval, we could build a histogram of the frequency, or prob-
ability, of hitting a region on the definition interval. In the limit of an
infinite number of points sampled and of infinitesimally small subintervals,
the frequency of falling in any subinterval would become proportional to
July 19, 2011 11:29 9in x 6in b1189-ch09 Solving the Schrodinger Equation
its width so that one could define the value of the associated probability
distribution as
M(x, x + x)
p(x) = lim . (9.9)
Mx x
Here, M(x, x + x) is the number of points fallen in the interval [x, x + x]
(see Fig. 9.1) and the limit is taken in such a way that the number of samples
largely exceeds 1/x.
In discussing the approaches commonly used to sample points in 3N
dimensions, we start noticing that the analytical form of a distribution can
be quite complex (e.g. it contains several maxima). Thus, standard trans-
formation approaches [6] that start with random numbers {} distributed
according to a given distribution and transform the latter into a new set
{ f()} by means of an auxiliary function f(x) are not suitable for this
task. Unfortunately, this happens despite the advantageous fact that the new
set would contain independent variates (another name for random numbers)
if independent objects compose the original one.
The general framework used to produce a sample of points is based on
the theory of Markov chains (see Box 9.1), whose main idea is to iteratively
displace points in configuration space with an appropriate displacement
rule so that in the limit of many iterations the arbitrarily chosen initial
configurations will end up being distributed according to PT (R, p). The
displacement rule T(R R ) gives the probability that the systems being
in position R will jump to position R. As discussed in Box 9.1, the
stochastic transition matrix can be written as product of a displacement
probability D(R R ) and an acceptance probability A(R R ).
July 19, 2011 11:29 9in x 6in b1189-ch09 Solving the Schrodinger Equation
In this way, the overall jump R R is broken down into a part that
can be interpreted as a random displacement with arbitrary distribution and
a correction (acceptance) step that eliminate the bias introduced by a non
optimal choice for D(R R ).
Is the choice of the analytical form for D(R R ) and A(R R )
inconsequential? If D(R R ) guarantees that all the states available to the
systems can be visited with non zero probability within a finite, albeit very
large, number of steps, the form of the displacement matrix controls only the
efficiency of the exploration. Consequently, it defines also the quality of
the statistical properties associated to the averages to be computed.
To comprehend why this happens, one must remember that the
sequential strategy employed to generate samples distributed according to
PT (R, p) does not produce independent configurations, but rather points in
July 19, 2011 11:29 9in x 6in b1189-ch09 Solving the Schrodinger Equation
quantum state has sufficient flexibility. However, both the cost of selecting
an appropriate parameter set and the quality of the results depend on the
strategy employed as described in the next sections.
throwing darts randomly at the target. The more diffused are the darts, the
higher their number to hit the maximum score.
In practical calculations, the distribution of configurations is usually
chosen as the square of a trial wave function with guessed parameters;
after a few minimization steps (e.g. using a conjugated gradients routine),
such distribution is updated running a Monte Carlo sampling of the newly
obtained T2 (R, p) in order to reduce the weight dispersion for the following
few minimization steps [11].
It is also important to notice that the variance of the Monte Carlo esti-
mator for the dispersion of the local energy may not be bound despite the
fact that the average variance of the local energy is itself always well defined
for an electronic system. In other words, whereas H 2 exists, H 4 may
not. This happens when electron-nucleus cusp conditions are not exactly
satisfied due to the presence of 1/r4 that diverge too quickly to be com-
pensated by the local volume element. To eliminate this difficulty, one has
to use a trial wave function that exactly satisfies all the interparticle cusp
conditions in order to compensate the divergence of the potential with an
opposite one in the local kinetic energy. The net effect of this requirement is
to forbid the use of Gaussian atomic basis sets unless local correction mea-
sures are taken [12]. Alternatively, one can favour the usage of Slater-type
basis sets [13].
As a final comment, we highlight the fact that minimizing the variance
of the local energy can also be interpreted as a fitting problem [11, 14].
Here, the parameters of the trial wave function should be chosen in order
to reduce the least square error between local and average energies. Under
this light, a relevant issue is the implicit assumption of normally distributed
errors made in the Monte Carlo estimator. A recent analysis of this problem
[14] has clearly identified the latter assumption as a problematic one due
to the inappropriate weighting assigned to outlier configurations. The latter
are necessarily present when optimizing molecular wave functions due to
a non-Gaussian shape of the probability distribution function for the local
energy. To correct for this difficulty, it has been proposed to minimize the
average absolute deviation of the local energy from its average
M
1
M |Eloc (Ri , p) Eloc (p)|, (9.16)
i=1
a choice that implicitly assumes a distribution for the values of the local
energy exp[|Eloc (R, p) Eloc (p)|] (i.e. with a slowly decaying tail at
low energy).
July 19, 2011 11:29 9in x 6in b1189-ch09 Solving the Schrodinger Equation
2 2 ln ln 2
= + , (9.18)
p2 p2 p
with p representing a generic parameter of the wave function. This set
of equations clearly paved the way for the usage of the Newton method
for the search of the minimum in parameter space. In this respect, the
great advantage provided by a quadratic method as Newtons one is its
intrinsic ability of defining the vector of the displacements in parameter
July 19, 2011 11:29 9in x 6in b1189-ch09 Solving the Schrodinger Equation
In the most general case, the correct spatial symmetry can be enforced
using the Young tableaux [19], which for a chosen N and N value allow
one to build the minimum number of permutation operators that satisfy the
antisymmetry requirement for identical spin electrons in the absence of both
internal or external magnetic fields. Unfortunately, the cost of evaluating
all the permuted terms scales as
N (N 1) N (N 1)
2min(N ,NN ) 2 2 2 2 , (9.21)
thus making this approach fairly time consuming even though it remains
very general. In this respect, it should come as no surprise that it has been
found feasible only for small or exotic systems containing up to five
leptons [2024], for which, however, results have challenged and occa-
sionally superseded more standard molecular orbital approaches.
For larger systems, it has now become customary to exploit as much as
possible the information provided by ab initio electronic structure codes
and to write T as
Nconf
T = eJ
cl det[i,l ]det[j,l ] (9.22)
l=1
To set the stage for introducing the diffusion Monte Carlo method, we feel
it is important to reiterate a few key ideas about what has been described
in the preceding sections. First, VMC allows one the complete freedom in
choosing a trial wave function and to estimate the corresponding average
values of the properties. Second, the results still depend on the chosen TWF
despite the increased freedom of choice. Thus, one may strive to invent
methods that somewhat correct TWF deficiencies in a systematic way as
it is done for HartreeFock TWFs using Configuration Interaction or CC
theories.
In the realm of MC techniques, one can get much better results, in
principle exact ones, by means of the diffusion Monte Carlo method. The
latter is capable of projecting out the excited state components from any
TWF and of sampling the exact (or quasi-exact) ground state employing
a mathematical similarity between the Schrodinger and the generalized
diffusion equation first noticed by Fermi in 1945.
9.3.1. Generalities
The original derivation of the diffusion Monte Carlo method (DMC) [25]
was based on the observation that the time-dependent Schrodinger equation
in imaginary time = it
1
= 2 V(R) (9.23)
2
is analogous to a generalized diffusion equation in the presence of sinks
and sources,
C
= D 2 C k(R)C (9.24)
t
assuming a diffusion constant D = 1/2 and a rate constant k dependent
on the position R. In diffusion process molecules move around randomly,
July 19, 2011 11:29 9in x 6in b1189-ch09 Solving the Schrodinger Equation
Continued
July 19, 2011 11:29 9in x 6in b1189-ch09 Solving the Schrodinger Equation
weights along the simulation diverges: in the long time limit, a walker whose
weight has exponentially grown dominates the rest of the sample. To avoid
that, the reweighting process is substituted with a stochastic kinetic process
generating n copies with n = int[ + w], where is a uniform random
number from the interval [0,1]. Walkers experiencing an unfavourable
potential are deleted, while n copies of those feeling a favourable potential
are replicated. The fluctuation of the walker number introduces a bias that
can be avoided introducing a minimal stochastic reconfiguration of the
population [27].
For potentials unbound from below, the branching process can result in
wild fluctuations, but so-called importance sampling can cure this problem.
In fact, multiplying 0 by a known trial function T , a new mixed distri-
bution f(R, ) = 0 (R, )T (R) is sampled simulating the equation
f(R, ) 1
= 2 f(R, ) + [F(R)f(R, )]
2
(Eloc Eref )f(R, ) = Lf(R, ) (Eloc Eref )f(R, ), (9.29)
where F(R) = T1 T is the quantum force, Eloc = T1 HT is the local
energy, and L is the FokkerPlanck (Schmoluchowski) operator. First, a
drift term is introduced that drives the walkers towards regions where T is
large. Second, the reweighing term now depends on the local energy instead
of on the potential. Also, the fluctuations of the local energy of a decent trial
wave function are fairly small thanks to a cancellation of diverging terms in
both the potential and local kinetic energy. As approximate representations
for eL spoil the detailed balance condition, the walker move is usually
accepted with probability
form: in this way DMC enables the calculation of very accurate results, but
on a model problem.
Worth stressing is the fact that the lowest state of a given space and/or
spin symmetry can be easily computed provided a trial wave function of
appropriate symmetry. For excited states of same symmetry as lower states,
the assumption that the DMC energy is greater than or equal to the energy of
the lowest exact eigenfunction with the same symmetry as the trial function
is valid only if the trial function transforms according to a one-dimensional
irreducible representation of the symmetry group of the Hamiltonian [40].
Furthermore, each state requires an independent simulation and so it is
difficult to extract energy differences. To cope with these issues, a method
has been devised that combines DMC and the variational principle [41],
generating simultaneously matrix elements and so many states orthogonal
to lower states. However, this method suffers from statistical noise, which
is an occurrence directly connected with the sign error.
DMC simulations has, however, not yet been attempted due to its non-local
nature.
Second, Assaraf and Caffarels approach does not take into account
that there are additional contributions to the force values that come from
the variation of the wave function with the nuclear coordinates, unless the
wave function used to represent the electronic state is exact or fully opti-
mized with respect to the average energy. Such contributions are known as
Pulays correction [51] and are familiar within the quantum chemistry com-
munity. They can also be introduced in the QMC context, as demonstrated
by Casalegno et al. [15], who reported substantial improvements in the
average values of the forces, equilibrium distances, harmonic frequencies
and their anharmonic corrections [52]. However, this comes at the cost of
an extensive optimization of some of the parameters of the wave function
by minimizing the energy, which no longer represents a major problem as
discussed in previous sections.
As an alternative, one could try to satisfy the HellmannFeynmann
hypothesis sampling the exact electron density, i.e. obtaining 02 instead
than the mixed distribution. Even this, however, is only a partial solution
of the problem given the fact that for fermionic systems the exact nodal
surfaces are not known in advance. In turn, this means that the 02 distri-
bution is exact only with respect to the boundary condition imposed by the
particular choice of trial wave function and that the related electron density
still contains an error difficult to estimate a priori [53].
In principle, the problem of structural determination can also be
confronted without force calculations, but only with energies with mod-
erate statistical uncertainties by a Bayesian inference method [54]. In this
case, however, foreseeing the realm of possible applications is however
made difficult by the lack of direct experience from a wide community of
researchers.
Before concluding this section, it is important to recall that the capa-
bility of computing reliable atomic forces would also pave the way for
the calculation of other energy derivatives, thus opening the chance for
a very accurate estimate of infrared intensities as well as other response
properties.
these, we can certainly list the Dirac delta operator that is required to model
spectroscopic methods such electron spin resonance and nuclear magnetic
resonance by means of electronic structure calculations [55]. Whereas such
a task is quite simple to carry out whenever a model wave function is written
as a linear combination of molecular orbital determinants so as to require
only inexpensive post-processing of the wave function, the main issue con-
fronted by mainstream quantum chemistry is the selection of appropriate
models since it requires a substantial amount of correlation to be injected.
The requirements also include the development and use of specialized basis
sets capable of describing electron-nucleus coalescence regions with high
accuracy.
Clearly, QMC may play an important role in this arena thanks to its
intrinsic capability of building fully correlated, and yet compact, models.
However, the intrinsically stochastic nature of QMC and its discrete
representation for the electron density raise a few issues when it comes
to produce an estimate for operators such as the Diracs delta, which has a
very limited set of points over which it returns non-zero values.
In this respect, many initial attempts of estimating (r r ), with r
being the position of a specific nucleus, addressed the issue by substituting
a sequence of simple functions that weakly converge to the correct operator
for a limit value of some parameter [24, 56]. Despite some success, such
an approach is intrinsically biased due to the diverging variance of the
estimators employed in the calculation, thus limiting its application to small
systems for which simulations can always be run long enough to control
the statistical errors.
Alternatively, one could exploit the differential identity 4(r) = 2
(1/|r|) and integrate by parts to obtain an estimator that contains only
position-dependent quantities. Among the latter, however, a term propor-
tional to 1/r2 is also present, which clearly has an unbound variance
and which requires the regularization approach discussed above to be
applied [57].
More recently, Chiesa et al. [58] proposed an alternative approach
developed by Ceperley and Alder [59] to compute muon-sticking prob-
abilities into a practical form for estimating the coalescence probability
between electron-positron pairs by separating the contributions coming
from the different parts of the estimator. Given the excellent performance
demonstrated by the latter idea, Hakansson and Mella [60] morphed again
the same approach in order to improve its efficiency when it comes to
compute the electron and spin densities on top of nuclei. In short, the general
idea behind this scheme is to analytically eliminate the Diracs delta for a
July 19, 2011 11:29 9in x 6in b1189-ch09 Solving the Schrodinger Equation
In this way, any usual VMC scheme devoted to sample T2 can be
used to estimate the electron density, at least in principle, by averaging
f(rr )T2 (r ,r2 ,...,rn )
T2 (r,r2 ,...,rn )
. In practice, however, a robust estimate requires the
usage of an alternative sampling distribution that is strictly positive; this
is due to the fact that numerator and denominator in the new estimator do
not have completely overlapping zero sets. This fact introduces diverging
behaviour in some regions of configuration space. To make the overall
scheme efficient, the alternative distribution is built, shifting the deter-
minantal part of the trial wave function, i.e. re-using quantities that are
already computed; with the proper implementation, the new approach [60]
was found to be an order of magnitude more efficient than the approach
previously implemented [57].
9.5. Conclusions
Bibliography
Chapter 10
This chapter first discusses real-space grid methods for solving the Kohn
Sham equations of density functional theory. These approaches possess
advantages due to the relatively localized nature of the Hamiltonian
operator on a spatial grid. This computational locality and the physical
locality due to the decay of the one-particle density matrix allow for
the development of low-scaling algorithms. The localized nature of the
real-space representation leads to a drawback, however; iterative pro-
cesses designed to update the wave functions tend to stall due to the
long-wavelength components of the error. Multigrid methods aimed at
overcoming the stalling are discussed. The chapter then moves in a
different direction motivated both by 1) the relatively large computa-
tional and storage overheads of wave-function-based methods and 2)
possible new opportunities for computing based on special-purpose mas-
sively parallel architectures. Potential alternative approaches for large-
scale electronic structure are discussed that employ ideas from quantum
Monte Carlo and reduced density-matrix descriptions. Preliminary work
on a FeynmanKac method that solves directly for the one-particle
density matrix using random walks in localized regions of space is
outlined.
271
July 19, 2011 11:29 9in x 6in b1189-ch10 Solving the Schrodinger Equation
10.1. Introduction
surprises have emerged. Most of the quantum simulations have been con-
ducted at the DFT level with gradient-corrected exchange-correlation func-
tionals [40]. This level of theory allows for efficient numerical solution
of the KohnSham equations, but possesses a drawback too: namely, dis-
persion interactions are not properly represented [4144]. Dispersion inter-
actions are universal and are due to electron correlation effects [45, 46].
The interactions can be non-local, and thus the near-local gradient cor-
rected functional cannot mimic the exact behavior. Dispersion accounts for
about 30% of the binding energy of the water dimer [44]. Computations of
the phase diagram of water with the DFT simulations yield a density nearly
20% too low for liquid water at atmospheric pressure [41]. This is a rather
sobering result, considering the degree of computational effort required for
the simulations. If dispersion interactions are included at an approximate
level, the density is closer to the experimental value [42]. Nevertheless,
these results point out that we still have a long way to go in developing
accurate models of liquid water at the quantum mechanical level. Finally,
it has been shown that nuclear quantum effects are significant in water,
adding another layer of complexity [4749]. Progress is sorely needed, and
that progress wont come purely from larger parallel computers (although
such machines may stimulate part of the progress).
The QMC method offers one possible alternative for the long term
[1419]. In recent years, QMC has become a method of choice for accurate
quantum calculations on relatively large systems. QMC can provide predic-
tions of electronic energies that include nearly all of the electron correlation
[1419]. Extensive algorithmic progress has been made in extending the
method to larger systems, and several software packages exist for use by
a wider range of condensed matter scientists [5052]. The QMC method
scales more gently with system size than the other correlated electronic
structure methods (roughly as N 3 or less [15]), and Monte Carlo methods
are perhaps the easiest to implement on parallel machines. Still, the compu-
tational overhead is quite large. Also, although there has been some recent
progress [5355], there is no widely available method for computing forces
and modeling the dynamics of large systems.
This chapter presents a discussion reflecting a progression in one
authors (TLB) group from grid-based methods toward alternatives related
to the QMC approach. The chapter will first present an overview of the
development of real-space grid algorithms for DFT electronic structure
calculations. Over the last ten years, several reviews [68, 1113], and
two recent texts [56, 57], have appeared that discuss the methodology and
applications of real-space approaches. These numerical methods have now
July 19, 2011 11:29 9in x 6in b1189-ch10 Solving the Schrodinger Equation
reached a fairly high level of maturity, and are being applied to large-scale
calculations in chemistry and physics.
Challenging applications have included large biological molecules
[13, 58, 59] and novel nano-structured materials [6, 13], with system sizes
of well over 1,000 atoms. These large-scale applications illustrate important
aspects of frontier problems for quantum modeling. First, the systems are
very large, and thus require efficient algorithms that do not possess severe
scaling bottlenecks. Second, they are inhomogeneous. A biological system
might consist of a peptide [13, 59] or a short DNA strand [58] solvated in
water; each of these is a very large molecule embedded in a sea of many
smaller molecules, with strong intra-molecular and molecule-water inter-
actions. A nano-structure of current interest consists of a large organic
molecule sandwiched between two conducting electrode materials (for
example, gold); computing the current through the molecule as a function
of applied voltage can lead to insights into the possible switching behavior
of the molecular device [13, 60]. This last example displays another inho-
mogeneity most of the region between the electrodes, except for near
the organic molecule, is vacuum. Can we devise a computational method
that allows us to neglect this vacuum region in solving the Schrodinger
equation?
As we discuss in detail below, the motivation for the development of
real-space methods came from some drawbacks of traditional plane-wave
calculations. First, the real-space approach leads to a relatively spatially
localized (or banded) representation of the Hamiltonian operator, in contrast
to the plane-wave method. The localized representation in turn makes the
resulting algorithms more suitable for parallel computing, a major thrust of
modern computational science. Second, at the physical level, the effect of
moving one atom a small amount propagates only a short distance in space
for most systems [61, 62]. Thus a more localized representation coupled
with the physical localization can lead to efficient low-scaling algorithms
[6365].
Moving to the real-space representation comes with a cost, however.
Namely, the iterative methods typically used to update the wave func-
tions tend to stall (critical slowing down, or CSD) [66, 67]. This effect
occurs due to the long-wavelength components of the errors in the initial
approximation to the wave functions. The localized iterations cannot effi-
ciently remove that error. Multigrid methods [66, 67], developed in applied
mathematics in the 1970s by Achi Brandt and others, attempt to overcome
July 19, 2011 11:29 9in x 6in b1189-ch10 Solving the Schrodinger Equation
occurs between the electrons and nuclei, and the energy is approximately
conserved for the nuclear motions. After updating the orbitals, they need
to be re-orthogonalized, an N 3 scaling step. While the CP simulations are
costly, they have led to a whole generation of quantum models of complex
condensed phase systems at the DFT level. The CP method is described in
detail in [56]. Other algorithms converge the electronic degrees of freedom
to the ground state for each time step in a dynamics simulation [70].
Besides all of the advantages of plane-wave methods, there are some
drawbacks. The major one is that plane waves are completely delocalized
in space. Imagine moving one nucleus at a specific site in a large system.
The electrons then redistribute so as to screen out the effect of moving
the nucleus. For systems with a band gap, the screening is exponential
[62, 71]; even for metals at zero temperature, the screening is algebraic
(and likely exponential at finite temperature) [72]. Thus the disturbance
created by moving the nucleus a small amount is localized. In a plane-wave
calculation, all of the plane-wave coefficients need to be updated to recreate
the local disturbance. As another example, if plane waves are used to study
a molecule or finite cluster, they must add up to yield a near-zero result for
the electron density away from the region of interest. This basic physical
feature led to an interest in representing the orbitals directly on grids in real
space [7]. These real-space methods are also fully numerical in the sense
that a single parameter, the grid spacing, can be reduced to yield a desired
level of convergence. It is easy to see that if the physical effect of moving an
atom is relatively localized in space, the orbitals only need to be updated in
that local region. We mention that one of the linear-scaling approaches, the
Order-N Electronic Total Energy Package (ONETEP), employs a localized
psinc basis built from plane waves, and exploits advantages of both plane-
wave and localized real-space representations to optimize computational
efficiency [59].
To get an intuitive idea of grid methods, we will consider the sim-
plest possible case, solution of the Schrodinger equation in one dimension
(atomic units are assumed in this chapter):
1 d 2 (x)
+ V(x)(x) = E(x). (10.1)
2 dx2
For the present discussion, we will consider the finite-difference (FD)
method [7, 8, 73]. The resulting equations for the finite-element (FE) method
turn out to be quite similar [7476]; there are important differences between
the FD and FE methods, however, including the fact that the FE method is
a variational (localized basis set) method, while the FD method is not.
July 19, 2011 11:29 9in x 6in b1189-ch10 Solving the Schrodinger Equation
S [] 1 d2
= + V E = 0. (10.7)
* 2 dx2
We see that we get back the Schrodinger equation with an energy eigenvalue
E. That eigenvalue comes from the normalization constraint.
With what we just did, we can obtain the ground state wave function and
eigenvalue, but what about the higher lying states? To extend this approach
to multiple states as we need to do, for example, in KohnSham DFT, then
we take the functional derivative with respect to each of the states, and we
maintain the orthonormality of all the states. That is, we need to make sure
each state is normalized, and each pair of states is orthogonal. A Gram
Schmidt procedure is a numerical approach to enforce these constraints
[79]. The problem with this (for large systems) is that enforcing the con-
straints is a global operation that needs to be performed over the whole
system volume, and that leads to N 3 scaling for the solver. These obser-
vations led to the development of methods that scale better with system
size [7, 10, 12, 22, 59, 80]. Those methods are based on the fact that the
one-particle density matrix (discussed below) decays in magnitude as we
move away from the diagonal element in real space.
How do we obtain an iterative procedure for solving for the eigenfunc-
tions and eigenvalues? Once we have the functional derivative of Eq. (10.7),
July 19, 2011 11:29 9in x 6in b1189-ch10 Solving the Schrodinger Equation
coarse-grid solution, and iterate a few more times on the fine grid. If this
process is continued recursively to successively coarser grids, in principle
errors of all wavelengths can be efficiently removed, and the solver can
obtain the solution with only several (maybe ten) iterations on the finest
grid. This is impressive!
The origin of the CSD problem is related to the update matrix in the
chosen iteration scheme (here well assume we are using the weighted
Jacobi iteration [67]). The eigenvalues of the update matrix determine
the rate of convergence for the modes of a given wavelength. The closer the
eigenvalue is to one, the slower the rate of convergence basically, the
update matrix decimates errors over a range of wavelengths, and the longer-
wavelength modes possess eigenvalues that approach one as the grid spacing
shrinks. The eigenvalues for those longer-wavelength modes (small k) are
given approximately by
2 2
k h
k 1 , (10.9)
2
where is the weighting parameter in the weighted Jacobi iteration scheme
(which must be less than one), and h is the grid spacing (see Ref. [8] for
the derivation). Thus we can see that, as the grid spacing gets smaller, the
eigenvalue for the longest-wavelength mode approaches one. By passing
to a coarser grid, the update matrix possesses long-wavelength eigenvalues
that decrease relative to the finer level, thus improving the convergence for
those modes of the error. This is a basic principle of the multigrid method.
An important point is that the coarse-grid problem needs to be con-
structed in a highly specific way in order for the algorithm to fully converge
to the exact numerical result on the grid; that is, if we iterate the multigrid
process many times, we should be able to drive the solution to machine
precision errors on the finest grid. Alternatively, we can say that, if we had
the exact fine grid solution and passed it to the next coarser grid, nothing
should happen.
Consider a simple problem such as numerical solution of the Poisson
equation of electrostatics:
2 (r) = 4(r), (10.10)
where is the electrostatic potential and is the charge density. On the
finest grid labeled by h, we can express this equation in FD form as
Lh U h = f h , (10.11)
where Lh is the FD Laplacian operator, U h is the exact grid solution for the
potential, and f h is the right side of Eq. (10.10). The current approximation
July 19, 2011 11:29 9in x 6in b1189-ch10 Solving the Schrodinger Equation
to the fine grid. These operations are discussed in more detail in Refs. [8]
and [67].
It is relatively easy to see that we need an equation different from
Eq. (10.11) on the coarse grid. If we were to set the coarse-grid equation to
LH U H = f H = IhH f h (10.12)
and iterate the problem there, followed by a correction on the fine grid, we
would see that, even with the exact solution from the fine grid, there would
be a net correction. Achi Brandts idea [66] was to modify the coarse-grid
equations to remove this problem:
LH U H = f H + H , (10.13)
where
H = LH IhH uh IhH Lh uh . (10.14)
The grid function H is called the defect correction (which is only a property
of the current fine-grid approximation). Here we used the current approx-
imation to the potential uh to see that H changes as the solution evolves
towards the exact grid result (after which it does not change). Now notice
that, if we insert the exact solution from the fine grid, we obtain an identity:
LH U H = f H + LH IhH U h IhH Lh U h
= f H + LH IhH U h IhH f h
= LH IhH U h (10.15)
Thus, with the inclusion of the defect correction, we have an equation that
satisfies the important condition of zero-correction-at-convergence. So the
problem is passed to the coarser grid, iterated there using Eq. (10.13), and
then the fine-grid solution is corrected as follows:
uh uh + IH (u IhH uh )
h H
(10.16)
The above discussion pertains to two grids only; the process can be extended
to a range of coarse grids, with some minor modifications [8, 66]. Also,
similar strategies, with additional features, are available for solving eigen-
value problems [80, 8588]. The inclusion of the defect correction in
July 19, 2011 11:29 9in x 6in b1189-ch10 Solving the Schrodinger Equation
Eq. (10.13) is termed the Full Approximation Scheme (FAS). The FAS
method can be used to solve nonlinear problems also, such as the Poisson
Boltzmann equation of electrostatics [7].
For relatively simple linear problems like solving the Poisson equation,
the ideal multigrid solver behavior is observed, meaning a total of about ten
or fewer iterations on the finest grid, and linear-scaling computational cost.
For solving the nonlinear self-consistent eigenvalue problems in quantum
chemistry, however, other issues arise [88]. These issues have limited (to
some extent) the general utility of multiscale approaches for these tough
problems.
The main limitation in developing multiscale eigenvalue solvers has
been the fact that the higher eigenfunctions are oscillatory. Crudely
speaking, the coarsest grid must have enough resolution to approximately
represent the wiggles in the highest energy eigenfunction. If we go to even
coarser grids, the solver will stall or even diverge. Thus, we can obtain sig-
nificant enhancement in the solver efficiency by using a couple of coarser
grid levels to accelerate the solution process, but the advertised multigrid
efficiency is lost. In practice, that might mean the requirement of a few tens
of iterations on the finest scale to obtain adequate convergence, compared
with ten or fewer iterations for a Poisson problem.
A great deal of development has occurred related to real-space solvers in
DFT, and the progress has by no means been related solely to FD represen-
tations and multigrid solvers. Many of these developments and extensive
applications are covered in a recent Physica Status Solidi B issue [89]
and the review of Ref. [8]. Time-dependent extensions are discussed in
Refs. [90] and [91]. Having seen the progression from a few real-space
papers in the early 1990s to many large-scale real-space solver methods at
the present time, it is clear that real-space methods will continue to com-
prise one major avenue for further developments in large-scale electronic
structure.
(r1 , r2 , r3 , . . . , rN )dr2 . . . drN (10.17)
2 (r1 , r2 , r1 , r2 ) =
* (r1 , r2 , r3 , . . . , rN )
(r1 , r2 , r3 , . . . , rN )dr3 . . . drN . (10.18)
We add the subscripts 1 and 2 to the DMs here in order to keep track of
the one and two particle forms. We can obtain the 1-DM from the 2-DM
by integration over the r2 coordinates, so the 1-DM information is con-
tained within the 2-DM. Note that the prime is omitted on the second r2 in
Eq. (10.18). We can restrict the variables in this way since the exact total
July 19, 2011 11:29 9in x 6in b1189-ch10 Solving the Schrodinger Equation
function with Monte Carlo methods would be a longer-term goal, and that
will be briefly discussed at the end of the chapter.
A clear discussion showing how the HartreeFock total energy can be
expressed in terms of the 1-DM is given in the book by Parr and Yang [20].
Similarly, the total energy in KohnSham DFT can be expressed in terms of
the 1-DM. Ref. [20] also discusses the constrained minimization that leads
to the HartreeFock equations in 1-DM form. There are two constraints,
one for conservation of the number of electrons, and one for idempotency
of the 1-DM:
(r, r)dr = N (10.20)
(r, r )(r , r )dr = (r, r ). (10.21)
The second constraint, Eq. (10.21), follows from the fact that, for a 1-DM
constructed from a wave function that is a single Slater determinant, the
density matrix is given by
N/2
(r, r ) = 2 i* (r)i (r ), (10.22)
i=1
where the i (r) are the eigenfunctions that solve the HartreeFock or Kohn
Sham equations. (Here we assume doubly occupied states with no net spin.)
The first constraint, Eq. (10.20), is global but easy to enforce: simply update
the density matrix by some process and then rescale to maintain the correct
particle number. The second, Eq. (10.21), is again global, and very difficult
to maintain during an iterative process. Various numerical approaches, such
as the McWeeny purification (discussed in Ref. [56], pgs. 4634), have been
implemented in large-scale codes to iteratively enforce idempotency as we
move toward convergence [59]. Is there another way to effectively enforce
this difficult constraint?
Another point to note is that, in order to construct the total energy and
the electron density (two of the primary goals of the calculation), say in a
KohnSham-type DFT calculation, all we need is the diagonal element of
(r, r ) (the electron density), and nearby points that determine the kinetic
energy:
1 2
diffusion process should sample the ground state wave function at long
pseudo-times. If we vary the parameter E in order to stabilize the diffusion
process (which is equivalent to approximately maintaining the norm of the
wave function), we should also obtain one estimate of the energy eigenvalue.
While the above picture is correct, it turns out that, for strongly inter-
acting systems like electrons and nuclei, the statistical noise gets completely
out of hand if Eq. (10.26) is solved directly with Monte Carlo methods. To
deal with this issue, Grimm and Storer [110] introduced an importance
sampling alternative to Eq. (10.26) that uses a trial function to enhance
the sampling in important regions. Instead of sampling directly the wave
function , we attempt to sample the product of and a trial function T
that is chosen to be as close as possible to the true wave function; we call
this composite function f = T . It is then a relatively easy exercise to
show that the following differential equation for f reduces to Eq. (10.26)
(here a one-dimensional notation is used for simplicity, but all the formulas
are easily generalized to the high-dimensional case.):
f 1 f
= Ff (ET (x) E)f
2 x x
1 f ln T2
= f (ET (x) E)f,
2 x x x
1 2 f ln T
= f (ET (x) E)f (10.27)
2 x2 x x
where F is a drift force that leads to enhanced sampling in regions where
the trial function has a large magnitude. The trial energy ET (x) is given by
ET (x) = HT (x)/T (x). (10.28)
The trial energy is a function of position since the trial wave function is
not the exact eigenfunction; it becomes increasingly smooth as the trial
function becomes more accurate, however. The drift force
ln T2
F= (10.29)
x
can be interpreted as minus the derivative of a (drift or guiding) potential
VD (x) = ln T2 , (10.30)
where the minimum in the potential occurs at the maximum values of
the trial function. Thus addition of this drift term drives the sampling
July 19, 2011 11:29 9in x 6in b1189-ch10 Solving the Schrodinger Equation
into the regions of large trial function values; if the trial function closely
approximates the true wave function, then the sampling noise is reduced
significantly.
What has been gained by transforming to Eq. (10.27)? Two improve-
ments have been made: (1) Now we have a diffusion process with drift
(coming from the trial function) that results in importance sampling in
the more important regions determined by the trial function, and (2) the
potential operator V(x) has been replaced by the trial energy ET (x). So
long as the trial function accurately reflects the important properties of the
true wave function, the trial energy is much smoother than the bare (singular)
potential V(x) that is typically the Coulomb potential in electronic structure
calculations. This importance sampling transformation has allowed for real-
istic calculations that would not have been possible without it.
It is relatively easy to generate the Monte Carlo trajectories that lead to
sampling of the distribution f . An equation like Eq. (10.27) is a diffusion
equation that is termed a forward Kolmogorov equation in the mathematics
literature [111], or a FokkerPlanck equation in physics and chemistry
[112]. The purpose of the random walks for this case is, at long times, to
produce sampling of the equilibrium distribution (and not to yield the actual
solution to the differential equation, f , that in turn would give the exact
wave function ). Below we will discuss an alternative view, the backward
equation, which does yield the solution to the differential equation. The
trajectories that yield the desired sampling are Langevin trajectories:
x+1 = x + b d + d, (10.31)
where
F ln T
b= = (10.32)
2 x
and is a Gaussian random number with unit variance. The last two terms
on the right side of Eq. (10.31) are the drift and diffusion terms, respectively.
We select the initial location with a probability determined by T2 and then
initiate the trajectories determined by Eq. (10.31). The sampling becomes
more accurate as the time step size d approaches zero. The Langevin-type
Eq. (10.31) is called a stochastic differential equation (SDE) in mathematics
[113] (written in discrete form here with a finite time step size d for imple-
mentation on a computer); see below for more discussion of SDEs. The code
for numerically solving the SDE is very simple, only requiring a Gaussian
random number (GRN) generator to produce the values. A walker is a
realization of a trajectory given by Eq. (10.31) on a computer.
July 19, 2011 11:29 9in x 6in b1189-ch10 Solving the Schrodinger Equation
most common trial functions are built from HartreeFock or DFT orbitals,
and terms are added (called Jastrow functions) to improve the treatment of
electron correlation. The DMC approach outlined here can recover a very
high fraction of the correlation energy.
The above discussion has ignored one important point. We have
implicitly assumed that we are locating the ground state for a one-particle
system that ground state possesses no nodes. In reality, the wave function
for a large system depends on 3N coordinates, and the wave function must
be anti-symmetric with respect to the interchange of electron coordinates.
This leads to spatial nodes in the ground state wave function for a many-
electron system, and the exact nodal locations are unknown. The most
common approximation is then to set the nodes at locations determined
by the trial functions, and restrict the sampling to regions with a single
sign for the wave function [19]. There has been extensive theoretical and
computational work showing that such sampling adequately covers the con-
figuration space [15]. The fixed-node approximation is just that, however,
an approximation.
Hamiltonian to act on the 1-DM, and multiply and divide by the 1-DM on
the right side to obtain
N/2
Hx (x, y) = 2 i i (x)i (y)
i=1
N/2
2 i=1 i i (x)i (y)
= N/2 (x, y) = E(x, y)(x, y) (10.36)
2 i=1 i (x)i (y)
or
1 d 2 (x, y)
+ V(x)(x, y) = E(x, y)(x, y), (10.37)
2 dx2
where the Hamiltonian acts on the x coordinate, and we have assumed here
that the eigenfunctions are real (which is true for practical DFT calcula-
tions). The function E(x, y) tends to be relatively smooth, but certainly not
constant.
So instead of having an eigenvalue problem with a single parameter, E,
as in the traditional QMC approach, we now have a differential equation for
the 1-DM with a function E(x, y) that is spatially dependent. We maintain
the symbol E here in analogy to the Schrodinger equation. Dawson and
March [116] pointed out that the function E(x, y) is essentially the Lagrange
multiplier for the idempotency constraint Eq. (10.21). The principal dif-
ficulty introduced then by the reduced 1-DM representation is how to
determine the function E(x, y) as opposed to the simple E when the full
wave function is sampled.
Given the differential equation in Eq. (10.37), we can next invent a
diffusion-type equation analogous to Eq. (10.27), that, at equilibrium,
solves Eq. (10.37):
(x, y) 1 2 (x, y)
= 2
V(x) E(x, y) (x, y) (10.38)
2 x
We can go through the same exercise as in DMC to develop an importance-
sampling version of Eq. (10.38):
f(x, y) 1 2 f(x, y) ln T
= f
2 x2 x x
where f = T , T is the trial 1-DM, and ET (x, y) = HT (x, y)/T (x, y).
The defect correction form analogous to Eq. (10.35) is
g(x, y) 1 2 g(x, y) ln T
= g
2 x2 x x
integral textbooks [118], one author (TLB) began to read more broadly
the literature on numerical solutions of differential equations using SDEs
(like Eq. (10.31) above). As mentioned above, the purpose of the random
walks in a FokkerPlanck approach is to properly sample the equilibrium
(here ground state) distribution; expectation values of operators of interest
can be computed as averages over these random walks. By a slight rear-
rangement of the diffusion-type equation (into the backward Kolmogorov
form), however, we can obtain the actual solution to the differential equation
(f in Eq. (10.39)). This came as a surprise to someone not well-versed in
SDEs. A helpful mathematics book by Freidlin [113] lays out the back-
ground theory to this approach this book is challenging for chemists and
physicists but is also clearly written and gives the solutions required for the
problems addressed here. The text by Gardiner [111] gives a clear physical
explanation of the backward equation, but does not develop the functional
integral formulas for its solution.
Functional integration was introduced into quantum mechanics by
Feynman with his path integral formula for the evolution of a quantum
system in time [119]. In the path (functional) integral method, quantities
are expressed in terms of averages over many paths linking the initial and
final points. The method was subsequently extended to equilibrium statis-
tical mechanics by going to imaginary time as we did above in looking at
both the grid methods and QMC. Mark Kac, inspired by Feynmans new
way of looking at quantum mechanics, examined the mathematical structure
of the formulation in imaginary time [120]. The resulting theory is called
the FeynmanKac approach. It is worthwhile to work through the chapter
on functional integration in Ref. [120] to begin to understand how solutions
to differential equations can emerge from the path integral approach.
Consider Eq. (10.39) above, but write out the derivative of the second
term to obtain
f(x, y) 1 2 f(x, y) ln T f
=
2 x2 x x
2
ln T
+ ET (x, y) E(x, y) f(x, y). (10.41)
x2
This equation is in the backward form (indicated by having the drift term
in front of the first derivative of f ) and can be written in shorthand as
f(, x)
= Lf(, x) c(x)f(, x), (10.42)
July 19, 2011 11:29 9in x 6in b1189-ch10 Solving the Schrodinger Equation
where we suppress the y dependence (we choose a value of y and keep this
fixed to solve for f as a function of x), and
1 2
L= + b(x). (10.43)
2 x2
The drift term in the backward form is
ln T
b(x) = , (10.44)
x
which we can see has the opposite sign from the forward case in Eq. (10.32).
We could view this as the walkers moving on an inverted potential, or
alternatively as moving backwards in time in relation to Eq. (10.31). The
potential operator c(x) includes the three terms of the last contribution (in
brackets) to Eq. (10.41).
What does the seemingly trivial rearrangement of Eq. (10.39) gain for
us? This is where the beautiful mathematics of the SDEs comes in as
claimed above, it allows us to then solve directly for the function sought,
here f , using random walks. The relation of the averages over random
walks to the solution of the differential equation is proved in Ref. [113].
Even after looking at these results for some time, it still seems somewhat
mysterious that adding up quantities computed along the random walks
yields a numerical approximation to the exact solution (to within statistical
errors based on the finite sampling, and finite time-step errors).
We take as our initial condition for f(, x) in Eq. (10.42) the values of
the trial function squared hT = T2 . Also, we attempt to solve the equation
only over a finite domain located within the volume defined by the first
nodal surface of T , and we assume that we know the values of f on that
surface (for example, we might set those values also to the square of the
trial function, which would be an approximation). We will label with k the
values of f on the boundaries D of the domain D
f(, x)|D = k(x). (10.45)
Then it is remarkable that the exact solution can be written as [113]
f(, x) = hT (Xx )<D exp c(Xsx )ds
0
D
+ k(XxD )>D exp c(Xsx )ds . (10.46)
0
July 19, 2011 11:29 9in x 6in b1189-ch10 Solving the Schrodinger Equation
1.5
1
(x,0)
0.5
-4 -2 0 2 4
x
Fig. 10.1. FeynmanKac solution for the 1-DM for six non-interacting electrons
in a harmonic well. The y value was taken as 0. The exact 1-DM is shown as a solid
line, the trial 1-DM is a dashed line, and the diamonds are the numerical solution
generated by Monte Carlo sampling. All units are au.
y chosen at the center of the oscillator well, is shown in Fig. 10.1, along
with the trial function and the numerical solution.
= [u + u] [u] = 0. (10.48)
July 19, 2011 11:29 9in x 6in b1189-ch10 Solving the Schrodinger Equation
The above equation again uses the concept of functional, as we did in using a
variational approach for minimizing the action in Eq. (10.6). In Eq. (10.48)
the density is a functional of the potential u.
Here we carry out a similar strategy, namely by setting the variation
of f with respect to changes in the potential c(x) to zero. For the present
discussion we will omit the second term in Eq. (10.46) to keep the notation
simpler, but the results are easily generalized to include this term. To first
order, the functional variation in f is then
f(, x; c + c) = f(, x; c) h(Xx )<D cds
0
exp c(Xsx )ds . (10.49)
0
the total energy. The walkers would move in a localized region around the
chosen point y, up to near the first nodal surface in the trial 1-DM. We
would be attempting to compute the 1-DM, and the full 3N-dimensional
wave function would not be involved. Thus the methods cost would scale
strictly linearly with the number of electrons.
The ideas proposed here are in a sense intermediate between the standard
KohnSham eigenvalue approach and the integral formulation of DFT men-
tioned above [20]. That integral formulation reflects the true goal of DFT
namely, to represent the problem entirely in terms of the electron density.
There a path integral formalism is invoked, and the density is computed
from the inverse Laplace transform of the thermal Greens function. The
result is a beautiful formal expression in which the kinetic energy piece
contains an oscillatory integrand. Those oscillations can be problematic
computationally, however. The above approach leads to estimates of the
1-DM that do not contain the same problem with oscillations. Rather, the
sign problem is pushed into the trial 1-DM via the fixed-node approximation
using the trial 1-DM.
What are concerns with the above discussion? First, does Eq. (10.50)
provide an accurate estimate of E(x, y)? Is the iterative scheme of
Eq. (10.50) stable? What is the physical meaning of setting the first-order
variation to 0? Is the approximation of making the boundary values in
the second term of Eq. (10.46) equal to the trial function squared (that is
k = hT ) a reasonably accurate one for three-dimensional problems? Do we
need physical information from outside the first nodal surface to obtain an
accurate E(x, y) estimate?
Another question could be raised: for the low-dimensional 1-DM
problem, why not just use grid methods to solve for the 1-DM? Gen-
erally, grid methods might be considered more efficient for this task. A
first point to make is that, in the discussion here, we are not performing
a high-dimensional integration; rather, we are proposing to solve a differ-
ential equation using Monte Carlo methods (the FeynmanKac approach).
For example, we could attempt to solve Eq. (10.41) by iteration on a grid.
Several points can be made that argue against this approach, however: (1)
There are limitations in both accuracy and stability when integrating a
diffusion-type equation on a grid. (2) While the Monte Carlo approach
yields in principle the actual solution to the differential equation with suf-
ficient sampling, the constraints of normalization and idempotency would
have to be imposed during the iterations on a grid. Alternatively, we could
imagine minimizing the total energy by varying the 1-DM values, but we
are still stuck with the nonlocal constraint issue (Eq. (10.21)). The strategy
July 19, 2011 11:29 9in x 6in b1189-ch10 Solving the Schrodinger Equation
The symbols 1, 1 here refer to the initial and final space-time points for the
particle trajectories, and the plus sign means t2 approaches t1 from above.
The last term contains the interaction potential v between the particles.
This equation can be transformed to the imaginary-time domain as we did
above. We see then that there is a differential equation for the 1-GF, but
it involves a higher-order Greens function, namely the 2-GF (leading to a
hierarchy of equations). Since we only need the 1-GF to express the exact
ground state energy, is there a way to estimate the 2-GF, perhaps by some
successive approximations scheme, so as to yield a good approximation to
July 19, 2011 11:29 9in x 6in b1189-ch10 Solving the Schrodinger Equation
the true 1-GF? Ref. [105] points out that the 1-GF can be thought of as the
time-dependent extension of the 1-DM the 1-DM is the zero-time limit
of the 1-GF. Also, in computing the ground state energy from the 1-GF,
we take the limit as 1 1 , but the time-dependent behavior of the 1-GF
is clearly important in allowing it to be used to compute the exact energy,
which is not available from the 1-DM.
The pipe dream would be to develop a stochastic differential equation
approach for solving Eq. (10.51) for the 1-GF, based on successive approx-
imations to the 2-GF. The 1-GF should possess localization properties
similar to the 1-DM, and thus relatively local sampling should again be
possible.
10.5. Summary
(1) We are reaching a wall in computer technology that will limit the
speed of commodity processors. Special purpose machines may need
to be developed in order to move forward in large-scale applications of
quantum mechanics to bio-molecules and nano-materials.
(2) We are reaching a wall in wave function approaches, as argued by
Kohn [115]. We should thus focus on methods based on the electron
density, the reduced density matrices, and/or Greens function-based
theories.
(3) Real-space grid methods were a first step towards exploiting the near-
locality of the 1-DM.
(4) But wave-function-based grid methods still require large computational
and storage overheads, and the special-purpose machine likely wont
allow for that.
(5) Stochastic differential equation methods for calculating the 1-DM, 2-
DM, and/or the 1-GF may provide a way out, along with clever physical
approximations.
(6) Monte Carlo is the most parallel algorithm (and typically requires
limited storage overhead), so lets exploit that, and sample only in local
regions of space to build up the 1-DM etc. We can imagine 107 walkers
on 107 compute nodes, all working so as to construct the reduced density
matrices in localized regions of space.
(7) Can we develop new ways of incorporating SDE methods into solving
for the one-particle Greens function of condensed matter theory?
July 19, 2011 11:29 9in x 6in b1189-ch10 Solving the Schrodinger Equation
(8) Has everything been tried? Certainly not. In fact, we might just be
entering a new computing era that may require some radical new ideas
in electronic structure calculations.
(9) A major goal is a new set of computational tools to accurately model
problems involving tens of thousands of atoms moving over nano-
or micro-second time scales. Applications might include drug design,
enzymes, ion channels, and the development of novel materials for
nano-scale engineering.
It is our hope that the ideas outlined here might serve to stimulate some alter-
native research directions for solving the Schrodinger equation on novel
massively parallel architectures yet to be developed.
Acknowledgments
Bibliography
[1] A. Szabo and N.S. Ostlund, Modern Quantum Chemistry (McGraw-Hill, NewYork,
1989).
[2] E.V. Lenthe and E. Baerends, J. Comput. Chem. 24, 1142 (2003).
[3] J. Soler, E. Artacho, J. Gale, A. Garcia, J. Junquera, P. Ordejon, and D. Sanchez-
Portal, J. Phys.-Cond. Matt. 14, 2745 (2002).
[4] M. Payne, M. Teter, D. Allan, T. Arias, and J. Joannopoulos, Rev. Mod. Phys. 64,
1045 (1992).
[5] M. Alemany, M. Jain, M. Tiago, Y. Zhou, Y. Saad, and J. Chelikowsky, Comput.
Phys. Commun. 177, 339 (2007).
[6] L. Kronik, A. Makmal, M. Tiago, M. Alemany, M. Jain, X. Huang, Y. Saad, and
J. Chelikowsky, Phys. Stat. Sol. B 243, 1063 (2006).
[7] T.L. Beck, Rev. Mod. Phys. 72, 1041 (2000).
[8] T.L. Beck, Rev. Comput. Chem. 26, 223 (2009).
[9] N. Modine, G. Zumbach, and E. Kaxiras, Phys. Rev. B 55, 10289 (1997).
[10] J. Fattebert, J. Phys.-Cond. Matt. 20, 294210 (2008).
[11] T. Torsti, T. Eirola, J. Enkovaara, T. Hakala, P. Havu, V. Havu, T. Hoynalanmaa,
J. Ignatius, M. Lyly, I. Makkonen, T. Rantala, J. Ruokolainen, K. Ruotsalainen, E.
Rasanen, H. Saarikoski, and M. Puska, Phys. Stat. Sol. B 243, 1016 (2006).
[12] D. Bowler, R. Choudhury, M. Gillan, and T. Miyazaki, Phys. Stat. Sol. B 243, 989
(2006).
July 19, 2011 11:29 9in x 6in b1189-ch10 Solving the Schrodinger Equation
[13] J. Bernholc, M. Hodak, and W. Lu, J. Phys.-Cond. Matt. 20, 294205 (2008).
[14] D. M. Ceperley and L. Mitas, Adv. Chem. Phys. 93, 1 (1996).
[15] W. Foulkes, L. Mitas, R. Needs, and G. Rajagopal, Rev. Mod. Phys. 73, 33 (2001).
[16] W. Lester, L. Mitas, and B. Hammond, Chem. Phys. Lett. 478, 1 (2009).
[17] M. Towler, Phys. Stat. Sol. B 243, 2573 (2006).
[18] R. Needs, M. Towler, N. Drummond, and P. Rios, J. Phys.-Cond. Matt. 22, 023201
(2010).
[19] J.B. Anderson, Rev. Comput. Chem. 13, 133 (1999).
[20] R.G. Parr and W.Yang, Density-Functional Theory of Atoms and Molecules (Oxford
University Press, Oxford, 1989).
[21] Y. Zhao, N. Schultz, and D. Truhlar, J. Chem. Theor. Comput. 2, 364 (2006).
[22] S. Goedecker, Rev. Mod. Phys. 71, 1085 (1999).
[23] B. Doser, D. Lambrecht, J. Kussmann, and C. Ochsenfeld, J. Chem. Phys. 130,
064107 (2009).
[24] T.L. Beck, M.E. Paulaitis, and L.R. Pratt, The Potential Distribution Theorem and
Models of Molecular Solutions (Cambridge University Press, New York, 2006).
[25] D. Asthagiri, L.R. Pratt, M.E. Paulaitis, and S.B. Rempe, J. Am. Chem. Soc. 126,
1285 (2004).
[26] D. Asthagiri, L.R. Pratt, and H.S. Ashbaugh, J. Chem. Phys. 119, 2702 (2003).
[27] D.M. Rogers and T.L. Beck, J. Chem. Phys. 129, 134505 (2008).
[28] D. Rogers and T. Beck, (2010) J. Chem. Phys. 132, 014505 (2010).
[29] Z. Zhao, D.M. Rogers, and T.L. Beck, J. Chem. Phys. 132, 014502 (2010).
[30] E. Guardia, I. Skarmoutsos, and M. Masia, J. Chem. Theor. Comput. 5, 1449 (2009).
[31] M. Dal Peraro, S. Raugei, P. Carloni, and M.L. Klein, Chem. Phys. Chem. 6, 1715
(2005).
[32] A.V. Marenich, R.M. Olson, A.C. Chamberlin, C.J. Cramer, and D.G. Truhlar, J.
Chem. Theor. Comput. 3, 2055 (2007).
[33] S.Y. Noskov and B. Roux, Biophys. Chem. 124, 279 (2006).
[34] D. Asthagiri, P.D. Dixit, S. Merchant, M.E. Paulaitis, L.R. Pratt, S.B. Rempe, and
S. Varma, Chem. Phys. Lett. 485, 1 (2010).
[35] A. Accardi and C. Miller, Nature 427, 803 (2004).
[36] J.Yin, Z. Kuang, U. Mahankali, and T. Beck, Proteins: Struct., Func., and Bioinform.
57, 414 (2004).
[37] Z. Kuang, U. Mahankali, and T.L. Beck, Proteins: Struct., Func., and Bioinform.
68, 26 (2007).
[38] D. Bucher, S. Raugei, L. Guidoni, M. Dal Peraro, U. Rothlisberger, P. Carloni, and
M. L. Klein, Biophys. Chem. 124, 292 (2006).
[39] S. Varma, D. Sabo, and S. Rempe, J. Molec. Biol. 376, 13 (2008).
[40] D. Asthagiri, L.R. Pratt, and J.D. Kress, Phys. Rev. E 68, 041505 (2003).
[41] M.J. McGrath, J.I. Siepmann, I.F. Kuo, C.J. Mundy, J. VandeVondele, J. Hutter,
F. Mohamed, and M. Krack, J. Phys. Chem. A 110, 640 (2006).
[42] J. Schmidt, J. VandeVondele, I. Kuo, D. Sebastiani, J. Siepmann, J. Hutter, and
C. Mundy, J. Phys. Chem. B 113, 11959 (2009).
[43] B. Santra, A. Michaelides, M. Fuchs, A. Tkatchenko, C. Filippi, and M. Scheffler,
J. Chem. Phys. 129, 194111 (2008).
[44] F. Sterpone, L. Spanu, L. Ferraro, S. Sorella, and L. Guidoni, J. Chem. Theor.
Comput. 4, 1428 (2008).
July 19, 2011 11:29 9in x 6in b1189-ch10 Solving the Schrodinger Equation
[45] J. Mahanty and B.W. Ninham, Dispersion Forces (Academic Press, New York,
1976).
[46] W. Kunz, P. Lo Nostro, and B.W. Ninham, Curr. Opin. Colloid Interface Sci. 9, 1
(2004).
[47] T.L. Beck, in Free energy calculations: Theory and applications in chemistry and
biology, edited by A. Pohorille and C. Chipot (Springer, New York, 2007) p. 389.
[48] E. Schwegler, J.C. Grossman, F. Gygi, and G. Galli, J. Chem. Phys. 121, 5400
(2004).
[49] P. Sit and N. Marzari, J. Chem. Phys. 122, 204510 (2005).
[50] CASINO available at http://www.tcm.phy.cam.ac.uk/mdt26/casino2.html.
[51] CHAMP available at http://pages.physics.cornell.edu/cyrus/champ.html.
[52] QWALK available at http://www.qwalk.org/wiki/index.php?title=Main Page.
[53] A. Badinski and R. Needs, Phys. Rev. E 76, 036707 (2007).
[54] A. Badinski, P. Haynes, J. Trail, and R. Needs, J. Phys.-Cond. Matt. 22, 074202
(2010).
[55] S. Chiesa, D.M. Ceperley, and S. Zhang, Phys. Rev. Lett. 94, 036404 (2005).
[56] R.M. Martin, Electronic Structure: Basic Theory and Practical Methods
(Cambridge University Press, New York, 2004).
[57] K. Hirose, T. Ono, Y. Fujimoto, and S. Tsukamoto, First-Principles Calculations in
Real-Space Formalism (Imperial College Press, London, 2005).
[58] T. Otsuka, T. Miyazaki, T. Ohno, D. Bowler, and M. Gillan, J. Phys.-Cond. Matt.
20, 294201 (2008).
[59] P. Haynes, C. Skylaris, A. Mostofi, and M. Payne, Phys. Stat. Sol. B 243, 2489
(2006).
[60] G. Feng, N. Wijesekera, and T. Beck, IEEE Trans. Nanotech. 6, 238 (2007).
[61] W. Kohn, Phys. Rev. Lett. 76, 3168 (1996).
[62] E. Prodan and W. Kohn, Proc. Natl. Acad. Sci USA 102, 11635 (2005).
[63] F. Shimojo, R. Kalia, A. Nakano, and P. Vashishta, Comput. Phys. Commun. 167,
151 (2005).
[64] Z. Zhao, J. Meza, and L. Wang, J. Phys.-Cond. Matt. 20, 294203 (2008).
[65] S. Burger and W. Yang, J. Phys.-Cond. Matt. 20, 294209 (2008).
[66] A. Brandt, Math. Comput. 31, 333 (1977).
[67] W.L. Briggs, V.E. Henson, and S.F. McCormick, A Multigrid Tutorial (SIAM,
Philadelphia, 2000).
[68] W.H. Press, S.A. Teukolsky, W.T. Vetterling, and B.P. Flannery, Numerical Recipes
in C (Cambridge University Press, Cambridge, 1992).
[69] R. Car and M. Parrinello, Phys. Rev. Lett. 55, 2471 (1985).
[70] G. Kresse and J. Furthmuller, Phys. Rev. B 54, 11169 (1996).
[71] S. Ismail-Beigi and T.A. Arias, Phys. Rev. Lett. 82, 2127 (1999).
[72] S. Goedecker, Phys. Rev. B 58, 3501 (1998).
[73] M. Alemany, M. Jain, L. Kronik, and J. Chelikowsky, Phys. Rev. B 69, 075101
(2004).
[74] S.C. Brenner and L.R. Scott, The Mathematical Theory of Finite Element Methods
(Springer, New York, 1994).
[75] J. Pask, B. Klein, P. Sterne, and C. Fong, Comput. Phys. Commun. 135, 1 (2001).
[76] J. Pask and P. Sterne, Model. Simul. Mater. Sci. Engr. 13, R71 (2005).
July 19, 2011 11:29 9in x 6in b1189-ch10 Solving the Schrodinger Equation
[110] R.C. Grimm and R.G. Storer, J. Comput. Phys. 7, 134 (1971).
[111] C.W. Gardiner, Handbook of Stochastic Methods for Physics, Chemistry, and the
Natural Sciences (Springer-Verlag, Berlin, 1985).
[112] N.G. van Kampen, Stochastic Processes in Physics and Chemistry (Elsevier, New
York, 1992).
[113] M. Freidlin, Functional Integration and Partial Differential Equations (Princeton
University Press, Princeton, 1985).
[114] J. Anderson, J. Chem. Phys. 112, 9699 (2000).
[115] W. Kohn, Electronic Structure of Matter Wave Functions and Density Func-
tionals. Available at http://nobelprize.org/nobel prizes/chemistry/laureates/1998/
kohn-lecture.html.
[116] K. Dawson and N. March, J. Chem. Phys. 81, 5850 (1984).
[117] L. Pratt, G. Hoffman, and R. Harris, J. Chem. Phys. 92, 6687 (1990).
[118] L.S. Schulman, Techniques and Applications of Path Integration (Wiley-
Interscience, New York, 1981).
[119] R.P. Feynman and A.R. Hibbs, Quantum Mechanics and Path Integrals
(McGraw-Hill, New York, 1965).
[120] M. Kac, Probability and Related Topics in Physical Sciences (American
Mathematical Society, Providence, 1959).
[121] R.A. Harris and L.R. Pratt, J. Chem. Phys. 82, 856 (1985).
July 19, 2011 11:29 9in x 6in b1189-ch10 Solving the Schrodinger Equation
Chapter 11
313
July 19, 2011 11:29 9in x 6in b1189-ch11 Solving the Schrodinger Equation
The key motivation in the design of efficient linear algebra algorithms for
advanced-architecture computers involves the storage and retrieval of data.
Designers wish to minimize the frequency with which data moves between
different levels of the memory hierarchy. Once data is in registers or the
fastest cache, all processing required for this data should be performed
before it gets evicted back to the main memory. Thus, the main algorithmic
approach for exploiting both vectorization and parallelism in our imple-
mentations uses block-partitioned algorithms, particularly in conjunction
with highly tuned kernels for performing matrix-vector and matrix-matrix
operations (the Level-2 and Level-3 BLAS). Block partitioning means that
the data is divided into blocks, each of which should fit within a cache
memory or a vector register file.
The computer architectures considered in this chapter are:
Vector machines
RISC computers with cache hierarchies
Parallel systems with distributed memory (the communication between
compute nodes happens by explicitly exchanging messages: the memory
is physically and programmatically distributed)
Multi-core computers
Secondly, RISC computers were introduced in the late 1980s and early
1990s. While their clock rates might have been comparable to those of the
vector machines, the computing speed lagged behind due to their lack of
vector registers. Another deficiency was their creation of a deep memory
hierarchy with multiple levels of cache memory to alleviate the scarcity of
bandwidth that was, in turn, caused mostly by a limited number of memory
banks. The eventual success of this architecture is commonly attributed
to the right price point and astonishing improvements in performance over
time as predicted by Moores Law. With RISC computers, the linear algebra
algorithms had to be redone yet again. This time, the formulations had to
expose as many matrix-matrix operations as possible, which guaranteed
good cache reuse.
Thirdly, a natural way of achieving even greater performance levels
with both vector and RISC processors is by connecting them together with
a network and letting them cooperate to solve a problem bigger than would
be feasible on just one processor. This advance results in parallel systems
with distributed memory. Many hardware configurations followed this path,
so the matrix algorithms had to follow this as well. It was quickly discovered
that good local performance has to be combined with good global parti-
tioning of the matrices and vectors.
Any trivial divisions of matrix data quickly uncovered scalability
problems dictated by so-called Amdahls Law: the observation that the
time taken by the sequential portion of a computation provides the minimum
bound for the entire execution time, and therefore limits the gains achievable
from parallel processing. In other words, unless most of computations
can be done independently, the point of diminishing returns is reached,
and adding more processors to the hardware mix will not result in faster
processing.
Finally, the class of multi-core architectures includes both symmetric
multiprocessing (SMP) and single-chip multi-core machines, for the sake
of simplicity. Single-chip multi-core processors constitute a new paradigm
in commodity hardware: instead of increasing frequency and complexity
of a chip, new cores are added on a single die. This is probably an unfair
simplification, as the SMP machines usually have better memory systems.
But when applied to matrix algorithms, both yield good performance results
with very similar algorithmic approaches: these combine local cache reuse
and independent computation with explicit control of data dependences.
The initial success of vector computers in the 1970s was driven by raw
performance. The introduction of this type of computer systems started the
area of supercomputing (see Fig. 11.1). In the 1980s the availability of
July 19, 2011 11:29 9in x 6in b1189-ch11 Solving the Schrodinger Equation
Fig. 11.1. Performance of the fastest computer systems for the last six decades
compared to Moores Law. (DEC VAX-11/780 with its speed of 1 MIPS is not
features as it couldnt compete with its contemporaries from CDC and Cray.)
Ax = b, (11.1)
A = LU, (11.2)
where L is a lower triangular matrix (a matrix that has only zeros above the
diagonal) with ones on the diagonal, and U is upper triangular (with only
zeros below the diagonal). During the decomposition process, diagonal
elements of A (called pivots) are used to divide the elements below the
diagonal. If matrix A has a zero pivot, the process will break with division-
by-zero error. Also, small values of the pivots excessively amplify the
numerical errors of the process. So for numerical stability, the method
needs to interchange rows of the matrix or make sure pivots are as large
(in absolute value) as possible. This observation leads to a row permutation
matrix P and modifies the factored form to:
PA = LU. (11.3)
x = A1 b (11.4)
July 19, 2011 11:29 9in x 6in b1189-ch11 Solving the Schrodinger Equation
and the use of L and U factors suggests the following algorithm for solving
the system of equations:
1. Factor PA into LU (P is applied as we factor A).
2. Solve the system Ly = Pb (this comes by replacing Ux with y in
LUx = Pb).
3. Solve the system Ux = y.
This approach to matrix computations through decomposition has
proven very useful for several reasons. First, the approach separates the
computation into two stages: the computation of a decomposition, followed
by the use of the decomposition to solve the problem at hand. Such sepa-
ration can be important, for example, if different right hand sides are present
and need to be solved at different points in the process. The matrix needs
to be factored only once and reused for the different right hand sides. This
is particularly important because the factorization of A, step 1, requires
O(n3 ) operations, whereas the solutions, steps 2 and 3, require only O(n2 )
operations. Another aspect of the algorithms strength is in storage: the
L and U factors do not require extra storage, but can take over the space
occupied initially by the original matrix A.
For the discussion of coding this algorithm, we present only the com-
putationally intensive part of the process, which is step 1, the factorization
of the matrix.
In the second half of the 1970s the introduction of vector computer systems
marked the beginning of modern Supercomputing. These systems offered
a performance advantage of at least one order of magnitude over con-
ventional systems of that time. Raw performance was the main if not
the only selling argument. In the first half of the 1980s the integration
of vector systems in conventional computing environments became more
important. Only the manufacturers, which provided standard programming
environments, operating systems and key applications, were successful
in getting industrial customers and survived. Performance was mainly
increased by improved chip technologies and by producing shared memory
multi-processor systems. They were able in one step to perform a single
operation on a relatively large number of operands stored in vector registers.
Expressing matrix algorithms as vector-vector operations was a natural fit
for this type of machines. However, some of the vector designs had a limited
ability to load and store the vector registers in main memory. A technique
July 19, 2011 11:29 9in x 6in b1189-ch11 Solving the Schrodinger Equation
subroutine dgefa(a,lda,n,ipvt,info)
integer lda,n,ipvt(1),info
double precision a(lda,1)
double precision t
integer idamax,j,k,kp1,l,nm1
c
c gaussian elimination with partial pivoting
c
info = 0
nm1 = n - 1
if (nm1 .lt. 1) go to 70
do 60 k = 1, nm1
kp1 = k + 1
c
c find l = pivot index
c
l = idamax(n-k+1,a(k,k),1) + k - 1
ipvt(k) = l
c
c zero pivot implies this column is already triangularized
c
if (a(l,k) .eq. 0.0d0) go to 40
c
c interchange if necessary
c
if (l .eq. k) go to 10
t = a(l,k)
a(l,k) = a(k,k)
a(k,k) = t
10 continue
c
c compute multipliers
c
t = -1.0d0/a(k,k)
call dscal(n-k,t,a(k+1,k),1)
c
c row elimination with column indexing
c
do 30 j = kp1, n
t = a(l,j)
if (l .eq. k) go to 20
a(l,j) = a(k,j)
a(k,j) = t
20 continue
call daxpy(n-k,t,a(k+1,k),1,a(k+1,j),1)
30 continue
go to 50
40 continue
info = k
50 continue
60 continue
70 continue
ipvt(n) = n
if (a(n,n) .eq. 0.0d0) info = n
return
end
into a simple, single vector operation. This avoided leaving the opti-
mization up to the compiler and explicitly exposing a performance-critical
operation.
In a sense, then, the beauty of the original code was regained with the
use of a new vocabulary to describe the algorithms: the BLAS. Over time,
the BLAS became a widely adopted standard and were most likely the first
to enforce two key aspects of software: modularity and portability. Again,
these are taken for granted today, but at the time they were not. One could
have the cake of compact algorithm representation and eat it too, because
the resulting Fortran code was portable.
Most algorithms in linear algebra can be easily vectorized. However,
to gain the most out of such architectures, simple vectorization is usually
not enough. Some vector computers are limited by having only one path
between memory and the vector registers. This creates a bottleneck if a
program loads a vector from memory, performs some arithmetic operations,
and then stores the results. In order to achieve top performance, the scope of
the vectorization must be expanded to facilitate chaining operations together
and to minimize data movement, in addition to using vector operations.
Recasting the algorithms in terms of matrix-vector operations makes it
easy for a vectorizing compiler to achieve these goals.
Thus, as computer architectures became more complex in the design of
their memory hierarchies, it became necessary to increase the scope of the
BLAS routines from Level-1 to Level-2 and Level-3.
RISC computers were introduced in the late 1980s and early 1990s. While
their clock rates might have been comparable to those of the vector
machines, the computing speed lagged behind due to their lack of vector
registers. Another deficiency was their creation of a deep memory hierarchy
with multiple levels of cache memory to alleviate the scarcity of bandwidth
that was, in turn, caused mostly by a limited number of memory banks. The
eventual success of this architecture is commonly attributed to the right
price point and astonishing improvements in performance over time as pre-
dicted by Moores Law. With RISC computers, the linear algebra algorithms
had to be redone yet again. This time, the formulations had to expose as
many matrix-matrix operations as possible, which guaranteed good cache
reuse.
As mentioned before, the introduction in the late 1970s and early 1980s
of vector machines brought about the development of another variant of
July 19, 2011 11:29 9in x 6in b1189-ch11 Solving the Schrodinger Equation
algorithms for dense linear algebra. This variant was centered on the
multiplication of a matrix by a vector. These subroutines were meant to
give improved performance over the dense linear algebra subroutines in
LINPACK, which were based on Level-1 BLAS. In the late 1980s and
early 1990s, with the introduction of RISC-type microprocessors (the killer
micros) and other machines with cache-type memories, we saw the devel-
opment of LAPACK Level-3 algorithms for dense linear algebra. A Level-3
code is typified by the main Level-3 BLAS, which, in this case, is matrix
multiplication.
The original goal of the LAPACK project was to make the widely
used LINPACK library run efficiently on vector and shared-memory par-
allel processors. On these machines, LINPACK is inefficient because its
memory access patterns disregard the multilayered memory hierarchies
of the machines, thereby spending too much time moving data instead of
doing useful floating-point operations. LAPACK addresses this problem by
reorganizing the algorithms to use block matrix operations, such as matrix
multiplication, in the innermost loops (see the paper by E. Anderson and
J. Dongarra under Further Reading). These block operations can be opti-
mized for each architecture to account for its memory hierarchy, and so
provide a transportable way to achieve high efficiency on diverse modern
machines.
Here we use the term transportable instead of portable because,
for fastest possible performance, LAPACK requires that highly optimized
block matrix operations be implemented already on each machine. In other
words, the correctness of the code is portable, but high performance is
not if we limit ourselves to a single Fortran source code.
LAPACK can be regarded as a successor to LINPACK in terms of
functionality, although it doesnt always use the same function-calling
sequences. As such a successor, LAPACK was a win for the scientific
community because it could keep LINPACKs functionality while getting
improved use out of new hardware.
Most of the computational work in the algorithm from Fig. 11.3 is
contained in three routines:
One of the key parameters in the algorithm is the block size, called
NB here. If NB is too small or too large, poor performance can result
July 19, 2011 11:29 9in x 6in b1189-ch11 Solving the Schrodinger Equation
11.6. Clusters
Traditional design focus for MPP systems was the very high end of perfor-
mance. In the early 1990s the SMP systems of various workstation man-
ufacturers as well as the IBM SP series, which targeted the lower and
medium market segments, gained great popularity (see Fig. 11.4). Their
price/performance ratios were better due to the missing overhead in the
design for support of the very large configurations and due to cost advan-
tages of the larger production numbers. Due to the vertical integration of
performance it was no longer economically feasible to produce and focus
on the highest end of computing power alone. The design focus for new
systems shifted to the market of medium performance systems.
The acceptance of MPP systems not only for engineering applications
but also for new commercial applications especially for database applica-
tions emphasized different criteria for market success such as the stability of
the system, continuity of the manufacturer and price/performance. Success
in commercial environments became a new important requirement for a
successful Supercomputer business towards the end of the 1990s. Due to
these factors and the consolidation in the number of vendors in the market,
July 19, 2011 11:29 9in x 6in b1189-ch11 Solving the Schrodinger Equation
500
400
300 Vector
Systems
Single Proc.
SMP
200 Const.
Cluster
MPP
100
0
93
98
03
08
19
19
20
20
Fig. 11.4. Main architectural categories seen in the TOP500.
hierarchical systems built with components designed for the broader com-
mercial market did replace homogeneous systems at the very high end
of performance. The marketplace adopted clusters of SMPs readily, while
academic research focused on clusters of workstations and PCs.
In the early 2000s, clusters built with off-the-shelf components gained
more and more attention not only as academic research object but also com-
puting platforms with end-users of HPC computing systems. By 2004 these
groups of clusters represent the majority of new systems on the TOP500
in a broad range of application areas. One major consequence of this trend
was the rapid rise in the utilization of Intel processors in HPC systems.
While virtually absent in the high end at the beginning of the decade, Intel
processors are now used in the majority of HPC systems. Clusters in the
1990s were mostly self-made system designed and built by small groups
of dedicated scientist or application experts. This changed rapidly as soon
as the market for clusters based on PC technology matured. Nowadays the
large majority of TOP500-class clusters are manufactured and integrated by
either a few traditional large HPC manufacturers such as IBM or Hewlett-
Packard or numerous small, specialized integrators of such systems.
At the end of the 1990s clusters were common in academia but mostly as
research objects and not primarily as general purpose computing platforms
for applications. Most of these clusters were of comparable small scale and
as a result the November 1999 edition of the TOP500 listed only seven
cluster systems. This changed dramatically as industrial and commercial
July 19, 2011 11:29 9in x 6in b1189-ch11 Solving the Schrodinger Equation
the ScaLAPACKs changes were much more drastic: the same mathematical
operation now required large amounts of tedious work. Both the users
and the library writers were now forced into explicitly controlling data
storage intricacies, because data locality became paramount for perfor-
mance. The victim was the readability of the code, despite efforts to mod-
ularize the code according to the best software engineering practices of
the day.
thread calls this routine, the other ones wait idly. And since the performance
of DGETF2 is bound by memory bandwidth (rather than processor speed),
this bottleneck will exacerbate scalability problems as systems with more
cores are introduced.
The multithreaded version of the algorithm attacks this problem head-on
by introducing the notion of look-ahead: calculating things ahead of time
to avoid potential stagnation in the progress of the computations. This of
course requires additional synchronization and bookkeeping not present in
the previous versions a trade-off between code complexity and perfor-
mance. Another aspect of the multi-threaded code is the use of recursion
in the panel factorization. It turns out that the use of recursion can give
even greater performance benefits for tall panel matrices than it does for
the square ones.
The algorithm is the same for each thread (the SIMD paradigm), and the
matrix data is partitioned among threads in a cyclic manner using panels
with pw columns in each panel (except maybe the last). The pw parameter
corresponds to the blocking parameter NB of LAPACK. The difference is the
logical assignment of panels (blocks of columns) to threads. (Physically,
all panels are equally accessible, because the code operates in a shared
memory regime.) The benefits of blocking in a thread are the same as they
were in LAPACK: better cache reuse and less stress on the memory bus.
Assigning a portion of the matrix to a thread seems an artificial requirement
at first, but it simplifies the code and the bookkeeping data structures; most
importantly, it provides better memory affinity. It turns out that multi-core
chips are not symmetric in terms of memory access bandwidth, so min-
imizing the number of reassignments of memory pages to cores directly
benefits performance.
The standard components of LU factorization are represented by the
pfactor() and pupdate() functions (see Fig. 11.6). As one might
expect, the former factors a panel, whereas the latter updates a panel using
one of the previously factored panels.
The main loop makes each thread iterate over each panel in turn. If
necessary, the panel is factored by the owner thread while other threads
wait (if they happen to need this panel for their updates).
The look-ahead logic is inside the nested loop (prefaced by the comment
for each panel to be updated) that replaces DGEMM or PDGEMM from
previous algorithms. Before each thread updates one of its panels, it checks
whether its already feasible to factor its first unfactored panel. This min-
imizes the number of times the threads have to wait because each thread
constantly attempts to eliminate the potential bottleneck.
July 19, 2011 11:29 9in x 6in b1189-ch11 Solving the Schrodinger Equation
void SMP_dgetrf(int n, double *a, int lda, int *ipiv, int pw,
int tid, int tsize, int *pready,ptm *mtx, ptc *cnd) {
int pcnt, pfctr, ufrom, uto, ifrom, p;
double *pa = a, *pl, *pf, *lp;
pcnt = n / pw; /* number of panels */
pfctr = tid + (tid ? 0 : tsize); /* first panel that should be factored by this thread after the very first panel
(number 0) gets factored */
/* this is a pointer to the last panel */
lp = a + (size_t)(n - pw) * (size_t)lda;
/* for each panel (that is used as source of updates) */
for (ufrom = 0; ufrom < pcnt; ufrom++, pa += (size_t)pw * (size_t)(lda + 1)){
p = ufrom * pw; /* column number */
/* if the panel to be used for updates has not been factored yet; 'ipiv' does not be consulted, but it is to
possibly avoid accesses to 'pready'*/
if (! ipiv[p + pw - 1] || ! pready[ufrom]) {
if (ufrom % tsize == tid) { /* if this is this thread's panel */
pfactor( n - p, pw, pa, lda, ipiv + p, pready, ufrom, mtx, cnd );
} else if (ufrom < pcnt - 1) { /* if this is not the last panel */
LOCK( mtx );
while (! pready[ufrom]) { WAIT( cnd, mtx ); }
UNLOCK( mtx );
}
}
/* for each panel to be updated */
for (uto = first_panel_to_update( ufrom, tid, tsize ); uto < pcnt; uto += tsize) {
/* if there are still panels to factor by this thread and preceding panel has been factored; test to 'ipiv' could
be skipped but is in there to decrease number of accesses to 'pready' */
if (pfctr < pcnt && ipiv[pfctr * pw - 1] && pready[pfctr - 1]) {
/* for each panel that has to (still) update panel 'pfctr' */
for (ifrom = ufrom + (uto > pfctr ? 1 : 0); ifrom < pfctr; ifrom++) {
p = ifrom * pw;
pl = a + (size_t)p * (size_t)(lda + 1);
pf = pl + (size_t)(pfctr - ifrom) * (size_t)pw * (size_t)lda;
pupdate( n - p, pw, pl, pf, lda, p, ipiv, lp );
}
p = pfctr * pw;
pl = a + (size_t)p * (size_t)(lda + 1);
pfactor( n - p, pw, pl, lda, ipiv + p, pready, pfctr, mtx, cnd );
pfctr += tsize; /* move to this thread's next panel */
}
/* if panel 'uto' hasn't been factored (if it was, it certainly has been updated, so no update is necessary) */
if (uto > pfctr || ! ipiv[uto * pw]) {
p = ufrom * pw;
pf = pa + (size_t)(uto - ufrom) * (size_t)pw * (size_t)lda;
pupdate( n - p, pw, pa, pf, lda, p, ipiv, lp );
}
}
}
The multicore processors do not resemble the SMP systems of the past,
nor do they resemble distributed memory systems. In comparison to SMPs,
July 19, 2011 11:29 9in x 6in b1189-ch11 Solving the Schrodinger Equation
multicores are much more starved for memory due to the fast increase in
the number of cores, which is not followed by a proportional increase in
bandwidth. Owing to that, data access locality is of much higher importance
in case of multicores. At the same time, they do follow to a large extent the
memory model where the main memory serves as a central (not distributed)
repository for data. For those reasons, the best performing algorithms or
multicores happen to be parallel versions of what used to be known as
out-of-core algorithms (algorithms developed in the past for situations
where data does not fit in the main memory and has to be explicitly moved
between the memory and the disc).
In dense linear algebra the Tile Algorithms are direct descendants of
out-of-core algorithms. The Tile Algorithms are based on the idea of
processing the matrix by square submatrices, referred to as tiles, of rel-
atively small size. This makes the operation efficient in terms of cache
and TLB use. The Cholesky factorization lends itself readily to tile for-
mulation, however the same is not true for the LU and QR factorizations.
The tile algorithms for them are constructed by factorizing the diagonal
tile first and then incrementally updating the factorization using the entries
below the diagonal tile. This is a very well known concept that dates back
to the work of Gauss. The idea was initially used to build out-of-core
algorithms and recently rediscovered as a very efficient method for imple-
menting linear algebra operations on multicore processors. (It is crucial to
note that the technique of processing the matrix by square tiles yields satis-
factory performance only when accompanied by data organization based on
square tiles. The layout is referred to as Square Block layout or, simply, Tile
Layout.)
For parallel execution those algorithms can be scheduled either statically
or dynamically. For static execution (Fig. 11.7) the work for each core is
predetermined and each core follows the cycle: check task dependencies
(and wait if necessary), perform a task, update dependencies, transition
to the next task (using a static transition function). For regular algorithms,
such as dense matrix factorizations, static scheduling is straightforward and
very robust.
An alternative approach, which emphasizes the ease of development,
is based on writing a serial algorithm and the use of a dynamic scheduler,
which traverses the code and queues tasks for parallel execution, while
automatically keeping track of data dependencies (Fig. 11.8). This approach
relies on the availability of such a scheduler, which is not trivial to develop,
but offers multiple advantages, such as pipelining/streaming of different
stages of the computation (e.g. factorization and solve).
July 19, 2011 11:29 9in x 6in b1189-ch11 Solving the Schrodinger Equation
plasma_unpack_args_3(A, L, IPIV);
work = (PLASMA_Complex64_t*)plasma_private_alloc(plasma, L.mb*L.nb, L.dtyp);
ss_init(A.mt, A.nt, -1);
k = 0; n = PLASMA_RANK;
while (n >= A.nt) {k++; n = n-A.nt+k;}
m = k;
next_m++;
if (next_m == A.mt) {
next_n += PLASMA_SIZE;
while (next_n >= A.nt && next_k < min(A.mt, A.nt)) {next_k++; next_n = next_n-A.nt+next_k;}
next_m = next_k;
}
if (n == k) {
if (m == k) {
ss_cond_wait(k, k, k-1);
CORE_zgetrf(k == A.mt-1 ? A.m-k*A.nb : A.nb, k == A.nt-1 ? A.n-k*A.nb : A.nb, L.mb, A(k, k), A.nb,
IPIV(k, k), &iinfo);
if (PLASMA_INFO == 0 && iinfo > 0 && m == A.mt-1)
PLASMA_INFO = iinfo + A.nb*k;
ss_cond_set(k, k, k);
}
else {
ss_cond_wait(m, k, k-1);
CORE_ztstrf(m == A.mt-1 ? A.m-m*A.nb : A.nb, k == A.nt-1 ? A.n-k*A.nb : A.nb, L.mb, A.nb, A(k, k),
A.nb, A(m, k), A.nb, L(m, k), L.mb, IPIV(m, k), work, L.nb, &iinfo);
if (PLASMA_INFO == 0 && iinfo > 0 && m == A.mt-1)
PLASMA_INFO = iinfo + A.nb*k;
ss_cond_set(m, k, k);
}
}
else {
if (m == k) {
ss_cond_wait(k, k, k);
ss_cond_wait(k, n, k-1);
CORE_zgessm(k == A.mt-1 ? A.m-k*A.nb : A.nb, n == A.nt-1 ? A.n-n*A.nb : A.nb, A.nb, L.mb, IPIV(k, k),
A(k, k), A.nb, A(k, n), A.nb);
}
else {
ss_cond_wait(m, k, k);
ss_cond_wait(m, n, k-1);
CORE_zssssm(A.nb, m == A.mt-1 ? A.m-m*A.nb : A.nb, n == A.nt-1 ? A.n-n*A.nb : A.nb, L.mb, A.nb,
A(k, n), A.nb, A(m, n), A.nb, L(m, k), L.mb, A(m, k), A.nb, IPIV(m, k));
ss_cond_set(m, n, k);
}
}
n = next_n; m = next_m; k = next_k;
}
plasma_private_free(plasma, work);
ss_finalize();
}
Fig. 11.7. Factorization for multicore execution using the SMPD programming
model with static scheduling of work (C code.)
July 19, 2011 11:29 9in x 6in b1189-ch11 Solving the Schrodinger Equation
plasma = plasma_context_self();
Fig. 11.8. Factorization for multicore execution using dynamic task scheduling
(C code.)
July 19, 2011 11:29 9in x 6in b1189-ch11 Solving the Schrodinger Equation
Another feature that is common to all the versions presented is the oper-
ation count: they all perform (2/3)n3 floating-point multiplications and/or
additions. The order of these operations is what differentiates them. There
exist algorithms that increase the amount of floating-point work to save
on memory traffic or network transfers (especially for distributed-memory
parallel algorithms.) But because the algorithms shown in this chapter have
the same operation count, it is valid to compare them for performance. The
computational rate (number of floating-point operations per second) may
be used instead of the time taken to solve the problem, provided that the
matrix size is the same. But comparing computational rates is sometimes
better because it allows a comparison of algorithms when the matrix sizes
differ. For example, a sequential algorithm on a single processor can be
directly compared with a parallel one working on a large cluster on a much
bigger matrix.
Bibliography
[1] R. Aymar, V. Chuyanov, M. Huguet, andY. Shimomura, Nuclear Fusion 41(10) (2001).
[2] E.F. Jaeger, L.A. Berry, E. DAzevedo, D.B. Batchelor, M.D. Carter MD, K.F. White,
and H. Weitzner, Physics of Plasmas 9(5), (2002) 1873.
[3] E.F. Jaeger, L.A. Berry, J.R. Myra, D.B. Batchelor, E. DAzevedo, P.T. Bonoli, C.K.
Philips, D.N. Smithe, D.A. DIppolito, M.D. Carter, R.J. Dumont, J.C. Wright, and
R.W. Harvey, Phys. Rev. Lett. 90(19) (2003) 5001.
[4] E.F. Jaeger, R.W. Harvey, L.A. Berry, J.R. Myra, R.J. Dumont, C.K. Philips, D.N.
Smithe, R.F. Barrett, D.B. Batchelor, P.T. Bonoli, M.D. Carter, E.F. Dazevedo, D.A.
Dippolito, R.D. Moore, and J.C. Wright, Nuclear Fusion 46(7), (2006) S397.
[5] R.F. Barrett, T.H.F. Chan, E.F. DAzevedo, E.F. Jaeger, K. Wong, and R.Y. Wong,
Concurrency and Computation: Practice and Experience 22(5), (2010) 573.
July 19, 2011 11:29 9in x 6in b1189-ch11 Solving the Schrodinger Equation
Further Reading
Index
343
July 19, 2011 11:29 9in x 6in b1189-Index Solving the Schrodinger Equation
344 Index
Index 345
complete basis set (CBS) limit, 16 coupled-cluster, 1, 28, 3032, 35, 39, 44,
completeness, 31, 137, 166 5355, 59, 61, 69, 70, 77, 78, 81, 83,
configuration, 30, 39, 40, 44, 46, 52, 55, 84, 133, 214, 272, 314
61, 69, 74, 76, 78, 79, 81, 93, 106, 115, covalent, 55, 202, 205208, 212, 216,
118121, 124, 127, 130133, 135138, 219221, 229
142, 145, 146, 151, 158, 176, 178180, covalent-ionic mixing, 221
204, 206, 212, 215, 221, 227, 229, 239, CPU, 224, 238, 284, 340
243, 245250, 253255, 259, 262, 263, creation operators, 28, 65, 66
266, 272, 296, 314, 316, 325 critical exponent, 94, 98, 99, 102104, 108
configuration interaction (CI), 39, 40, 46, critical phenomena, 9295, 97, 108
52, 55, 61, 69, 74, 78, 79, 81, 133, 142, critical slowing down, or CSD, 274
145, 146, 151, 204, 206, 239, 250, 254, critical temperature Tc , 96
263, 272, 314 CRUNCH, 231
configuration set, 248 cubic scaling, 263
configuration space, 243245, 247, 255, cumulant expansion, 75
262, 266 cumulant theory for RDMs, 73
conical intersection, 62, 78, 80 cusp-corrected Gaussian, 180
conjecture, 13, 22 cyclic polyenes, 54
conjugate eigenvalue problem, 114 cyclopropane, 219
conjugated systems, 54, 56, 219, 228 cyclopropyl radical, 78
Connection Machine, 288 cytochrome P450, 218
conrotatory pathway, 77, 78
continuous phase transitions, 98, 108 D6h , 221, 222
continuum, 91, 97, 105, 107, 108, 112, decapentaene C10 H12 , 251
137, 156, 211 decomposition, 49, 51, 54, 224, 286, 287,
continuum state, 91, 108 318320
contracted Schrodinger equation (CSE), defect correction, 281, 298
6163, 7375, 85 degenerate, 44, 73, 97, 105, 107, 124, 229
contraction, 12, 38, 47, 48, 57, 62, 6668, degenerate orbitals, 44
73, 81, 225, 314 delocalized, 144, 147, 201, 204, 207, 208,
core ionization energies, 129, 130 219, 226, 276
correlation energy, 1, 2, 4, 5, 13, 1519, dense linear algebra, 313, 320, 323, 332,
25, 27, 32, 33, 77, 83, 84, 145, 237, 335
240, 259, 260, 267, 296 density functional theory (DFT), 1, 3, 13,
correlation length , 96 21, 26, 27, 39, 144, 145, 150, 151, 154,
correlation-consistent basis sets, 27 156, 174, 175, 182, 187, 194, 218, 219,
Coulomb operator, 19, 142, 148, 153, 154, 227, 259, 262, 271273, 275, 276, 278,
157, 163, 165167, 169, 187, 192 279, 282, 289291, 296298, 302, 304,
Coulomb potential, 100, 112, 118, 158, 305
161, 258, 294 density matrix, 8, 9, 12, 15, 17, 18, 20, 43,
Coulomb Sturmians, 112114, 121, 131, 45, 46, 53, 54, 56, 58, 61, 62, 64, 73, 86,
133137, 145, 156, 161, 166, 178 107, 142, 154, 172, 271, 275, 278, 283,
CoulsonFischer orbitals, 207, 208 289, 291, 303, 306
coupled electron-pair approximation density matrix renormalization group, 18,
(CEPA), 82, 83 43, 45, 46, 53, 54, 56, 58, 86
coupled HartreeFock, 172 DGEMM, 323, 331, 334
July 19, 2011 11:29 9in x 6in b1189-Index Solving the Schrodinger Equation
346 Index
diatomics in molecules (DIM), 135, 143, effective potential, 93, 94, 291, 302
147, 165, 168 eigenstates, 52, 53, 247, 248, 252, 257
DielsAlder reaction, 225, 227 eigenvalues, 62, 65, 66, 83, 94, 97, 102,
differential dynamic correlation, 212, 213 106, 114, 116, 120, 144, 239, 243, 277,
diffusion equation, 254, 255, 292, 294, 278, 280283, 293, 297, 304
297, 299, 304 electric field, 2, 9294, 314
diffusion Monte Carlo (DMC), 179, 180, electrode, 274
225, 226, 253256, 259262, 264, 266, electron correlation problem, 1, 3
267, 292, 295298 electron distribution, 92, 143, 149
difluorine, 211 electron spin resonance, 265
dimensionless, 96, 100, 128, 138 electron-electron repulsion integral, 74
dioxygen, 204, 229 electron-nucleus cusp, 249
Dirac delta, 4, 261, 265 electronegativities, 212, 220
Dirac delta operator, 261, 265 electrostatic attraction potential, 118
Dirac equation, 127 elliptic coordinates, 158, 161, 162, 188
DiracCoulomb equation, 138, 142
elliptical Slater orbitals, 155
direct mapped implementation, 284286
energy barrier, 61, 77, 78, 216
Dirichlet boundary conditions, 102
entanglement, 4750, 58
Dirichlet problem, 295
entanglement entropy, 48, 49
discretization, 243
ergodic sampling, 247
discretize, 107, 244
error analysis, 185, 338
dispersion, 1, 1821, 33, 34, 248, 249,
251, 273, 290, 305 error bound, 338
displacement probability, 245 Euler angles, 106
displacement rule, 245 Euler constant, 190
disrotatory pathway, 77, 78 exact ground state energy, 62, 257, 289,
dissociation, 6971, 74, 83, 91, 92, 136, 290, 305
209213, 218 exchange integrals, 147, 148, 157, 158,
dissociation curves, 83, 136, 213 165, 167170, 187, 188, 190192
dissociation energy, 209213, 218 excited configurations, 204
distributed memory, 315, 316, 328, 330, excited states, 16, 56, 62, 7376, 78, 111,
335, 339 116, 117, 122, 123, 138, 176, 201, 215,
diverge, 70, 96, 244, 249, 257, 282 222, 223, 228, 229, 254, 256, 259, 261,
DNA strand, 274 267, 314
double excitation, 2830, 38, 61, 69, 74, expectation value, 50, 59, 64, 98, 99, 104,
77, 78, 81 156, 158, 240244, 247, 248, 250, 251,
double-bond, 74 262, 264, 272, 278, 295, 298, 299
double-polarization basis, 55 exponential operator, 53, 54
double-zeta, 26, 32, 55, 56, 72, 73, 79 exponential scaling, 50
doubly excited, 29, 123 exponential type orbitals (ETOs), 145,
drift force, 293 146, 163, 164, 169, 178
dual-core, 339 exponential wall, 296
dynamic correlation, 1619, 34, 40, 50, extended Lagrangian formalism, 275
51, 53, 54, 56, 201, 206, 210, 212, 213,
226228, 237, 239 F2 , 220
F
2 , 212
ECP, 225 F12, 28, 29, 31, 36, 40, 241, 260
July 19, 2011 11:29 9in x 6in b1189-Index Solving the Schrodinger Equation
Index 347
factorization, 68, 319, 321, 323, 324, 329, functional integral, 299
332337
fermion sign problem, 259 GAMESS, 178, 179, 230
fermionic systems, 264 GAMESS-US, 230
ferromagnetic bonding, 229 gauche-1,3-butadiene, 77, 78
FeynmanKac formula, 298 Gaunt coefficients, 193
FFT, 275 GaussLegendre quadrature, 161
finite element (FE), 91, 93, 98, 100102, GaussSeidel, 279
107, 276 Gaussian, 4, 5, 9, 10, 12, 14, 15, 19, 26,
finite size scaling (FSS), 91, 92, 95100, 35, 37, 93, 102, 141143, 145, 146,
102108 149152, 154, 155, 159, 160, 163165,
finite temperature, 276 171, 173177, 179182, 187, 230, 249,
finite-difference (FD), 263, 276, 277, 260, 272, 294
279282, 292 Gaussian functions, 4, 12, 35, 142, 143,
first order phase transitions, 98, 108 145, 146, 149, 177, 179, 272
first-order kinetic equation, 255 Gaussian quadrature, 102
first-order pair function, 28, 32 Gaussians random number (GRN), 294
fixed-node, 142, 179, 180, 259, 260, 267, Gegenbauer addition theorem, 165
296, 298, 304
geminal, 32, 145, 161, 260
fixed-node approximation, 179, 180, 259,
geminal basis functions, 32
260, 267, 296, 298, 304
generalized Sturmians, 111, 114122,
floating point performance, 283
130, 131, 133, 136138
floating-point operations, 68, 69, 323, 325,
generalized Sturmians secular equations,
338, 339
115, 120
FLP, 285
Fock operator, 28, 146 generalized trapezoidal rule, 243
Fock space, 48, 51, 52 generalized valence bond, 207
FokkerPlanck (Schmoluchowski) GIAO, 171
operator, 257 Gigabit Ethernet, 328
FokkerPlanck equation, 294 global minimum, 67, 76
force operator, 263 Goscinskian configurations, 118, 119,
fork-join model, 332 124, 130, 131, 137, 138
formaldoxime, 85, 86 GramSchmidt procedure, 278
Fortran, 167, 184, 231, 320324, 329, 341 GRAPE, 288
forward Kolmogorov equation, 294 Graphics Processing Units (GPUs), 288,
Fourier space, 165 340, 341
Fourier transform, 7, 19, 20, 134, 141, 150, Grassmann wedge product, 65, 67, 75
155, 160, 167, 172, 187, 191, 193, 194 Greens function, 73, 151, 256, 289, 291,
frozen core approximation, 133 304306
Full Approximation Scheme (FAS), 282 grid spacing, 276, 277, 279281
full configuration interaction (FCI), 46, ground state, 3, 15, 28, 6164, 68, 69,
52, 55, 69, 70, 79, 80, 83 7278, 94, 98, 101, 106, 117, 121,
functional, 1, 3, 13, 14, 21, 26, 27, 39, 125129, 138, 145, 172, 176, 180, 188,
6164, 67, 73, 82, 84, 85, 107, 142, 215, 217219, 222, 223, 229, 241, 247,
144, 145, 150, 151, 166, 259, 262, 267, 251, 254, 256, 257, 259, 262, 267, 276,
271273, 278, 279, 290, 292, 299, 303 278, 289, 290, 293, 295, 296, 298, 299,
functional derivative, 278, 279 305, 306, 314
July 19, 2011 11:29 9in x 6in b1189-Index Solving the Schrodinger Equation
348 Index
H2 , 17, 19, 136, 149, 163, 167169, 171, hybrid atomic orbitals (HAOs), 204,
176, 177, 187, 188, 202 207209, 230
Hamiltonian, 13, 18, 28, 33, 39, 48, 50, hybrid integrals, 158
5255, 63, 64, 72, 74, 75, 9193, 97, Hybrid Multicores (HMC), 340
98, 100, 104107, 113, 117, 120, 128, hybridization, 202, 204, 208, 229, 340,
144, 156, 165, 172, 205, 206, 224, 227, 341
240, 241, 243, 247, 251, 256, 257, 261, hydration, 272
262, 271, 274, 277, 279, 297 hydrogen, 3, 22, 33, 39, 44, 47, 5153, 56,
harmonic oscillator, 19, 20, 277 61, 63, 70, 71, 77, 78, 8386, 106,
harmonic well, 301, 302 111113, 118, 119, 127, 141, 143145,
Hartree orbitals, 148 148, 151153, 155, 156, 163, 165, 167,
HartreeFock (HF), 1, 2, 28, 29, 32, 39, 168, 174176, 182, 188, 202, 204, 207,
43, 50, 76, 81, 93, 107, 111, 121, 133, 216, 220
136, 138, 143, 146, 150, 153, 156, 157, hydrogen abstraction reactions, 216
165, 168, 172, 179, 181, 211214, 219, hydrogen fluoride, 78, 8385
226, 238, 250, 254, 289291, 296, 298, hydrogen lattices, 44, 61, 63
315 hydrogen molecule, 22, 44, 176
HartreeFock limit, 138, 153, 168, 181 hydrogen-like orbitals, 112, 113, 141, 143,
HartreeFock orbitals, 226 144, 148, 151, 152, 155, 156, 163, 165,
HCCH, 16 174, 175, 182, 188
He2 , 176 hydrogen-transfer, 220
HeH+ , 136, 176 hydroxylation, 173, 218
Heisenberg representation, 75 Hylleraas wave functions, 151, 154, 176,
Heisenberg Uncertainty Principle, 7, 92 181
helium, 5, 6, 11, 22, 32, 39, 40, 108, 121, hyperconjugation, 218, 220, 228
124126, 129, 138, 151, 177, 241 hypergeometric functions, 100, 191
helium atom, 5, 6, 11, 22, 32, 39, 108,
121, 138, 151, 177 IBM, 150, 167169, 175, 179, 183, 318,
HellmannFeynman theorem, 251 325327, 342
Hermite interpolation polynomials, 98, 99, IBM BlueGene, 179, 318
102, 103 IBMOL, 150
Hermitian, 6163, 7375, 85, 244, 320 idempotency constraint, 292, 297, 303
Hessian, 94, 250, 251 idempotent, 290
Hessian matrix, 94, 251 ILAENV function, 325
heteropolar bond, 220 imaginary time, 179, 254256, 292, 299,
hextuple excitations, 55 305
high performance computing (HPC), 318, importance sampling, 257, 293, 294, 297
326, 327, 339341 independent particle techniques, 238
Hilbert spaces, 66, 98, 114, 166 indicator function, 301
HO 3 , 77 indistinguishable, 62, 69, 70, 84, 85
HohenbergKohn theorem, 1, 3, 13, 21 infrared intensities, 264
holes, 38, 6567, 73, 82 integral equation, 302
HOMO/LUMO gap, 290 integral formulation of DFT, 302, 304
homopolar bond, 220 interatomic distance, 70, 71, 135, 151,
HPC computing, 326 168, 213
Hulthen potential, 100102, 104, 105, 107 interelectron repulsion, 120, 124128,
Husimi quasi-densities, 7 130, 131, 137, 138
July 19, 2011 11:29 9in x 6in b1189-Index Solving the Schrodinger Equation
Index 349
350 Index
Index 351
perturbation theory, 28, 29, 76, 77, 79, 93, promotion gap, 216, 217
133, 210, 239, 272, 314 propagator, 256, 258, 305
perturbative triple excitations, 84 propellanes, 220
Pfaffian form, 260 propene, 77
phase diagram, 273 protein, 228, 272
photochemical conversion, 78 pseudo-potentials, 251, 258
photochemistry, 228 pseudo-time , 292
photoelectron spectroscopy (PES), 229, pseudocritical parameters, 102
262 pseudopotentials, 279
pico-hartree accuracy, 177, 178 psinc, 276
pipelining/streaming, 336 Pulay force, 275
pivots, 318 Pulays correction, 264
Plancks constant, 9
plane-wave, 26, 187, 272, 274276 quadratic convergence, 67
Poissons equation, 166, 168, 187, 280, quadruple precision, 181, 184, 185
282, 295 quadruple-zeta, 83
PoissonBoltzmann equation, 282 Quantum Chemistry Program Exchange
polarizable continuum model (PCM), 211 (QCPE), 150
poly-phenylcarbenes, 57 quantum dot, 108
POLYATOM, 150 quantum electrodynamics, 53, 129, 138
polynomial, 46, 50, 59, 98, 99, 101103, Quantum Monte Carlo (QMC), 34, 142,
144, 145, 152, 156, 161, 169, 178, 185, 145, 178180, 182, 225, 226, 228, 237,
188191, 238, 260 239, 247, 262, 263, 266, 271273, 275,
polyradical, 57, 69, 71, 73, 228 289, 292, 295299
Poples G1 data set, 15 quantum Monte Carlo simulations, 145
porphyrins, 238, 267 quantum numbers, 98, 118, 126, 144, 152,
positive semidefinite, 62, 6468 154156, 188, 192, 194
positivity conditions, 61, 6369, 72, 82 quantum phase transitions, 86, 91, 92, 108
positron, 237, 265, 268 quasi-densities, 7, 8
potential energy surface (PES), 45, 85, 86, quasiparticle coordinates, 260
171, 213, 250, 262 quintuple-zeta, 31
p-positivity conditions, 63, 64
predictor-corrector technique, 258 Racah coefficient, 191
price/performance ratios, 317, 325, 327 radial differential equation, 100
primal-dual interior-point algorithms, radial functions, 113
6769 random numbers, 245, 257, 294
principal quantum number, 118, 126, 144, random walks, 255, 271, 272, 292, 294,
155, 156, 188, 192 299, 300
principle of locality, 45, 58 reaction mechanisms, 214
probability distribution, 6567, 242, 245, real-space grid method, 271, 275, 283, 306
249 reduced second-order density matrix, 8, 9
product ansatz, 46 relativistic corrections, 117, 124, 125, 128,
product Fock space, 48 138
product theorem, 143, 149, 154 release node method, 260
Projected Entangled Pair States (PEPS), renormalization group, 18, 43, 45, 46, 51,
58, 59 53, 54, 56, 58, 86, 96
promoted state, 216, 217 reptation method, 262
July 19, 2011 11:29 9in x 6in b1189-Index Solving the Schrodinger Equation
352 Index
Index 353
354 Index
Valence Bond Self Consistent field method water, 55, 56, 95, 204, 238, 272274
(VBSCF), 209212, 220, 224227, 230 wavefunction, 15, 7, 10, 1416, 18, 19,
variance, 238, 243, 244, 247252, 265, 22, 2530, 32, 33, 35, 40, 43, 4559,
294 61, 62, 64, 69, 70, 74, 75, 77, 80, 81,
variational, 32, 47, 50, 52, 53, 59, 6164, 8386, 98101, 105, 112
6770, 72, 80, 81, 85, 86, 98, 102, 106, wavefunction ansatz, 30, 46, 47
107, 162, 179, 180, 207, 208, 239241, weakly bound states, 92
247, 250, 259, 261, 267, 276, 277, 303 weight dispersion, 249
variational coefficients, 52, 106 weighted potential, 111, 113115
variational Monte Carlo (VMC), 179, Wicks theorem, 314
239241, 250252, 254, 256, 259, Wigner intracule, 12, 13
261263, 266 Wigner quasi-density, 7, 8
VB state correlation diagram (VBSCD),
215, 218, 221224 X theory, 21
VB/MM, 211 XMVB, 230
VB2000, 230
VBCI method, 210, 211, 217 Young tableaux, 253
VBPCM, 211
VBPT2, 210, 211, 225, 227, 228, 230 zero temperature, 91, 276
vector machines, 315, 316, 322, 323 zero variance principle, 243
vector-vector operations, 315, 319, 320 zero-correction-at-convergence, 281
vibrational spectroscopy, 171 zeroth-order Hamiltonian, 33, 113, 227
vinylidene carbene, 77 Zundel cation, H5 O+2 , 263