Post

T>T: Quantum Nostalgia: Rebuilding Pekeris’ Helium Solver in Python

T>T: Quantum Nostalgia: Rebuilding Pekeris’ Helium Solver in Python

Try the code yourself!

Click the following button to launch an ipython notebook on Google Colab which implements the code developed in this post:

Open In Colab

In 1950, Chaim L. Pekeris wrote to John von Neumann about the numerical difficulty of the helium atom. Von Neumann replied with characteristic optimism:

“The differential equations which you give as determining the problem of Helium II do not look vicious. How bad are they from the numerical point of view? They should certainly not be bad for high-speed computer.”

That verdict was both correct and premature. The helium atom was not analytically impossible, but it was computationally vicious enough that Pekeris and Accad eventually needed WEIZAC, one of the earliest stored-program electronic computers, to solve it properly.

The Pekeris method was the starting point for my PhD work into investigating many-body, high accuracy quantum mechanical atoms and molecules, and here I wanted to do justice to the Pekeris method showing its theoretical elegance but also how to implement it using a modern programming language, Python. When I first read the 1958 paper it took time to understand some of the intricacies, especially how to build the determinant which we will cover in detail.

Why Helium Is Hard

The hydrogen atom is separable and elegant, see my previous blog post here. Helium is neither.

In the infinite-nuclear-mass, non-relativistic approximation, the helium ground state depends on three inter-particle distances,

\[r_1,\quad r_2,\quad r_{12},\]

where $r_1$ and $r_2$ are the electron-nucleus distances and $r_{12}$ is the electron-electron separation. The fact that we can reduce to these three scalars already removes a great deal of complexity: we do not need to track the individual Cartesian positions of each particle, because the ground state has no preferred orientation in space.

That reduction still leaves a three-variable partial differential equation. Writing it out in full:

\[\frac{\partial^2\Psi}{\partial r_1^2} + \frac{2}{r_1}\frac{\partial \Psi}{\partial r_1} + \frac{\partial^2\Psi}{\partial r_2^2} + \frac{2}{r_2}\frac{\partial\Psi}{\partial r_2} + 2\frac{\partial^2\Psi}{\partial r_{12}^2} + \frac{4}{r_{12}}\frac{\partial\Psi}{\partial r_{12}} + \frac{r_1^2-r_2^2+r_{12}^2}{r_1r_{12}}\frac{\partial^2\Psi}{\partial r_1\partial r_{12}} + \frac{r_2^2-r_1^2+r_{12}^2}{r_2r_{12}}\frac{\partial^2\Psi}{\partial r_2\partial r_{12}} + 2\left(E + \frac{Z}{r_1} + \frac{Z}{r_2} - \frac{1}{r_{12}}\right)\Psi = 0.\]

Each group of terms has a clear physical origin:

  • $\partial^2/\partial r_1^2 + (2/r_1)\,\partial/\partial r_1$ and the corresponding terms in $r_2$: these are the radial kinetic energies for each electron considered in isolation, written in spherical coordinates. The $2/r$ coefficient arises from the angular part of the Laplacian when the wavefunction has no angular dependence (the $s$-state).
  • $2\,\partial^2/\partial r_{12}^2 + (4/r_{12})\,\partial/\partial r_{12}$: this is the kinetic energy associated with the relative motion of the two electrons. The factor of 2 in front of the second derivative comes from the reduced mass of the electron pair.
  • The cross-derivative terms $\partial^2/\partial r_1\partial r_{12}$ and $\partial^2/\partial r_2\partial r_{12}$: these arise because $r_1$, $r_2$, and $r_{12}$ are geometrically coupled. When $r_1$ changes at fixed $r_2$, the angle between the two electron-nucleus vectors changes, which also changes $r_{12}$. The angular dependence of the geometry mixes the three coordinates, producing these off-diagonal second derivatives with their $r$-dependent prefactors.
  • $2(E + Z/r_1 + Z/r_2 - 1/r_{12})\Psi$: the potential energy. $-Z/r_1$ and $-Z/r_2$ are the nuclear attractions, $+1/r_{12}$ is the electron-electron repulsion, and $E$ shifts the equation to seek the bound-state energies.

Why hydrogen separates but helium does not. For hydrogen there is one electron and one nucleus: the Schrödinger equation in spherical coordinates separates cleanly into radial and angular parts, each solvable exactly. Helium has two electrons. The $1/r_{12}$ repulsion term couples them irremovably. No change of variables will decouple it into a product of single-particle functions, because it depends on the angle between the two electrons and cannot be expressed as a function of $r_1$ and $r_2$ alone.

The triangle inequality problem. Beyond the coupling in the PDE itself, the domain is geometrically constrained. The three distances must satisfy the triangle inequality for every configuration of the three particles:

\[|r_1 - r_2| \le r_{12} \le r_1 + r_2,\]

and analogous inequalities for the other pairs. This means the domain is not the simple orthant $r_1, r_2, r_{12} \ge 0$: it is a complicated curved region whose shape depends on $r_1$ and $r_2$. Any basis expansion must either work on this constrained domain requiring complicated boundary conditions at every face or find coordinates that make the constraint invisible.

Pekeris’ first key insight was to remove that geometrical obstacle entirely.

Pekeris’ Main Ideas

1. Use Perimetric Coordinates

Parametrising the energy. A bound state has negative energy. Write it as

\[E = -\epsilon^2, \quad \epsilon > 0,\]

so that $\epsilon = \sqrt{-E}$ is a positive real number with units of inverse length. It will serve both as the energy parameter we are solving for and as a natural length scale for the coordinates.

Defining the perimetric coordinates. Using $\epsilon$, define three new variables:

\[\begin{aligned} u &= \epsilon(r_2 + r_{12} - r_1), \\ v &= \epsilon(r_1 + r_{12} - r_2), \\ w &= 2\epsilon(r_1 + r_2 - r_{12}). \end{aligned}\]

Each one measures the “excess” of a triangle perimeter over one side, scaled by $\epsilon$. Geometrically, $u/\epsilon$ is the amount by which the sum $r_2 + r_{12}$ exceeds $r_1$; if the three distances formed a degenerate triangle with $r_1 = r_2 + r_{12}$, then $u$ would be zero.

Why $u, v, w \ge 0$. This is the key insight, and it follows directly from the triangle inequality. The three particles, nucleus, electron 1, and electron 2, define a triangle with sides $r_1$, $r_2$, $r_{12}$. For any physical configuration:

\[r_{12} \le r_1 + r_2 \;\Rightarrow\; w = 2\epsilon(r_1+r_2-r_{12}) \ge 0,\] \[r_1 \le r_2 + r_{12} \;\Rightarrow\; u = \epsilon(r_2+r_{12}-r_1) \ge 0,\] \[r_2 \le r_1 + r_{12} \;\Rightarrow\; v = \epsilon(r_1+r_{12}-r_2) \ge 0.\]

The three triangle inequalities become precisely the three non-negativity conditions. Because those inequalities must always hold for any physical configuration, $u$, $v$, and $w$ are automatically non-negative and there is no further constraint between them. They are free to range over $[0,\infty)$ independently.

The geometric obstacle has been absorbed into the definition of the coordinates. There is no constrained domain left to worry about.

The image below visualises inter-particle coordinates on the left and perimetric coordinates on the right. The trick is that a single interparticle coordinate is shared between two sides, meaning there is shared coordinate information for each edge. In the image the perimetric coordinates are labelled using $z_i$ instead of $u,v,w$; these are the unscaled variants without the $\epsilon$ factor. (I was lazy and grabbed the image from my thesis…).

Desktop View

The inverse transformation. Adding pairs of the defining equations recovers the original interparticle distances:

\[r_{12} = \frac{u+v}{2\epsilon}, \quad r_1 = \frac{2v+w}{4\epsilon}, \quad r_2 = \frac{2u+w}{4\epsilon}.\]

This shows that the transformation is invertible everywhere in the domain $u, v, w \ge 0$, confirming that no physical configuration has been lost.

2. Factor Out the Exponential Decay

The asymptotic form. Any normalisable bound-state wavefunction must vanish as the electrons recede to infinity. For large $r_1$ at fixed $r_2$, electron 1 sees the nucleus screened by electron 2, giving an effective nuclear charge approaching $Z-1$. For helium ($Z=2$) the effective charge approaches 1. The leading asymptotic behaviour of the wavefunction is therefore

\[\Psi \sim e^{-\epsilon\, r_1} \cdot e^{-\epsilon\, r_2} = e^{-\epsilon(r_1 + r_2)}.\]

The connection to perimetric coordinates. Using the inverse transformation, the sum of the two electron-nucleus distances is

\[r_1 + r_2 = \frac{2v+w}{4\epsilon} + \frac{2u+w}{4\epsilon} = \frac{u+v+w}{2\epsilon}.\]

So the asymptotic factor becomes

\[e^{-\epsilon(r_1+r_2)} = e^{-(u+v+w)/2}.\]

Pekeris therefore writes

\[\Psi(u,v,w) = e^{-(u+v+w)/2}F(u,v,w),\]

where $F$ is a smooth, algebraically-growing function with no exponential tail. The advantage is twofold. First, the exponential decay that any bound-state basis must represent has been handled exactly upfront. Second, and more importantly, the standard Laguerre polynomials are orthogonal on $[0,\infty)$ with respect to the weight $e^{-x}$. The prefactor $e^{-(u+v+w)/2}$ is precisely the product weight for the three coordinates, so the basis functions $e^{-(u+v+w)/2}L_l(u)L_m(v)L_n(w)$ are orthogonal over the full domain.

3. Expand in Laguerre Polynomials

With the exponential factor peeled off, Pekeris expands $F$ in a triple product of Laguerre polynomials:

\[F(u,v,w) = \sum_{l,m,n=0}^{\infty} A(l,m,n)L_l(u)L_m(v)L_n(w),\]

where $L_n$ are the standard Laguerre polynomials.

This is not a generic polynomial choice it is physics-aware on several levels:

  • Laguerre polynomials are defined naturally on $[0,\infty)$, exactly the domain of each perimetric coordinate.
  • Together with the exponential weight $e^{-x}$, they form a complete orthogonal set, so any smooth function on $[0,\infty)$ can be represented exactly in the limit of infinitely many terms.
  • Associated Laguerre polynomials appear as the exact solutions to the hydrogen radial equation, so the standard Laguerre polynomials here are a natural generalisation of what works for the one-electron problem.

Why these polynomials convert the PDE into a recurrence. The transformed wave equation (after expanding $\Psi = e^{-(u+v+w)/2}F$ and substituting into the PDE) contains terms of the form $u\cdot F$, $u\cdot \partial F/\partial u$, and $u\cdot \partial^2 F/\partial u^2$, and similarly for $v$ and $w$. The crucial property of Laguerre polynomials is that each of these operations maps $L_n(x)$ to a finite linear combination of its neighbours:

\[\begin{aligned} xL_n''(x) &= (x-1)L_n'(x) - nL_n(x), \\ xL_n'(x) &= nL_n(x) - nL_{n-1}(x), \\ xL_n(x) &= -(n+1)L_{n+1}(x) + (2n+1)L_n(x) - nL_{n-1}(x). \end{aligned}\]

To see this concretely, consider the simplest contribution: the term $u\cdot L_l(u)$ appearing when the Coulomb factor $1/r_1 \sim \epsilon/(2v+w) \cdot 4$ is rewritten in perimetric form and the coordinate $u$ appears as a prefactor. Applying the third identity:

\[u\cdot L_l(u) = -(l+1)L_{l+1}(u) + (2l+1)L_l(u) - l\,L_{l-1}(u).\]

This replaces a product involving the continuous variable $u$ with a linear combination of coefficients at integer indices $l-1$, $l$, and $l+1$. No integration is needed; the continuous structure of the PDE has been converted into a discrete shift in an integer index. The derivative identities work analogously: $u \cdot L_l’(u) = l\,L_l(u) - l\,L_{l-1}(u)$ shifts $l$ by 0 and $-1$, and $u \cdot L_l’’ (u)$ can be reduced via the first identity and then the second.

When the full PDE is expanded in this way across all three coordinates $u$, $v$, $w$ simultaneously, every continuous differential operator is replaced by a finite combination of integer index shifts $l \to l + \alpha$, $m \to m + \beta$, $n \to n + \gamma$. The result is a purely algebraic recurrence relation for the coefficients $A(l,m,n)$.

4. Derive the 33-Term Recurrence

This is the core of the Pekeris method.

After applying the Laguerre identities to every term in the transformed PDE, collecting like terms, and accounting for the symmetry between $u$ and $v$, the result is Eq. (22): a 33-term recurrence relation of the form

\[\sum_{\alpha,\beta,\gamma} C_{\alpha\beta\gamma}(l,m,n)\, A(l+\alpha,\,m+\beta,\,n+\gamma) = 0.\]

Where do the 33 terms come from? Each Laguerre identity shifts one index by at most $\pm 1$ or $\pm 2$. The transformed PDE in perimetric coordinates contains:

  • Pure second-derivative terms in $u$ and $v$ (from the radial kinetic energy and the $\partial^2/\partial r_i^2$ operators): after applying $xL_n’’ = (x-1)L_n’ - nL_n$ and then $xL_n’ = nL_n - nL_{n-1}$, these generate shifts of $0, \pm1, \pm2$ in $l$ and $m$, producing the 6 “raising” terms $(2,0,0)$, $(0,2,0)$, $(1,1,0)$, $(1,0,1)$, $(0,1,1)$, $(0,0,2)$ and their 6 “lowering” mirrors $(-2,0,0)$, $(0,-2,0)$, $(-1,-1,0)$, $(-1,0,-1)$, $(0,-1,-1)$, $(0,0,-2)$.
  • First-derivative terms in $u$, $v$, and $w$ (from the first-order operators in the PDE): applying $xL_n’ = nL_n - nL_{n-1}$ generates single-step shifts $(\pm1,0,0)$, $(0,\pm1,0)$, $(0,0,\pm1)$; 6 more terms.
  • Cross-derivative terms $\partial^2/\partial r_1\partial r_{12}$ and $\partial^2/\partial r_2\partial r_{12}$: in perimetric coordinates these mix shifts across different indices, giving the 4 mixed terms $(0,2,-1)$, $(2,0,-1)$, $(-1,0,2)$, $(0,-1,2)$ and their 4 counterparts $(1,0,-2)$, $(0,1,-2)$, $(-2,0,1)$, $(0,-2,1)$.
  • The diagonal term $(0,0,0)$: every $(2n+1)L_n$ and $(2l+1)L_l$ middle-index contribution from all three $xL_n$ identities accumulates here, making it the most algebraically complex single entry but always guaranteed to land in the basis.
  • Exchange terms: the symmetry between $u$ and $v$ (corresponding to the two electrons) generates 6 mixed-exchange shifts $(-1,1,0)$, $(1,-1,0)$, $(-1,0,1)$, $(0,-1,1)$, $(1,0,-1)$, $(0,1,-1)$.

Counting: $6 + 3 + 4 + 1 + 6 + 3 + 7 + 3 = 33$.

Each coefficient $C_{\alpha\beta\gamma}$ splits into a part independent of $\epsilon$ (denoted $a$) and a part linear in $\epsilon$ (denoted $b$). This split is inevitable: the $\epsilon$ appears in the perimetric coordinates and in the energy $E = -\epsilon^2$, so the PDE coefficients are at most linear in $\epsilon$ after clearing denominators.

Instead of solving a differential equation in continuous variables, the whole problem is now a structured algebraic system on integer indices. That is the step that made WEIZAC useful at all; the machine did not need to integrate differential equations, it needed to multiply numbers and accumulate sums into a matrix.

From Recurrence to Determinant

This is the part that took me the most time to understand the first time through, so let me slow it down.

The recurrence is an infinite system: for every triple $(l,m,n)$ with $l,m,n \ge 0$, Eq. (22) gives one equation coupling $A(l,m,n)$ to 32 neighbours. We have infinitely many unknowns and infinitely many equations. Pekeris rewrites this schematically as

\[\sum_{\alpha,\beta,\gamma} C_{\alpha,\beta,\gamma}(l,m,n)\, A(l+\alpha,m+\beta,n+\gamma)=0.\]

That means: pick one specific basis triple $(l,m,n)$, write down the 33 shifted terms from Eq. (22), and that gives one linear equation in many unknown coefficients $A(l’,m’,n’)$.

So one basis triple produces one row of the matrix. The next three steps turn the infinite system into a finite, solvable one.

Symmetry Reduction

The helium ground state has the spin configuration $1^1S$, a singlet, meaning the two electrons have opposite spin. For a singlet, the spin part of the wavefunction is antisymmetric under electron exchange, so the spatial part must be symmetric:

\[\Psi(r_1, r_2, r_{12}) = \Psi(r_2, r_1, r_{12}).\]

In the perimetric coordinate expansion, swapping the two electrons is equivalent to swapping $u \leftrightarrow v$ (since $u$ involves $r_2$ where $v$ involves $r_1$, and both involve $r_{12}$). Because $L_l(u)L_m(v)$ and $L_m(u)L_l(v)$ are exchanged under $u \leftrightarrow v$, the symmetry constraint on the coefficients is:

\[A(l,m,n)=A(m,l,n).\]

This means the coefficients for $(l,m,n)$ and $(m,l,n)$ are identical, so we only need to store one of each pair. Pekeris keeps the states where $l \le m$. For states with $l = m$ only one copy exists anyway; for states with $l \ne m$, keeping only $l \le m$ cuts the basis size roughly in half.

Pekeris calls this the symmetrical case. An antisymmetric spatial wavefunction would correspond to a triplet spin state ($1^3S$), which gives the first excited state of helium, not the ground state.

Truncation

The infinite set of all triples $(l,m,n)$ with $l \le m$ and $l,m,n \ge 0$ is truncated to a finite basis by imposing a maximum polynomial degree:

\[l+m+n \le \omega.\]

The parameter $\omega$ is what Pekeris calls the polynomial order (Table III of the paper). The basis size grows approximately as $\omega^3/6$:

$\omega$Basis size
13
313
534
10161
12252

Increasing $\omega$ enlarges the subspace, includes more basis functions, and allows the energy to converge toward the exact result. The truncation is the only approximation in the method: the recurrence relation, the Laguerre expansion, and the perimetric transformation are all exact.

Ordering the Basis

The basis triples must now be turned into a one-dimensional index. Pekeris gives a closed-form map in the paper, but for implementation the simplest thing is just to generate the states in the same order as his Table I:

1
2
3
4
5
6
7
8
9
10
11
def symmetric_basis_for_omega(omega: int) -> list[tuple[int, int, int]]:
    basis = []
    for ww in range(omega + 1):
        for vv in range(ww + 1):
            for uu in range(vv + 1):
                l = uu
                m = vv - uu
                n = ww - vv
                if l + m + n <= omega and l <= m:
                    basis.append((l, m, n))
    return basis

For $\omega=3$ this starts

\[(0,0,0),\ (0,0,1),\ (0,1,0),\ (0,0,2),\ (0,1,1),\ldots\]

which is exactly the pattern shown in Table I for the symmetrical case.

What the Row and Column Indices Mean

This is the key bookkeeping idea:

  • row $j$ means: write Eq. (22) for the basis state $(l,m,n)$ at position $j$;
  • column $k$ means: look at the coefficient multiplying the basis state $(l’,m’,n’)$ at position $k$.

So the matrix entry is simply:

the coefficient of $A(l’,m’,n’)$ in the recurrence equation written for $A(l,m,n)$.

That is all.

How One Entry Is Built

Suppose the current row corresponds to $(l,m,n)$. One term in Eq. (22) might be, for example,

\[4(l+1)(l+2)\left[-Z+\epsilon(1+m+n)\right]A(l+2,m,n).\]

That tells us immediately:

  • the target column is the state $(l+2,m,n)$,
  • the constant contribution is $-4(l+1)(l+2)Z$,
  • the $\epsilon$ contribution is $4(l+1)(l+2)(1+m+n)$.

If that target triple is valid and survives truncation, we add those two pieces into the matrix entry for that row and that column.

If the shifted target comes out with $l’ > m’$, we fold it back into the singlet basis by using the symmetry

\[A(l',m',n') = A(m',l',n').\]

For me this was the reason the $(l,m,n)$ to $(l’,m’,n’)$ bookkeeping felt confusing: the raw shifted target is not always already in canonical $l’ \le m’$ order.

The Determinant Condition

After applying symmetry reduction and truncation, the matrix system $\mathbf{C}\,\mathbf{B} = \mathbf{0}$ is finite. Each entry has the form

\[C_{jk} = a_{jk} + \epsilon\, b_{jk},\]

where $a_{jk}$ is the constant-in-$\epsilon$ part of the coefficient of $A(l_k,m_k,n_k)$ in the recurrence for row $j$, and $b_{jk}$ is the $\epsilon$-linear part.

Why the determinant must vanish. The system $\mathbf{C}\,\mathbf{B} = \mathbf{0}$ is homogeneous. We want a non-trivial wavefunction (not $\mathbf{B} = \mathbf{0}$). A homogeneous linear system has a non-zero solution only when the matrix is singular:

\[\det(\mathbf{C}) = \det(\mathbf{a} + \epsilon\,\mathbf{b}) = 0.\]

Since each entry is linear in $\epsilon$, the determinant is a polynomial in $\epsilon$ of degree at most $N$ (the matrix size). Its roots are the allowed values of $\epsilon$, i.e., the allowed energies $E = -\epsilon^2$. The ground state corresponds to the largest root $\epsilon$ (largest $\epsilon$ $\Rightarrow$ largest $\epsilon^2$ $\Rightarrow$ most negative energy).

From determinant to eigenvalue problem. Finding roots of a high-degree polynomial is expensive and numerically fragile. The standard approach is to rewrite the determinant condition as a generalised eigenvalue problem:

\[(-\mathbf{a})\mathbf{x} = \epsilon\,\mathbf{b}\mathbf{x}.\]

The sign flip on $\mathbf{a}$ puts the problem into the form expected by standard numerical solvers. The eigenvalues $\epsilon$ of this pencil are the allowed energies, computed in $O(N^3)$ time. Once the largest positive real eigenvalue $\epsilon$ is found:

\[E = -\epsilon^2.\]

Smaller positive eigenvalues from the same solve give excited states in the same symmetry class, a bonus of the method that Pekeris also used to compute the $2^3S$ state.

The Direct Python Implementation

The full direct implementation is in PekerisCode/pekeris.py, and the interactive notebook is linked at the top of this post. Below I walk through each function in the order it is called, explaining both what it does and why it is structured the way it is. This is just the way that I programmed it, it is not optimised as the purpose is for explanation and understanding over computational speed.

Step 1: Storing results cleanly

1
2
3
4
5
6
7
8
9
10
@dataclass(frozen=True)
class SolveResult:
    omega: int
    basis_size: int
    epsilon: float
    energy_hartree: float
    build_seconds: float
    solve_seconds: float
    total_seconds: float
    nuclear_charge: int

A frozen dataclass rather than a plain tuple or dictionary means every field is named, typed, and immutable. The benchmark loop returns a list of SolveResult objects, and each one carries everything needed to build the convergence table: the polynomial order, the basis size, the energy, and the timing breakdown.

Step 2: Generating the symmetry-reduced basis

1
2
3
4
5
6
7
8
9
10
11
def symmetric_basis_for_omega(omega: int) -> list[tuple[int, int, int]]:
    basis = []
    for ww in range(omega + 1):
        for vv in range(ww + 1):
            for uu in range(vv + 1):
                l = uu
                m = vv - uu
                n = ww - vv
                if l + m + n <= omega and l <= m:
                    basis.append((l, m, n))
    return basis

The critical constraint is l <= m. For the $1^1S$ ground state the wavefunction is symmetric under electron exchange:

\[A(l,m,n) = A(m,l,n).\]

This means we only need to store one of each pair $(l,m,n)$ and $(m,l,n)$ when $l \ne m$. Keeping both would double-count those contributions and corrupt the matrix.

The auxiliary variables ww, vv, uu parameterise all non-negative integer triples $(l,m,n)$ with $l+m+n \le \omega$. Specifically:

  • ww $= l+m+n$ (the total polynomial degree),
  • vv $= l+m$,
  • so $l = $ uu, $m = $ vv - uu, $n = $ ww - vv.

This ordering matches Pekeris’ Table I exactly. For $\omega = 3$ the function returns:

Index$(l, m, n)$$l+m+n$
0$(0, 0, 0)$0
1$(0, 0, 1)$1
2$(0, 1, 0)$1
3$(0, 0, 2)$2
4$(0, 1, 1)$2
5$(1, 1, 0)$2
6$(0, 0, 3)$3
7$(0, 1, 2)$3
8$(0, 2, 1)$3
9$(1, 1, 1)$3
10$(1, 2, 0)$3

Step 3: Folding shifted states back into canonical form

When Eq. (22) applies a shift $(\Delta l, \Delta m, \Delta n)$ to the current state $(l, m, n)$, the result $(l’, m’, n’)$ may come out with $l’ > m’$, placing it in the half of the space we discarded. The singlet symmetry immediately rescues us:

\[A(l', m', n') = A(m', l', n').\]

So we swap the first two indices whenever needed:

1
2
def canonical_singlet_state(l: int, m: int, n: int) -> tuple[int, int, int]:
    return (l, m, n) if l <= m else (m, l, n)

Without this, shifted targets with $l’ > m’$ would be absent from the index dictionary and their contributions would be silently lost from the matrix. The physical meaning is simply that the two electrons are indistinguishable, so $A(l’, m’, n’)$ and $A(m’, l’, n’)$ are the same expansion coefficient.

Step 4: Encoding Eq. (22) the 33-term recurrence

This is the algebraic heart of the implementation. For each basis state $(l, m, n)$, Pekeris’ Eq. (22) gives a linear combination of 33 shifted coefficients:

\[\sum_{\alpha,\beta,\gamma} C_{\alpha\beta\gamma}(l,m,n)\, A(l+\alpha, m+\beta, n+\gamma) = 0.\]

Each coefficient splits into a constant part and a part linear in $\epsilon$:

\[C_{\alpha\beta\gamma}(l,m,n) = a_{\alpha\beta\gamma}(l,m,n) + \epsilon \cdot b_{\alpha\beta\gamma}(l,m,n).\]

The function eq22_terms returns all 33 entries as (shift, a, b) triples. The terms fall into natural groups based on which Laguerre identity produced them:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
# +2 shifts: from the xL_n''(x) identity on u, v, and their cross terms
add((2, 0, 0), -4*(l+1)*(l+2)*z,       4*(l+1)*(l+2)*(1+m+n))
add((0, 2, 0), -4*(m+1)*(m+2)*z,       4*(m+1)*(m+2)*(1+l+n))
add((1, 1, 0),  4*(l+1)*(m+1)*(1-2*z), 4*(l+1)*(m+1)*(2+l+m))
add((1, 0, 1),  2*(l+1)*(n+1)*(1-2*z), 2*(l+1)*(n+1)*(2+2*m+n))
add((0, 1, 1),  2*(m+1)*(n+1)*(1-2*z), 2*(m+1)*(n+1)*(2+2*l+n))
add((0, 0, 2),  (n+1)*(n+2),            0)

# +1 shifts: from the xL_n'(x) identity
add((1, 0, 0),  (l+1)*(4*z*(4*l+4*m+2*n+7) - 8*m - 4*n - 6),
               -2*(l+1)*((m+n)*(4*m+12*l) + n**2 + 12*l + 18*m + 15*n + 14))
add((0, 1, 0),  (m+1)*(4*z*(4*l+4*m+2*n+7) - 8*l - 4*n - 6),
               -2*(m+1)*((l+n)*(4*l+12*m) + n**2 + 12*m + 18*l + 15*n + 14))
add((0, 0, 1),  4*(n+1)*(z*(2*l+2*m+2) - l - m - n - 2),
                4*(n+1)*(l**2 + m**2 - 4*l*m - 2*l*n - 2*m*n - 3*l - 3*m - 2*n - 2))

# mixed +2/-1: from cross-derivative terms in the perimetric PDE
add((0, 2, -1),  0,  4*(m+1)*(m+2)*n)
add((2, 0, -1),  0,  4*(l+1)*(l+2)*n)
add((-1, 0, 2),  0,  2*l*(n+1)*(n+2))
add((0, -1, 2),  0,  2*m*(n+1)*(n+2))

# diagonal (0,0,0): collects all middle-index contributions from every xL_n identity
add((0, 0, 0),
    4*(2*l+1)*(2*m+1) + 4*(2*n+1)*(l+m+1) + 6*n**2 + 6*n + 2
    - 4*z*((l+m)*(6*l+6*m+4*n+12) - 4*l*m + 4*n + 8),
    4*((l+m)*(10*l*m + 10*m*n + 10*l*n + 10*l + 10*m + 18*n + 4*n**2 + 16)
       + l*m*(4 - 12*n) + 8 + 12*n + 4*n**2))

# mixed +1/-1: off-diagonal parts of the cross-derivative PDE terms
add((-1,  1, 0),  4*l*(m+1)*(1-2*z),    4*l*(m+1)*(1+l+m))
add(( 1, -1, 0),  4*(l+1)*m*(1-2*z),    4*(l+1)*m*(1+l+m))
add((-1,  0, 1),  2*l*(n+1)*(1-2*z),    2*l*(n+1)*(2*m - 4*l - n))
add(( 0, -1, 1),  2*m*(n+1)*(1-2*z),    2*m*(n+1)*(2*l - 4*m - n))
add(( 1,  0,-1),  2*(l+1)*n*(1-2*z),    2*(l+1)*n*(2*m - 4*l - n - 3))
add(( 0,  1,-1),  2*(m+1)*n*(1-2*z),    2*(m+1)*n*(2*l - 4*m - n - 3))

# -1 shifts: lowering terms from the xL_n'(x) identity
add((-1, 0, 0),  2*l*(-(4*m+2*n+3) + z*(8*l+8*m+4*n+6)),
                -2*l*((m+n+1)*(12*l+4*m+2) + n + n**2))
add(( 0,-1, 0),  2*m*(-(4*l+2*n+3) + z*(8*l+8*m+4*n+6)),
                -2*m*((l+n+1)*(12*m+4*l+2) + n + n**2))
add(( 0, 0,-1),  4*n*(-(l+m+n+1) + z*(2*l+2*m+2)),
                -4*n*((l+m)*(1+2*n-l-m) + 6*l*m + 2*n))

# ±2 shifts: from L_n''(x) lowering and mixed cross terms
add(( 1, 0,-2),  0,  2*n*(n-1)*(l+1))
add(( 0, 1,-2),  0,  2*n*(n-1)*(m+1))
add((-2, 0, 1),  0,  4*l*(l-1)*(n+1))
add(( 0,-2, 1),  0,  4*m*(m-1)*(n+1))
add((-2, 0, 0), -4*l*(l-1)*z,  4*l*(l-1)*(1+m+n))
add(( 0,-2, 0), -4*m*(m-1)*z,  4*m*(m-1)*(1+l+n))
add(( 0, 0,-2),  n*(n-1),       0)

# double -1: coupling to coefficients lower in two indices simultaneously
add((-1,-1, 0),  4*l*m*(1-2*z),  4*l*m*(l+m))
add((-1, 0,-1),  2*l*n*(1-2*z),  2*l*n*(2*m+n+1))
add(( 0,-1,-1),  2*m*n*(1-2*z),  2*m*n*(2*l+n+1))

A few structural observations worth noting:

When a = 0: the shift arises purely from the coordinate-multiplication Laguerre identity $xL_n(x) = -(n+1)L_{n+1} + (2n+1)L_n - nL_{n-1}$. It carries an index shift but no additional Coulomb or kinetic factor, so it contributes only to $\mathbf{b}$.

When b = 0: the shift arises from the pure second-derivative identity applied to $w$, which introduces no factor of the coordinate. The $(0,0,\pm2)$ terms have $b = 0$ for this reason.

The diagonal term (0,0,0) collects the “middle index” contributions from all three $xL_n$ identities at once. It is the algebraically densest expression in the recurrence, and because its shift is $(0,0,0)$ it always targets the diagonal column and is never discarded by the truncation check.

Natural vanishing at the boundary: many terms vanish automatically without any index check. For example, the shift $(-2,0,0)$ has prefactor $4l(l-1)z$: when $l=0$ or $l=1$ that factor is zero regardless, so the term contributes nothing even before the bounds check fires.

Step 5: Assembling the two matrices

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
def build_paper_matrices(omega, z=2):
    basis = symmetric_basis_for_omega(omega)
    index = {state: i for i, state in enumerate(basis)}
    size  = len(basis)
    a = np.zeros((size, size))
    b = np.zeros((size, size))

    for row, (l, m, n) in enumerate(basis):
        for (dl, dm, dn), const_part, linear_part in eq22_terms(l, m, n, z):
            lp, mp, np_ = l + dl, m + dm, n + dn

            # Discard out-of-range indices
            if lp < 0 or mp < 0 or np_ < 0:
                continue
            # Discard targets beyond the truncation boundary
            if lp + mp + np_ > omega:
                continue

            # Fold back into l <= m canonical form and look up the column
            state = canonical_singlet_state(lp, mp, np_)
            col = index.get(state)
            if col is None:
                continue

            a[row, col] += const_part
            b[row, col] += linear_part

    return a, b, basis

Row $j$ of the matrix is built by writing Eq. (22) for the basis state at position $j$ in the list. Each of the 33 shifts nominates a column. Three filters decide whether to accept it:

  1. Non-negativity check: a term like $l \cdot A(l-1, \ldots)$ is zero when $l = 0$ because of the prefactor, but at $l > 0$ the index $l-1$ must be non-negative. Negative indices cannot be in any basis, so they are skipped.

  2. Truncation boundary: the polynomial order $\omega$ caps the basis. Any shifted target with $l’ + m’ + n’ > \omega$ is outside our finite subspace. Skipping it is the approximation; the energy improves as $\omega$ increases because we are enlarging the subspace.

  3. Canonical lookup: after folding with canonical_singlet_state, we look up the column via index.get(state). This returns None for any state not in the basis (which can happen near the boundary after folding), so those are skipped too.

The += accumulation is essential: multiple different shifts from Eq. (22) can target the same column. For example, two independent terms might both shift into $(0, 1, 2)$ from the same row; their contributions add together at that matrix entry.

After the loop, $\mathbf{a}$ holds all the constant-in-$\epsilon$ coefficients and $\mathbf{b}$ holds all the $\epsilon$-linear coefficients.

Step 6: Solving for $\epsilon$ and the energy

The determinant condition $\det(\mathbf{a} + \epsilon\, \mathbf{b}) = 0$ is equivalent to the generalised eigenvalue problem:

\[(-\mathbf{a})\mathbf{x} = \epsilon\, \mathbf{b}\mathbf{x}.\]

The sign flip on $\mathbf{a}$ converts the problem into the standard form expected by SciPy, where the physical $\epsilon$ appears as a positive real eigenvalue:

1
2
3
4
5
6
7
8
9
10
11
12
def solve_paper_determinant(omega, z=2):
    a, b, basis = build_paper_matrices(omega, z=z)

    eigenvalues = eig(-a, b, left=False, right=False, check_finite=False)

    # Discard non-physical eigenvalues in three stages
    finite   = eigenvalues[np.isfinite(eigenvalues)]
    real     = finite[np.abs(finite.imag) < 1e-9].real
    positive = real[real > 0]
    epsilon  = float(np.max(positive))

    energy_hartree = -(epsilon ** 2)

The three-stage filter matters:

  • Discard infinities: near-singular pencils at small $\omega$ can produce inf from nearly linearly dependent rows. These are artefacts of the truncation.
  • Discard complex eigenvalues: the truncated matrices are not guaranteed to be perfectly symmetric when assembled via accumulation, so eig can return tiny imaginary parts. Only eigenvalues with imaginary part less than $10^{-9}$ are physical.
  • Take the largest positive eigenvalue: the ground-state $\epsilon$ is the largest. Smaller positive eigenvalues correspond to excited states; negative ones belong to a different spectral branch.

Once $\epsilon$ is in hand, $E = -\epsilon^2$ gives the energy in hartree. For $\omega = 10$ this yields $E \approx -2.903724\,E_\mathrm{h}$, converging toward the exact non-relativistic value $-2.903724377\,E_\mathrm{h}$ as $\omega$ increases.

The Matrix Visualiser

I said I would make this part clear… I built a small companion visualiser which can eb seen below.

It does three useful things:

  1. It generates the same symmetric basis as the direct solver.
  2. It builds determinant entries from the actual Eq. (22) shifts.
  3. It shows each selected entry as
\[a_{jk} + \epsilon b_{jk},\]

including the individual shifted terms that landed in that row/column position.

You can explore it directly here in the post:

If you want to open it on its own page, use this standalone version.

So if the notation $(l,m,n)$ and $(l’,m’,n’)$ ever starts to feel foggy, hopefully the visualiser helps:

  • choose $\omega$,
  • click a coloured cell,
  • see which recurrence terms landed there,
  • and watch how the raw shifted target is folded back into the canonical singlet basis if necessary.

Running the Code

The direct solver can be run as:

1
python PekerisCode/pekeris.py --omega 10

On my machine that gives

1
2
 omega  size    epsilon        energy / Ha        build / s   solve / s   total / s
    10   161  1.704031722  -2.903724111149     0.0023     0.0105     0.0128

Which is correct! (I correctly extracted the 33-term recurrence relation!).

Convergence with Matrix Order

One of the nicest aspects of Pekeris’ method is that it is variational. Increase $\omega$, enlarge the determinant, and the energy becomes more negative but will never go below the true, relativistic ground state energy (sometimes floating point rounding error can make it seem like it has).

For the direct Eq. (22) implementation I get:

$\omega$Basis size$\epsilon$Energy / Ha
131.700159826-2.890543433666
271.703898372-2.903269662751
3131.703994144-2.903596041997
4221.704021416-2.903688986121
5341.704028201-2.903712109027
6501.704030388-2.903719562451
7701.704031186-2.903722283080
8951.704031511-2.903723389072
91251.704031654-2.903723878621
101611.704031722-2.903724111149
112031.704031757-2.903724228322
122521.704031775-2.903724290411

Why Pekeris’ Method Was Novel

There are several layers of novelty here and why I find this method so elegant.

1. He picked the right coordinates

Perimetric coordinates remove the triangle constraint and make the domain rectangular in the new variables. Most of the time a problem becomes much simpler to solve if we select the correct coordinate system to represent it.

2. He picked a basis that matched the problem

The Laguerre expansion is tailored to exponentially decaying bound states on semi-infinite intervals. This is a mathematically natural and physically clever choice.

3. He converted a hard PDE into sparse algebra

Instead of approximating derivatives on a grid, he used recurrence identities to convert the PDE into a structured linear system with only a modest number of nonzero couplings per row.

4. He exploited exchange symmetry ruthlessly

On modern hardware halving a basis can feel nice. On WEIZAC it could be the difference between fitting in memory and failing entirely. We are spoiled by modern hardware.

5. He combined symbolic and numeric thinking decades early

One of the main reasons I love this paper is that Pekeris’ workflow feels surprisingly modern: symbolic reduction first, numerical linear algebra second. That is why the method still feels elegant rather than historical.

Conclusions

Pekeris’ helium calculation is not just an old numerical result. It is a wonderful example of the fact that the hardest part of computational physics is often finding the right representation.

The chain of ideas is the whole story:

  1. reduce the three-body geometry to inter-particle distances,
  2. remove the triangle inequality with perimetric coordinates,
  3. factor out the asymptotic decay,
  4. expand in a Laguerre basis,
  5. convert the PDE into a 33-term recurrence,
  6. truncate and symmetry-reduce the coefficient space,
  7. solve the resulting determinant problem.

IMO, this was a quite profound piece of computational mathematics in 1958, and for me at least still deeply instructive.

References

[1] Von Neumann to Pekeris, Feb. 23, 1950 (WIA 3-96-72).

[2] L. Corry and R. Leviathan, WEIZAC: An Israeli Pioneering Adventure in Electronic Computing (1945–1963), Springer, 2019.

[3] C. Koutschan and D. Zeilberger, “The 1958 Pekeris-Accad-WEIZAC Ground-Breaking Collaboration that Computed Ground States of Two-Electron Atoms (and its 2010 Redux),” Math. Intelligencer 33, 52-57, 2011.

[4] C. L. Pekeris, “Ground State of Two-Electron Atoms,” Physical Review 112, 1649-1658, 1958.

[5] C. L. Pekeris, “1^1S and 2^3S States of Helium,” Physical Review 115, 1216-1221, 1959.

This post is licensed under CC BY 4.0 by the author.