Post

T>T: The First Solution is Rarely the Best: A Lesson in Numerical Integration

T>T: The First Solution is Rarely the Best: A Lesson in Numerical Integration

If you spend enough time developing scientific software, you eventually learn that the first solution to a problem is rarely the best one. Science, and the code we write to execute it, is a continual learning process which is one of the things I love most about the field. I have been repeatedly guilty of brute-forcing a mathematical derivation to prove it can be done, only to realise later that I was doing it the hard way.

Today, I want to pull back the curtain on a specific problem I tackled a few years ago involving the exact calculation of Hartree-Fock (HF) two-electron integrals for two-electron atoms, and how I only realised today that there is a much simpler trick which I was not aware of at the time.

The Original Approach: The 7-Nested Summation

Back in 2018, I co-authored a paper titled Hartree-Fock implementation using a Laguerre-based wavefunction for the ground-state and correlation energies of two-electron atoms which can be found here. Our goal was to implement the HF method using a Laguerre-based wavefunction in perimetric coordinates.

Because the Laguerre functions are orthogonal on the interval $[0, \infty)$, the one-electron integrals were elegantly solved using series solutions. However, the two-electron Coulomb and exchange integrals were more difficult. The integration generated terms that did not satisfy the Laguerre orthogonality condition.

To solve them analytically, we converted the integrals to perimetric coordinates, expressed the Laguerre polynomials as power series via the binomial expansion, and solved them term-by-term. Mathematically, it was rigorous. Computationally, it was a behemoth resulting in the following 7-nested summation

\[\begin{aligned} &\sum_{p_i,q_i,u_i,v_i=0}^{p,q,u,v} \sum_{a_i=0}^{p_i+q_i} \sum_{b_i=0}^{p_i} \sum_{c_i=0}^{a_i+b_i-\phi+1} (-1)^{\phi+1} \pi^2 \\ &\quad \times \Big( a_i^2 - 2a_i b_i + b_i^2 - p^2 - 2p_i q_i - 2p_i u_i - 2p_i v_i - q_i^2 \\ &\qquad - 2q_i u_i - 2q_i v_i - u^2 - 2u_i v_i - v^2 + a_i + b_i - 7\phi - 10 \Big) \\ &\quad \times (\phi - a_i - b_i)! (p_i + q_i)! (u_i + v_i)! \\ &\quad \times \frac{ p!q!u!v! }{ p_i!^2 q_i!^2 u_i!^2 v_i!^2 (p_i+q_i-a_i)! (q_i+u_i-b_i)! (p-p_i)! (q-q_i)! (u-u_i)! (v-v_i)! }. \end{aligned}\]

What is the problem? Just look at it… also catastrophic cancellation. When dealing with alternating signs $(-1)^{\phi+1}$ and large factorials, standard 64-bit floating-point numbers fail. At higher basis set sizes, the polynomial terms reach magnitudes of $10^{25}$, while the final integrated volume is around $10^2$ so we lose all significant digits.

To bypass this, I originally wrote a dedicated C++ program utilising the Arb ball arithmetic library to handle the massive factorials and enforce rigourous error bounds. It worked, but it was a heavy, memory-hungry hammer (although I did learn how to use the Arb library which I think is fantastic).

The Epiphany: Gauss-Laguerre Quadrature

Recently, while solving an unrelated problem, I realised I had missed a more elegant mathematical mapping. Instead of expanding the polynomials algebraically and fighting the resulting factorials, we can evaluate the integral exactly using Gauss-Laguerre Quadrature.

Because the $1/r_{12}$ Coulomb operator perfectly cancels with the $r_{12}$ term in the perimetric Jacobian, the spatial integration becomes a pure polynomial (up to degree $4m$) multiplied by a separable exponential decay $e^{-A(r_1+r_2)}$.

This takes the exact form required for Gauss-Laguerre quadrature

\[\iiint_0^\infty P(z_1, z_2, z_3) e^{-\frac{A}{2}z_1} e^{-\frac{A}{2}z_2} e^{-Az_3} dz_1 dz_2 dz_3\]

I have used Gaussian quadrature numerous times before and automatically think of it being an approximate, numerical technique but it is not so here; an $N$-point rule (where $N = 2m + 1$) integrates a polynomial of degree $\le 4m$ exactly.

By mapping the integral to the quadrature nodes, the 3D integral algebraically separates into independent 1D projections. The $O(m^7)$ nightmare collapses into an $O(m^4)$ dynamic evaluation.

The Arbitrary Precision Solution with Python

Even with quadrature, the massive polynomial values at $m=20$ still cause precision issues in standard NumPy float arrays. But instead of relying on heavy C++ binaries, we can solve this cleanly in pure Python using mpmath for arbitrary precision.

Here is the core logic of the new implementation. It drops the C++ overhead entirely and allows us to generate the full $(pq|uv)$ tensor with a canonical Yoshimine sort (see my original Hartree Fock post to learn more about this) in seconds, setting the environment to 50 decimal places to absorb any residual cancellation.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
import mpmath
import time

# Set arbitrary precision to 50 decimal places
mpmath.mp.dps = 50


def get_gauss_laguerre_mp(N):
    """
    Generates Gauss-Laguerre roots and weights using the Golub-Welsch
    algorithm in arbitrary precision.
    """
    T = mpmath.matrix(N, N)
    for i in range(N):
        # Diagonal elements: alpha_i = 2i + 1
        T[i, i] = mpmath.mpf(2 * i + 1)
        if i < N - 1:
            # Off-diagonal elements: sqrt(beta_i) = i + 1
            val = mpmath.mpf(i + 1)
            T[i, i + 1] = val
            T[i + 1, i] = val

    # Diagonalize the symmetric Jacobi matrix
    E, ER = mpmath.eigsy(T)

    roots = [E[i] for i in range(N)]
    # Weights are the square of the first component of each normalised eigenvector
    weights = [ER[0, i] ** 2 for i in range(N)]

    return roots, weights


def generate_canonical_integrals_mp(m, A):
    """
    Computes and canonically sorts the unique (pq|uv) two-electron repulsion
    integrals using arbitrary precision Gauss-Laguerre Quadrature.
    """
    N = 2 * m + 1

    print(
        f"-> Generating Gauss-Laguerre grid (N={N}) at {mpmath.mp.dps} decimal places..."
    )
    roots, weights = get_gauss_laguerre_mp(N)

    print("-> Pre-computing coordinate grid and Laguerre polynomials...")
    # 2D Grid: X_jk = x_j + 0.5 * x_k
    X_jk = [
        [roots[j] + mpmath.mpf("0.5") * roots[k] for k in range(N)] for j in range(N)
    ]

    L_grid = []
    for p in range(m):
        # mpmath.laguerre(n, alpha, x)
        grid_p = [
            [mpmath.laguerre(p, 0, X_jk[j][k]) for k in range(N)] for j in range(N)
        ]
        L_grid.append(grid_p)

    print("-> Separating 3D integral and evaluating inner sums S_pq...")
    S = {}
    for p in range(m):
        for q in range(p, m):
            S_pq = []
            for k in range(N):
                val = mpmath.mpf(0)
                for j in range(N):
                    term = weights[j] * X_jk[j][k] * L_grid[p][j][k] * L_grid[q][j][k]
                    val += term
                S_pq.append(val)
            # Store symmetrically
            S[(p, q)] = S_pq
            S[(q, p)] = S_pq

    num_pairs = m * (m + 1) // 2
    total_unique = num_pairs * (num_pairs + 1) // 2
    print(f"-> Assembling {total_unique} canonically sorted unique integrals...")

    scalar = (mpmath.mpf(8) * mpmath.pi**2) / (mpmath.mpf(A) ** 5)

    unique_integrals = []
    index_1d = 0

    # Yoshimine Sort: p >= q, u >= v, pq >= uv
    for p in range(m):
        for q in range(p + 1):
            idx_pq = p * (p + 1) // 2 + q
            for u in range(m):
                for v in range(u + 1):
                    idx_uv = u * (u + 1) // 2 + v

                    if idx_pq >= idx_uv:
                        # Final contraction over k
                        val = mpmath.mpf(0)
                        for k in range(N):
                            term = weights[k] * S[(p, q)][k] * S[(u, v)][k]
                            val += term

                        final_val = scalar * val
                        unique_integrals.append((index_1d, p, q, u, v, final_val))
                        index_1d += 1

    return unique_integrals


# ==========================================
# Execution Example
# ==========================================
if __name__ == "__main__":
    m_basis = 20
    A_val = 1.0  # Set to 1.0 for unscaled reference validation

    start_time = time.time()
    canonical_integrals = generate_canonical_integrals_mp(m_basis, A_val)
    elapsed = time.time() - start_time

    total_unique = len(canonical_integrals)
    print(f"\nCompleted {total_unique} integrals in {elapsed:.2f} seconds.")

    output_filename = f"canonical_integrals_mpmath_m{m_basis}.txt"
    print(f"Writing arbitrary-precision output to {output_filename}...")

    with open(output_filename, "w") as f:
        f.write(f"# Canonical Two-Electron Integrals (m={m_basis}, A={A_val})\n")
        f.write(f"# Computed with mpmath at {mpmath.mp.dps} decimal places\n")
        f.write("# Format: 1D_Index | p  q  u  v | Integral_Value\n")
        f.write("-" * 65 + "\n")

        for row in canonical_integrals:
            idx = int(row[0])
            # +1 shift for 1-based orbital index notation
            p = int(row[1]) + 1
            q = int(row[2]) + 1
            u = int(row[3]) + 1
            v = int(row[4]) + 1

            # Print to 30 decimal places
            val_str = mpmath.nstr(row[5], 30, min_fixed=0)

            # Formatting to handle negative signs cleanly
            f.write(f"{idx:6d} | {p:2d} {q:2d} {u:2d} {v:2d} |  {val_str}\n")

    print("Done.")

This code is much simpler than the original and significantly faster whilst offering the required accuracy for each integral. Even better on my MacBook I can generate all 22155 integrals for a 20-term Hartree Fock wavefunction in just 3.7 seconds! The more you learn.

This post is licensed under CC BY 4.0 by the author.