Pre-normalizing Legendre Polynomials

Legendre.jl Series, Part III

Legendre.jl Series: Parts 1, 2, 3, 4, 5

In the previous part we computed Legendre polynomials via a few recurrence relations, but the polynomial values grow rapidly with degree and quickly overflow the range of standard finite floating point numbers. In this posting we explore baking in a normalization factor into the recurrence relation — such as used when the Legendre polynomials are used to calculate the spherical harmonics — that eliminates the overflow.


Introduction

At the end of the previous article we saw that the growth of the Legendre polynomials polynomials rapidly leads to numerical overflow. For common uses such as calculating the spherical harmonics in CMB analysis, we require calculating the polynomials to much higher degrees than where the overflows occur, so our goal in this article is to explain how to mitigate this problem.

As a reminder, recall that we calculate the associated Legendre polynomials by evaluating the following three recurrence relations — starting from an appropriate initial condition — for any integer degree $\ell$ and order $m$.

\begin{align} P_\ell^\ell(x) &= -(2\ell-1)\sqrt{1-x^2} P_{\ell-1}^{\ell-1}(x) \label{eqn:std_rr_1term_lm} \\ P_\ell^{\ell-1}(x) &= (2\ell-1) x P_{\ell-1}^{\ell-1}(x) \label{eqn:std_rr_1term_l} \\ P_\ell^m(x) &= \frac{2\ell-1}{\ell - m} x P_{\ell-1}^m(x) - \frac{\ell+m-1}{\ell - m} P_{\ell-2}^m(x) \label{eqn:std_rr_2term} \end{align}

Note that compared to Eqs. 1–3 in the previous article, we have shifted the indexing so that $\ell+1 \rightarrow \ell$ and the left-hand side is for degree $\ell$ (not $\ell+1$). (This will prove to be more convenient shortly when we start generalizing these equations.)

This article will be structured in two primary parts. In the first part we’ll generalize the recurrence relations to prepare for supporting a wider range of normalizations. Then in the second part we’ll show how to pre-normalize the Legendre functions for use in the spherical harmonics, avoiding the overflow which would otherwise prohibit complete analysis of CMB skies.

Generalized recurrence relations

We’ll rewrite the recurrence relations for a generalized form of the Legendre polynomials $p_\ell^m(x)$ which include some arbitrary normalization factor in terms of four undetermined coefficients $\mu_\ell$, $\nu_\ell$, $\alpha_\ell^m$, and $\beta_\ell^m$:

\begin{align} p_\ell^\ell(x) &= -\mu_\ell \sqrt{1-x^2} p_{\ell-1}^{\ell-1}(x) \label{eqn:cus_rr_1term_lm} \\ p_\ell^{\ell-1}(x) &= \nu_\ell x p_{\ell-1}^{\ell-1}(x) \label{eqn:cus_rr_1term_l} \\ p_\ell^m(x) &= \alpha_\ell^m x p_{\ell-1}^m(x) - \beta_\ell^m p_{\ell-2}^m(x) \label{eqn:cus_rr_2term} \end{align}

For the standard associated Legendre polynomials where $P_0^0(x) = 1$ (which will from here on out call the unit-normalized Legendre polynomials), we can simply read off the coefficients from the original form of the recurrence relations and identify each of the coefficients as:

\begin{align}\begin{aligned} \mu_\ell &= 2\ell - 1 & \alpha_\ell^m &= \frac{2\ell - 1}{\ell - m} \\ \nu_\ell &= 2\ell - 1 & \beta_\ell^m &= \frac{\ell + m - 1}{\ell - m} \end{aligned}\end{align}

To prepare for a fully generalized implementation, we’ll encode the calculation of each coefficient term and the initial condition into functions which are dispatched on a trait type1. At this point the addition of a trait type might seem unnecessary, but the use of traits will let us use a single implementation to calculate any arbitrary normalization2.

# Unit-normalization trait type
struct UnitNorm end

# Initial condition and recurrence relation coefficients for unit normalization
initcond(::UnitNorm) = 1.0
coeff_μ(::UnitNorm, l) = 2l - 1
coeff_ν(::UnitNorm, l) = 2l - 1
coeff_α(::UnitNorm, l, m) = (2l - 1) / (l - m)
coeff_β(::UnitNorm, l, m) = (l + m - 1) / (l - m)

Now we update the implementation of legendre() that we wrote in the previous article to instead call these helper functions instead of having the initial condition and recurrence coefficients written inline. Note that at the moment we hard-code calls to use the unit normalization (such as in initcond(UnitNorm())), so the implementation is not yet generic, but this gets us a step closer.

function legendre(, m, x)
    P = convert(typeof(x), initcond(UnitNorm()))
     == 0 && m == 0 && return P

    # Use recurrence relation 1 to increase to the target m
    y = sqrt(one(x) - x^2)
    for m′ in 1:m
        μ = coeff_μ(UnitNorm(), m′)
        P = -μ * y * P
    end
     == m && return P

    # Setup for 2-term recurrence, using recurrence relation 2 to set
    # Pmin1
    ν = coeff_ν(UnitNorm(), m + 1)
    Pmin2 = P
    Pmin1 = ν * x * Pmin2
     == m+1 && return Pmin1

    # Use recurrence relation 3
    for ℓ′ in m+2:
        α = coeff_α(UnitNorm(), ℓ′, m)
        β = coeff_β(UnitNorm(), ℓ′, m)
        P = α * x * Pmin1 - β * Pmin2
        Pmin2, Pmin1 = Pmin1, P
    end

    return P
end

Comparison of legendre() from the previous posting versus the code cell above.3

 function legendre(ℓ, m, x)
-    P = one(x)
+    P = convert(typeof(x), initcond(UnitNorm())
     ℓ == 0 && m == 0 && return P
 
     # Use recurrence relation 1 to increase to the target m
     y = sqrt(one(x) - x^2)
     for m′ in 1:m
-        P = -(2m′ - 1) * y * P
+        μ = coeff_μ(UnitNorm(), m′)
+        P = -μ * y * P
     end
     ℓ == m && return P
 
     # Setup for 2-term recurrence, using recurrence relation 2 to set
     # Pmin1
+    ν = coeff_ν(UnitNorm(), m + 1)
     Pmin2 = P
-    Pmin1 = (2*(m+1) - 1) * x * Pmin2
+    Pmin1 = ν * x * Pmin2
     ℓ == m+1 && return Pmin1
 
     # Use recurrence relation 3
     for ℓ′ in m+2:ℓ
-        P = ((2ℓ′ - 1) * x * Pmin1 - (ℓ′ + m - 1) * Pmin2) / (ℓ′ - m)
+        α = coeff_α(UnitNorm(), ℓ′, m)
+        β = coeff_β(UnitNorm(), ℓ′, m)
+        P = α * x * Pmin1 - β * Pmin2
         Pmin2, Pmin1 = Pmin1, P
     end
 
     return P
 end

No functionality has yet actually changed for an end-user of the function. We can exactly reproduce the figure from last time showing the $m = 1$ associated Legendre polynomials:

using PyPlot

x = range(-1.0, 1.0, length=1000)
for m in 1:5
    plot(x, legendre.(m, 1, x), label="\$P_{$m}^1\$")
end
axhline.([-1, 0, 1], color="gray", linewidth=0.5)
axvline.([-1, 0, 1], color="gray", linewidth=0.5)
legend(loc = "lower right")
title("Associated Legendre Polynomials (\$m = 1\$)")

We’re now ready to tackle the issue of the Legendre polynomials growing in magnitude with increasing degree and order by considering how the spherical harmonics are normalized.

Legendre Polynomials in the Spherical Harmonics

The spherical harmonics are critical to CMB analysis, so accurately and efficiently calculating them is worth the effort. In part 1 we defined the spherical harmonics as

\begin{align} Y_{\ell m}(\theta,\phi) = \sqrt{\frac{2\ell+1}{4\pi} \frac{(\ell-m)!}{(\ell+m)!}} P_\ell^m(\cos\theta) e^{im\phi} \end{align}

which composes the Legendre polynomials (as a function of colatitude $\theta$) with a complex exponential (acting on the azimuth $\phi$) and a normalization constant. We then saw in the previous part that the Legendre polynomials result in values greater than what is representable in standard floating point types at very modest combinations of degree and order (around $\ell,m \sim 150$ for 64-bit floating point numbers).

legendre(152, 150, 0.2)
Inf

We discussed last time that the normalization counteracts growth rate of the polynomials, but it’s not sufficient to calculate the polynomials and normalization independently. In fact, the normalization underflows to identically zero when we know that mathematically it’s non-zero (just very small):

Nlm(l, m; type=Float64) = sqrt((2l + 1) / 4π / reduce(*, l-m+1:l+m, init=one(type)))

Nlm(152, 150)
0.0

A big feature of Julia is that it specializes your code depending on the types of the input arguments, and that allows one workaround for under/overflowing to be to upgrade to using extended-precision BigFloat values instead of standard Float64 types. With BigFloat, we can calculate both the polynomial value and normalization and then convert back to Float64:

using Printf
n = Nlm(152, 150, type=BigFloat)
p = legendre(152, 150, big(0.2))
λ = Float64(n * p)
@printf """
    N_152^150 ≈ %0.9g
    P_152^150 ≈ %0.9g
    λ_152^150 ≈ %0.9g
    """ n p λ
N_152^150 ≈ 1.32090549e-309
P_152^150 ≈ 2.94031628e+308
λ_152^150 ≈ 0.388387991

Using BigFloat comes at a heavy cost, though. Compare the computation time for a standard 64-bit machine float versus calculating on BigFloat values:

using BenchmarkTools

print(" Float64:"); @btime legendre(152, 150, 0.2);
print("BigFloat:"); @btime legendre(152, 150, big(0.2));
 Float64:  146.484 ns (0 allocations: 0 bytes)
BigFloat:  20.763 μs (624 allocations: 34.13 KiB)

The calculation using BigFloat values takes over two orders of magnitude longer to calculate than the equivalent Float64 value. Doing similar calculations for thousands of degrees and orders, for potentially thousands more input values, means the extra cost is unacceptable if there’s any other option.

Spherical Harmonic Normalization

Since it’s been promised already, the obvious answer is that we can instead bake-in the normalization into the recurrence relations so that we calculate the final value directly.

The first step is to note that we can define the spherical harmonic pre-normalized Legendre polynomials $\lambda_\ell^m(x)$ as

\begin{align} \lambda_\ell^m(x) \equiv N_\ell^m P_\ell^m(x) \quad\text{where}\quad N_\ell^m \equiv \sqrt{\frac{2\ell+1}{4\pi}\frac{(\ell-m)!}{(\ell+m)!}} \end{align}

by combining the normalization and Legendre polynomials into a single function so that the spherical harmonics are rewritten simply as

\begin{align} Y_{\ell m}(\theta,\phi) = \lambda_\ell^m(\cos\theta) e^{im\phi} \end{align}

Our remaining task is to determine the expressions for the coefficients in Eqs. $\ref{eqn:cus_rr_1term_lm}$–$\ref{eqn:cus_rr_2term}$ that produce the pre-normalized functions $\lambda_\ell^m(x)$. We do that by multiplying the original [unit-normlized] recurrences (Eqs. $\ref{eqn:std_rr_1term_lm}$–$\ref{eqn:std_rr_2term}$) through by $N_\ell^m$. The left-hand side will by definition be $\lambda_\ell^m$, and on the right-hand side we rearrange terms until we can similarly identify the $\lambda_{\ell’}^{m’}$ terms.

For example, $\alpha_\ell^m$ and $\beta_\ell^m$ are determined by starting with Eq. $\ref{eqn:cus_rr_2term}$. Multiplying through by $N_\ell^m$ and identifying $\lambda_\ell^m$ on the left-hand side, that leaves us with

\begin{align} \begin{split} \lambda_\ell^m &= \frac{2\ell-1}{\ell-m} x \sqrt{\frac{2\ell+1}{4\pi} \frac{(\ell-m)!}{(\ell+m)!}} P_{\ell-1}^m(x) - \frac{\ell+m-1}{\ell-m} \sqrt{\frac{2\ell+1}{4\pi} \frac{(\ell-m)!}{(\ell+m)!}} P_{\ell-2}^m(x) \end{split} \end{align}

Through judicious use of algebra, the terms on the right-hand side can be manipulated to gather terms of the form $N_{\ell-1}^m P_{\ell-1}^m(x)$ and $N_{\ell-2}^m P_{\ell-2}^m(x)$, leaving us with

\begin{align} \lambda_\ell^m &= \sqrt{\frac{2\ell+1}{2\ell-3} \frac{4(\ell-1)^2 - 1}{\ell^2 - m^2}} x \lambda_{\ell-1}^m(x) - \sqrt{\frac{2\ell+1}{2\ell-3} \frac{(\ell-1)^2 - m^2}{\ell^2 - m^2}} \lambda_{\ell-2}^m(x) \end{align}

We identify each of the two square root terms as $\alpha_\ell^m$ and $\beta_\ell^m$ since they are the cofficients appropriate for generating $\lambda_\ell^m(x)$.

Proceeding similarly with the other two recurrence relations, we obtain:

\begin{align}\begin{aligned} \mu_\ell &= \sqrt{1 + \frac{1}{2\ell}} & \alpha_\ell^m &= \sqrt{\frac{2\ell+1}{2\ell-3} \frac{4(\ell-1)^2 - 1}{\ell^2 - m^2}} \\ \nu_\ell &= \sqrt{1 + 2\ell} & \beta_\ell^m &= \sqrt{\frac{2\ell+1}{2\ell-3} \frac{(\ell-1)^2 - m^2}{\ell^2 - m^2}} \end{aligned}\end{align}

Finally, we also require the intial condition $\lambda_0^0(x)$. This too is straight forward given the definition:

\begin{align} \lambda_0^0(x) &= N_0^0 P_0^0(x) = \sqrt{\frac{1}{4\pi}} \end{align}

Computing this normalization is accomplished by definining a new trait type SphereNorm and overloading the initial condition and coefficient functions defined above, dispatching on SphereNorm.

struct SphereNorm end

# Initial condition and recurrence relation coefficients for unit normalization
initcond(::SphereNorm) = 1.0 / sqrt(4π)
coeff_μ(::SphereNorm, l) = sqrt(1 + 1/2l)
coeff_ν(::SphereNorm, l) = sqrt(1 + 2l)
coeff_α(::SphereNorm, l, m) = sqrt((2l+1)/(2l-3) * (4*(l-1)^2 - 1)/(l^2 - m^2))
coeff_β(::SphereNorm, l, m) = sqrt((2l+1)/(2l-3) * ((l-1)^2 - m^2)/(l^2 - m^2))

To make use of the new normalization, we now simply update legendre() to accept the normalization as an argument and replace each use of UnitNorm() with the normalization argument.

function legendre(norm, , m, x)
    P = convert(typeof(x), initcond(norm))
     == 0 && m == 0 && return P

    # Use recurrence relation 1 to increase to the target m
    y = sqrt(one(x) - x^2)
    for m′ in 1:m
        μ = coeff_μ(norm, m′)
        P = -μ * y * P
    end
     == m && return P

    # Setup for 2-term recurrence, using recurrence relation 2 to set
    # Pmin1
    ν = coeff_ν(norm, m + 1)
    Pmin2 = P
    Pmin1 = ν * x * Pmin2
     == m+1 && return Pmin1

    # Use recurrence relation 3
    for ℓ′ in m+2:
        α = coeff_α(norm, ℓ′, m)
        β = coeff_β(norm, ℓ′, m)
        P = α * x * Pmin1 - β * Pmin2
        Pmin2, Pmin1 = Pmin1, P
    end

    return P
end

With the normalization now baked into the recurrence, we have no problem calculating the appropriately-normalized polynomial value for $\lambda_{152}^{150}(0.2)$.

legendre(SphereNorm(), 152, 150, 0.2)
0.3883879907461426

It could be seen as unfortunate that the call syntax requires we explicitly state which normalization is to used, but it’s trivial to define wrapper functions that are named for each normalization style. In accordance with the mathematical notation of this article, we could choose to define Plm and λlm for the unit and spherical harmonic normalizations, respectively.

Plm(l, m, x) = legendre(UnitNorm(), l, m, x)
λlm(l, m, x) = legendre(SphereNorm(), l, m, x)

Using the pre-normalization, we can now update the plot from the previous posting which had missing values due to overflow in the middle and show the two functions entirely:

plot(x, λlm.(151, 150, x), label="\$λ_{151}^{150}\$")
plot(x, λlm.(152, 150, x), label="\$λ_{152}^{150}\$")
legend()
title("Spherical Harmonic normalized Legendre polynomials")

Conclusions

The trait-based dispatch mechanism is one of the core features of the Legendre.jl package, and we could undoubtedly use this simple implementation in real use cases. To achieve highly efficient and high-accuracy results, though, the package implements several other optimization and numerical tricks which we will introduce and discuss in future postings. We’ll explore some of these enhancements as well, using several goal calculations to demonstrate existing deficiencies and how I have resolved them.


  1. Traits in Julia are sometimes also called “Holy Traits” in recognition of a core Julia contributor, Tim Holy, who early-on recognized the power of Julia’s multiple dispatch combined with singleton instance types as a way to make extensible code and type hierarchies. Trait types are now used extensively throughout base Julia, such as in defining the iterator interface (e.g. HasLength(), HasEltype(), etc). ↩︎

  2. The implementation in this article is simplified from the version actually used in Legendre.jl. For instance, we’ve skipped using an abstract supertype to unify all normalizations under one type hierarchy. Furthermore, the interface functions have all been simplified with the assumption that we’re only interested in Float64 results; supporting type-stable results over all numeric types requires passing additional contextual information to the calculations. ↩︎

  3. The diff is computed against a version of legendre() which has had the indexing updated for the $\ell+1 \rightarrow \ell$ change, so the before state does not exactly correspond to the legendre() function shown in the previous posting. This has been done to highlight the trait-based interface related changes. ↩︎