In the previous part we computed Legendre polynomials via a few recurrence relations, but the polynomial values grow rapidly with degree and quickly overflow the range of standard finite floating point numbers. In this posting we explore baking in a normalization factor into the recurrence relation — such as used when the Legendre polynomials are used to calculate the spherical harmonics — that eliminates the overflow.
Introduction¶
At the end of the previous article we saw that the growth of the Legendre polynomials polynomials rapidly leads to numerical overflow. For common uses such as calculating the spherical harmonics in CMB analysis, we require calculating the polynomials to much higher degrees than where the overflows occur, so our goal in this article is to explain how to mitigate this problem.
As a reminder, recall that we calculate the associated Legendre polynomials by evaluating the following three recurrence relations — starting from an appropriate initial condition — for any integer degree $\ell$ and order $m$.
Note that compared to Eqs. 1–3 in the previous article, we have shifted the indexing so that $\ell+1 \rightarrow \ell$ and the left-hand side is for degree $\ell$ (not $\ell+1$). (This will prove to be more convenient shortly when we start generalizing these equations.)
This article will be structured in two primary parts. In the first part we’ll generalize the recurrence relations to prepare for supporting a wider range of normalizations. Then in the second part we’ll show how to pre-normalize the Legendre functions for use in the spherical harmonics, avoiding the overflow which would otherwise prohibit complete analysis of CMB skies.
Generalized recurrence relations¶
We’ll rewrite the recurrence relations for a generalized form of the Legendre polynomials $p_\ell^m(x)$ which include some arbitrary normalization factor in terms of four undetermined coefficients $\mu_\ell$, $\nu_\ell$, $\alpha_\ell^m$, and $\beta_\ell^m$:
For the standard associated Legendre polynomials where $P_0^0(x) = 1$ (which will from here on out call the unit-normalized Legendre polynomials), we can simply read off the coefficients from the original form of the recurrence relations and identify each of the coefficients as:
To prepare for a fully generalized implementation, we’ll encode the calculation of each coefficient term and the initial condition into functions which are dispatched on a trait type1. At this point the addition of a trait type might seem unnecessary, but the use of traits will let us use a single implementation to calculate any arbitrary normalization2.
# Unit-normalization trait type
struct UnitNorm end
# Initial condition and recurrence relation coefficients for unit normalization
initcond(::UnitNorm) = 1.0
coeff_μ(::UnitNorm, l) = 2l - 1
coeff_ν(::UnitNorm, l) = 2l - 1
coeff_α(::UnitNorm, l, m) = (2l - 1) / (l - m)
coeff_β(::UnitNorm, l, m) = (l + m - 1) / (l - m)
Now we update the implementation of legendre()
that we wrote in the
previous article to instead call these helper functions instead of having the
initial condition and recurrence coefficients written inline.
Note that at the moment we hard-code calls to use the unit normalization
(such as in initcond(UnitNorm())
), so the implementation is not yet generic,
but this gets us a step closer.
function legendre(ℓ, m, x)
P = convert(typeof(x), initcond(UnitNorm()))
ℓ == 0 && m == 0 && return P
# Use recurrence relation 1 to increase to the target m
y = sqrt(one(x) - x^2)
for m′ in 1:m
μ = coeff_μ(UnitNorm(), m′)
P = -μ * y * P
end
ℓ == m && return P
# Setup for 2-term recurrence, using recurrence relation 2 to set
# Pmin1
ν = coeff_ν(UnitNorm(), m + 1)
Pmin2 = P
Pmin1 = ν * x * Pmin2
ℓ == m+1 && return Pmin1
# Use recurrence relation 3
for ℓ′ in m+2:ℓ
α = coeff_α(UnitNorm(), ℓ′, m)
β = coeff_β(UnitNorm(), ℓ′, m)
P = α * x * Pmin1 - β * Pmin2
Pmin2, Pmin1 = Pmin1, P
end
return P
end
Comparison of legendre()
from the
previous posting
versus the code cell above.3
function legendre(ℓ, m, x)
- P = one(x)
+ P = convert(typeof(x), initcond(UnitNorm())
ℓ == 0 && m == 0 && return P
# Use recurrence relation 1 to increase to the target m
y = sqrt(one(x) - x^2)
for m′ in 1:m
- P = -(2m′ - 1) * y * P
+ μ = coeff_μ(UnitNorm(), m′)
+ P = -μ * y * P
end
ℓ == m && return P
# Setup for 2-term recurrence, using recurrence relation 2 to set
# Pmin1
+ ν = coeff_ν(UnitNorm(), m + 1)
Pmin2 = P
- Pmin1 = (2*(m+1) - 1) * x * Pmin2
+ Pmin1 = ν * x * Pmin2
ℓ == m+1 && return Pmin1
# Use recurrence relation 3
for ℓ′ in m+2:ℓ
- P = ((2ℓ′ - 1) * x * Pmin1 - (ℓ′ + m - 1) * Pmin2) / (ℓ′ - m)
+ α = coeff_α(UnitNorm(), ℓ′, m)
+ β = coeff_β(UnitNorm(), ℓ′, m)
+ P = α * x * Pmin1 - β * Pmin2
Pmin2, Pmin1 = Pmin1, P
end
return P
end
No functionality has yet actually changed for an end-user of the function. We can exactly reproduce the figure from last time showing the $m = 1$ associated Legendre polynomials:
using PyPlot
x = range(-1.0, 1.0, length=1000)
for m in 1:5
plot(x, legendre.(m, 1, x), label="\$P_{$m}^1\$")
end
axhline.([-1, 0, 1], color="gray", linewidth=0.5)
axvline.([-1, 0, 1], color="gray", linewidth=0.5)
legend(loc = "lower right")
title("Associated Legendre Polynomials (\$m = 1\$)")
We’re now ready to tackle the issue of the Legendre polynomials growing in magnitude with increasing degree and order by considering how the spherical harmonics are normalized.
Legendre Polynomials in the Spherical Harmonics¶
The spherical harmonics are critical to CMB analysis, so accurately and efficiently calculating them is worth the effort. In part 1 we defined the spherical harmonics as
which composes the Legendre polynomials (as a function of colatitude $\theta$) with a complex exponential (acting on the azimuth $\phi$) and a normalization constant. We then saw in the previous part that the Legendre polynomials result in values greater than what is representable in standard floating point types at very modest combinations of degree and order (around $\ell,m \sim 150$ for 64-bit floating point numbers).
legendre(152, 150, 0.2)
Inf
We discussed last time that the normalization counteracts growth rate of the polynomials, but it’s not sufficient to calculate the polynomials and normalization independently. In fact, the normalization underflows to identically zero when we know that mathematically it’s non-zero (just very small):
Nlm(l, m; type=Float64) = sqrt((2l + 1) / 4π / reduce(*, l-m+1:l+m, init=one(type)))
Nlm(152, 150)
0.0
A big feature of Julia is that it specializes your code depending on the
types of the input arguments, and that allows one workaround for
under/overflowing to be to upgrade to using extended-precision BigFloat
values instead of standard Float64
types.
With BigFloat
, we can calculate both the polynomial value and normalization
and then convert back to Float64
:
using Printf
n = Nlm(152, 150, type=BigFloat)
p = legendre(152, 150, big(0.2))
λ = Float64(n * p)
@printf """
N_152^150 ≈ %0.9g
P_152^150 ≈ %0.9g
λ_152^150 ≈ %0.9g
""" n p λ
N_152^150 ≈ 1.32090549e-309
P_152^150 ≈ 2.94031628e+308
λ_152^150 ≈ 0.388387991
Using BigFloat
comes at a heavy cost, though.
Compare the computation time for a standard 64-bit machine float versus
calculating on BigFloat
values:
using BenchmarkTools
print(" Float64:"); @btime legendre(152, 150, 0.2);
print("BigFloat:"); @btime legendre(152, 150, big(0.2));
Float64: 146.484 ns (0 allocations: 0 bytes)
BigFloat: 20.763 μs (624 allocations: 34.13 KiB)
The calculation using BigFloat
values takes over two orders of magnitude
longer to calculate than the equivalent Float64
value.
Doing similar calculations for thousands of degrees and orders, for
potentially thousands more input values, means the extra cost is unacceptable
if there’s any other option.
Spherical Harmonic Normalization¶
Since it’s been promised already, the obvious answer is that we can instead bake-in the normalization into the recurrence relations so that we calculate the final value directly.
The first step is to note that we can define the spherical harmonic pre-normalized Legendre polynomials $\lambda_\ell^m(x)$ as
by combining the normalization and Legendre polynomials into a single function so that the spherical harmonics are rewritten simply as
Our remaining task is to determine the expressions for the coefficients in Eqs. $\ref{eqn:cus_rr_1term_lm}$–$\ref{eqn:cus_rr_2term}$ that produce the pre-normalized functions $\lambda_\ell^m(x)$. We do that by multiplying the original [unit-normlized] recurrences (Eqs. $\ref{eqn:std_rr_1term_lm}$–$\ref{eqn:std_rr_2term}$) through by $N_\ell^m$. The left-hand side will by definition be $\lambda_\ell^m$, and on the right-hand side we rearrange terms until we can similarly identify the $\lambda_{\ell’}^{m’}$ terms.
For example, $\alpha_\ell^m$ and $\beta_\ell^m$ are determined by starting with Eq. $\ref{eqn:cus_rr_2term}$. Multiplying through by $N_\ell^m$ and identifying $\lambda_\ell^m$ on the left-hand side, that leaves us with
Through judicious use of algebra, the terms on the right-hand side can be manipulated to gather terms of the form $N_{\ell-1}^m P_{\ell-1}^m(x)$ and $N_{\ell-2}^m P_{\ell-2}^m(x)$, leaving us with
We identify each of the two square root terms as $\alpha_\ell^m$ and $\beta_\ell^m$ since they are the cofficients appropriate for generating $\lambda_\ell^m(x)$.
Proceeding similarly with the other two recurrence relations, we obtain:
Finally, we also require the intial condition $\lambda_0^0(x)$. This too is straight forward given the definition:
Computing this normalization is accomplished by definining a new trait type
SphereNorm
and overloading the initial condition and coefficient functions
defined above, dispatching on SphereNorm
.
struct SphereNorm end
# Initial condition and recurrence relation coefficients for unit normalization
initcond(::SphereNorm) = 1.0 / sqrt(4π)
coeff_μ(::SphereNorm, l) = sqrt(1 + 1/2l)
coeff_ν(::SphereNorm, l) = sqrt(1 + 2l)
coeff_α(::SphereNorm, l, m) = sqrt((2l+1)/(2l-3) * (4*(l-1)^2 - 1)/(l^2 - m^2))
coeff_β(::SphereNorm, l, m) = sqrt((2l+1)/(2l-3) * ((l-1)^2 - m^2)/(l^2 - m^2))
To make use of the new normalization, we now simply update legendre()
to
accept the normalization as an argument and replace each use of UnitNorm()
with the normalization argument.
function legendre(norm, ℓ, m, x)
P = convert(typeof(x), initcond(norm))
ℓ == 0 && m == 0 && return P
# Use recurrence relation 1 to increase to the target m
y = sqrt(one(x) - x^2)
for m′ in 1:m
μ = coeff_μ(norm, m′)
P = -μ * y * P
end
ℓ == m && return P
# Setup for 2-term recurrence, using recurrence relation 2 to set
# Pmin1
ν = coeff_ν(norm, m + 1)
Pmin2 = P
Pmin1 = ν * x * Pmin2
ℓ == m+1 && return Pmin1
# Use recurrence relation 3
for ℓ′ in m+2:ℓ
α = coeff_α(norm, ℓ′, m)
β = coeff_β(norm, ℓ′, m)
P = α * x * Pmin1 - β * Pmin2
Pmin2, Pmin1 = Pmin1, P
end
return P
end
With the normalization now baked into the recurrence, we have no problem calculating the appropriately-normalized polynomial value for $\lambda_{152}^{150}(0.2)$.
legendre(SphereNorm(), 152, 150, 0.2)
0.3883879907461426
It could be seen as unfortunate that the call syntax requires we explicitly
state which normalization is to used, but it’s trivial to define wrapper
functions that are named for each normalization style.
In accordance with the mathematical notation of this article, we could
choose to define Plm
and λlm
for the unit and spherical harmonic
normalizations, respectively.
Plm(l, m, x) = legendre(UnitNorm(), l, m, x)
λlm(l, m, x) = legendre(SphereNorm(), l, m, x)
Using the pre-normalization, we can now update the plot from the previous posting which had missing values due to overflow in the middle and show the two functions entirely:
plot(x, λlm.(151, 150, x), label="\$λ_{151}^{150}\$")
plot(x, λlm.(152, 150, x), label="\$λ_{152}^{150}\$")
legend()
title("Spherical Harmonic normalized Legendre polynomials")
Conclusions¶
The trait-based dispatch mechanism is one of the core features of the
Legendre.jl
package, and
we could undoubtedly use this simple implementation in real use cases.
To achieve highly efficient and high-accuracy results, though, the package
implements several other optimization and numerical tricks which we will
introduce and discuss in future postings.
We’ll explore some of these enhancements as well, using several goal
calculations to demonstrate existing deficiencies and how I have resolved
them.
Traits in Julia are sometimes also called “Holy Traits” in recognition of a core Julia contributor, Tim Holy, who early-on recognized the power of Julia’s multiple dispatch combined with singleton instance types as a way to make extensible code and type hierarchies. Trait types are now used extensively throughout base Julia, such as in defining the iterator interface (e.g.
HasLength()
,HasEltype()
, etc). ↩︎The implementation in this article is simplified from the version actually used in
Legendre.jl
. For instance, we’ve skipped using an abstract supertype to unify all normalizations under one type hierarchy. Furthermore, the interface functions have all been simplified with the assumption that we’re only interested inFloat64
results; supporting type-stable results over all numeric types requires passing additional contextual information to the calculations. ↩︎The diff is computed against a version of
legendre()
which has had the indexing updated for the $\ell+1 \rightarrow \ell$ change, so the before state does not exactly correspond to thelegendre()
function shown in the previous posting. This has been done to highlight the trait-based interface related changes. ↩︎