1 · The Problem & the MOM Thesis

For: anyone who has read the landing page and wants to know what the package actually does and why MOM is the right tool. Atmospheric scientists, retrieval developers, method developers, theorists, contributors.
Prev: Home · Next: 2 · Vector RTE & Discretization

vSmartMOM solves a well-defined problem with a well-chosen method. This page states the problem, explains why the matrix operator method is the right tool for it, and lists the eight things that distinguish this implementation from the other MOM codes you might compare it to. Every other page in the Concepts arc is one segment of the spine introduced at the bottom of this page.

The problem

Compute the polarized radiance leaving a layered, scattering, absorbing atmosphere illuminated by the sun, plus analytic Jacobians, fast enough for retrievals, on whatever hardware the user has.

The Stokes vector at the top of atmosphere (TOA) and bottom of atmosphere (BOA) is a function of: solar geometry (), viewing geometry (), and the vertical structure of the atmosphere — gas absorption, Rayleigh scattering, aerosols, clouds, the surface BRDF. Modern hyperspectral instruments (OCO-2/3, GOSAT, EMIT, …) measure (and increasingly , ) at millions of monochromatic points, and the retrieval algorithms that infer trace gas columns, aerosol microphysics, or surface properties from those measurements need not just but the Jacobian against every retrieval state , evaluated inside a Levenberg-Marquardt loop.

That is a four-axis problem: angles × Stokes components × wavelength × retrieval parameters. It needs to run in seconds, not hours, for production retrievals.

Why MOM

The matrix operator method (Plass & Kattawar, Hovenier, Sanghavi 2014) maps the per-layer vector RTE to a finite-dimensional matrix operator . Layers compose by linear algebra — specifically by the adding equations (Sanghavi 2014, Eqs 23–28). That single design choice gives three structural advantages over the discrete-ordinates, two-stream, and Monte-Carlo families:

Full multiple scattering by construction. Layer composition is matrix multiplication. There is no "single-scatter only" mode that you graduate out of, no fixed-point iteration, no Markov-chain convergence. The first layer added contributes its single-scatter contribution; each subsequent addition picks up all higher orders automatically.
Large optical thickness is cheap. Doubling reaches optical depth in doublings — logarithmic in . Discrete-ordinates solvers march in optical depth and slow down with thicker atmospheres; Monte Carlo loses photons to absorption. MOM does neither.
Operator-level analytic Jacobians. Each elemental, doubling, and adding step has an exact tangent-linear partner (Sanghavi 2014 App. C). The chain rule on the adding-doubling formulas has a compact closed form, so the Jacobian costs roughly the same as the forward run. Other codes resort to AD-everywhere or finite differences — both pay multiplicative overhead.

That's why MOM, in the abstract. The next bullet is what makes this particular MOM implementation different from the others.

What sets vSmartMOM apart

Every claim below has a file.jl:LINE next to it. No marketing.

Operator-level analytic linearization. The RT solver itself is differentiated by hand (Sanghavi 2014 App. C). ForwardDiff is used upstream at the optical-property boundary — Mie cross-sections, atmospheric profile, surface BRDF parameters — never through the adding-doubling kernels. That separation keeps the hot loop pure-FT and lets the analytic chain rule run on GPU. Evidence: src/CoreRT/CoreKernel/lin_added_layer_all_params.jl, src/CoreRT/CoreKernel/{elemental,doubling,interaction}_lin.jl.
One @kernel source compiles for CPU, CUDA, and Metal. Backend dispatch is a one-line method override per extension; the kernel bodies are unchanged. Metal works because KernelAbstractions.@kernel is the abstraction layer, not vendor BLAS. Evidence: src/Architectures.jl:33–96, ext/vSmartMOMCUDAExt.jl:21–27, ext/vSmartMOMMetalExt.jl:19–22.
Hybrid AD across the GPU boundary. ForwardDiff.Dual numbers flow through NNlib.batched_mul on CuArray — values and partials carried through CUBLAS together, no host-device round trips. Evidence: ext/gpu_batched_cuda.jl:141–177.
Polarization is a type, not a runtime branch. Stokes_I (), Stokes_IQ (), Stokes_IQU (), Stokes_IQUV () each subtype AbstractPolarizationType. The kernels specialize at compile time on the type — there's no if pol_type.n == 4 branch inside the inner loops. Evidence: src/Scattering/types.jl:92–143.
Optical properties as algebra. mixes scatterers in a layer (τϖ-weighted averaging); stacks layers vertically. Users compose layer optics like operands. Evidence: src/CoreRT/types.jl:1063–1101.
Weak GPU dependency via Julia 1.9 package extensions. CPU-only installs are first-class; CUDA and Metal load at runtime if available and otherwise stay out of your dependency graph. Evidence: ext/vSmartMOMCUDAExt.jl.
Three core variables per layer. The RT kernel differentiates against these directly, not a black-box scalar optical depth, which is why the chain rule is tractable end-to-end. Evidence: src/CoreRT/types_lin.jl:119–149, src/CoreRT/types.jl:1032+.
Exact finite-δ elemental, not the linear limit. Sanghavi 2014 Eqs. (19)–(20) are written in the limit (linear in ); other MOM codes that stop there need very thin elemental layers (large ) for that approximation to hold. vSmartMOM's elemental kernel uses the exact finite- single-scatter formulas (Fell 1997 Eqs 1.52–1.56, restated as SF2023-II Eqs 10–11), coded with -expm1(-x) and expdiff_neg(a, b) for numerical stability in Float32 on GPU. The elemental layer can be thicker at the same single-scatter accuracy → smaller → less round-off accumulating through doubling. See Concepts/04 § Elemental for the side-by-side equation comparison. Evidence: src/CoreRT/CoreKernel/elemental.jl:207–252.

The narrative thread

PROBLEM      Polarized radiance from a layered scattering+absorbing
             atmosphere, plus Jacobians, fast, on any hardware.
                |
                v
WHY MOM?     Per-layer matrix operator → linear-algebra composition →
             full multiple scattering by construction; large τ cheap
             (logarithmic via doubling); exact analytic Jacobians.
                |
                v
DISCRETIZE   Fourier in φ (m=0..max_m), Gauss/Radau quadrature in μ.
                |
                v
4 ARRAYS     Per Fourier moment m, each layer reduces to four arrays
             of shape (NquadN, NquadN, nSpec):
                τ, ϖ, Z⁺⁺, Z⁻⁺
                |
                v
BUILD        Gas absorption (HITRAN LBL) + Mie/Rayleigh scattering
LAYER        + τϖ-weighted mixing + δ-M truncation + vertical stacking
OPTICS       (RT basics 3, 3a, 3b, 3c).
                |
                v
SOLVE        Per Fourier moment m, layer iz from TOA → BOA:
(MOM)            elemental!  →  doubling!  →  interaction!
             (RT basics 4)
                |
                v
SURFACE      BRDF as the bottom-most AddedLayer (RT basics 5).
                |
                v
LINEARIZE    Operator-level chain rule on adding-doubling.
             ParameterLayout names the Jacobian columns.
             Forward + linearized < 2× forward-only (RT basics 6).
                |
                v
EVERYWHERE   One @kernel source compiles for CPU + CUDA + Metal.
ON ANY       All wavelengths in parallel via the spectral batch axis.
HARDWARE     ForwardDiff Duals flow through GPU batched paths.
             (RT basics 7)

That's the spine. Every page in this RT basics arc is one segment of it.

Where this thread continues

2 · Vector RTE & Discretization — write the equation, discretize it, introduce the four arrays.
3 · Layer Optical Properties — the abstraction every layer reduces to.
3a · Gas Absorption, 3b · Mie & Rayleigh, 3c · Mixing & δ-M — how those four arrays are built from physics.
4 · The MOM Solver — Elemental → Doubling → Adding kernels.
5 · Surfaces — BRDF as the bottom-most AddedLayer.
6 · Linearization — the operator-level chain rule.
7 · Architecture-Agnostic Code — one kernel source, CPU + CUDA + Metal, all wavelengths in parallel.
8 · Inelastic Extension — Raman / Cabannes (brief).

If you only have time for two pages, read 3 (Layer Optics) and 4 (MOM Solver). The rest is detail around them.

Hands-on tutorials

Runnable examples with Plotly figures:

Quick Start

References

Plass, G. N. & Kattawar, G. W. (1973). Matrix operator theory of radiative transfer. Appl. Opt. 12:314.
Hovenier, J. W. (1971). Multiple scattering of polarized light in planetary atmospheres. Astron. Astrophys. 13:7.
Sanghavi, S., Davis, A. B., Eldering, A. (2014). vSmartMOM: A vector matrix operator method-based radiative transfer model linearized with respect to aerosol properties. JQSRT 133:412–433, doi:10.1016/j.jqsrt.2013.09.004. (Primary methodological reference.)
Jeyaram, R., Sanghavi, S., Frankenberg, C. (2022). vSmartMOM.jl: An open-source Julia package for atmospheric radiative transfer and remote sensing tools. JOSS 7(80):4575, doi:10.21105/joss.04575. (Cite this for the package itself.)
Full crib sheet: docs/dev_notes/theory_references.md.

1 · The Problem & the MOM Thesis ​

The problem ​

Why MOM ​

What sets vSmartMOM apart ​

The narrative thread ​

Where this thread continues ​

Hands-on tutorials ​

References ​