Skip to content

Preprocessing API

For narrative coverage of the source × target dispatch model and per-source pipelines, see Preprocessing overview and the per-source pages ERA5 spectral path and GEOS native cubed-sphere.

AtmosTransport.Preprocessing.AbstractBinaryWriter Type
julia
AbstractBinaryWriter{G <: AbstractTargetGeometry, FT,
                     Basis <: AbstractMassBasis}

Typed nominal for the topology's streaming binary writer. The third type parameter encodes the on-disk mass-basis convention (reusing State.AbstractMassBasis so the same nominal flows through the runtime reader path) so a writer↔reader pairing mismatch is a compile-time MethodError.

P1 ships only the abstract type; concrete subtypes land in P2.

source
AtmosTransport.Preprocessing.AbstractChainPolicy Type
julia
AbstractChainPolicy

Type-system tag for cross-day mass-state carry. Concrete subtypes are NoChain (no carry; each day starts from the raw source endpoint) and ChainedMass{T} where T is the array shape of the seed (e.g. NTuple{6, Array{Float32, 3}} for GEOS cubed-sphere).

source
AtmosTransport.Preprocessing.AbstractERA5GRIBSettings Type
julia
AbstractERA5GRIBSettings <: AbstractMetSettings

Abstract supertype for ERA5 native-GRIB sources. Concrete subtypes pick the source mesh via the flavor parameter on ERA5GRIBSettings.

source
AtmosTransport.Preprocessing.AbstractFieldKind Type
julia
AbstractFieldKind

Singleton-type tag selecting the vertical-reduction rule apply_vertical! uses for one payload field. Concrete subtypes (all zero-size singletons) below.

source
AtmosTransport.Preprocessing.AbstractMetReader Type
julia
AbstractMetReader{FT, S, CP}

Typed met-reader nominal. Bundles a met-source's settings, per-day handle context, and chained-mass-policy carry into one struct that the unified preprocessor driver dispatches on. Type parameters:

  • FT <: AbstractFloat — preprocessing float type (Float32 for GPU, Float64 for diagnostic runs).

  • S <: AbstractMetSettings — concrete settings type (GEOSITSettings, GEOSFPSettings, ERA5SpectralSettings, …).

  • CP <: AbstractChainPolicyNoChain or ChainedMass{T} for some seed array type T.

Concrete subtypes implement the seven trait functions below.

source
AtmosTransport.Preprocessing.AbstractMetSettings Type
julia
AbstractMetSettings

Top-level supertype for typed met-data source descriptors used by the preprocessor. Concrete subtypes (e.g. GEOSITSettings, MERRA2Settings, SpectralERA5Settings) carry source-specific paths and parameters and implement the read_window! / source_grid / windows_per_day interface.

process_day(date, grid::AbstractTargetGeometry, settings::AbstractMetSettings, vertical; ...) dispatches on settings to pick the reader.

source
AtmosTransport.Preprocessing.AbstractVerticalTransform Type
julia
AbstractVerticalTransform

Typed nominal selecting how native source levels are mapped to output levels. Concrete subtypes:

  • IdentityVertical — no merge.

  • MergeByIndex — explicit native-level groups.

  • MergeLayersThinnerThan — automatic local coarsening.

  • MergeAbovePressure — upper-atmosphere coarsening.

  • LevelSelection — echlevs-style level selection.

  • PressureOverlap — pressure-thickness overlap onto a different hybrid coordinate.

A concrete transform T is consumed by plan_vertical(transform, native_vc) to produce a VerticalPlan{FT, T}. apply_vertical! dispatches on (plan, ::FieldKind) to select the right per-field rule.

source
AtmosTransport.Preprocessing.AbstractWindowContract Type
julia
AbstractWindowContract{G <: AbstractTargetGeometry, FT}

Typed nominal owning a topology's per-window gate policy and the worst-window accumulator state. Concrete subtypes:

  • CubedSphereContract{FT} (cubed_sphere_contracts.jl)

  • LatLonContract{FT} (latlon_contracts.jl)

  • ReducedGaussianContract{FT} (reduced_gaussian_contracts.jl)

A concrete contract validates its policy at construction time (so e.g. positivity_cfl_limit = 0.0 errors before any window is preprocessed) and exposes the four-method trait surface:

julia
verify_window!(window, contract, win_idx) -> (replay, positivity)
update_accumulator!(contract, positivity_diag, win_idx) -> nothing
summarize_status!(contract; quarantine_path) -> nothing

window is the topology-specific window payload (NamedTuple of typed buffers today, P2-typed ReadyWindow{G, FT} later).

Closes foot-gun (A) from DESIGN.md: contract knobs aren't drift-prone kwargs anymore — each topology constructs its own contract once from config, with whatever fields IT needs.

source
AtmosTransport.Preprocessing.AbstractWindowWorkspace Type
julia
AbstractWindowWorkspace{G <: AbstractTargetGeometry, FT}

Typed nominal for the per-day target-shape workspace buffers. P1 ships only the abstract type; concrete subtypes land alongside the unified driver cutover in P2 (today's workspaces are NamedTuples constructed inside each topology's process_day orchestrator).

source
AtmosTransport.Preprocessing.ChainedMass Type
julia
ChainedMass{T} <: AbstractChainPolicy

The reader carries an end-of-day mass field of shape T into the next day's open. The shape T is the seed array type (e.g. NTuple{6, Array{Float32, 3}}); the actual seed VALUE lives in the reader struct, but its STATIC type is encoded here so end_of_day_seed return-type is known at compile time.

source
AtmosTransport.Preprocessing.ConvectionInterfaceFlux Type

Convective interface flux (e.g. cmfmc). Same selection rule as PressureFluxField.

source
AtmosTransport.Preprocessing.ConvectionTendencyField Type

Convective center tendency (e.g. dtrain). Extensive at the layer; sum within merged group.

source
AtmosTransport.Preprocessing.CubedSphereContract Type
julia
CubedSphereContract{FT} <: AbstractWindowContract{CubedSphereTargetGeometry, FT}

Typed nominal owning a CS preprocessor's per-window gate policy and worst-window positivity accumulator.

Fields:

  • replay_tol::Float64 — relative replay tolerance.

  • positivity_cfl_limit::Float64 — per-substep positivity CFL gate. Must satisfy 0 < limit ≤ 1; validated at construction.

  • require_substep_positivity::Bool — whether summarize_status! errors (true) or warns (false) on a positivity violation.

  • steps_per_window::Int — for the recommended-steps message in the summary's escape-hatch detail. Must be ≥ 1.

  • halo_width::Int — passed through to verify_substep_positivity_cs!.

  • worst::NamedTuple — mutable accumulator (initially zero).

Construct with explicit kwargs; defaults match the CS round-2/round-3 production policy.

source
AtmosTransport.Preprocessing.ERA5C180RegridFields Type
julia
ERA5C180RegridFields{FT}

Per-window output container for fields regridded onto the C180 cubed-sphere target. Holds:

  • ps — 2D (Nc, Nc) matrix per panel (Pa, moist surface pressure),

  • u, v, t, qv — 3D (Nc, Nc, Nz) arrays per panel (U/V in geographic east/north frame; rotation to panel-local axes happens in the breakpoint-F glue where the panel basis is known).

Mass fields (m_dry, delp_dry, ps_dry) are not regridded directly; they are re-derived on the C180 mesh from the regridded PS + Q so that Σ_k DELP_dry == PS_dry to roundoff on the target side as well.

source
AtmosTransport.Preprocessing.ERA5C180RegridWorkspace Type
julia
ERA5C180RegridWorkspace{FT, R}

Owns the conservative regridder + flat scratch buffers used by regrid_n320_to_c180!. Allocated once per (source_grid, target_grid) pair and reused across every window.

source
AtmosTransport.Preprocessing.ERA5GRIBDayHandles Type
julia
ERA5GRIBDayHandles{S<:AbstractERA5GRIBSettings}

Per-day source-file context. Carries the resolved on-disk paths to the day's GRIB streams plus the optional next-day core path that supplies the right endpoint of the last window.

convection_path is nothing unless settings.include_convection is set; likewise surface_path is nothing unless settings.include_surface is set. next_core_path is nothing either when the caller passed next_day_handle=false or when the next-day file is not on disk (last day of the available archive).

source
AtmosTransport.Preprocessing.ERA5GRIBSettings Type
julia
ERA5GRIBSettings{flavor} <: AbstractERA5GRIBSettings

Typed settings for one ERA5 native-GRIB flavor:

  • flavor = :n320 — reduced linear-Gaussian N320, the default MARS native grid for ERA5 analyses (640 longitudes at the equator, 137 hybrid levels).

ERA5 archives store hybrid model levels top-down (k = 1 is the TOA-side cap). level_orientation = :top_down reflects that, matching era5.toml. The reader flips to the project's runtime convention downstream — same path used by the GEOS-IT bottom-up source.

source
AtmosTransport.Preprocessing.ERA5N320ConvectionFields Type
julia
ERA5N320ConvectionFields{FT}

Per-window convection forecast fields on the N320 source mesh. All four fields are (n_cells, Nz):

  • udmf — updraft convective mass flux (kg m⁻² s⁻¹)

  • ddmf — downdraft convective mass flux (kg m⁻² s⁻¹)

  • udrf — updraft detrainment rate (kg m⁻³ s⁻¹)

  • ddrf — downdraft detrainment rate (kg m⁻³ s⁻¹)

Downstream conversion to GEOS-style CMFMC + DTRAIN (or TM5-style entu/entd) happens in the breakpoint-F glue once both source and target geometries are known.

source
AtmosTransport.Preprocessing.ERA5N320DryMassFields Type
julia
ERA5N320DryMassFields{FT}

Per-window dry-basis output container. m_dry is dry-air mass per cell per layer (kg), delp_dry is dry pressure thickness per cell per layer (Pa), ps_dry is the column-integrated dry surface pressure (Pa). ps_dry_acc is a Float64 accumulator that backs ps_dry so Σ_k DELP_dry stays accurate even when FT = Float32 (137-layer summation with ~10 hPa values per layer would otherwise lose ~100 Pa to single-precision rounding).

source
AtmosTransport.Preprocessing.ERA5N320SpectralWorkspace Type
julia
ERA5N320SpectralWorkspace{FT, G}

Per-window workspace for ERA5 N320 spectral synthesis. Owns:

  • the spectral coefficient cubes for T, VO, D, LNSP for the current hour (sized (T+1) × (T+1) × Nz in ComplexF64),

  • per-level scratch matrices u_spec / v_spec reused inside the synthesis loop,

  • a ReducedSpectralThreadCache with Legendre column buffer plus FFT and real ring buffers sized to every unique ring length in the source mesh,

  • a read_buf scratch used by read_spectral_coeffs! for the raw ecCodes codes_get_double_array payload,

  • completion bookkeeping (have_t / have_vo / have_d / have_lnsp).

The cubes dominate memory: at T = 639 and Nz = 137 each cube is ≈ 0.9 GB. Workspaces are intended to be allocated once per day-handle and reused across the 24 hours.

Spectral buffers (vo_spec, d_spec, t_spec, lnsp_spec, u_spec, v_spec, synth_cache.P_buf) are unconditionally ComplexF64 / Float64 regardless of FTread_spectral_coeffs! and vod2uv! only operate at that precision. FT only controls the eltype of the downstream gridpoint fields written via read_era5_n320_window_fields!.

source
AtmosTransport.Preprocessing.ERA5N320ToC180Pipeline Type
julia
ERA5N320ToC180Pipeline{FT, RW, CSGrid, SrcGrid}

All-in-one container for the per-day ERA5 N320 → C180 preprocessing workspace. Holds the source-grid descriptor, the hybrid coordinate, the shared per-cell area vector, every read/derive/regrid workspace from breakpoints B-D, the convection workspace from E, and the per-window output fields on both the source mesh and the C180 target.

One pipeline allocated per day-handle, reused across the 24 hourly windows.

source
AtmosTransport.Preprocessing.ERA5N320WindowFields Type
julia
ERA5N320WindowFields{FT}

Per-window output container. u / v (m/s, geographic frame), t (K), qv (kg/kg specific humidity) are (n_cells, Nz); ps (Pa) is (n_cells,). Dry-basis layer mass derivation, regridding, and convection conversion are downstream of this struct.

source
AtmosTransport.Preprocessing.ERA5PhysicsBinaryHeader Type
julia
ERA5PhysicsBinaryHeader

In-memory view of the ERA5 physics BIN header. Parsed from / serialized to a 4 KB JSON block at the start of each BIN file. Fields:

  • format_version — int, currently 1. Readers error on mismatch.

  • dateDate, the calendar day the BIN represents (hours 00-23).

  • Nlon, Nlat, Nlev, Nt — grid shape. Nt is always 24 for the hourly calendar-day BIN.

  • var_offsets_bytes — NamedTuple mapping var name → byte offset into the file (where that variable's payload starts, after the header block).

  • var_nelems — NamedTuple mapping var name → total element count.

  • latitude_convention:S_to_N (AtmosTransport orientation).

  • longitude_range(first, last, step) in degrees.

  • latitude_range(first, last, step) in degrees (S to N).

  • provenance — Dict with source NC paths, timestamp, git sha.

source
AtmosTransport.Preprocessing.ERA5PhysicsBinaryReader Type
julia
ERA5PhysicsBinaryReader

Mmap view of one ERA5 physics BIN file. Per-variable getters return reshaped views with zero allocation on subsequent calls.

Fields

  • path — absolute BIN path.

  • io — open IOStream (keep alive for mmap lifetime).

  • header — parsed ERA5PhysicsBinaryHeader.

  • mmap — single flat Vector{Float32} over the entire payload. Individual variable views reshape into this.

Usage

julia
reader = open_era5_physics_binary(bin_dir, Date(2021, 12, 1))
try
    udmf = get_era5_physics_field(reader, :udmf)    # 4D (Nlon, Nlat, Nlev, 24)
    slab = @view udmf[:, :, :, 5]                    # one hour
    # ... use slab ...
finally
    close_era5_physics_binary(reader)
end
source
AtmosTransport.Preprocessing.ERA5SpectralReader Type
julia
ERA5SpectralReader{FT, S} <: AbstractMetReader{FT, S, NoChain}

Typed reader nominal for the ERA5 spectral path. ChainPolicy is fixed at NoChain because today's spectral path does not carry cross-day mass state (it pins global-mean ps per window instead). The per-window read still fuses with merge in today's process_window! and is deferred to P2.

source
AtmosTransport.Preprocessing.ERA5SpectralSettings Type
julia
ERA5SpectralSettings <: AbstractMetSettings

Typed wrapper for the historical ERA5 spectral settings NamedTuple. The spectral preprocessor still performs spectral synthesis inside the topology workspaces, but the source axis is now explicit: TOML parsing constructs this settings type and topology methods dispatch on it.

source
AtmosTransport.Preprocessing.GEOSDayHandles Type
julia
GEOSDayHandles

Open NCDataset handles for one UTC day's GEOS collections plus the resolved level orientation and the hybrid coordinate (loaded once for endpoint-DELP reconstruction). The orchestrator opens these once at the start of process_day and closes them at the end.

source
AtmosTransport.Preprocessing.GEOSNativeReader Type
julia
GEOSNativeReader{FT, S, CP, V, H} <: AbstractMetReader{FT, S, CP}

Typed reader wrapping (GEOSSettings, handles) for one day, plus optional chained-mass state. Type parameters:

  • FT — preprocessing float type.

  • S <: AbstractGEOSSettings — concrete settings (GEOSITSettings / GEOSFPSettings / …).

  • CP <: AbstractChainPolicyNoChain or ChainedMass{T}.

  • V — seed-field slot type. Nothing for NoChain; Union{Nothing, NTuple{6, Array{FT, 3}}} for ChainedMass. The V parameter is fixed at construction (Julia-style review round-1: value-dependent V was the type-instability foot-gun; we pin V to the unconditional Union on the chained path so the inner constructor specializes once).

  • H — concrete handles type. Pinned to typeof(open_day(...)) at construction so reader.handles access is type-stable (replaces the previous :: Any field). Different GEOS flavors produce different handle structs (GEOSDayHandles for GEOS-IT, GEOSFPNativeDayHandles for GEOS-FP); each becomes its own concrete reader type via this parameter.

Constructor: open_reader(settings::AbstractGEOSSettings, date, FT; seed, next_day_handle). See the function docstring.

source
AtmosTransport.Preprocessing.GEOSSettings Type
julia
GEOSSettings{flavor} <: AbstractGEOSSettings

Settings for one of the two supported GEOS flavors:

  • flavor = :geosit — GEOS-IT (file pattern GEOSIT.{date}.{collection}.C{Nc}.nc).

  • flavor = :geosfp — GEOS-FP (file pattern GEOSFP.{date}.{collection}.C{Nc}.nc).

Auto-detection of level orientation runs at open_geos_day time when level_orientation = :auto. Set explicitly to :bottom_up or :top_down to skip the heuristic.

source
AtmosTransport.Preprocessing.IdentityVertical Type

No-op identity vertical transform. Nz_output = Nz_native, merge_map[k] = k.

source
AtmosTransport.Preprocessing.IntensiveCenterField Type

Center-level intensive field (e.g. T, Q). Mass-weighted mean within merged group; weights provided positionally.

source
AtmosTransport.Preprocessing.LatLonContract Type
julia
LatLonContract{FT} <: AbstractWindowContract{LatLonTargetGeometry, FT}

Typed nominal owning an LL preprocessor's per-window gate policy and worst-window positivity accumulator. Construction validates positivity_cfl_limit ∈ (0, 1] and steps_per_window ≥ 1 so an invalid TOML value fails before any window runs.

source
AtmosTransport.Preprocessing.LevelSelection Type
julia
LevelSelection(echlevs)

Typed wrapper for the existing select_levels_echlevs algorithm. echlevs is a vector of native level INTERFACE indices (0-based, bottom-up); levels between selected interfaces are summed. See vertical_coordinates.jl:64 and the ECHLEVS_ML137_* constants.

source
AtmosTransport.Preprocessing.MassField Type

Center-level extensive mass (e.g. delp, m). Sum native layers within each merged group.

source
AtmosTransport.Preprocessing.MassFluxField Type

Horizontal face mass flux (e.g. am, bm). Already integrated over the layer thickness; sum within merged group.

source
AtmosTransport.Preprocessing.MergeAbovePressure Type
julia
MergeAbovePressure(pressure_Pa; target_min_thickness_Pa = Inf,
                                 reference_surface_pressure_Pa = 101325.0)

Upper-atmosphere coarsening: native layers whose midpoint pressure is LOWER than pressure_Pa (physically ABOVE the cutoff altitude) get greedily merged until each merged layer exceeds target_min_thickness_Pa. Below the cutoff, the native grid is preserved verbatim.

target_min_thickness_Pa = Inf merges every above-cutoff native layer into one top cap. The GEOS-IT L72 use case is pressure_Pa = 100.0 + target_min_thickness_Pa = 50.0 — merges the ~14 Pa mesospheric layers into ~50 Pa output layers while keeping the troposphere and stratosphere at native resolution.

source
AtmosTransport.Preprocessing.MergeByIndex Type
julia
MergeByIndex(groups)

Explicit native center-level groups. groups[l] is the UnitRange{Int} of native center levels merged into output level l. Validation at plan_vertical:

  • groups[1] starts at 1; groups[end] ends at Nz_native;

  • groups are contiguous (groups[l+1].start == groups[l].stop + 1);

  • each range is non-empty.

This is the most auditable transform for production reruns — the group list lives in the run TOML and is version-controlled.

source
AtmosTransport.Preprocessing.MergeLayersThinnerThan Type
julia
MergeLayersThinnerThan(min_thickness_Pa; reference_surface_pressure_Pa = 101325.0)

Typed wrapper for the existing merge_thin_levels algorithm: greedily merge adjacent native layers until each output layer exceeds min_thickness_Pa at the reference surface pressure.

source
AtmosTransport.Preprocessing.NoChain Type
julia
NoChain <: AbstractChainPolicy

The reader does not carry mass state across days. Day N starts from the raw source endpoint. end_of_day_seed(::reader{…, NoChain}) returns nothing (statically inferred).

source
AtmosTransport.Preprocessing.PreprocessorRunCache Type
julia
PreprocessorRunCache{G, FT}()

Small typed cache for artifacts that should be built once per preprocessing run instead of once per day/window (for example spectral LL->CS regridders or RG compressed Laplacians). P2b only introduces the nominal and storage; concrete drivers decide which keys they own as they migrate.

source
AtmosTransport.Preprocessing.PressureFluxField Type

Vertical interface mass flux (e.g. cm). Interfaces are selected (not summed); top/bottom zeros preserved.

source
AtmosTransport.Preprocessing.PressureOverlap Type
julia
PressureOverlap(target_coeff_path)

Remap native layer integrals onto an independent target hybrid coordinate by pressure-thickness overlap. The target half-level coefficients are loaded from target_coeff_path. Full apply_vertical! implementation lands in P1 alongside the spectral driver cutover; plan_vertical constructs the plan today.

source
AtmosTransport.Preprocessing.PreverifiedWindow Type
julia
PreverifiedWindow(ready::ReadyWindow, contract_diag; accumulated=false)

Typed event wrapper for topology hooks that have already run verify_window! before handing a window to the generic driver. When accumulated=true, the hook also guarantees it has already folded the positivity diagnostic into the contract accumulator.

source
AtmosTransport.Preprocessing.RawWindow Type
julia
RawWindow{FT, A2, A3}

Per-window source-grid intermediate carrying both window endpoints (t_n and t_{n+1}) and the window-integrated horizontal mass fluxes between them.

The right endpoint of window n is the left endpoint of window n+1, so the reader caches it across calls. For the last window of the day, the right endpoint comes from the next day's first instantaneous file (the existing next_day_hour0 plumbing in the orchestrator).

Cell-center winds u, v (geographic frame) are filled only when the target grid differs from the source grid and fluxes must be reconstructed downstream. For native passthrough (source mesh == target mesh) u/v stay nothing and am/bm are written through directly after vertical merging.

cmfmc/dtrain are filled only when the source supports convection and the user has enabled it via settings.include_convection. vdiff carries optional Holtslag-Boville VDIFF source fields (u, v, t, qv) when the source can archive them into the transport binary for runtime diffusion.

source
AtmosTransport.Preprocessing.ReadyWindow Type
julia
ReadyWindow{G, FT}(index, payload::NamedTuple)

Typed wrapper for a topology-specific window payload that is ready to be verified and written. G is the target geometry, FT is the on-disk float type, and payload is the existing topology NamedTuple (m_cur/am/... for LL/CS, m_cur/hflux/... for RG).

Unknown property access is forwarded to payload, so existing verify_window!(window, contract, win_idx) methods can accept either a raw NamedTuple or a ReadyWindow.

source
AtmosTransport.Preprocessing.ReducedGaussianContract Type
julia
ReducedGaussianContract{FT} <: AbstractWindowContract{ReducedGaussianTargetGeometry, FT}

Typed nominal owning an RG preprocessor's per-window gate policy and worst-window positivity accumulator. Holds the face connectivity (face_left / face_right) so the per-window call site doesn't need to thread it through every call.

Construction validates replay_tol, positivity_cfl_limit ∈ (0, 1], steps_per_window ≥ 1, boundary_stub_tol ≥ 0, and length(face_left) == length(face_right).

boundary_stub_tol (default 0.0) is the absolute tolerance for the boundary-stub flux gate (verify_boundary_stub_flux_rg). Default is strict; tighten only with caution.

source
AtmosTransport.Preprocessing.SurfaceField Type

2D surface field (e.g. ps, pblh). No vertical reduction — identity passthrough.

source
AtmosTransport.Preprocessing.TM5PreprocessingWorkspace Type
julia
TM5PreprocessingWorkspace{FT, R, B}

Per-day scratch for the TM5 convection preprocessor. Holds the per-column vectors reused across all columns of an hour, plus the 4×(Nlon_src, Nlat_src, Nz_native) native-vertical buffers and the 4×(Nlon_src, Nlat_src, Nz) merged-vertical buffers on the ERA5 source grid. regridder is the conservative regridder from the ERA5 LL source mesh to the target mesh, or nothing when source and target are the same LL grid (identity fast-path).

physics_bufs::B holds optional FT-typed conversion buffers for the six physics inputs (udmf, ddmf, udrf, ddrf, t, q). When the workspace FT matches the physics BIN eltype (Float32 today), B === Nothing and the kernel reads the BIN views directly with zero copy. When FT differs (e.g. the F64 path), B === NTuple{6, Array{FT, 3}} and compute_tm5_merged_hour_on_source! upcasts hour views into them in-place — alloc once per workspace, reused across all 24 hours per day.

source
AtmosTransport.Preprocessing.TracerMassField Type

Center-level extensive tracer mass (e.g. qv-mass). Sum native layers within each merged group.

source
AtmosTransport.Preprocessing.UnifiedPreprocessorDay Type
julia
UnifiedPreprocessorDay(reader, workspace, contract, writer; context=nothing)

Bundle the four typed axes a unified preprocessing day needs. context is an opaque topology/source adapter payload used by hook methods during migration.

source
AtmosTransport.Preprocessing.VerticalPlan Type
julia
VerticalPlan{FT, T <: AbstractVerticalTransform}

Result of plan_vertical. Type-parameterized by the originating transform so the apply_vertical! dispatch is statically resolved.

Fields:

  • transform : the originating AbstractVerticalTransform value.

  • native_vc : the input native hybrid coordinate.

  • merged_vc : the output hybrid coordinate.

  • merge_map : Vector{Int} such that merge_map[k_native] is the output-level index for merge-map flavors. Int[] for PressureOverlap (uses overlap coefficients instead).

  • groups : Vector{UnitRange{Int}} of native center-level ranges that map to each output level (derived from merge_map for the merge-map flavors).

  • Nz_output : the output level count.

  • Nz_native : the input level count (cached for cheap access).

source
AtmosTransport.Preprocessing.TM5CleanupStats Method
julia
TM5CleanupStats() -> NamedTuple of Ref{Int} counters

Diagnostic counters bumped by ec2tm_from_rates!. Nothing-overhead when the function runs without stats (pass nothing); when stats are passed, each counter increments once per level/column that hit the corresponding cleanup branch.

  • columns_processed — total columns the function was called on.

  • no_updraft — columns with no level satisfying udmf > 0 (after small-value clipping). entu/detu zeroed out.

  • no_downdraft — columns with no level satisfying ddmf < 0.

  • levels_udmf_clipped, levels_ddmf_clipped — levels where half-level mass flux magnitude was below 1e-6 kg/m²/s and got zeroed.

  • levels_udrf_clipped, levels_ddrf_clipped — full-level detrainment rates zeroed (|rate| < 1e-10 kg/m³/s).

  • levels_entu_neg, levels_detu_neg, levels_entd_neg, levels_detd_neg — levels where the indicated output went negative and got fixed via symmetric redistribution with its complementary rate.

Interpretation

On clean data we expect ~0% clipped levels and ~0 no-updraft columns (outside pure stratospheric columns). O(1%) redistribution firings are normal TM5 behaviour. O(50%) firings indicate a data pathology (wrong param IDs, wrong stream, bad units).

source
AtmosTransport.Preprocessing.allocate_era5_c180_regrid_workspace Method
julia
allocate_era5_c180_regrid_workspace(source_grid, target_grid, Nz; cache_dir=nothing)

Build (or load from cache_dir) the N320 → C180 conservative regridder and allocate the flat scratch buffers. The regridder's intersections matrix is the expensive piece — on first run it is built and serialised to JLD2 under cache_dir, and subsequent runs load it in milliseconds.

source
AtmosTransport.Preprocessing.allocate_era5_n320_spectral_workspace Method
julia
allocate_era5_n320_spectral_workspace(source_grid, T, Nz)

Allocate a fresh workspace sized to source_grid (build via discover_era5_n320_source_grid), spectral truncation T (from discover_era5_spectral_truncation) and vertical level count Nz (defaults to 137 in callers, but the workspace itself stays generic).

source
AtmosTransport.Preprocessing.allocate_era5_n320_to_c180_pipeline Method
julia
allocate_era5_n320_to_c180_pipeline(handles, target_grid;
                                    Nz=ERA5_NATIVE_LEVEL_COUNT,
                                    cache_dir=nothing,
                                    include_convection=true)

Build the full per-window pipeline for the ERA5 source described by handles (resolved via open_era5_day) and the C-tier target target_grid. Discovers the source mesh and spectral truncation from the day's core GRIB, loads the hybrid coordinate file declared in the settings, builds (or JLD2-loads from cache_dir) the conservative regridder, and allocates every B/C/D/E workspace.

Convection workspace allocation is gated on include_convection so the caller can opt out for a scalar-only smoke.

source
AtmosTransport.Preprocessing.allocate_raw_window Function
julia
allocate_raw_window(settings::AbstractMetSettings;
                    FT::Type, Nc=nothing, Nz=nothing) -> RawWindow

Allocate a pre-zeroed RawWindow sized for one window of settings's source data. The orchestrator calls this once before the per-window loop and reuses the same buffer across all windows in a day. The shape parameters (Nc, Nz, …) are source-specific — concrete subtypes pick the keys they need.

source
AtmosTransport.Preprocessing.allocate_raw_window Method
julia
allocate_raw_window(settings::GEOSSettings; FT, Nz) -> RawWindow

Pre-allocate a per-window workspace for the GEOS reader: 6 zero-filled panel arrays each for m, m_next, qv, qv_next, am, bm (shape (Nc, Nc, Nz)) and ps, ps_next (shape (Nc, Nc)).

When settings.include_convection, also allocates cmfmc (interfaces, shape (Nc, Nc, Nz + 1) per panel) and dtrain (centers, shape (Nc, Nc, Nz) per panel). Cross-topology winds (u, v) stay nothing here — they are produced by the orchestrator only when the target grid differs from the source.

source
AtmosTransport.Preprocessing.allocate_tm5_workspace Method
julia
allocate_tm5_workspace(Nlon_src, Nlat_src, Nz_native, Nz, FT;
                       regridder=nothing, target_nlon=Nlon_src,
                       target_nlat=Nlat_src) -> TM5PreprocessingWorkspace

Allocate the TM5 preprocessing workspace. Pass regridder=nothing for identity (source and target grids match). Otherwise pass a ConservativeRegridding.Regridder built via build_regridder(source_mesh, target_mesh) from src/Regridding/.

source
AtmosTransport.Preprocessing.allocate_window_workspace Function
julia
allocate_window_workspace(args...; kwargs...)

Construct the topology-specific AbstractWindowWorkspace{G, FT} for one preprocessing day. Concrete methods land as production drivers migrate.

source
AtmosTransport.Preprocessing.apply_vertical! Function
julia
apply_vertical!(buf_out, buf_in, plan::VerticalPlan, kind::AbstractFieldKind, args...)

Apply the vertical transform to one window of buf_in, writing the result into buf_out. Dispatches on the (plan.transform, kind) combination:

  • Extensive center fields (MassField, TracerMassField, MassFluxField, ConvectionTendencyField) sum native layers within each output group.

  • Interface fields (PressureFluxField, ConvectionInterfaceFlux) select the kept half-level interfaces.

  • IntensiveCenterField takes an additional positional weights argument (native mass-per-layer); produces the mass-weighted mean within each output group.

  • SurfaceField is a passthrough copy (no vertical reduction).

buf_out and buf_in must be 3D arrays with the vertical axis on dim 3 (or 2D for SurfaceField).

source
AtmosTransport.Preprocessing.build_target_geometry Method
julia
build_target_geometry(cfg_grid, FT=Float64) -> AbstractTargetGeometry

Dispatch configuration-driven target-geometry construction from the user-facing grid.type string.

source
AtmosTransport.Preprocessing.build_target_geometry Method
julia
build_target_geometry(::Val{:cubed_sphere}, cfg_grid, FT)

Build a cubed-sphere target geometry from the [grid] config section.

Required keys:

  • Nc :: Int — cells per panel edge

Optional keys:

  • definition"equiangular_gnomonic" (legacy synthetic) or "gmao" (GEOS-IT/GEOS-FP equal-distance gnomonic)

  • panel_convention or convention"gnomonic" (default) or "geos_native". If definition is omitted, "geos_native" selects "gmao" and "gnomonic" selects "equiangular_gnomonic".

  • regridder_cache_dir — directory for CR.jl weight cache (default ~/.cache/AtmosTransport/cr_regridding)

  • staging_nlon, staging_nlat — override the internal LL staging grid size (defaults: max(4Nc, 360) × max(2Nc+1, 181))

source
AtmosTransport.Preprocessing.build_target_geometry Method
julia
build_target_geometry(::Val{:era5_native_reduced_gaussian}, cfg_grid, FT)

Build the reduced-Gaussian geometry metadata from a native ERA5 GRIB file. This is currently intended for geometry discovery and future native-grid preprocessing work rather than the active lat-lon synthesis path.

source
AtmosTransport.Preprocessing.build_target_geometry Method
julia
build_target_geometry(::Val{:latlon}, cfg_grid, FT) -> LatLonTargetGeometry

Build the regular lat-lon target geometry used by the current v4 spectral preprocessor.

source
AtmosTransport.Preprocessing.build_target_geometry Method
julia
build_target_geometry(::Val{:synthetic_reduced_gaussian}, cfg_grid, FT)

Build a reduced-Gaussian target geometry from scratch without needing a source GRIB file. cfg_grid["gaussian_number"] controls the truncation N so the mesh has 2N Gauss-Legendre latitude rings. cfg_grid["nlon_mode"] selects the longitude layout:

  • "regular" (default): every ring has 4N cells — the classical "regular reduced" Gaussian grid used e.g. in the TL159/N80 family.

  • "octahedral": ECMWF octahedral layout nlon = 4k + 16 per hemisphere, mirrored between the two hemispheres with k = 1 at the pole-adjacent ring.

Mesh latitudes are produced from FastGaussQuadrature.gausslegendre(2N), which returns ascending Gauss-Legendre nodes on (-1, 1). geometry_source_grib is left as an empty string to mark the grid as synthetic.

source
AtmosTransport.Preprocessing.close_day! Function
julia
close_day!(ctx)

Close all resources held by a day context. Must be idempotent; safe to call from a finally block.

source
AtmosTransport.Preprocessing.close_day! Method

Canonical-contract alias for close_geos_day!.

source
AtmosTransport.Preprocessing.close_era5_day! Method
julia
close_era5_day!(handles::ERA5GRIBDayHandles)

Release any resources held by the day handle. Idempotent — safe to call from a finally block. This is a no-op today because the handle only stores paths, but the symbol exists so future flavors that pin file descriptors don't change the call surface.

source
AtmosTransport.Preprocessing.close_era5_physics_binary Method
julia
close_era5_physics_binary(reader) -> nothing

Release the mmap and close the underlying IOStream.

source
AtmosTransport.Preprocessing.close_geos_day! Method

Close all handles. Idempotent.

source
AtmosTransport.Preprocessing.close_reader! Function
julia
close_reader!(reader::AbstractMetReader)  nothing

Close all per-day file handles and release scratch held by the reader. Idempotent — safe to call from a finally block. Concrete readers implement.

source
AtmosTransport.Preprocessing.close_streaming_binary! Function
julia
close_streaming_binary!(writer)

Close a streaming binary writer. Implementations should be idempotent.

source
AtmosTransport.Preprocessing.compute_tm5_merged_hour_on_source! Method
julia
compute_tm5_merged_hour_on_source!(ws, reader, hour, ps_hour, ak_full, bk_full,
                                     Nz_native, merge_map; stats)

Phase 1 of the per-hour TM5 step. Reads native-L137 ERA5 physics fields for hour from the mmap-backed reader, runs the TM5 math column-by-column into ws.*_native, then collapses to merged Nz into ws.*_merged_src. All work is on the ERA5-native horizontal grid (720×361 for standard physics BINs).

This path requires source and target grids to match (Nx, Ny) == (720, 361). PS comes from the caller (typically the preprocessor's transform.sp after spectral synthesis) because the physics BIN does not carry PS.

stats::Union{Nothing, NamedTuple} is the TM5CleanupStats bundle; when non-nothing, counters accumulate across all columns of all hours processed.

Shape guards:

  • reader fields shape (Nlon_src, Nlat_src, Nz_native, Nt) — must match ws.entu_native's leading dims.

  • ps_hour shape (Nlon_src, Nlat_src).

  • merge_map length Nz_native.

source
AtmosTransport.Preprocessing.contract_cfl_limit Function
julia
contract_cfl_limit(contract::AbstractWindowContract) -> Float64

Per-substep positivity CFL gate the contract enforces.

source
AtmosTransport.Preprocessing.contract_replay_tolerance Function
julia
contract_replay_tolerance(contract::AbstractWindowContract) -> Float64

Relative replay tolerance the contract's verify_window! uses. Concrete contracts return their stored policy value.

source
AtmosTransport.Preprocessing.contract_require_positivity Function
julia
contract_require_positivity(contract::AbstractWindowContract) -> Bool

Whether summarize_status! errors (true) or warns (false) on a positivity violation. Closes-the-escape-hatch toggle from CS round-2.

source
AtmosTransport.Preprocessing.convert_era5_physics_nc_to_bin Method
julia
convert_era5_physics_nc_to_bin(nc_dir, bin_dir, date;
                                force_rewrite=false, verbose=true) -> bin_path

Build one calendar-day ERA5 physics BIN from the archive NCs.

  • nc_dir: directory containing era5_convection_YYYYMMDD.nc + era5_thermo_ml_YYYYMMDD.nc.

  • bin_dir: output staging directory; BIN lands at <bin_dir>/<YYYY>/era5_physics_<YYYYMMDD>.bin.

  • date: calendar day (00:00-23:00) the BIN represents.

  • force_rewrite: when false and the BIN already exists with a valid header, skip (idempotent). When true, overwrite.

  • verbose: log progress (default true).

Requires the convection NC for both date - 1 day and date (because convection is forecast-based and a calendar-day BIN needs hours 00-06 from the previous day's file and hours 07-23 from the current day's file). Raises if either is missing.

The thermo NC is calendar-day aligned so only the target-date file is needed.

Writes the BIN atomically: first to <name>.bin.tmp, then renames. This is safe to run concurrently with a reader: the rename step is an atomic fs operation on a single filesystem.

source
AtmosTransport.Preprocessing.derive_c180_dry_mass! Method
julia
derive_c180_dry_mass!(m_dry, delp_dry, ps_dry, ps_dry_acc,
                       ps_panels, qv_panels, vc, cell_areas; grav=GRAV) -> nothing

Cubed-sphere variant of derive_n320_dry_mass!. Builds the dry-air layer mass, dry pressure thickness, and dry surface pressure for each of the 6 C-tier panels from the regridded moist PS + Q on C180. Same formula as the GEOS endpoint dry-mass derivation so Σ_k DELP_dry = PS_dry to roundoff on the target mesh too.

Inputs and outputs are NTuple{6, ...} panel tuples; ps_dry_acc is a panel-tuple Float64 accumulator (so a Float32 output preserves the multi-level summation precision). cell_areas[i, j] is shared across all 6 panels (the CS mesh is isotropic per panel).

source
AtmosTransport.Preprocessing.derive_n320_dry_mass! Method
julia
derive_n320_dry_mass!(dry, window, vc, cell_areas; grav=GRAV) -> dry

Reconstruct dry-air mass per layer, dry pressure thickness, and dry surface pressure from a populated window::ERA5N320WindowFields (moist PS, Q) using the hybrid coordinate vc (length Nz+1 A and B arrays in Pa and 1 respectively, top-down) and the per-cell areas (m²). Matches the GEOS endpoint_dry_mass! formula so the runtime sees a nominally bit-identical dry-basis contract regardless of source.

Asserts shapes and length(vc.A) == length(vc.B) == Nz + 1 so a coefficient table that does not match the workspace Nz fails immediately with a clear DimensionMismatch.

source
AtmosTransport.Preprocessing.detect_level_orientation Method
julia
detect_level_orientation(ctm_a1::NCDataset) -> Symbol

Return :bottom_up if k=1 is the surface (mass-thicker) and :top_down if k=1 is TOA. Heuristic is unambiguous: surface DELP is O(1000 Pa), TOA DELP is O(1 Pa).

source
AtmosTransport.Preprocessing.discover_era5_n320_source_grid Method
julia
discover_era5_n320_source_grid(core_path; FT=Float64) -> ReducedGaussianTargetGeometry

Build the N320 source-grid descriptor from the first gridType=reduced_gg message in core_path (the q field is always present). The resulting mesh ring order is south→north — the GRIB stores rings north→south and the read_era5_reduced_gaussian_* helpers flip into the project convention.

source
AtmosTransport.Preprocessing.discover_era5_spectral_truncation Method
julia
discover_era5_spectral_truncation(core_path) -> Int

Read the first spectral (gridType=sh) message in core_path and return its triangular truncation J = K = M. ERA5 native model-level analyses are T639 in the current archive; the helper avoids hard-coding that value in case a future archive convention shifts.

source
AtmosTransport.Preprocessing.drain_ready_windows! Function
julia
drain_ready_windows!(workspace) -> iterator of `ReadyWindow`

Return the windows that became write-ready after the last ingest.

source
AtmosTransport.Preprocessing.driver_after_write_window! Method
julia
driver_after_write_window!(workspace, reader, ready, context)

Post-write migration hook. GEOS-native CS uses this to advance its chained endpoint state; most topologies do nothing.

source
AtmosTransport.Preprocessing.driver_drain_ready_windows! Method
julia
driver_drain_ready_windows!(workspace, contract, win, context)

Return ready windows produced by the most recent ingest. A hook may return a single ReadyWindow, a single PreverifiedWindow, or any iterator of either shape.

source
AtmosTransport.Preprocessing.driver_flush_final_windows! Method
julia
driver_flush_final_windows!(workspace, reader, contract, context)

Return final ready windows after all source windows have been ingested.

source
AtmosTransport.Preprocessing.driver_ingest_window! Method
julia
driver_ingest_window!(workspace, reader, win, context)

Migration hook for ingesting one source window into the target workspace. Topology/source adapters may override while legacy workspace signatures are being collapsed.

source
AtmosTransport.Preprocessing.driver_windows_per_day Method
julia
driver_windows_per_day(reader, context) -> Int

Migration hook for source readers whose window count needs adapter context.

source
AtmosTransport.Preprocessing.dz_hydrostatic_constT! Method
julia
dz_hydrostatic_constT!(dz, ps, ak, bk, Nz; T_ref=260) -> dz

Constant-temperature hydrostatic layer thickness — fallback for use when T and Q are unavailable. Matches main's Julia port's shortcut (T_ref = 260 K). Biases entu/detu magnitudes by ~10-20% vs the virtual-temperature version; dz_hydrostatic_virtual! is preferred when T + Q are downloaded together with the convection fields.

source
AtmosTransport.Preprocessing.dz_hydrostatic_virtual! Method
julia
dz_hydrostatic_virtual!(dz, T_col, Q_col, ps, ak, bk, Nz) -> dz

Compute layer thickness dz[1:Nz] (m) at layer centers from a single column's temperature T_col[1:Nz] (K) and specific humidity Q_col[1:Nz] (kg/kg), plus surface pressure ps (Pa) and the hybrid-sigma coefficients ak, bk (length Nz+1).

Uses the hydrostatic approximation with virtual temperature:

julia
p_top[k] = ak[k]   + bk[k]   * ps        (Pa, higher-altitude side)
p_bot[k] = ak[k+1] + bk[k+1] * ps        (Pa, lower-altitude side)
dp[k]    = p_bot[k] - p_top[k]           (> 0 in AtmosTransport orientation)
p_mid[k] = 0.5 · (p_top[k] + p_bot[k])
T_v[k]   = T_col[k] · (1 + 0.608 · Q_col[k])
dz[k]    = R · T_v[k] / g · dp[k] / p_mid[k]

Orientation: AtmosTransport (k=1=TOA, k=Nz=surface). ak/bk are the full (Nz+1)-length ERA5 L137 hybrid coefficients.

For the TOA half-level we fall back to an ordinary scale-height estimate (T_v=T_top, p_mid=p_bot) when p_top → 0. In practice the top-level dz is never in the convection window so the approximation is irrelevant; it's a guard against divide-by-zero.

source
AtmosTransport.Preprocessing.ec2tm! Method
julia
ec2tm!(entu, detu, entd, detd,
        mflu_ec, mfld_ec, detu_ec, detd_ec) -> (entu, detu, entd, detd)

Convert ECMWF convective mass-flux fields into TM5 (entu, detu, entd, detd) layer-center fields. All inputs / outputs in AtmosTransport orientation (k=1=TOA, k=Nz=surface). Operates in place on the four output arrays (no allocations).

Shapes:

  • entu, detu, entd, detd: (..., Nz) — layer centers.

  • mflu_ec, mfld_ec: (..., Nz+1) — half levels (interfaces). Interface k is the TOP of layer k; interface Nz+1 is the surface boundary.

  • detu_ec, detd_ec: (..., Nz) — layer centers.

The leading dimensions are arbitrary (scalar, (Nx, Ny), (ncells,), or per-panel (Nc, Nc)) as long as the arrays all have consistent shape. This function is backend-agnostic pure Julia; call from the preprocessor (CPU) ahead of binary writeout.

Negative small values (<= 0) in detu_ec / detd_ec are clipped to zero, following the ECMWF-rounding-artifact clean-up documented in phys_convec_ec2tm.F90 (ECMWF diagnostics can produce ~-1e-19 values from rounding).

For every location, computes:

julia
k = 1:          entu[1]   = detu[1]   (updraft starts in layer 1)
                entd[1]   = detd[1]   (no flux at TOA)
k  2:Nz:       entu[k]   = detu[k]   + mflu_ec[k+1] - mflu_ec[k]
                entd[k]   = detd[k]   - mfld_ec[k]   + mfld_ec[k-1]
                (where mfld_ec sign-flipped internally)

Wait — above formula is written in TM5 convention (k=1=surface). Let me re-write in AtmosTransport orientation explicitly.

AtmosTransport orientation (k=1=TOA):

  • Layer k has interface k at its TOP (higher altitude side) and interface k+1 at its BOTTOM (lower altitude side).

  • Updraft flows upward: from interface k+1 into layer k via entrainment entu[k], out of layer k via interface k. Continuity: mflu_out − mflu_in + detu − entu = 0entu[k] = detu[k] + mflu_ec[k] − mflu_ec[k+1] where mflu_ec[k] is the flux through interface k (above layer k) and mflu_ec[k+1] is the flux through interface k+1 (below layer k). Convention: mflu_ec >= 0 (positive upward).

  • Downdraft flows downward: from interface k into layer k via entrainment entd[k], out through interface k+1 via detrainment. Sign convention in ECMWF: mfdo <= 0 (negative). We define mfdo_abs = -mfdo_ec >= 0, then entd[k] = detd[k] + mfdo_abs[k+1] − mfdo_abs[k].

Boundary conditions:

  • mflu_ec[1] = 0 (no updraft above TOA).

  • mflu_ec[Nz+1] = 0 (no updraft into ground from below — the surface is the sink).

  • mfld_ec[1] = 0 (no downdraft above TOA).

  • mfld_ec[Nz+1] = 0 (no downdraft escapes the surface).

These boundaries are typically already zero in ECMWF output; we enforce explicitly by construction below.

source
AtmosTransport.Preprocessing.ec2tm_from_rates! Method
julia
ec2tm_from_rates!(entu, detu, entd, detd,
                   udmf, ddmf, udrf_rate, ddrf_rate,
                   dz, Nz; stats=nothing) -> nothing

Column-level port of TM5's ECconv_to_TMconv (see deps/tm5/base/src/phys_convec_ec2tm.F90:87-237). Fills the output arrays entu, detu, entd, detd (kg/m²/s at layer centers) in place from raw ERA5 physics inputs. All arrays use AtmosTransport orientation (k=1=TOA, k=Nz=surface).

Inputs

  • udmf::AbstractVector{FT}, length Nz+1 — updraft mass flux at half levels (kg/m²/s). Half level k is the interface at the TOP of layer k; udmf[Nz+1] is the surface interface. Must be ≥ 0.

  • ddmf::AbstractVector{FT}, length Nz+1 — downdraft mass flux at half levels, ECMWF convention (≤ 0).

  • udrf_rate::AbstractVector{FT}, length Nz — updraft detrainment RATE at layer centers (kg/m³/s).

  • ddrf_rate::AbstractVector{FT}, length Nz — downdraft detrainment RATE at layer centers (kg/m³/s).

  • dz::AbstractVector{FT}, length Nz — layer thickness (m), positive. Compute from dz_hydrostatic_virtual!.

  • Nz::Int — number of full levels.

  • stats::Union{Nothing, NamedTuple} — cleanup-stats counters from TM5CleanupStats(). When nothing, no counter work.

Algorithm (line-for-line match to F90 lines 132-236)

  1. Copy-with-clipping (F90 L132-144). |udmf| < 1e-6 → 0, |ddmf| < 1e-6 → 0 (applied as ddmf > -1e-6 → 0), udrf_rate < 1e-10 → 0, ddrf_rate < 1e-10 → 0.

  2. dz integration (F90 L146-151): detu = udrf × dz (kg/m³/s → kg/m²/s). Same for detd.

  3. uptop/dotop search (F90 L153-173): find the first level from TOA (k=1..Nz) with nonzero flux. If none, zero out everything in that direction.

  4. Mass-budget closure (F90 L175-212): for each active direction, entu[k] = udmf[k] - udmf[k+1] + detu[k] from uptop down. Above uptop-1 stays zero.

  5. Symmetric negative redistribution (F90 L214-232): if entu[k] < 0, add -entu[k] to detu[k] and zero entu[k]. Same for detu<0 (adds to entu), and the same two for downdraft.

Mass conservation

Within the cloud window, the sum entu[k] - detu[k] + (udmf[k] - udmf[k+1]) should be zero per layer — equivalent to the mass-budget closure at step 4. Negative redistribution (step 5) preserves this sum because it only SWAPS between entu↔detu (or entd↔detd) without changing the net entu - detu.

source
AtmosTransport.Preprocessing.end_of_day_seed Function
julia
end_of_day_seed(reader::AbstractMetReader)  seed_or_nothing

Return the seed value to thread into the next day's open_reader(..., seed = ...). Type-system guarantees:

  • end_of_day_seed(::AbstractMetReader{FT, S, NoChain}) returns nothing (statically inferred).

  • end_of_day_seed(::AbstractMetReader{FT, S, ChainedMass{T}}) returns a value of type T (or throws if the reader has not yet produced the end-of-day endpoint).

Closes foot-gun (D).

source
AtmosTransport.Preprocessing.endpoint_dry_mass! Method
julia
endpoint_dry_mass!(delp_dry, ps_dry, ps_total, qv, vc) -> (delp_dry, ps_dry)

Reconstruct dry DELP and dry PS at one endpoint hour from the moist PS and QV provided by GEOS, using the hybrid coordinate vc. The output is on top-down level convention (k=1 = TOA).

Algorithm:

julia
DELP_full[k] = ΔA[k] + ΔB[k] * PS_total
DELP_dry[k]  = (1 - QV[k]) * DELP_full[k]
PS_dry       = Σ DELP_dry[k]

In-place: writes into delp_dry and ps_dry.

source
AtmosTransport.Preprocessing.endpoint_dry_mass Method

Allocate (delp_dry, ps_dry) for one panel and run endpoint_dry_mass!.

source
AtmosTransport.Preprocessing.era5_convection_hour_address Method
julia
era5_convection_hour_address(hour) -> (use_prev_day, data_time_hhmm, step_range)

Map a UTC hour h ∈ 0..23 to the GRIB header tuple that addresses the matching ECMWF convection forecast sample (see read_era5_n320_convection_window! docstring for the cycle layout). Returns (::Bool, ::Int, ::String).

source
AtmosTransport.Preprocessing.era5_grib_path Method
julia
era5_grib_path(settings, date, stream) -> String

Resolve the on-disk GRIB path for stream on date. stream must be one of :core, :convection, or :surface. Existence is not checked here — the caller (typically open_era5_day) decides whether a missing file is fatal or merely "no next-day endpoint available".

source
AtmosTransport.Preprocessing.flush_final_windows! Function
julia
flush_final_windows!(workspace, args...; kwargs...)

Emit any final cross-day/zero-tendency windows once the source stream is exhausted.

source
AtmosTransport.Preprocessing.geos_collection_path Method
julia
geos_collection_path(settings::GEOSITSettings, date::Date, collection) -> String

Resolve the on-disk path of one GEOS-IT collection for date. Searches a flat root_dir and the per-day root_dir/YYYYMMDD/ layout.

source
AtmosTransport.Preprocessing.get_era5_physics_field Method
julia
get_era5_physics_field(reader, var::Symbol) -> Array view

Return a zero-allocation reshaped view into the mmap for var ∈ (:udmf, :ddmf, :udrf_rate, :ddrf_rate, :t, :q). Shape (Nlon, Nlat, Nlev, Nt).

source
AtmosTransport.Preprocessing.has_convection Method
julia
has_convection(settings::AbstractMetSettings) -> Bool

Whether this source can populate RawWindow.cmfmc and RawWindow.dtrain. Defaults to false; sources that support convection override and gate the actual output behind a user flag (e.g. settings.include_convection).

source
AtmosTransport.Preprocessing.has_surface Method
julia
has_surface(settings::AbstractMetSettings) -> Bool

Whether this source can populate raw PBL surface fields in RawWindow.surface. Sources gate actual reads behind their settings flags.

source
AtmosTransport.Preprocessing.has_vdiff_fields Method
julia
has_vdiff_fields(settings::AbstractMetSettings) -> Bool

Whether this source can populate RawWindow.vdiff with layer-center fields needed by the GCHP Holtslag-Boville vertical-diffusion data contract.

source
AtmosTransport.Preprocessing.ingest_window! Function
julia
ingest_window!(workspace, args...; kwargs...) -> nothing

Consume one source/met window into the topology workspace. Ready windows are exposed by drain_ready_windows!.

source
AtmosTransport.Preprocessing.init_cs_positivity_accumulator Method
julia
init_cs_positivity_accumulator() -> NamedTuple

Zero-valued state for accumulating the worst per-window positivity diagnostic across a preprocessing loop. Pair with update_cs_positivity_accumulator!.

source
AtmosTransport.Preprocessing.init_ll_positivity_accumulator Method
julia
init_ll_positivity_accumulator() -> NamedTuple

Zero-valued state for accumulating the worst per-window LL positivity diagnostic across a preprocessing loop. Pair with update_ll_positivity_accumulator.

source
AtmosTransport.Preprocessing.init_rg_positivity_accumulator Method
julia
init_rg_positivity_accumulator() -> NamedTuple

Zero-valued state for accumulating the worst per-window RG positivity diagnostic across a preprocessing loop.

source
AtmosTransport.Preprocessing.load_hybrid_coefficients Method
julia
load_hybrid_coefficients(coeff_path::String) -> HybridSigmaPressure

Load all hybrid sigma-pressure interface coefficients from a TOML file. Unlike load_era5_vertical_coordinate, this does not slice — useful for sources whose level count comes from the file rather than a config knob (e.g. GEOS-72, MERRA-2, native ERA5 L137 without sub-tropo selection).

source
AtmosTransport.Preprocessing.load_met_settings Method
julia
load_met_settings(toml_path::String; root_dir, kwargs...) -> AbstractMetSettings

Construct a typed met-source descriptor from toml_path. The TOML's [source].name key picks the concrete settings type, and per-type _build_met_settings methods consume the source-specific TOML tables.

root_dir is the on-disk directory holding the source's daily files. Additional kwargs are forwarded to the constructor and override any values supplied by the TOML.

source
AtmosTransport.Preprocessing.log_tm5_cleanup_stats Method
julia
log_tm5_cleanup_stats(stats, date_str)

Pretty-print the per-day TM5 cleanup counters produced by TM5CleanupStats(). Zero-valued counters are omitted to keep the output compact; a fully-clean day yields a single line.

source
AtmosTransport.Preprocessing.mass_basis_from_symbol Method
julia
mass_basis_from_symbol(s::Symbol) -> AbstractMassBasis

Construct the matching basis singleton from a header Symbol. Throws ArgumentError for unknown values.

source
AtmosTransport.Preprocessing.mass_basis_symbol Method
julia
mass_basis_symbol(::AbstractMassBasis) -> Symbol

Map a basis singleton (State.DryBasis / State.MoistBasis) to the on-disk binary-header value (:dry / :moist). Inverse of mass_basis_from_symbol.

source
AtmosTransport.Preprocessing.merge_tm5_field_3d! Method
julia
merge_tm5_field_3d!(merged, native, merge_map)

Accumulate a native-level TM5 field (Nlon, Nlat, Nz_native) onto merged output levels (Nlon, Nlat, Nz_merged) using the native-to- merged level map. Matches the semantics of merge_cell_field! for mass/flux fields: sum native layers that map to the same merged layer.

For TM5 entrainment/detrainment fluxes (kg/m²/s), summing over a consolidated layer preserves the column-integrated mass budget exactly.

source
AtmosTransport.Preprocessing.n320_cell_areas Method
julia
n320_cell_areas(source_grid) -> Vector{Float64}

Materialise per-cell areas (m²) for the N320 source mesh. Always returns Vector{Float64} regardless of source_grid's element type — downstream dry-mass arithmetic runs in Float64 for precision and the per-cell mesh quadrature is itself Float64. Cached by callers that derive dry mass for many windows of the same day.

source
AtmosTransport.Preprocessing.native_vertical Function
julia
native_vertical(reader::AbstractMetReader)  HybridSigmaPressure{FT_v}

Native vertical coordinate of the source data (NOT the merged/output coordinate). The vertical-transform axis (P0b) consumes this to plan its mapping.

source
AtmosTransport.Preprocessing.open_day Function
julia
open_day(settings::AbstractMetSettings, date::Date; ...) -> ctx

Open per-day source-specific context (file handles, caches, vertical coefficients, …). The orchestrator calls this once per day at the start of process_day, threads ctx through every read_window! call, and calls close_day!(ctx) in a finally block.

ctx is opaque to the orchestrator — only the source knows its layout.

source
AtmosTransport.Preprocessing.open_day Method
julia
open_day(settings::GEOSSettings, date::Date; next_day_handle=true) -> GEOSDayHandles

Canonical-contract alias for open_geos_day. The orchestrator calls this once per day and threads the returned handles through every per-window read_window!.

source
AtmosTransport.Preprocessing.open_era5_day Method
julia
open_era5_day(settings, date; next_day_handle=true) -> ERA5GRIBDayHandles

Resolve the GRIB stream paths for date and assert that today's required files are on disk. When next_day_handle=true and <date+1> has a core GRIB available, records its path so the last window's right endpoint can be read from the next day's hour-0 fields.

Errors are explicit about which file is missing so a misconfigured root_dir or an incomplete download is immediately visible.

source
AtmosTransport.Preprocessing.open_era5_physics_binary Method
julia
open_era5_physics_binary(bin_dir, date) -> ERA5PhysicsBinaryReader

Open the BIN for date under bin_dir (with YYYY subdir), mmap the payload, parse the header. Caller must close_era5_physics_binary when done (or wrap in a try/finally).

source
AtmosTransport.Preprocessing.open_geos_day Method
julia
open_geos_day(settings, date; next_day_handle=true) -> GEOSDayHandles

Open per-collection NCDataset handles for date. When next_day_handle is true and the next-day CTM_I1 file exists, opens it too so the last window of date has its right endpoint available.

source
AtmosTransport.Preprocessing.open_reader Function
julia
open_reader(settings::AbstractMetSettings, date::Date, ::Type{FT};
            seed = nothing, next_day_handle::Bool = true)
 reader::AbstractMetReader{FT, typeof(settings), CP}

Construct a typed met-reader for one calendar day. The reader owns the underlying day-handle context (file handles, vertical coefficients, chained-mass seed) and exposes the per-window trait surface.

seed carries cross-day mass state for chained-mass readers; pass the return value of end_of_day_seed(prev_day_reader). For NoChain readers, seed must be nothing.

next_day_handle controls whether the reader opens a handle into the next day's first-hour instantaneous file. Required when the day produces endpoint mass for the last window (the standard case for hourly sources).

Dispatch by settings type to select the concrete reader; concrete readers register their own method.

source
AtmosTransport.Preprocessing.open_reader Method
julia
open_reader(settings::ERA5SpectralSettings, date::Date, ::Type{FT};
            seed = nothing, next_day_handle::Bool = true)

P0a: opens the spectral day's GRIB and caches the typed nominal. seed/next_day_handle accepted for signature parity with the GEOS constructor; the spectral path's "next-day endpoint" is handled by the existing next_day_hour0 helper in configuration.jl:349 until P2.

source
AtmosTransport.Preprocessing.plan_vertical Function
julia
plan_vertical(transform::AbstractVerticalTransform,
              native_vc::HybridSigmaPressure{FT})  VerticalPlan{FT, typeof(transform)}

Materialize the planned vertical-coordinate mapping for one day's run. Called once per day (or once per run if native_vc is invariant); the returned plan is reused across all windows.

source
AtmosTransport.Preprocessing.process_day Method
julia
process_day(date::Date, grid::AbstractTargetGeometry, settings, vertical; next_day_hour0=nothing)

Topology-specific daily transport-binary preprocessor extension point.

Concrete target geometries implement this method with ordinary Julia multiple dispatch:

  • LatLonTargetGeometry writes structured directional LL binaries.

  • ReducedGaussianTargetGeometry writes face-indexed RG binaries.

  • CubedSphereTargetGeometry writes panel-local CS binaries.

Every implementation must satisfy the same transport contract:

  • use explicit forward endpoint mass targets for every window, including the final cross-day window when next_day_hour0 is available;

  • write declared payload semantics, including delta_semantics;

  • run a write-time replay check unless explicitly disabled for diagnostics;

  • produce binaries that the runtime driver can load with replay validation.

This fallback rejects unsupported source/target pairs after config parsing has already produced an AbstractTargetGeometry.

source
AtmosTransport.Preprocessing.process_day Method
julia
process_day(date, grid::CubedSphereTargetGeometry, settings::AbstractERA5GRIBSettings,
            vertical; out_path, FT, mass_basis, dt_met_seconds, positivity_cfl_limit,
            kwargs...)

Adapter that the unified preprocessor CLI calls into. Forwards to process_era5_n320_to_cs_day with the kwargs the underlying function actually accepts; the rest of the unified-CLI day-kwargs (e.g. chain_mass, adaptive_substeps, min_steps_per_window, seed_m) are absorbed by the trailing kwargs... and silently ignored — ERA5 N320 has no day-to-day mass-chain state, and the writer currently uses a fixed substep count rather than the adaptive policy.

Returns (; final_m = nothing) so the unified CLI's seed_m = get(result, :final_m, nothing) chain remains a no-op.

source
AtmosTransport.Preprocessing.process_day Method
julia
process_day(date, grid::CubedSphereTargetGeometry,
            settings::AbstractGEOSSettings, vertical;
            out_path,
            dt_met_seconds = 3600.0,
            FT = Float64,
            mass_basis = :dry,
            replay_tol = replay_tolerance(FT),
            seed_m = nothing,
            next_day_hour0 = nothing,
            chain_mass = true) -> NamedTuple

Build a v4 cubed-sphere transport binary at out_path from one UTC day of native GEOS data. Source mesh and target mesh must match (CS passthrough).

Stored mass targets the raw GEOS dry endpoint (DELP_dry) transformed to the output vertical grid. The native horizontal fluxes are column-balanced to that endpoint, then cm is diagnosed so the replay and positivity contracts are checked against the same endpoint the runtime will see.

For multi-day preprocessing with chain_mass = true, seed_m carries the raw endpoint from the previous day so adjacent daily binaries share a boundary mass: pass nothing (default) on day 1 to seed from raw GEOS DELP_dry, and on day N+1 pass the final_mreturned by day N'sprocess_day. Withchain_mass = false,seed_m is ignored and every window reinitializes from raw GEOS mass.

When chain_mass = true, the returned NamedTuple includes final_m::NTuple{6, Array{FT, 3}}, the raw-endpoint state at the END of the last window. With chain_mass = false, final_m is nothing.

next_day_hour0 is part of the inherited topology-dispatch contract but unused — the GEOS reader handles next-day endpoints internally via next_ctm_i1.

source
AtmosTransport.Preprocessing.process_day Method
julia
process_day(date, grid::CubedSphereTargetGeometry, settings, vertical; ...)

Spectral→CS transport binary: spectral synthesis to an internal LL staging grid, conservative regridding to CS panels, endpoint continuity closure, and streaming binary write. No on-disk LL intermediate.

source
AtmosTransport.Preprocessing.process_day Method
julia
process_day(date, grid::LatLonTargetGeometry, settings, vertical;
            next_day_hour0=nothing, positivity_cfl_limit=0.95,
            require_substep_positivity=true)

Run the full one-day preprocessing workflow for the structured lat-lon target: read spectral input, process all windows, close continuity against forward mass endpoints, and write the final binary.

source
AtmosTransport.Preprocessing.process_day Method
julia
process_day(date, grid::ReducedGaussianTargetGeometry, settings, vertical;
            next_day_hour0=nothing, positivity_cfl_limit=0.95,
            require_substep_positivity=true)

Streaming one-day preprocessing for reduced-Gaussian targets.

Uses a 2-window sliding buffer: at any time only two windows' worth of (m, hflux, cm, ps) are held in memory. Each window is Poisson-balanced and written to disk before the next pair is computed. This reduces peak memory from O(Nt) to O(1) and enables O160/O320 binary generation.

Pipeline per window: spectral synthesis → mass fix → level merge → (wait for next window) → Poisson balance using (m_cur, m_next) → cm recomputation → stream-write

source
AtmosTransport.Preprocessing.process_day Method
julia
process_day(cfg::Dict; day_override=nothing, start_date=nothing, end_date=nothing)

Top-level TOML-driven preprocessor entry. Detects source type from cfg:

  • [source].toml = "config/met_sources/<source>.toml" → typed AbstractMetSettings path, supports cross-day state carry (e.g. GEOS pressure-fixer chained mass) and --start/--end date ranges.

  • otherwise → typed ERA5 spectral config path ([input].spectral_dir).

Both paths converge on process_day(date, grid::AbstractTargetGeometry, settings, vertical; ...) for the per-day work. There is no parallel GEOS-only or per-source CLI — new sources plug in via AbstractMetSettings + load_met_settings.

source
AtmosTransport.Preprocessing.process_era5_n320_to_cs_day Method
julia
process_era5_n320_to_cs_day(date, settings, target_grid;
                             out_path,
                             FT = Float32,
                             mass_basis = :dry,
                             Nz = ERA5_NATIVE_LEVEL_COUNT,
                             dt_met_seconds = 3600.0,
                             steps_per_window = 8,
                             cs_balance_tol = 1e-14,
                             cs_balance_project_every = 50,
                             positivity_cfl_limit = 0.95,
                             cache_dir = nothing,
                             include_convection = false)

Generate a v4 cubed-sphere transport binary for one UTC date from the ERA5 native-GRIB source described by settings, written to out_path.

mass_basis is fixed to :dry here — the writer pulls dry-basis layer mass and dry surface pressure (re-derived on C180 from regridded PS + Q). A :moist request would need the moist-basis runtime contract, which is not the project's runtime default (feedback_dry_basis_default.md).

steps_per_window controls the number of Strang substeps per met window written into the binary. Each am / bm per-face slot stores the substep-mass amount; the runtime CFL is cfl = am[i,j,k] / m[i-1,j,k] per substep so a larger value softens the per-substep CFL at the cost of a larger binary.

The function stages writes to out_path.tmp and promotes to out_path on success — a partial run leaves no usable file at the requested path.

source
AtmosTransport.Preprocessing.process_era5_n320_window! Method
julia
process_era5_n320_window!(pipeline, handles, date, hour) -> pipeline

Drive the per-window pipeline for (date, hour):

  1. Synthesise PS / U / V / T / Q on the N320 source mesh (breakpoint B).

  2. Derive dry-air mass + DELP_dry + PS_dry on the source mesh (breakpoint C).

  3. Optionally read UDMF / DDMF / UDRF / DDRF for the matching forecast sample (breakpoint E), if the pipeline was built with include_convection = true.

  4. Conservatively regrid PS / U / V / T / Q to the C-tier target (breakpoint D).

After the call, pipeline.window_fields, pipeline.dry_fields, pipeline.convection_fields, and pipeline.c180_fields carry the window's data on their respective grids.

source
AtmosTransport.Preprocessing.promote_streaming_binary! Function
julia
promote_streaming_binary!(writer)

Close and promote a staged binary file into its final path.

source
AtmosTransport.Preprocessing.quarantine_streaming_binary! Function
julia
quarantine_streaming_binary!(writer)

Close and remove a staged binary file after a failed validation pass.

source
AtmosTransport.Preprocessing.read_era5_n320_convection_window! Method
julia
read_era5_n320_convection_window!(fields, handles, mesh, date, hour) -> fields

Fill fields with one hour's worth of N320 convection forecast fields for (date, hour). The reader picks handles.convection_path or handles.prev_convection_path based on the hour-address mapping and forward-iterates that file, matching messages whose (dataTime, stepRange, shortName) triple falls within the requested sample. Each matching message is decoded directly into the appropriate (n_cells, Nz) output slot via _reorder_grib_reduced_gg_to_mesh!.

Completeness gates name the missing field in any error so a corrupt or partial download is immediately visible. All four fields × 137 levels must be present for the call to succeed.

source
AtmosTransport.Preprocessing.read_era5_n320_window_fields! Method
julia
read_era5_n320_window_fields!(fields, workspace, handles, date, hour) -> fields

Fill fields with one window's worth of N320 source-grid fields for (date, hour). Performs one forward pass over handles.core_path, decoding every message whose (dataDate, dataTime) matches into the workspace's level-indexed spectral cubes (for gridType=sh) or directly into the output qv array (for gridType=reduced_gg). After the pass the function synthesizes spectral T per level, applies vod2uv! per level and synthesizes U/V, and synthesizes LNSP → PS. Errors loudly if any required level/field is absent so a partial download is immediately visible.

The function does not allocate beyond read_buf resizing inside read_spectral_coeffs!. Reusing (fields, workspace) across the 24 windows of a day is the intended call pattern.

source
AtmosTransport.Preprocessing.read_window! Function
julia
read_window!(raw::RawWindow, settings::AbstractMetSettings, ctx,
             date::Date, win_idx::Int) -> raw

Fill raw in place with one window of source data on the source's native grid. ctx is the day context returned by open_day. Implementations must NOT allocate per call beyond bounded scratch — raw is a pre-allocated workspace owned by the orchestrator — and should be idempotent in (date, win_idx).

source
AtmosTransport.Preprocessing.read_window! Method
julia
read_window!(raw, settings, handles, date, win_idx) -> raw

Fill `raw` in place with one window of GEOS data on the source CS grid.
Both endpoints (t_n, t_{n+1}) carry dry DELP + dry PS reconstructed from
PS_total via the hybrid coordinate, plus the original QV. Dynamics-step
`am`/`bm` are MFXC/MFYC scaled by `1/mass_flux_dt`.

The signature matches the canonical AbstractMetSettings contract declared in met_sources.jl::read_window!.

source
AtmosTransport.Preprocessing.regrid_ll_binary_to_cs Method
julia
regrid_ll_binary_to_cs(ll_binary_path, cs_grid, out_path; FT=Float64, mass_basis=nothing)

Regrid an existing LL transport binary to a cubed-sphere binary.

Reads each window from the LL binary, recovers cell-center winds from am/bm, conservatively regrids m/ps/winds to CS panels, rotates winds to panel-local coordinates, reconstructs CS face fluxes, closes continuity against forward mass endpoints, and stream-writes the CS binary.

This reuses the entire CS regrid/continuity/write infrastructure from the spectral→CS path — the only difference is the data source (binary reader instead of spectral synthesis).

Timestep metadata (dt_met_seconds, steps_per_window) is read directly from the source header by default. Passing steps_per_window overrides the output substep count: LL winds are recovered with the source scaling, then CS face fluxes are reconstructed with the output scaling.

Keyword arguments

  • FT::Type = Float64 — on-disk float type for the output CS binary.

  • mass_basis::Union{Nothing, Symbol} = nothing — output mass-basis label. nothing (default) = match the source. Setting this to a value that differs from the source's mass_basis currently errors: actual dry↔moist conversion requires loading the source's qv and applying apply_dry_basis_native!, which this function does not do. Invariant 14 mandates :dry end-to-end; use a dry source.

  • steps_per_window::Union{Nothing, Integer} = nothing — output substep count. nothing = match the source; larger values reduce stored per-substep fluxes while preserving the window-integrated transport.

  • allow_terminal_zero_tendency::Bool = false — diagnostic-only escape hatch for legacy LL sources that do not carry dm. Production-safe regrids should leave this at false so the final CS window is closed against an explicit endpoint target instead of an inferred zero-tendency fallback.

  • run_cache = nothing — optional PreprocessorRunCache used to reuse the LL→CS conservative regridder across calls in the same preprocessing run.

source
AtmosTransport.Preprocessing.regrid_n320_to_c180! Method
julia
regrid_n320_to_c180!(c180_fields, n320_window, workspace, target_grid) -> c180_fields

Conservatively regrid PS (2D) and U, V, T, Q (3D) from the N320 source mesh to the C180 cubed-sphere target. All five fields are intensive scalars; the regridder weights apply directly. The flat scratch buffers in workspace hold the intermediate Float64 arrays so the call is allocation-free after warm-up.

source
AtmosTransport.Preprocessing.regrid_transport_binary Method
julia
regrid_transport_binary(input_path, target_grid, out_path; kwargs...)

Generic transport-binary regrid extension point. Concrete target/source combinations implement methods here instead of adding new topology-specific scripts. The currently implemented production pair is LL transport binary to cubed sphere.

source
AtmosTransport.Preprocessing.reset_workspace! Function
julia
reset_workspace!(workspace, day_state) -> workspace

Reset a reusable workspace before ingesting a new day/source stream.

source
AtmosTransport.Preprocessing.resolve_tm5_convection_settings Method
julia
resolve_tm5_convection_settings(cfg) -> NamedTuple

Parse the optional [tm5_convection] section. When enable=true the preprocessor reads ERA5 physics binaries (built by convert_era5_physics_nc_to_bin, plan 24 Commit 2), computes TM5 entu/detu/entd/detd per hour via tm5_native_fields_for_hour! (plan 24 Commit 3), merges to the transport Nz, conservatively regrids to the target horizontal grid, and writes the four TM5 sections into the transport binary.

Fields:

  • tm5_convection_enable :: Bool — master switch.

  • tm5_physics_bin_dir :: String — NVMe directory holding era5_physics_YYYYMMDD.bin files produced by Commit 2's converter. Empty when disabled.

source
AtmosTransport.Preprocessing.run_unified_preprocessor_day! Method
julia
run_unified_preprocessor_day!(day::UnifiedPreprocessorDay; close_reader=true)

Execute the additive unified-driver lifecycle for one day. The function is generic over concrete reader/workspace/contract/writer types and depends on hook methods for topology-specific ingest/drain/flush behavior.

The writer is closed before summarize_status!, so fatal positivity summaries can quarantine a closed staging file. Any exception before promotion closes and quarantines the writer, then closes the reader.

source
AtmosTransport.Preprocessing.set_end_of_day_seed! Method
julia
set_end_of_day_seed!(reader::GEOSNativeReader, seed)  reader

Set the end-of-day mass seed produced by the orchestrator's last window. Called once per day at the end of process_day. For NoChain readers this is a no-op.

source
AtmosTransport.Preprocessing.source_grid Function
julia
source_grid(settings::AbstractMetSettings)

Return the source grid mesh that read_window! produces data on. Used by the orchestrator to build the appropriate regridder for the target grid (or to detect the source==target passthrough case).

source
AtmosTransport.Preprocessing.source_grid Method
julia
source_grid(settings::GEOSSettings) -> CubedSphereMesh

The native source mesh GEOS data is archived on (Nc × Nc per panel, GEOS-native panel convention).

source
AtmosTransport.Preprocessing.summarize_cs_positivity_status Method
julia
summarize_cs_positivity_status(worst; cfl_limit, steps_per_window,
                               require_substep_positivity = true,
                               quarantine_path = nothing)

Post-loop summary helper. Logs the worst-window outcome, and if it exceeds cfl_limit:

  • deletes quarantine_path (if given) so a downstream consumer cannot pick up the half-written binary;

  • errors when require_substep_positivity = true, otherwise warns.

The error message includes a recommended steps_per_window value that would satisfy the gate, computed from the observed worst ratio.

source
AtmosTransport.Preprocessing.summarize_ll_positivity_status Method
julia
summarize_ll_positivity_status(worst; cfl_limit, steps_per_window,
                               require_substep_positivity = true,
                               quarantine_path = nothing)

Post-loop summary helper for the LL positivity accumulator. Logs the worst-window outcome, and if it exceeds cfl_limit:

  • deletes quarantine_path (if given) so a downstream consumer cannot pick up the half-written binary;

  • errors when require_substep_positivity = true, otherwise warns.

The error/warn message includes a recommended steps_per_window value that would satisfy the gate, computed from the observed worst ratio. The "no representable rescue" branch from CS round-3 is mirrored here.

source
AtmosTransport.Preprocessing.summarize_rg_positivity_status Method
julia
summarize_rg_positivity_status(worst; cfl_limit, steps_per_window,
                               require_substep_positivity = true,
                               quarantine_path = nothing)

Post-loop summary helper for the RG positivity accumulator. Mirrors summarize_cs_positivity_status (CS round-2 + round-3 escape-hatch semantics).

source
AtmosTransport.Preprocessing.summarize_status! Function
julia
summarize_status!(contract::AbstractWindowContract;
                   quarantine_path::Union{Nothing, AbstractString} = nothing)
    -> nothing

Post-loop summary helper. Logs the worst-window outcome; if the contract's policy demands and the worst exceeds the gate, deletes quarantine_path (if given) and errors. Otherwise warns when the gate is violated but the policy is set to record-and-continue.

source
AtmosTransport.Preprocessing.summarize_status! Method
julia
summarize_status!(contract::CubedSphereContract;
                   quarantine_path::Union{Nothing, AbstractString} = nothing)

Run the CS positivity post-loop summary using the contract's worst accumulator and policy fields. May log, warn, or error depending on the accumulator state and require_substep_positivity.

source
AtmosTransport.Preprocessing.target_summary Method
julia
target_summary(grid) -> String

Return a short human-readable summary of the configured target grid.

source
AtmosTransport.Preprocessing.tm5_copy_or_regrid_ll! Method
julia
tm5_copy_or_regrid_ll!(dst_3d, ws_field, ws)

LL phase-2 helper: copy (identity, when shapes match) or regrid (conservative) a single source-grid merged TM5 field into the per-window target array dst_3d of shape (Nx, Ny, Nz).

source
AtmosTransport.Preprocessing.tm5_native_fields_for_hour! Method
julia
tm5_native_fields_for_hour!(entu, detu, entd, detd,
                              udmf_hour, ddmf_hour, udrf_hour, ddrf_hour,
                              t_hour, q_hour, ps_hour,
                              ak_full, bk_full, Nz_native;
                              stats=nothing, scratch=nothing) -> nothing

Grid-level entry point: for each column (i, j) in the 2D grid, compute dz from (T, Q, ps) then call ec2tm_from_rates!. Writes results into the 3D (Nlon, Nlat, Nz_native) output arrays.

All input 3D arrays are native-level (137 layers for ERA5); 2D ps_hour is surface pressure in Pa. stats counters are bumped across all columns. scratch is an optional 4-tuple of length-Nz_native vectors to avoid per-column allocation; when nothing, fresh ones are allocated inside.

source
AtmosTransport.Preprocessing.update_accumulator! Function
julia
update_accumulator!(contract::AbstractWindowContract,
                     positivity_diag, win_idx::Int) -> nothing

Fold one window's positivity diagnostic into the contract's worst- window accumulator. Concrete contracts mutate their internal state and return nothing. Idempotent if positivity_diag.ratio does not exceed the current worst.

source
AtmosTransport.Preprocessing.update_accumulator! Method
julia
update_accumulator!(contract::CubedSphereContract, positivity_diag, win_idx::Int)

Fold one window's positivity diagnostic into the CS contract's worst-window accumulator. Mutates contract.worst.

source
AtmosTransport.Preprocessing.update_cs_positivity_accumulator Method
julia
update_cs_positivity_accumulator(worst, diag, win_idx) -> NamedTuple

Return an updated accumulator from a fresh per-window diagnostic.

source
AtmosTransport.Preprocessing.update_ll_positivity_accumulator Method
julia
update_ll_positivity_accumulator(worst, diag, win_idx) -> NamedTuple

Return an updated LL accumulator from a fresh per-window diagnostic.

source
AtmosTransport.Preprocessing.update_rg_positivity_accumulator Method
julia
update_rg_positivity_accumulator(worst, diag, win_idx) -> NamedTuple

Return an updated RG accumulator from a fresh per-window diagnostic.

source
AtmosTransport.Preprocessing.verify_boundary_stub_flux_rg Method
julia
verify_boundary_stub_flux_rg(hflux, face_left, face_right;
                             tol = 0.0) -> NamedTuple

Explicit-invariant scan: any non-zero hflux value on a boundary stub (face_left ≤ 0 or face_right ≤ 0) is a contract violation. The runtime advection silently discards such fluxes (StrangSplitting.jl:279), so a writer that produces them is emitting data the runtime cannot apply — almost always a sign-flip or boundary- masking bug in preprocessing.

Returns (violated, worst_flux, worst_face, worst_level):

  • violated :: Booltrue iff any |flux| > tol on a boundary stub.

  • worst_flux :: Float64 — signed value of the worst-magnitude violation, or 0.0 if none.

  • worst_face :: Int — face index of the worst violation, or 0.

  • worst_level :: Int — k-index of the worst violation, or 0.

tol is the absolute tolerance below which a "near-zero" stub flux is permitted. Default 0.0 (strict) — RG writers should explicitly zero boundary stubs.

source
AtmosTransport.Preprocessing.verify_cs_window_contract! Method
julia
verify_cs_window_contract!(m_cur, am, bm, cm, m_next, steps_per_window, win_idx;
                           replay_tol, positivity_cfl_limit = 0.95, halo_width = 0)

Single canonical per-window CS binary contract check. Runs the replay gate (verify_write_replay_cs!, errors on failure) followed by the per-substep positivity scan (verify_substep_positivity_cs, returns a diagnostic). Every CS-producing preprocessor (spectral, regrid, GEOS-native) should call this so no path can silently skip a gate.

Returns (; replay, positivity) with both diagnostics. Positivity is non-fatal here — callers aggregate the worst window and pass it to summarize_cs_positivity_status after the loop, where the run-level require_substep_positivity policy decides whether to error or warn.

source
AtmosTransport.Preprocessing.verify_ll_window_contract! Method
julia
verify_ll_window_contract!(m_cur, am, bm, cm, m_next, steps_per_window, win_idx;
                           replay_tol, positivity_cfl_limit = 0.95,
                           div_scratch = nothing)

Single canonical per-window LL binary contract check. Runs the replay gate (errors on failure) followed by the per-substep positivity scan (verify_substep_positivity_ll!, returns a diagnostic).

div_scratch may be pre-allocated by the caller (workspace-owned scratch from P2) to suppress the per-window Array{Float64} the default-allocating verify_window_continuity_ll would otherwise produce. Default nothing → allocate locally.

Returns (; replay, positivity). Positivity is non-fatal here — callers aggregate the worst window across the loop and pass it to summarize_ll_positivity_status where the run-level require_substep_positivity policy decides whether to error or warn.

source
AtmosTransport.Preprocessing.verify_rg_window_contract! Method
julia
verify_rg_window_contract!(m_cur, hflux, cm, m_next, face_left, face_right,
                           steps_per_window, win_idx;
                           replay_tol, positivity_cfl_limit = 0.95,
                           div_scratch = nothing,
                           outgoing_h = nothing, bad_h = nothing,
                           boundary_stub_tol = 0.0)

Single canonical per-window RG binary contract check. Runs three gates in order:

  1. Boundary-stub flux gate — errors hard if any boundary stub (face_left ≤ 0 / face_right ≤ 0) carries non-zero hflux above boundary_stub_tol. No require_* escape hatch: such fluxes are silently discarded by the runtime (StrangSplitting.jl:279), so emitting them is always a writer bug.

  2. Replay gateverify_window_continuity_rg; errors on failure.

  3. Per-substep positivity scanverify_substep_positivity_rg!, returns a non-fatal diagnostic; the run-level accumulator + summarize_rg_positivity_status decides fatal-vs-warn.

div_scratch, outgoing_h, bad_h may be pre-allocated by the caller (workspace-owned scratch from P2) to suppress per-window allocation. Default nothing → allocate locally.

Returns (; replay, positivity). Boundary-stub failure does not return; it errors out before the replay gate so a broken writer cannot silently emit a binary the runtime would partially evaluate.

source
AtmosTransport.Preprocessing.verify_substep_positivity_cs! Method
julia
verify_substep_positivity_cs!(m, am, bm, cm; cfl_limit = 0.95,
                              halo_width = 0, m_next = nothing)

Verify the per-substep horizontal+vertical positivity contract that the runtime's _cs_static_subcycle_count depends on. For every interior cell on every panel:

  1. The cell air mass itself must be positive (m > 0). A non-positive cell mass is an immediate contract violation — the runtime divides by m and would produce Inf or NaN in the CFL scan. Such a cell is reported with ratio = Inf regardless of flux magnitude.

  2. The combined Strang-palindrome outgoing budget 2 * (out_x + out_y + out_z) must not exceed cfl_limit * m_ref, where m_ref = min(m, m_next) when m_next is supplied by a caller that wants endpoint tightening, and m_ref = m otherwise. The factor of 2 is required because the CS runtime applies the direction sequence X-Y-Z-Z-Y-X for every met substep.

Returns a NamedTuple (direction, ratio, location, ok):

  • direction :: Union{Symbol, Nothing} — the dominant contributor among :x, :y, :z, or nothing when no cell was inspected.

  • ratio :: Float64 — worst observed palindrome outgoing budget over the reference mass, or Inf if any inspected cell had invalid mass or flux.

  • location :: NTuple{4, Int}(panel, i, j, k) of the worst cell.

  • ok :: Booltrue iff ratio <= cfl_limit.

The replay gate (verify_write_replay_cs!) only checks endpoint continuity. A binary that drives a cell mass negative mid-sweep can still pass replay because the cell re-fills from inflow before the window ends — but the runtime cannot recover. This gate is the actual contract the runtime depends on.

halo_width defaults to 0 (panel arrays are stored unhaloed at preprocess time); pass > 0 to scan only the interior of a haloed buffer.

source
AtmosTransport.Preprocessing.verify_substep_positivity_ll! Method
julia
verify_substep_positivity_ll!(m, am, bm, cm; cfl_limit = 0.95)

Per-substep horizontal+vertical positivity scan for a structured LL window. Mirrors verify_substep_positivity_cs! but operates on a single LL window (no panel dimension).

For every cell (i, j, k):

  1. m > 0. A non-positive cell mass is reported with ratio = Inf and short-circuits this cell's CFL ratios; the runtime divides by m and would Inf/NaN otherwise.

  2. Outgoing mass per substep, per direction, ≤ cfl_limit * m.

NaN/Inf cell mass and NaN/Inf fluxes are flagged as ratio = Inf (see CS round-2 fix in cubed_sphere_contracts.jl).

Returns (direction, ratio, location, ok) with:

  • direction :: Union{Symbol, Nothing}:x / :y / :z / nothing (no inspection).

  • ratio :: Float64 — worst outgoing / m.

  • location :: NTuple{3, Int}(i, j, k).

  • ok :: Boolratio ≤ cfl_limit.

source
AtmosTransport.Preprocessing.verify_substep_positivity_rg! Method
julia
verify_substep_positivity_rg!(m, hflux, cm, face_left, face_right;
                              cfl_limit = 0.95,
                              outgoing_h = nothing, bad_h = nothing)

Per-substep horizontal+vertical positivity scan for a face-indexed RG window. Mirrors verify_substep_positivity_cs! / ..._ll! but operates on the face-indexed RG mass-flux representation.

For every cell (c, k):

  1. m > 0. A non-positive cell mass is reported with ratio = Inf.

  2. Horizontal outgoing mass per substep ≤ cfl_limit * m. Only interior faces (face_left > 0 && face_right > 0) contribute, matching the runtime advection in StrangSplitting.jl:279. Boundary stubs are not counted as outflow here — see verify_boundary_stub_flux_rg for the separate "non-zero flux on a boundary stub" invariant.

  3. Vertical outgoing mass per substep ≤ cfl_limit * m. Same as CS / LL: max(0, -cm[c, k]) + max(0, cm[c, k+1]).

NaN/Inf cell mass and NaN/Inf fluxes are flagged as ratio = Inf (matches the CS round-2 fix).

outgoing_h and bad_h can be passed in as workspace-owned scratch to suppress per-window allocation once P2 wires this into the unified driver. Default nothing → allocate locally.

Returns (direction, ratio, location, ok) with:

  • direction :: Union{Symbol, Nothing}:h / :z / nothing.

  • ratio :: Float64 — worst outgoing / m.

  • location :: NTuple{2, Int}(cell, level).

  • ok :: Boolratio ≤ cfl_limit.

source
AtmosTransport.Preprocessing.verify_window! Function
julia
verify_window!(window, contract::AbstractWindowContract, win_idx::Int)
    -> (; replay, positivity)

Run the contract's replay and positivity gates for one window. The replay gate throws on violation; the positivity gate is non-fatal at this layer (the run-level accumulator + summarize_status! decides fatal-vs-warn based on require_substep_positivity).

window is the topology's per-window payload (a NamedTuple in P1, a typed ReadyWindow{G, FT} in P2).

source
AtmosTransport.Preprocessing.verify_window! Method
julia
verify_window!(window, contract::CubedSphereContract, win_idx::Int)
    -> (; replay, positivity)

Run the per-window CS contract on a NamedTuple window with fields m_cur, am, bm, cm, m_next (each a 6-tuple of panel arrays). Delegates to verify_cs_window_contract!; the replay gate throws on violation, the positivity gate is non-fatal here.

source
AtmosTransport.Preprocessing.verify_write_replay_cs! Method
julia
verify_write_replay_cs!(m_cur, am, bm, cm, m_next, steps_per_window, tol_rel, win_idx;
                        div_scratch = nothing)

Run the CS write-time replay gate for one window and return its diagnostic.

The check integrates the stored panel-local fluxes from m_cur under the runtime palindrome-continuity contract and verifies that the result matches the explicit endpoint m_next. A failure here means the binary would produce a runtime day-boundary or window-boundary mass inconsistency.

div_scratch may be pre-allocated by the caller (workspace-owned scratch from P2 / contract-owned lazy scratch from P1) so the gate doesn't allocate the panel-shared div_h per call. Default nothing → allocate locally. Shape must match size(m_cur[1]).

source
AtmosTransport.Preprocessing.window_metadata Function
julia
window_metadata(reader::AbstractMetReader)  NamedTuple

Per-source window timing metadata. Standard fields:

  • windows::Intwindows_per_day(reader).

  • substeps::Int — sub-windows per write window (e.g. GEOS's dt_met_seconds ÷ mass_flux_dt).

  • dt_substep::Float64 — substep wall-clock in seconds.

Concrete readers may add source-specific keys (e.g. GEOS's mass_flux_dt).

source
AtmosTransport.Preprocessing.windows_per_day Function
julia
windows_per_day(settings::AbstractMetSettings, date::Date) -> Int

Number of preprocessing windows per UTC day for this source on date. For most sources this is constant (24 for hourly), but date-dependent sources (e.g. a leap-second day) may override.

source
AtmosTransport.Preprocessing.write_window! Function
julia
write_window!(writer, ready)

Write one validated ready window through a topology-specific binary writer. Concrete methods are topology-indexed by AbstractBinaryWriter and ReadyWindow so writer/window mismatches fail by dispatch.

source
AtmosTransport.Preprocessing.write_window! Method
julia
write_window!(io, win_idx, storage, settings, merged, last_hour_next) -> Int64

Write one window's payload blocks to the output stream in v4 on-disk order.

source