Preprocessing API
For narrative coverage of the source × target dispatch model and per-source pipelines, see Preprocessing overview and the per-source pages ERA5 spectral path and GEOS native cubed-sphere.
AtmosTransport.Preprocessing.AbstractBinaryWriter Type
AbstractBinaryWriter{G <: AbstractTargetGeometry, FT,
Basis <: AbstractMassBasis}Typed nominal for the topology's streaming binary writer. The third type parameter encodes the on-disk mass-basis convention (reusing State.AbstractMassBasis so the same nominal flows through the runtime reader path) so a writer↔reader pairing mismatch is a compile-time MethodError.
P1 ships only the abstract type; concrete subtypes land in P2.
sourceAtmosTransport.Preprocessing.AbstractChainPolicy Type
AbstractChainPolicyType-system tag for cross-day mass-state carry. Concrete subtypes are NoChain (no carry; each day starts from the raw source endpoint) and ChainedMass{T} where T is the array shape of the seed (e.g. NTuple{6, Array{Float32, 3}} for GEOS cubed-sphere).
AtmosTransport.Preprocessing.AbstractERA5GRIBSettings Type
AbstractERA5GRIBSettings <: AbstractMetSettingsAbstract supertype for ERA5 native-GRIB sources. Concrete subtypes pick the source mesh via the flavor parameter on ERA5GRIBSettings.
AtmosTransport.Preprocessing.AbstractFieldKind Type
AbstractFieldKindSingleton-type tag selecting the vertical-reduction rule apply_vertical! uses for one payload field. Concrete subtypes (all zero-size singletons) below.
AtmosTransport.Preprocessing.AbstractMetReader Type
AbstractMetReader{FT, S, CP}Typed met-reader nominal. Bundles a met-source's settings, per-day handle context, and chained-mass-policy carry into one struct that the unified preprocessor driver dispatches on. Type parameters:
FT <: AbstractFloat— preprocessing float type (Float32for GPU,Float64for diagnostic runs).S <: AbstractMetSettings— concrete settings type (GEOSITSettings,GEOSFPSettings,ERA5SpectralSettings, …).CP <: AbstractChainPolicy—NoChainorChainedMass{T}for some seed array typeT.
Concrete subtypes implement the seven trait functions below.
sourceAtmosTransport.Preprocessing.AbstractMetSettings Type
AbstractMetSettingsTop-level supertype for typed met-data source descriptors used by the preprocessor. Concrete subtypes (e.g. GEOSITSettings, MERRA2Settings, SpectralERA5Settings) carry source-specific paths and parameters and implement the read_window! / source_grid / windows_per_day interface.
process_day(date, grid::AbstractTargetGeometry, settings::AbstractMetSettings, vertical; ...) dispatches on settings to pick the reader.
AtmosTransport.Preprocessing.AbstractVerticalTransform Type
AbstractVerticalTransformTyped nominal selecting how native source levels are mapped to output levels. Concrete subtypes:
IdentityVertical— no merge.MergeByIndex— explicit native-level groups.MergeLayersThinnerThan— automatic local coarsening.MergeAbovePressure— upper-atmosphere coarsening.LevelSelection— echlevs-style level selection.PressureOverlap— pressure-thickness overlap onto a different hybrid coordinate.
A concrete transform T is consumed by plan_vertical(transform, native_vc) to produce a VerticalPlan{FT, T}. apply_vertical! dispatches on (plan, ::FieldKind) to select the right per-field rule.
AtmosTransport.Preprocessing.AbstractWindowContract Type
AbstractWindowContract{G <: AbstractTargetGeometry, FT}Typed nominal owning a topology's per-window gate policy and the worst-window accumulator state. Concrete subtypes:
CubedSphereContract{FT}(cubed_sphere_contracts.jl)LatLonContract{FT}(latlon_contracts.jl)ReducedGaussianContract{FT}(reduced_gaussian_contracts.jl)
A concrete contract validates its policy at construction time (so e.g. positivity_cfl_limit = 0.0 errors before any window is preprocessed) and exposes the four-method trait surface:
verify_window!(window, contract, win_idx) -> (replay, positivity)
update_accumulator!(contract, positivity_diag, win_idx) -> nothing
summarize_status!(contract; quarantine_path) -> nothingwindow is the topology-specific window payload (NamedTuple of typed buffers today, P2-typed ReadyWindow{G, FT} later).
Closes foot-gun (A) from DESIGN.md: contract knobs aren't drift-prone kwargs anymore — each topology constructs its own contract once from config, with whatever fields IT needs.
sourceAtmosTransport.Preprocessing.AbstractWindowWorkspace Type
AbstractWindowWorkspace{G <: AbstractTargetGeometry, FT}Typed nominal for the per-day target-shape workspace buffers. P1 ships only the abstract type; concrete subtypes land alongside the unified driver cutover in P2 (today's workspaces are NamedTuples constructed inside each topology's process_day orchestrator).
AtmosTransport.Preprocessing.ChainedMass Type
ChainedMass{T} <: AbstractChainPolicyThe reader carries an end-of-day mass field of shape T into the next day's open. The shape T is the seed array type (e.g. NTuple{6, Array{Float32, 3}}); the actual seed VALUE lives in the reader struct, but its STATIC type is encoded here so end_of_day_seed return-type is known at compile time.
AtmosTransport.Preprocessing.ConvectionInterfaceFlux Type
Convective interface flux (e.g. cmfmc). Same selection rule as PressureFluxField.
AtmosTransport.Preprocessing.ConvectionTendencyField Type
Convective center tendency (e.g. dtrain). Extensive at the layer; sum within merged group.
AtmosTransport.Preprocessing.CubedSphereContract Type
CubedSphereContract{FT} <: AbstractWindowContract{CubedSphereTargetGeometry, FT}Typed nominal owning a CS preprocessor's per-window gate policy and worst-window positivity accumulator.
Fields:
replay_tol::Float64— relative replay tolerance.positivity_cfl_limit::Float64— per-substep positivity CFL gate. Must satisfy0 < limit ≤ 1; validated at construction.require_substep_positivity::Bool— whethersummarize_status!errors (true) or warns (false) on a positivity violation.steps_per_window::Int— for the recommended-steps message in the summary's escape-hatch detail. Must be ≥ 1.halo_width::Int— passed through toverify_substep_positivity_cs!.worst::NamedTuple— mutable accumulator (initially zero).
Construct with explicit kwargs; defaults match the CS round-2/round-3 production policy.
sourceAtmosTransport.Preprocessing.ERA5C180RegridFields Type
ERA5C180RegridFields{FT}Per-window output container for fields regridded onto the C180 cubed-sphere target. Holds:
ps— 2D(Nc, Nc)matrix per panel (Pa, moist surface pressure),u,v,t,qv— 3D(Nc, Nc, Nz)arrays per panel (U/V in geographic east/north frame; rotation to panel-local axes happens in the breakpoint-F glue where the panel basis is known).
Mass fields (m_dry, delp_dry, ps_dry) are not regridded directly; they are re-derived on the C180 mesh from the regridded PS + Q so that Σ_k DELP_dry == PS_dry to roundoff on the target side as well.
AtmosTransport.Preprocessing.ERA5C180RegridWorkspace Type
ERA5C180RegridWorkspace{FT, R}Owns the conservative regridder + flat scratch buffers used by regrid_n320_to_c180!. Allocated once per (source_grid, target_grid) pair and reused across every window.
AtmosTransport.Preprocessing.ERA5GRIBDayHandles Type
ERA5GRIBDayHandles{S<:AbstractERA5GRIBSettings}Per-day source-file context. Carries the resolved on-disk paths to the day's GRIB streams plus the optional next-day core path that supplies the right endpoint of the last window.
convection_path is nothing unless settings.include_convection is set; likewise surface_path is nothing unless settings.include_surface is set. next_core_path is nothing either when the caller passed next_day_handle=false or when the next-day file is not on disk (last day of the available archive).
AtmosTransport.Preprocessing.ERA5GRIBSettings Type
ERA5GRIBSettings{flavor} <: AbstractERA5GRIBSettingsTyped settings for one ERA5 native-GRIB flavor:
flavor = :n320— reduced linear-Gaussian N320, the default MARS native grid for ERA5 analyses (640 longitudes at the equator, 137 hybrid levels).
ERA5 archives store hybrid model levels top-down (k = 1 is the TOA-side cap). level_orientation = :top_down reflects that, matching era5.toml. The reader flips to the project's runtime convention downstream — same path used by the GEOS-IT bottom-up source.
AtmosTransport.Preprocessing.ERA5N320ConvectionFields Type
ERA5N320ConvectionFields{FT}Per-window convection forecast fields on the N320 source mesh. All four fields are (n_cells, Nz):
udmf— updraft convective mass flux (kg m⁻² s⁻¹)ddmf— downdraft convective mass flux (kg m⁻² s⁻¹)udrf— updraft detrainment rate (kg m⁻³ s⁻¹)ddrf— downdraft detrainment rate (kg m⁻³ s⁻¹)
Downstream conversion to GEOS-style CMFMC + DTRAIN (or TM5-style entu/entd) happens in the breakpoint-F glue once both source and target geometries are known.
sourceAtmosTransport.Preprocessing.ERA5N320DryMassFields Type
ERA5N320DryMassFields{FT}Per-window dry-basis output container. m_dry is dry-air mass per cell per layer (kg), delp_dry is dry pressure thickness per cell per layer (Pa), ps_dry is the column-integrated dry surface pressure (Pa). ps_dry_acc is a Float64 accumulator that backs ps_dry so Σ_k DELP_dry stays accurate even when FT = Float32 (137-layer summation with ~10 hPa values per layer would otherwise lose ~100 Pa to single-precision rounding).
AtmosTransport.Preprocessing.ERA5N320SpectralWorkspace Type
ERA5N320SpectralWorkspace{FT, G}Per-window workspace for ERA5 N320 spectral synthesis. Owns:
the spectral coefficient cubes for T, VO, D, LNSP for the current hour (sized
(T+1) × (T+1) × NzinComplexF64),per-level scratch matrices
u_spec/v_specreused inside the synthesis loop,a
ReducedSpectralThreadCachewith Legendre column buffer plus FFT and real ring buffers sized to every unique ring length in the source mesh,a
read_bufscratch used byread_spectral_coeffs!for the raw ecCodescodes_get_double_arraypayload,completion bookkeeping (
have_t/have_vo/have_d/have_lnsp).
The cubes dominate memory: at T = 639 and Nz = 137 each cube is ≈ 0.9 GB. Workspaces are intended to be allocated once per day-handle and reused across the 24 hours.
Spectral buffers (vo_spec, d_spec, t_spec, lnsp_spec, u_spec, v_spec, synth_cache.P_buf) are unconditionally ComplexF64 / Float64 regardless of FT — read_spectral_coeffs! and vod2uv! only operate at that precision. FT only controls the eltype of the downstream gridpoint fields written via read_era5_n320_window_fields!.
AtmosTransport.Preprocessing.ERA5N320ToC180Pipeline Type
ERA5N320ToC180Pipeline{FT, RW, CSGrid, SrcGrid}All-in-one container for the per-day ERA5 N320 → C180 preprocessing workspace. Holds the source-grid descriptor, the hybrid coordinate, the shared per-cell area vector, every read/derive/regrid workspace from breakpoints B-D, the convection workspace from E, and the per-window output fields on both the source mesh and the C180 target.
One pipeline allocated per day-handle, reused across the 24 hourly windows.
sourceAtmosTransport.Preprocessing.ERA5N320WindowFields Type
ERA5N320WindowFields{FT}Per-window output container. u / v (m/s, geographic frame), t (K), qv (kg/kg specific humidity) are (n_cells, Nz); ps (Pa) is (n_cells,). Dry-basis layer mass derivation, regridding, and convection conversion are downstream of this struct.
AtmosTransport.Preprocessing.ERA5PhysicsBinaryHeader Type
ERA5PhysicsBinaryHeaderIn-memory view of the ERA5 physics BIN header. Parsed from / serialized to a 4 KB JSON block at the start of each BIN file. Fields:
format_version— int, currently 1. Readers error on mismatch.date—Date, the calendar day the BIN represents (hours 00-23).Nlon, Nlat, Nlev, Nt— grid shape.Ntis always 24 for the hourly calendar-day BIN.var_offsets_bytes— NamedTuple mappingvar name → byte offsetinto the file (where that variable's payload starts, after the header block).var_nelems— NamedTuple mappingvar name → total element count.latitude_convention—:S_to_N(AtmosTransport orientation).longitude_range—(first, last, step)in degrees.latitude_range—(first, last, step)in degrees (S to N).provenance— Dict with source NC paths, timestamp, git sha.
AtmosTransport.Preprocessing.ERA5PhysicsBinaryReader Type
ERA5PhysicsBinaryReaderMmap view of one ERA5 physics BIN file. Per-variable getters return reshaped views with zero allocation on subsequent calls.
Fields
path— absolute BIN path.io— openIOStream(keep alive for mmap lifetime).header— parsedERA5PhysicsBinaryHeader.mmap— single flatVector{Float32}over the entire payload. Individual variable views reshape into this.
Usage
reader = open_era5_physics_binary(bin_dir, Date(2021, 12, 1))
try
udmf = get_era5_physics_field(reader, :udmf) # 4D (Nlon, Nlat, Nlev, 24)
slab = @view udmf[:, :, :, 5] # one hour
# ... use slab ...
finally
close_era5_physics_binary(reader)
endAtmosTransport.Preprocessing.ERA5SpectralReader Type
ERA5SpectralReader{FT, S} <: AbstractMetReader{FT, S, NoChain}Typed reader nominal for the ERA5 spectral path. ChainPolicy is fixed at NoChain because today's spectral path does not carry cross-day mass state (it pins global-mean ps per window instead). The per-window read still fuses with merge in today's process_window! and is deferred to P2.
AtmosTransport.Preprocessing.ERA5SpectralSettings Type
ERA5SpectralSettings <: AbstractMetSettingsTyped wrapper for the historical ERA5 spectral settings NamedTuple. The spectral preprocessor still performs spectral synthesis inside the topology workspaces, but the source axis is now explicit: TOML parsing constructs this settings type and topology methods dispatch on it.
AtmosTransport.Preprocessing.GEOSDayHandles Type
GEOSDayHandlesOpen NCDataset handles for one UTC day's GEOS collections plus the resolved level orientation and the hybrid coordinate (loaded once for endpoint-DELP reconstruction). The orchestrator opens these once at the start of process_day and closes them at the end.
AtmosTransport.Preprocessing.GEOSNativeReader Type
GEOSNativeReader{FT, S, CP, V, H} <: AbstractMetReader{FT, S, CP}Typed reader wrapping (GEOSSettings, handles) for one day, plus optional chained-mass state. Type parameters:
FT— preprocessing float type.S <: AbstractGEOSSettings— concrete settings (GEOSITSettings / GEOSFPSettings / …).CP <: AbstractChainPolicy—NoChainorChainedMass{T}.V— seed-field slot type.NothingforNoChain;Union{Nothing, NTuple{6, Array{FT, 3}}}forChainedMass. The V parameter is fixed at construction (Julia-style review round-1: value-dependent V was the type-instability foot-gun; we pin V to the unconditional Union on the chained path so the inner constructor specializes once).H— concrete handles type. Pinned totypeof(open_day(...))at construction soreader.handlesaccess is type-stable (replaces the previous:: Anyfield). Different GEOS flavors produce different handle structs (GEOSDayHandlesfor GEOS-IT,GEOSFPNativeDayHandlesfor GEOS-FP); each becomes its own concrete reader type via this parameter.
Constructor: open_reader(settings::AbstractGEOSSettings, date, FT; seed, next_day_handle). See the function docstring.
AtmosTransport.Preprocessing.GEOSSettings Type
GEOSSettings{flavor} <: AbstractGEOSSettingsSettings for one of the two supported GEOS flavors:
flavor = :geosit— GEOS-IT (file patternGEOSIT.{date}.{collection}.C{Nc}.nc).flavor = :geosfp— GEOS-FP (file patternGEOSFP.{date}.{collection}.C{Nc}.nc).
Auto-detection of level orientation runs at open_geos_day time when level_orientation = :auto. Set explicitly to :bottom_up or :top_down to skip the heuristic.
AtmosTransport.Preprocessing.IdentityVertical Type
No-op identity vertical transform. Nz_output = Nz_native, merge_map[k] = k.
AtmosTransport.Preprocessing.IntensiveCenterField Type
Center-level intensive field (e.g. T, Q). Mass-weighted mean within merged group; weights provided positionally.
AtmosTransport.Preprocessing.LatLonContract Type
LatLonContract{FT} <: AbstractWindowContract{LatLonTargetGeometry, FT}Typed nominal owning an LL preprocessor's per-window gate policy and worst-window positivity accumulator. Construction validates positivity_cfl_limit ∈ (0, 1] and steps_per_window ≥ 1 so an invalid TOML value fails before any window runs.
AtmosTransport.Preprocessing.LevelSelection Type
LevelSelection(echlevs)Typed wrapper for the existing select_levels_echlevs algorithm. echlevs is a vector of native level INTERFACE indices (0-based, bottom-up); levels between selected interfaces are summed. See vertical_coordinates.jl:64 and the ECHLEVS_ML137_* constants.
AtmosTransport.Preprocessing.MassField Type
Center-level extensive mass (e.g. delp, m). Sum native layers within each merged group.
AtmosTransport.Preprocessing.MassFluxField Type
Horizontal face mass flux (e.g. am, bm). Already integrated over the layer thickness; sum within merged group.
AtmosTransport.Preprocessing.MergeAbovePressure Type
MergeAbovePressure(pressure_Pa; target_min_thickness_Pa = Inf,
reference_surface_pressure_Pa = 101325.0)Upper-atmosphere coarsening: native layers whose midpoint pressure is LOWER than pressure_Pa (physically ABOVE the cutoff altitude) get greedily merged until each merged layer exceeds target_min_thickness_Pa. Below the cutoff, the native grid is preserved verbatim.
target_min_thickness_Pa = Inf merges every above-cutoff native layer into one top cap. The GEOS-IT L72 use case is pressure_Pa = 100.0 + target_min_thickness_Pa = 50.0 — merges the ~14 Pa mesospheric layers into ~50 Pa output layers while keeping the troposphere and stratosphere at native resolution.
AtmosTransport.Preprocessing.MergeByIndex Type
MergeByIndex(groups)Explicit native center-level groups. groups[l] is the UnitRange{Int} of native center levels merged into output level l. Validation at plan_vertical:
groups[1]starts at 1;groups[end]ends atNz_native;groups are contiguous (
groups[l+1].start == groups[l].stop + 1);each range is non-empty.
This is the most auditable transform for production reruns — the group list lives in the run TOML and is version-controlled.
sourceAtmosTransport.Preprocessing.MergeLayersThinnerThan Type
MergeLayersThinnerThan(min_thickness_Pa; reference_surface_pressure_Pa = 101325.0)Typed wrapper for the existing merge_thin_levels algorithm: greedily merge adjacent native layers until each output layer exceeds min_thickness_Pa at the reference surface pressure.
AtmosTransport.Preprocessing.NoChain Type
NoChain <: AbstractChainPolicyThe reader does not carry mass state across days. Day N starts from the raw source endpoint. end_of_day_seed(::reader{…, NoChain}) returns nothing (statically inferred).
AtmosTransport.Preprocessing.PreprocessorRunCache Type
PreprocessorRunCache{G, FT}()Small typed cache for artifacts that should be built once per preprocessing run instead of once per day/window (for example spectral LL->CS regridders or RG compressed Laplacians). P2b only introduces the nominal and storage; concrete drivers decide which keys they own as they migrate.
sourceAtmosTransport.Preprocessing.PressureFluxField Type
Vertical interface mass flux (e.g. cm). Interfaces are selected (not summed); top/bottom zeros preserved.
AtmosTransport.Preprocessing.PressureOverlap Type
PressureOverlap(target_coeff_path)Remap native layer integrals onto an independent target hybrid coordinate by pressure-thickness overlap. The target half-level coefficients are loaded from target_coeff_path. Full apply_vertical! implementation lands in P1 alongside the spectral driver cutover; plan_vertical constructs the plan today.
AtmosTransport.Preprocessing.PreverifiedWindow Type
PreverifiedWindow(ready::ReadyWindow, contract_diag; accumulated=false)Typed event wrapper for topology hooks that have already run verify_window! before handing a window to the generic driver. When accumulated=true, the hook also guarantees it has already folded the positivity diagnostic into the contract accumulator.
AtmosTransport.Preprocessing.RawWindow Type
RawWindow{FT, A2, A3}Per-window source-grid intermediate carrying both window endpoints (t_n and t_{n+1}) and the window-integrated horizontal mass fluxes between them.
The right endpoint of window n is the left endpoint of window n+1, so the reader caches it across calls. For the last window of the day, the right endpoint comes from the next day's first instantaneous file (the existing next_day_hour0 plumbing in the orchestrator).
Cell-center winds u, v (geographic frame) are filled only when the target grid differs from the source grid and fluxes must be reconstructed downstream. For native passthrough (source mesh == target mesh) u/v stay nothing and am/bm are written through directly after vertical merging.
cmfmc/dtrain are filled only when the source supports convection and the user has enabled it via settings.include_convection. vdiff carries optional Holtslag-Boville VDIFF source fields (u, v, t, qv) when the source can archive them into the transport binary for runtime diffusion.
AtmosTransport.Preprocessing.ReadyWindow Type
ReadyWindow{G, FT}(index, payload::NamedTuple)Typed wrapper for a topology-specific window payload that is ready to be verified and written. G is the target geometry, FT is the on-disk float type, and payload is the existing topology NamedTuple (m_cur/am/... for LL/CS, m_cur/hflux/... for RG).
Unknown property access is forwarded to payload, so existing verify_window!(window, contract, win_idx) methods can accept either a raw NamedTuple or a ReadyWindow.
AtmosTransport.Preprocessing.ReducedGaussianContract Type
ReducedGaussianContract{FT} <: AbstractWindowContract{ReducedGaussianTargetGeometry, FT}Typed nominal owning an RG preprocessor's per-window gate policy and worst-window positivity accumulator. Holds the face connectivity (face_left / face_right) so the per-window call site doesn't need to thread it through every call.
Construction validates replay_tol, positivity_cfl_limit ∈ (0, 1], steps_per_window ≥ 1, boundary_stub_tol ≥ 0, and length(face_left) == length(face_right).
boundary_stub_tol (default 0.0) is the absolute tolerance for the boundary-stub flux gate (verify_boundary_stub_flux_rg). Default is strict; tighten only with caution.
AtmosTransport.Preprocessing.SurfaceField Type
2D surface field (e.g. ps, pblh). No vertical reduction — identity passthrough.
AtmosTransport.Preprocessing.TM5PreprocessingWorkspace Type
TM5PreprocessingWorkspace{FT, R, B}Per-day scratch for the TM5 convection preprocessor. Holds the per-column vectors reused across all columns of an hour, plus the 4×(Nlon_src, Nlat_src, Nz_native) native-vertical buffers and the 4×(Nlon_src, Nlat_src, Nz) merged-vertical buffers on the ERA5 source grid. regridder is the conservative regridder from the ERA5 LL source mesh to the target mesh, or nothing when source and target are the same LL grid (identity fast-path).
physics_bufs::B holds optional FT-typed conversion buffers for the six physics inputs (udmf, ddmf, udrf, ddrf, t, q). When the workspace FT matches the physics BIN eltype (Float32 today), B === Nothing and the kernel reads the BIN views directly with zero copy. When FT differs (e.g. the F64 path), B === NTuple{6, Array{FT, 3}} and compute_tm5_merged_hour_on_source! upcasts hour views into them in-place — alloc once per workspace, reused across all 24 hours per day.
AtmosTransport.Preprocessing.TracerMassField Type
Center-level extensive tracer mass (e.g. qv-mass). Sum native layers within each merged group.
AtmosTransport.Preprocessing.UnifiedPreprocessorDay Type
UnifiedPreprocessorDay(reader, workspace, contract, writer; context=nothing)Bundle the four typed axes a unified preprocessing day needs. context is an opaque topology/source adapter payload used by hook methods during migration.
AtmosTransport.Preprocessing.VerticalPlan Type
VerticalPlan{FT, T <: AbstractVerticalTransform}Result of plan_vertical. Type-parameterized by the originating transform so the apply_vertical! dispatch is statically resolved.
Fields:
transform: the originatingAbstractVerticalTransformvalue.native_vc: the input native hybrid coordinate.merged_vc: the output hybrid coordinate.merge_map:Vector{Int}such thatmerge_map[k_native]is the output-level index for merge-map flavors.Int[]forPressureOverlap(uses overlap coefficients instead).groups:Vector{UnitRange{Int}}of native center-level ranges that map to each output level (derived frommerge_mapfor the merge-map flavors).Nz_output: the output level count.Nz_native: the input level count (cached for cheap access).
AtmosTransport.Preprocessing.TM5CleanupStats Method
TM5CleanupStats() -> NamedTuple of Ref{Int} countersDiagnostic counters bumped by ec2tm_from_rates!. Nothing-overhead when the function runs without stats (pass nothing); when stats are passed, each counter increments once per level/column that hit the corresponding cleanup branch.
columns_processed— total columns the function was called on.no_updraft— columns with no level satisfyingudmf > 0(after small-value clipping). entu/detu zeroed out.no_downdraft— columns with no level satisfyingddmf < 0.levels_udmf_clipped,levels_ddmf_clipped— levels where half-level mass flux magnitude was below 1e-6 kg/m²/s and got zeroed.levels_udrf_clipped,levels_ddrf_clipped— full-level detrainment rates zeroed (|rate| < 1e-10 kg/m³/s).levels_entu_neg,levels_detu_neg,levels_entd_neg,levels_detd_neg— levels where the indicated output went negative and got fixed via symmetric redistribution with its complementary rate.
Interpretation
On clean data we expect ~0% clipped levels and ~0 no-updraft columns (outside pure stratospheric columns). O(1%) redistribution firings are normal TM5 behaviour. O(50%) firings indicate a data pathology (wrong param IDs, wrong stream, bad units).
sourceAtmosTransport.Preprocessing.allocate_era5_c180_regrid_workspace Method
allocate_era5_c180_regrid_workspace(source_grid, target_grid, Nz; cache_dir=nothing)Build (or load from cache_dir) the N320 → C180 conservative regridder and allocate the flat scratch buffers. The regridder's intersections matrix is the expensive piece — on first run it is built and serialised to JLD2 under cache_dir, and subsequent runs load it in milliseconds.
AtmosTransport.Preprocessing.allocate_era5_n320_spectral_workspace Method
allocate_era5_n320_spectral_workspace(source_grid, T, Nz)Allocate a fresh workspace sized to source_grid (build via discover_era5_n320_source_grid), spectral truncation T (from discover_era5_spectral_truncation) and vertical level count Nz (defaults to 137 in callers, but the workspace itself stays generic).
AtmosTransport.Preprocessing.allocate_era5_n320_to_c180_pipeline Method
allocate_era5_n320_to_c180_pipeline(handles, target_grid;
Nz=ERA5_NATIVE_LEVEL_COUNT,
cache_dir=nothing,
include_convection=true)Build the full per-window pipeline for the ERA5 source described by handles (resolved via open_era5_day) and the C-tier target target_grid. Discovers the source mesh and spectral truncation from the day's core GRIB, loads the hybrid coordinate file declared in the settings, builds (or JLD2-loads from cache_dir) the conservative regridder, and allocates every B/C/D/E workspace.
Convection workspace allocation is gated on include_convection so the caller can opt out for a scalar-only smoke.
AtmosTransport.Preprocessing.allocate_raw_window Function
allocate_raw_window(settings::AbstractMetSettings;
FT::Type, Nc=nothing, Nz=nothing) -> RawWindowAllocate a pre-zeroed RawWindow sized for one window of settings's source data. The orchestrator calls this once before the per-window loop and reuses the same buffer across all windows in a day. The shape parameters (Nc, Nz, …) are source-specific — concrete subtypes pick the keys they need.
AtmosTransport.Preprocessing.allocate_raw_window Method
allocate_raw_window(settings::GEOSSettings; FT, Nz) -> RawWindowPre-allocate a per-window workspace for the GEOS reader: 6 zero-filled panel arrays each for m, m_next, qv, qv_next, am, bm (shape (Nc, Nc, Nz)) and ps, ps_next (shape (Nc, Nc)).
When settings.include_convection, also allocates cmfmc (interfaces, shape (Nc, Nc, Nz + 1) per panel) and dtrain (centers, shape (Nc, Nc, Nz) per panel). Cross-topology winds (u, v) stay nothing here — they are produced by the orchestrator only when the target grid differs from the source.
AtmosTransport.Preprocessing.allocate_tm5_workspace Method
allocate_tm5_workspace(Nlon_src, Nlat_src, Nz_native, Nz, FT;
regridder=nothing, target_nlon=Nlon_src,
target_nlat=Nlat_src) -> TM5PreprocessingWorkspaceAllocate the TM5 preprocessing workspace. Pass regridder=nothing for identity (source and target grids match). Otherwise pass a ConservativeRegridding.Regridder built via build_regridder(source_mesh, target_mesh) from src/Regridding/.
AtmosTransport.Preprocessing.allocate_window_workspace Function
allocate_window_workspace(args...; kwargs...)Construct the topology-specific AbstractWindowWorkspace{G, FT} for one preprocessing day. Concrete methods land as production drivers migrate.
AtmosTransport.Preprocessing.apply_vertical! Function
apply_vertical!(buf_out, buf_in, plan::VerticalPlan, kind::AbstractFieldKind, args...)Apply the vertical transform to one window of buf_in, writing the result into buf_out. Dispatches on the (plan.transform, kind) combination:
Extensive center fields (
MassField,TracerMassField,MassFluxField,ConvectionTendencyField) sum native layers within each output group.Interface fields (
PressureFluxField,ConvectionInterfaceFlux) select the kept half-level interfaces.IntensiveCenterFieldtakes an additional positionalweightsargument (native mass-per-layer); produces the mass-weighted mean within each output group.SurfaceFieldis a passthrough copy (no vertical reduction).
buf_out and buf_in must be 3D arrays with the vertical axis on dim 3 (or 2D for SurfaceField).
AtmosTransport.Preprocessing.build_target_geometry Method
build_target_geometry(cfg_grid, FT=Float64) -> AbstractTargetGeometryDispatch configuration-driven target-geometry construction from the user-facing grid.type string.
AtmosTransport.Preprocessing.build_target_geometry Method
build_target_geometry(::Val{:cubed_sphere}, cfg_grid, FT)Build a cubed-sphere target geometry from the [grid] config section.
Required keys:
Nc :: Int— cells per panel edge
Optional keys:
definition—"equiangular_gnomonic"(legacy synthetic) or"gmao"(GEOS-IT/GEOS-FP equal-distance gnomonic)panel_conventionorconvention—"gnomonic"(default) or"geos_native". Ifdefinitionis omitted,"geos_native"selects"gmao"and"gnomonic"selects"equiangular_gnomonic".regridder_cache_dir— directory for CR.jl weight cache (default~/.cache/AtmosTransport/cr_regridding)staging_nlon,staging_nlat— override the internal LL staging grid size (defaults:max(4Nc, 360)×max(2Nc+1, 181))
AtmosTransport.Preprocessing.build_target_geometry Method
build_target_geometry(::Val{:era5_native_reduced_gaussian}, cfg_grid, FT)Build the reduced-Gaussian geometry metadata from a native ERA5 GRIB file. This is currently intended for geometry discovery and future native-grid preprocessing work rather than the active lat-lon synthesis path.
sourceAtmosTransport.Preprocessing.build_target_geometry Method
build_target_geometry(::Val{:latlon}, cfg_grid, FT) -> LatLonTargetGeometryBuild the regular lat-lon target geometry used by the current v4 spectral preprocessor.
sourceAtmosTransport.Preprocessing.build_target_geometry Method
build_target_geometry(::Val{:synthetic_reduced_gaussian}, cfg_grid, FT)Build a reduced-Gaussian target geometry from scratch without needing a source GRIB file. cfg_grid["gaussian_number"] controls the truncation N so the mesh has 2N Gauss-Legendre latitude rings. cfg_grid["nlon_mode"] selects the longitude layout:
"regular"(default): every ring has4Ncells — the classical "regular reduced" Gaussian grid used e.g. in the TL159/N80 family."octahedral": ECMWF octahedral layoutnlon = 4k + 16per hemisphere, mirrored between the two hemispheres withk = 1at the pole-adjacent ring.
Mesh latitudes are produced from FastGaussQuadrature.gausslegendre(2N), which returns ascending Gauss-Legendre nodes on (-1, 1). geometry_source_grib is left as an empty string to mark the grid as synthetic.
AtmosTransport.Preprocessing.close_day! Function
close_day!(ctx)Close all resources held by a day context. Must be idempotent; safe to call from a finally block.
AtmosTransport.Preprocessing.close_era5_day! Method
close_era5_day!(handles::ERA5GRIBDayHandles)Release any resources held by the day handle. Idempotent — safe to call from a finally block. This is a no-op today because the handle only stores paths, but the symbol exists so future flavors that pin file descriptors don't change the call surface.
AtmosTransport.Preprocessing.close_era5_physics_binary Method
close_era5_physics_binary(reader) -> nothingRelease the mmap and close the underlying IOStream.
AtmosTransport.Preprocessing.close_reader! Function
close_reader!(reader::AbstractMetReader) → nothingClose all per-day file handles and release scratch held by the reader. Idempotent — safe to call from a finally block. Concrete readers implement.
AtmosTransport.Preprocessing.close_streaming_binary! Function
close_streaming_binary!(writer)Close a streaming binary writer. Implementations should be idempotent.
sourceAtmosTransport.Preprocessing.compute_tm5_merged_hour_on_source! Method
compute_tm5_merged_hour_on_source!(ws, reader, hour, ps_hour, ak_full, bk_full,
Nz_native, merge_map; stats)Phase 1 of the per-hour TM5 step. Reads native-L137 ERA5 physics fields for hour from the mmap-backed reader, runs the TM5 math column-by-column into ws.*_native, then collapses to merged Nz into ws.*_merged_src. All work is on the ERA5-native horizontal grid (720×361 for standard physics BINs).
This path requires source and target grids to match (Nx, Ny) == (720, 361). PS comes from the caller (typically the preprocessor's transform.sp after spectral synthesis) because the physics BIN does not carry PS.
stats::Union{Nothing, NamedTuple} is the TM5CleanupStats bundle; when non-nothing, counters accumulate across all columns of all hours processed.
Shape guards:
readerfields shape(Nlon_src, Nlat_src, Nz_native, Nt)— must matchws.entu_native's leading dims.ps_hourshape(Nlon_src, Nlat_src).merge_maplengthNz_native.
AtmosTransport.Preprocessing.contract_cfl_limit Function
contract_cfl_limit(contract::AbstractWindowContract) -> Float64Per-substep positivity CFL gate the contract enforces.
sourceAtmosTransport.Preprocessing.contract_replay_tolerance Function
contract_replay_tolerance(contract::AbstractWindowContract) -> Float64Relative replay tolerance the contract's verify_window! uses. Concrete contracts return their stored policy value.
AtmosTransport.Preprocessing.contract_require_positivity Function
contract_require_positivity(contract::AbstractWindowContract) -> BoolWhether summarize_status! errors (true) or warns (false) on a positivity violation. Closes-the-escape-hatch toggle from CS round-2.
AtmosTransport.Preprocessing.convert_era5_physics_nc_to_bin Method
convert_era5_physics_nc_to_bin(nc_dir, bin_dir, date;
force_rewrite=false, verbose=true) -> bin_pathBuild one calendar-day ERA5 physics BIN from the archive NCs.
nc_dir: directory containingera5_convection_YYYYMMDD.nc+era5_thermo_ml_YYYYMMDD.nc.bin_dir: output staging directory; BIN lands at<bin_dir>/<YYYY>/era5_physics_<YYYYMMDD>.bin.date: calendar day (00:00-23:00) the BIN represents.force_rewrite: whenfalseand the BIN already exists with a valid header, skip (idempotent). Whentrue, overwrite.verbose: log progress (defaulttrue).
Requires the convection NC for both date - 1 day and date (because convection is forecast-based and a calendar-day BIN needs hours 00-06 from the previous day's file and hours 07-23 from the current day's file). Raises if either is missing.
The thermo NC is calendar-day aligned so only the target-date file is needed.
Writes the BIN atomically: first to <name>.bin.tmp, then renames. This is safe to run concurrently with a reader: the rename step is an atomic fs operation on a single filesystem.
AtmosTransport.Preprocessing.derive_c180_dry_mass! Method
derive_c180_dry_mass!(m_dry, delp_dry, ps_dry, ps_dry_acc,
ps_panels, qv_panels, vc, cell_areas; grav=GRAV) -> nothingCubed-sphere variant of derive_n320_dry_mass!. Builds the dry-air layer mass, dry pressure thickness, and dry surface pressure for each of the 6 C-tier panels from the regridded moist PS + Q on C180. Same formula as the GEOS endpoint dry-mass derivation so Σ_k DELP_dry = PS_dry to roundoff on the target mesh too.
Inputs and outputs are NTuple{6, ...} panel tuples; ps_dry_acc is a panel-tuple Float64 accumulator (so a Float32 output preserves the multi-level summation precision). cell_areas[i, j] is shared across all 6 panels (the CS mesh is isotropic per panel).
AtmosTransport.Preprocessing.derive_n320_dry_mass! Method
derive_n320_dry_mass!(dry, window, vc, cell_areas; grav=GRAV) -> dryReconstruct dry-air mass per layer, dry pressure thickness, and dry surface pressure from a populated window::ERA5N320WindowFields (moist PS, Q) using the hybrid coordinate vc (length Nz+1 A and B arrays in Pa and 1 respectively, top-down) and the per-cell areas (m²). Matches the GEOS endpoint_dry_mass! formula so the runtime sees a nominally bit-identical dry-basis contract regardless of source.
Asserts shapes and length(vc.A) == length(vc.B) == Nz + 1 so a coefficient table that does not match the workspace Nz fails immediately with a clear DimensionMismatch.
AtmosTransport.Preprocessing.detect_level_orientation Method
detect_level_orientation(ctm_a1::NCDataset) -> SymbolReturn :bottom_up if k=1 is the surface (mass-thicker) and :top_down if k=1 is TOA. Heuristic is unambiguous: surface DELP is O(1000 Pa), TOA DELP is O(1 Pa).
AtmosTransport.Preprocessing.discover_era5_n320_source_grid Method
discover_era5_n320_source_grid(core_path; FT=Float64) -> ReducedGaussianTargetGeometryBuild the N320 source-grid descriptor from the first gridType=reduced_gg message in core_path (the q field is always present). The resulting mesh ring order is south→north — the GRIB stores rings north→south and the read_era5_reduced_gaussian_* helpers flip into the project convention.
AtmosTransport.Preprocessing.discover_era5_spectral_truncation Method
discover_era5_spectral_truncation(core_path) -> IntRead the first spectral (gridType=sh) message in core_path and return its triangular truncation J = K = M. ERA5 native model-level analyses are T639 in the current archive; the helper avoids hard-coding that value in case a future archive convention shifts.
AtmosTransport.Preprocessing.drain_ready_windows! Function
drain_ready_windows!(workspace) -> iterator of `ReadyWindow`Return the windows that became write-ready after the last ingest.
sourceAtmosTransport.Preprocessing.driver_after_write_window! Method
driver_after_write_window!(workspace, reader, ready, context)Post-write migration hook. GEOS-native CS uses this to advance its chained endpoint state; most topologies do nothing.
sourceAtmosTransport.Preprocessing.driver_drain_ready_windows! Method
driver_drain_ready_windows!(workspace, contract, win, context)Return ready windows produced by the most recent ingest. A hook may return a single ReadyWindow, a single PreverifiedWindow, or any iterator of either shape.
AtmosTransport.Preprocessing.driver_flush_final_windows! Method
driver_flush_final_windows!(workspace, reader, contract, context)Return final ready windows after all source windows have been ingested.
sourceAtmosTransport.Preprocessing.driver_ingest_window! Method
driver_ingest_window!(workspace, reader, win, context)Migration hook for ingesting one source window into the target workspace. Topology/source adapters may override while legacy workspace signatures are being collapsed.
sourceAtmosTransport.Preprocessing.driver_windows_per_day Method
driver_windows_per_day(reader, context) -> IntMigration hook for source readers whose window count needs adapter context.
sourceAtmosTransport.Preprocessing.dz_hydrostatic_constT! Method
dz_hydrostatic_constT!(dz, ps, ak, bk, Nz; T_ref=260) -> dzConstant-temperature hydrostatic layer thickness — fallback for use when T and Q are unavailable. Matches main's Julia port's shortcut (T_ref = 260 K). Biases entu/detu magnitudes by ~10-20% vs the virtual-temperature version; dz_hydrostatic_virtual! is preferred when T + Q are downloaded together with the convection fields.
AtmosTransport.Preprocessing.dz_hydrostatic_virtual! Method
dz_hydrostatic_virtual!(dz, T_col, Q_col, ps, ak, bk, Nz) -> dzCompute layer thickness dz[1:Nz] (m) at layer centers from a single column's temperature T_col[1:Nz] (K) and specific humidity Q_col[1:Nz] (kg/kg), plus surface pressure ps (Pa) and the hybrid-sigma coefficients ak, bk (length Nz+1).
Uses the hydrostatic approximation with virtual temperature:
p_top[k] = ak[k] + bk[k] * ps (Pa, higher-altitude side)
p_bot[k] = ak[k+1] + bk[k+1] * ps (Pa, lower-altitude side)
dp[k] = p_bot[k] - p_top[k] (> 0 in AtmosTransport orientation)
p_mid[k] = 0.5 · (p_top[k] + p_bot[k])
T_v[k] = T_col[k] · (1 + 0.608 · Q_col[k])
dz[k] = R · T_v[k] / g · dp[k] / p_mid[k]Orientation: AtmosTransport (k=1=TOA, k=Nz=surface). ak/bk are the full (Nz+1)-length ERA5 L137 hybrid coefficients.
For the TOA half-level we fall back to an ordinary scale-height estimate (T_v=T_top, p_mid=p_bot) when p_top → 0. In practice the top-level dz is never in the convection window so the approximation is irrelevant; it's a guard against divide-by-zero.
AtmosTransport.Preprocessing.ec2tm! Method
ec2tm!(entu, detu, entd, detd,
mflu_ec, mfld_ec, detu_ec, detd_ec) -> (entu, detu, entd, detd)Convert ECMWF convective mass-flux fields into TM5 (entu, detu, entd, detd) layer-center fields. All inputs / outputs in AtmosTransport orientation (k=1=TOA, k=Nz=surface). Operates in place on the four output arrays (no allocations).
Shapes:
entu, detu, entd, detd:(..., Nz)— layer centers.mflu_ec, mfld_ec:(..., Nz+1)— half levels (interfaces). Interfacekis the TOP of layerk; interfaceNz+1is the surface boundary.detu_ec, detd_ec:(..., Nz)— layer centers.
The leading dimensions are arbitrary (scalar, (Nx, Ny), (ncells,), or per-panel (Nc, Nc)) as long as the arrays all have consistent shape. This function is backend-agnostic pure Julia; call from the preprocessor (CPU) ahead of binary writeout.
Negative small values (<= 0) in detu_ec / detd_ec are clipped to zero, following the ECMWF-rounding-artifact clean-up documented in phys_convec_ec2tm.F90 (ECMWF diagnostics can produce ~-1e-19 values from rounding).
For every location, computes:
k = 1: entu[1] = detu[1] (updraft starts in layer 1)
entd[1] = detd[1] (no flux at TOA)
k ∈ 2:Nz: entu[k] = detu[k] + mflu_ec[k+1] - mflu_ec[k]
entd[k] = detd[k] - mfld_ec[k] + mfld_ec[k-1]
(where mfld_ec sign-flipped internally)Wait — above formula is written in TM5 convention (k=1=surface). Let me re-write in AtmosTransport orientation explicitly.
AtmosTransport orientation (k=1=TOA):
Layer k has interface
kat its TOP (higher altitude side) and interfacek+1at its BOTTOM (lower altitude side).Updraft flows upward: from interface
k+1into layerkvia entrainmententu[k], out of layerkvia interfacek. Continuity:mflu_out − mflu_in + detu − entu = 0⟹entu[k] = detu[k] + mflu_ec[k] − mflu_ec[k+1]wheremflu_ec[k]is the flux through interface k (above layer k) andmflu_ec[k+1]is the flux through interface k+1 (below layer k). Convention:mflu_ec >= 0(positive upward).Downdraft flows downward: from interface
kinto layerkvia entrainmententd[k], out through interfacek+1via detrainment. Sign convention in ECMWF:mfdo <= 0(negative). We definemfdo_abs = -mfdo_ec >= 0, thenentd[k] = detd[k] + mfdo_abs[k+1] − mfdo_abs[k].
Boundary conditions:
mflu_ec[1] = 0(no updraft above TOA).mflu_ec[Nz+1] = 0(no updraft into ground from below — the surface is the sink).mfld_ec[1] = 0(no downdraft above TOA).mfld_ec[Nz+1] = 0(no downdraft escapes the surface).
These boundaries are typically already zero in ECMWF output; we enforce explicitly by construction below.
sourceAtmosTransport.Preprocessing.ec2tm_from_rates! Method
ec2tm_from_rates!(entu, detu, entd, detd,
udmf, ddmf, udrf_rate, ddrf_rate,
dz, Nz; stats=nothing) -> nothingColumn-level port of TM5's ECconv_to_TMconv (see deps/tm5/base/src/phys_convec_ec2tm.F90:87-237). Fills the output arrays entu, detu, entd, detd (kg/m²/s at layer centers) in place from raw ERA5 physics inputs. All arrays use AtmosTransport orientation (k=1=TOA, k=Nz=surface).
Inputs
udmf::AbstractVector{FT}, lengthNz+1— updraft mass flux at half levels (kg/m²/s). Half levelkis the interface at the TOP of layerk;udmf[Nz+1]is the surface interface. Must be ≥ 0.ddmf::AbstractVector{FT}, lengthNz+1— downdraft mass flux at half levels, ECMWF convention (≤ 0).udrf_rate::AbstractVector{FT}, lengthNz— updraft detrainment RATE at layer centers (kg/m³/s).ddrf_rate::AbstractVector{FT}, lengthNz— downdraft detrainment RATE at layer centers (kg/m³/s).dz::AbstractVector{FT}, lengthNz— layer thickness (m), positive. Compute fromdz_hydrostatic_virtual!.Nz::Int— number of full levels.stats::Union{Nothing, NamedTuple}— cleanup-stats counters fromTM5CleanupStats(). Whennothing, no counter work.
Algorithm (line-for-line match to F90 lines 132-236)
Copy-with-clipping (F90 L132-144).
|udmf| < 1e-6 → 0,|ddmf| < 1e-6 → 0(applied asddmf > -1e-6 → 0),udrf_rate < 1e-10 → 0,ddrf_rate < 1e-10 → 0.dz integration (F90 L146-151):
detu = udrf × dz(kg/m³/s → kg/m²/s). Same for detd.uptop/dotop search (F90 L153-173): find the first level from TOA (
k=1..Nz) with nonzero flux. If none, zero out everything in that direction.Mass-budget closure (F90 L175-212): for each active direction,
entu[k] = udmf[k] - udmf[k+1] + detu[k]fromuptopdown. Aboveuptop-1stays zero.Symmetric negative redistribution (F90 L214-232): if
entu[k] < 0, add-entu[k]todetu[k]and zeroentu[k]. Same fordetu<0(adds to entu), and the same two for downdraft.
Mass conservation
Within the cloud window, the sum entu[k] - detu[k] + (udmf[k] - udmf[k+1]) should be zero per layer — equivalent to the mass-budget closure at step 4. Negative redistribution (step 5) preserves this sum because it only SWAPS between entu↔detu (or entd↔detd) without changing the net entu - detu.
AtmosTransport.Preprocessing.end_of_day_seed Function
end_of_day_seed(reader::AbstractMetReader) → seed_or_nothingReturn the seed value to thread into the next day's open_reader(..., seed = ...). Type-system guarantees:
end_of_day_seed(::AbstractMetReader{FT, S, NoChain})returnsnothing(statically inferred).end_of_day_seed(::AbstractMetReader{FT, S, ChainedMass{T}})returns a value of typeT(or throws if the reader has not yet produced the end-of-day endpoint).
Closes foot-gun (D).
sourceAtmosTransport.Preprocessing.endpoint_dry_mass! Method
endpoint_dry_mass!(delp_dry, ps_dry, ps_total, qv, vc) -> (delp_dry, ps_dry)Reconstruct dry DELP and dry PS at one endpoint hour from the moist PS and QV provided by GEOS, using the hybrid coordinate vc. The output is on top-down level convention (k=1 = TOA).
Algorithm:
DELP_full[k] = ΔA[k] + ΔB[k] * PS_total
DELP_dry[k] = (1 - QV[k]) * DELP_full[k]
PS_dry = Σ DELP_dry[k]In-place: writes into delp_dry and ps_dry.
AtmosTransport.Preprocessing.endpoint_dry_mass Method
Allocate (delp_dry, ps_dry) for one panel and run endpoint_dry_mass!.
AtmosTransport.Preprocessing.era5_convection_hour_address Method
era5_convection_hour_address(hour) -> (use_prev_day, data_time_hhmm, step_range)Map a UTC hour h ∈ 0..23 to the GRIB header tuple that addresses the matching ECMWF convection forecast sample (see read_era5_n320_convection_window! docstring for the cycle layout). Returns (::Bool, ::Int, ::String).
AtmosTransport.Preprocessing.era5_grib_path Method
era5_grib_path(settings, date, stream) -> StringResolve the on-disk GRIB path for stream on date. stream must be one of :core, :convection, or :surface. Existence is not checked here — the caller (typically open_era5_day) decides whether a missing file is fatal or merely "no next-day endpoint available".
AtmosTransport.Preprocessing.flush_final_windows! Function
flush_final_windows!(workspace, args...; kwargs...)Emit any final cross-day/zero-tendency windows once the source stream is exhausted.
sourceAtmosTransport.Preprocessing.geos_collection_path Method
geos_collection_path(settings::GEOSITSettings, date::Date, collection) -> StringResolve the on-disk path of one GEOS-IT collection for date. Searches a flat root_dir and the per-day root_dir/YYYYMMDD/ layout.
AtmosTransport.Preprocessing.get_era5_physics_field Method
get_era5_physics_field(reader, var::Symbol) -> Array viewReturn a zero-allocation reshaped view into the mmap for var ∈ (:udmf, :ddmf, :udrf_rate, :ddrf_rate, :t, :q). Shape (Nlon, Nlat, Nlev, Nt).
AtmosTransport.Preprocessing.has_convection Method
has_convection(settings::AbstractMetSettings) -> BoolWhether this source can populate RawWindow.cmfmc and RawWindow.dtrain. Defaults to false; sources that support convection override and gate the actual output behind a user flag (e.g. settings.include_convection).
AtmosTransport.Preprocessing.has_surface Method
has_surface(settings::AbstractMetSettings) -> BoolWhether this source can populate raw PBL surface fields in RawWindow.surface. Sources gate actual reads behind their settings flags.
AtmosTransport.Preprocessing.has_vdiff_fields Method
has_vdiff_fields(settings::AbstractMetSettings) -> BoolWhether this source can populate RawWindow.vdiff with layer-center fields needed by the GCHP Holtslag-Boville vertical-diffusion data contract.
AtmosTransport.Preprocessing.ingest_window! Function
ingest_window!(workspace, args...; kwargs...) -> nothingConsume one source/met window into the topology workspace. Ready windows are exposed by drain_ready_windows!.
AtmosTransport.Preprocessing.init_cs_positivity_accumulator Method
init_cs_positivity_accumulator() -> NamedTupleZero-valued state for accumulating the worst per-window positivity diagnostic across a preprocessing loop. Pair with update_cs_positivity_accumulator!.
AtmosTransport.Preprocessing.init_ll_positivity_accumulator Method
init_ll_positivity_accumulator() -> NamedTupleZero-valued state for accumulating the worst per-window LL positivity diagnostic across a preprocessing loop. Pair with update_ll_positivity_accumulator.
AtmosTransport.Preprocessing.init_rg_positivity_accumulator Method
init_rg_positivity_accumulator() -> NamedTupleZero-valued state for accumulating the worst per-window RG positivity diagnostic across a preprocessing loop.
sourceAtmosTransport.Preprocessing.load_hybrid_coefficients Method
load_hybrid_coefficients(coeff_path::String) -> HybridSigmaPressureLoad all hybrid sigma-pressure interface coefficients from a TOML file. Unlike load_era5_vertical_coordinate, this does not slice — useful for sources whose level count comes from the file rather than a config knob (e.g. GEOS-72, MERRA-2, native ERA5 L137 without sub-tropo selection).
AtmosTransport.Preprocessing.load_met_settings Method
load_met_settings(toml_path::String; root_dir, kwargs...) -> AbstractMetSettingsConstruct a typed met-source descriptor from toml_path. The TOML's [source].name key picks the concrete settings type, and per-type _build_met_settings methods consume the source-specific TOML tables.
root_dir is the on-disk directory holding the source's daily files. Additional kwargs are forwarded to the constructor and override any values supplied by the TOML.
AtmosTransport.Preprocessing.log_tm5_cleanup_stats Method
log_tm5_cleanup_stats(stats, date_str)Pretty-print the per-day TM5 cleanup counters produced by TM5CleanupStats(). Zero-valued counters are omitted to keep the output compact; a fully-clean day yields a single line.
AtmosTransport.Preprocessing.mass_basis_from_symbol Method
mass_basis_from_symbol(s::Symbol) -> AbstractMassBasisConstruct the matching basis singleton from a header Symbol. Throws ArgumentError for unknown values.
AtmosTransport.Preprocessing.mass_basis_symbol Method
mass_basis_symbol(::AbstractMassBasis) -> SymbolMap a basis singleton (State.DryBasis / State.MoistBasis) to the on-disk binary-header value (:dry / :moist). Inverse of mass_basis_from_symbol.
AtmosTransport.Preprocessing.merge_tm5_field_3d! Method
merge_tm5_field_3d!(merged, native, merge_map)Accumulate a native-level TM5 field (Nlon, Nlat, Nz_native) onto merged output levels (Nlon, Nlat, Nz_merged) using the native-to- merged level map. Matches the semantics of merge_cell_field! for mass/flux fields: sum native layers that map to the same merged layer.
For TM5 entrainment/detrainment fluxes (kg/m²/s), summing over a consolidated layer preserves the column-integrated mass budget exactly.
sourceAtmosTransport.Preprocessing.n320_cell_areas Method
n320_cell_areas(source_grid) -> Vector{Float64}Materialise per-cell areas (m²) for the N320 source mesh. Always returns Vector{Float64} regardless of source_grid's element type — downstream dry-mass arithmetic runs in Float64 for precision and the per-cell mesh quadrature is itself Float64. Cached by callers that derive dry mass for many windows of the same day.
AtmosTransport.Preprocessing.native_vertical Function
native_vertical(reader::AbstractMetReader) → HybridSigmaPressure{FT_v}Native vertical coordinate of the source data (NOT the merged/output coordinate). The vertical-transform axis (P0b) consumes this to plan its mapping.
sourceAtmosTransport.Preprocessing.open_day Function
open_day(settings::AbstractMetSettings, date::Date; ...) -> ctxOpen per-day source-specific context (file handles, caches, vertical coefficients, …). The orchestrator calls this once per day at the start of process_day, threads ctx through every read_window! call, and calls close_day!(ctx) in a finally block.
ctx is opaque to the orchestrator — only the source knows its layout.
AtmosTransport.Preprocessing.open_day Method
open_day(settings::GEOSSettings, date::Date; next_day_handle=true) -> GEOSDayHandlesCanonical-contract alias for open_geos_day. The orchestrator calls this once per day and threads the returned handles through every per-window read_window!.
AtmosTransport.Preprocessing.open_era5_day Method
open_era5_day(settings, date; next_day_handle=true) -> ERA5GRIBDayHandlesResolve the GRIB stream paths for date and assert that today's required files are on disk. When next_day_handle=true and <date+1> has a core GRIB available, records its path so the last window's right endpoint can be read from the next day's hour-0 fields.
Errors are explicit about which file is missing so a misconfigured root_dir or an incomplete download is immediately visible.
AtmosTransport.Preprocessing.open_era5_physics_binary Method
open_era5_physics_binary(bin_dir, date) -> ERA5PhysicsBinaryReaderOpen the BIN for date under bin_dir (with YYYY subdir), mmap the payload, parse the header. Caller must close_era5_physics_binary when done (or wrap in a try/finally).
AtmosTransport.Preprocessing.open_geos_day Method
open_geos_day(settings, date; next_day_handle=true) -> GEOSDayHandlesOpen per-collection NCDataset handles for date. When next_day_handle is true and the next-day CTM_I1 file exists, opens it too so the last window of date has its right endpoint available.
AtmosTransport.Preprocessing.open_reader Function
open_reader(settings::AbstractMetSettings, date::Date, ::Type{FT};
seed = nothing, next_day_handle::Bool = true)
→ reader::AbstractMetReader{FT, typeof(settings), CP}Construct a typed met-reader for one calendar day. The reader owns the underlying day-handle context (file handles, vertical coefficients, chained-mass seed) and exposes the per-window trait surface.
seed carries cross-day mass state for chained-mass readers; pass the return value of end_of_day_seed(prev_day_reader). For NoChain readers, seed must be nothing.
next_day_handle controls whether the reader opens a handle into the next day's first-hour instantaneous file. Required when the day produces endpoint mass for the last window (the standard case for hourly sources).
Dispatch by settings type to select the concrete reader; concrete readers register their own method.
AtmosTransport.Preprocessing.open_reader Method
open_reader(settings::ERA5SpectralSettings, date::Date, ::Type{FT};
seed = nothing, next_day_handle::Bool = true)P0a: opens the spectral day's GRIB and caches the typed nominal. seed/next_day_handle accepted for signature parity with the GEOS constructor; the spectral path's "next-day endpoint" is handled by the existing next_day_hour0 helper in configuration.jl:349 until P2.
AtmosTransport.Preprocessing.plan_vertical Function
plan_vertical(transform::AbstractVerticalTransform,
native_vc::HybridSigmaPressure{FT}) → VerticalPlan{FT, typeof(transform)}Materialize the planned vertical-coordinate mapping for one day's run. Called once per day (or once per run if native_vc is invariant); the returned plan is reused across all windows.
AtmosTransport.Preprocessing.process_day Method
process_day(date::Date, grid::AbstractTargetGeometry, settings, vertical; next_day_hour0=nothing)Topology-specific daily transport-binary preprocessor extension point.
Concrete target geometries implement this method with ordinary Julia multiple dispatch:
LatLonTargetGeometrywrites structured directional LL binaries.ReducedGaussianTargetGeometrywrites face-indexed RG binaries.CubedSphereTargetGeometrywrites panel-local CS binaries.
Every implementation must satisfy the same transport contract:
use explicit forward endpoint mass targets for every window, including the final cross-day window when
next_day_hour0is available;write declared payload semantics, including
delta_semantics;run a write-time replay check unless explicitly disabled for diagnostics;
produce binaries that the runtime driver can load with replay validation.
This fallback rejects unsupported source/target pairs after config parsing has already produced an AbstractTargetGeometry.
AtmosTransport.Preprocessing.process_day Method
process_day(date, grid::CubedSphereTargetGeometry, settings::AbstractERA5GRIBSettings,
vertical; out_path, FT, mass_basis, dt_met_seconds, positivity_cfl_limit,
kwargs...)Adapter that the unified preprocessor CLI calls into. Forwards to process_era5_n320_to_cs_day with the kwargs the underlying function actually accepts; the rest of the unified-CLI day-kwargs (e.g. chain_mass, adaptive_substeps, min_steps_per_window, seed_m) are absorbed by the trailing kwargs... and silently ignored — ERA5 N320 has no day-to-day mass-chain state, and the writer currently uses a fixed substep count rather than the adaptive policy.
Returns (; final_m = nothing) so the unified CLI's seed_m = get(result, :final_m, nothing) chain remains a no-op.
AtmosTransport.Preprocessing.process_day Method
process_day(date, grid::CubedSphereTargetGeometry,
settings::AbstractGEOSSettings, vertical;
out_path,
dt_met_seconds = 3600.0,
FT = Float64,
mass_basis = :dry,
replay_tol = replay_tolerance(FT),
seed_m = nothing,
next_day_hour0 = nothing,
chain_mass = true) -> NamedTupleBuild a v4 cubed-sphere transport binary at out_path from one UTC day of native GEOS data. Source mesh and target mesh must match (CS passthrough).
Stored mass targets the raw GEOS dry endpoint (DELP_dry) transformed to the output vertical grid. The native horizontal fluxes are column-balanced to that endpoint, then cm is diagnosed so the replay and positivity contracts are checked against the same endpoint the runtime will see.
For multi-day preprocessing with chain_mass = true, seed_m carries the raw endpoint from the previous day so adjacent daily binaries share a boundary mass: pass nothing (default) on day 1 to seed from raw GEOS DELP_dry, and on day N+1 pass the final_mreturned by day N'sprocess_day. Withchain_mass = false,seed_m is ignored and every window reinitializes from raw GEOS mass.
When chain_mass = true, the returned NamedTuple includes final_m::NTuple{6, Array{FT, 3}}, the raw-endpoint state at the END of the last window. With chain_mass = false, final_m is nothing.
next_day_hour0 is part of the inherited topology-dispatch contract but unused — the GEOS reader handles next-day endpoints internally via next_ctm_i1.
AtmosTransport.Preprocessing.process_day Method
process_day(date, grid::CubedSphereTargetGeometry, settings, vertical; ...)Spectral→CS transport binary: spectral synthesis to an internal LL staging grid, conservative regridding to CS panels, endpoint continuity closure, and streaming binary write. No on-disk LL intermediate.
sourceAtmosTransport.Preprocessing.process_day Method
process_day(date, grid::LatLonTargetGeometry, settings, vertical;
next_day_hour0=nothing, positivity_cfl_limit=0.95,
require_substep_positivity=true)Run the full one-day preprocessing workflow for the structured lat-lon target: read spectral input, process all windows, close continuity against forward mass endpoints, and write the final binary.
sourceAtmosTransport.Preprocessing.process_day Method
process_day(date, grid::ReducedGaussianTargetGeometry, settings, vertical;
next_day_hour0=nothing, positivity_cfl_limit=0.95,
require_substep_positivity=true)Streaming one-day preprocessing for reduced-Gaussian targets.
Uses a 2-window sliding buffer: at any time only two windows' worth of (m, hflux, cm, ps) are held in memory. Each window is Poisson-balanced and written to disk before the next pair is computed. This reduces peak memory from O(Nt) to O(1) and enables O160/O320 binary generation.
Pipeline per window: spectral synthesis → mass fix → level merge → (wait for next window) → Poisson balance using (m_cur, m_next) → cm recomputation → stream-write
sourceAtmosTransport.Preprocessing.process_day Method
process_day(cfg::Dict; day_override=nothing, start_date=nothing, end_date=nothing)Top-level TOML-driven preprocessor entry. Detects source type from cfg:
[source].toml = "config/met_sources/<source>.toml"→ typedAbstractMetSettingspath, supports cross-day state carry (e.g. GEOS pressure-fixer chained mass) and--start/--enddate ranges.otherwise → typed ERA5 spectral config path (
[input].spectral_dir).
Both paths converge on process_day(date, grid::AbstractTargetGeometry, settings, vertical; ...) for the per-day work. There is no parallel GEOS-only or per-source CLI — new sources plug in via AbstractMetSettings + load_met_settings.
AtmosTransport.Preprocessing.process_era5_n320_to_cs_day Method
process_era5_n320_to_cs_day(date, settings, target_grid;
out_path,
FT = Float32,
mass_basis = :dry,
Nz = ERA5_NATIVE_LEVEL_COUNT,
dt_met_seconds = 3600.0,
steps_per_window = 8,
cs_balance_tol = 1e-14,
cs_balance_project_every = 50,
positivity_cfl_limit = 0.95,
cache_dir = nothing,
include_convection = false)Generate a v4 cubed-sphere transport binary for one UTC date from the ERA5 native-GRIB source described by settings, written to out_path.
mass_basis is fixed to :dry here — the writer pulls dry-basis layer mass and dry surface pressure (re-derived on C180 from regridded PS + Q). A :moist request would need the moist-basis runtime contract, which is not the project's runtime default (feedback_dry_basis_default.md).
steps_per_window controls the number of Strang substeps per met window written into the binary. Each am / bm per-face slot stores the substep-mass amount; the runtime CFL is cfl = am[i,j,k] / m[i-1,j,k] per substep so a larger value softens the per-substep CFL at the cost of a larger binary.
The function stages writes to out_path.tmp and promotes to out_path on success — a partial run leaves no usable file at the requested path.
AtmosTransport.Preprocessing.process_era5_n320_window! Method
process_era5_n320_window!(pipeline, handles, date, hour) -> pipelineDrive the per-window pipeline for (date, hour):
Synthesise PS / U / V / T / Q on the N320 source mesh (breakpoint B).
Derive dry-air mass + DELP_dry + PS_dry on the source mesh (breakpoint C).
Optionally read UDMF / DDMF / UDRF / DDRF for the matching forecast sample (breakpoint E), if the pipeline was built with
include_convection = true.Conservatively regrid PS / U / V / T / Q to the C-tier target (breakpoint D).
After the call, pipeline.window_fields, pipeline.dry_fields, pipeline.convection_fields, and pipeline.c180_fields carry the window's data on their respective grids.
AtmosTransport.Preprocessing.promote_streaming_binary! Function
promote_streaming_binary!(writer)Close and promote a staged binary file into its final path.
sourceAtmosTransport.Preprocessing.quarantine_streaming_binary! Function
quarantine_streaming_binary!(writer)Close and remove a staged binary file after a failed validation pass.
sourceAtmosTransport.Preprocessing.read_era5_n320_convection_window! Method
read_era5_n320_convection_window!(fields, handles, mesh, date, hour) -> fieldsFill fields with one hour's worth of N320 convection forecast fields for (date, hour). The reader picks handles.convection_path or handles.prev_convection_path based on the hour-address mapping and forward-iterates that file, matching messages whose (dataTime, stepRange, shortName) triple falls within the requested sample. Each matching message is decoded directly into the appropriate (n_cells, Nz) output slot via _reorder_grib_reduced_gg_to_mesh!.
Completeness gates name the missing field in any error so a corrupt or partial download is immediately visible. All four fields × 137 levels must be present for the call to succeed.
sourceAtmosTransport.Preprocessing.read_era5_n320_window_fields! Method
read_era5_n320_window_fields!(fields, workspace, handles, date, hour) -> fieldsFill fields with one window's worth of N320 source-grid fields for (date, hour). Performs one forward pass over handles.core_path, decoding every message whose (dataDate, dataTime) matches into the workspace's level-indexed spectral cubes (for gridType=sh) or directly into the output qv array (for gridType=reduced_gg). After the pass the function synthesizes spectral T per level, applies vod2uv! per level and synthesizes U/V, and synthesizes LNSP → PS. Errors loudly if any required level/field is absent so a partial download is immediately visible.
The function does not allocate beyond read_buf resizing inside read_spectral_coeffs!. Reusing (fields, workspace) across the 24 windows of a day is the intended call pattern.
AtmosTransport.Preprocessing.read_window! Function
read_window!(raw::RawWindow, settings::AbstractMetSettings, ctx,
date::Date, win_idx::Int) -> rawFill raw in place with one window of source data on the source's native grid. ctx is the day context returned by open_day. Implementations must NOT allocate per call beyond bounded scratch — raw is a pre-allocated workspace owned by the orchestrator — and should be idempotent in (date, win_idx).
AtmosTransport.Preprocessing.read_window! Method
read_window!(raw, settings, handles, date, win_idx) -> raw
Fill `raw` in place with one window of GEOS data on the source CS grid.
Both endpoints (t_n, t_{n+1}) carry dry DELP + dry PS reconstructed from
PS_total via the hybrid coordinate, plus the original QV. Dynamics-step
`am`/`bm` are MFXC/MFYC scaled by `1/mass_flux_dt`.The signature matches the canonical AbstractMetSettings contract declared in met_sources.jl::read_window!.
AtmosTransport.Preprocessing.regrid_ll_binary_to_cs Method
regrid_ll_binary_to_cs(ll_binary_path, cs_grid, out_path; FT=Float64, mass_basis=nothing)Regrid an existing LL transport binary to a cubed-sphere binary.
Reads each window from the LL binary, recovers cell-center winds from am/bm, conservatively regrids m/ps/winds to CS panels, rotates winds to panel-local coordinates, reconstructs CS face fluxes, closes continuity against forward mass endpoints, and stream-writes the CS binary.
This reuses the entire CS regrid/continuity/write infrastructure from the spectral→CS path — the only difference is the data source (binary reader instead of spectral synthesis).
Timestep metadata (dt_met_seconds, steps_per_window) is read directly from the source header by default. Passing steps_per_window overrides the output substep count: LL winds are recovered with the source scaling, then CS face fluxes are reconstructed with the output scaling.
Keyword arguments
FT::Type = Float64— on-disk float type for the output CS binary.mass_basis::Union{Nothing, Symbol} = nothing— output mass-basis label.nothing(default) = match the source. Setting this to a value that differs from the source'smass_basiscurrently errors: actual dry↔moist conversion requires loading the source'sqvand applyingapply_dry_basis_native!, which this function does not do. Invariant 14 mandates:dryend-to-end; use a dry source.steps_per_window::Union{Nothing, Integer} = nothing— output substep count.nothing= match the source; larger values reduce stored per-substep fluxes while preserving the window-integrated transport.allow_terminal_zero_tendency::Bool = false— diagnostic-only escape hatch for legacy LL sources that do not carrydm. Production-safe regrids should leave this atfalseso the final CS window is closed against an explicit endpoint target instead of an inferred zero-tendency fallback.run_cache = nothing— optionalPreprocessorRunCacheused to reuse the LL→CS conservative regridder across calls in the same preprocessing run.
AtmosTransport.Preprocessing.regrid_n320_to_c180! Method
regrid_n320_to_c180!(c180_fields, n320_window, workspace, target_grid) -> c180_fieldsConservatively regrid PS (2D) and U, V, T, Q (3D) from the N320 source mesh to the C180 cubed-sphere target. All five fields are intensive scalars; the regridder weights apply directly. The flat scratch buffers in workspace hold the intermediate Float64 arrays so the call is allocation-free after warm-up.
AtmosTransport.Preprocessing.regrid_transport_binary Method
regrid_transport_binary(input_path, target_grid, out_path; kwargs...)Generic transport-binary regrid extension point. Concrete target/source combinations implement methods here instead of adding new topology-specific scripts. The currently implemented production pair is LL transport binary to cubed sphere.
sourceAtmosTransport.Preprocessing.reset_workspace! Function
reset_workspace!(workspace, day_state) -> workspaceReset a reusable workspace before ingesting a new day/source stream.
sourceAtmosTransport.Preprocessing.resolve_tm5_convection_settings Method
resolve_tm5_convection_settings(cfg) -> NamedTupleParse the optional [tm5_convection] section. When enable=true the preprocessor reads ERA5 physics binaries (built by convert_era5_physics_nc_to_bin, plan 24 Commit 2), computes TM5 entu/detu/entd/detd per hour via tm5_native_fields_for_hour! (plan 24 Commit 3), merges to the transport Nz, conservatively regrids to the target horizontal grid, and writes the four TM5 sections into the transport binary.
Fields:
tm5_convection_enable :: Bool— master switch.tm5_physics_bin_dir :: String— NVMe directory holdingera5_physics_YYYYMMDD.binfiles produced by Commit 2's converter. Empty when disabled.
AtmosTransport.Preprocessing.run_unified_preprocessor_day! Method
run_unified_preprocessor_day!(day::UnifiedPreprocessorDay; close_reader=true)Execute the additive unified-driver lifecycle for one day. The function is generic over concrete reader/workspace/contract/writer types and depends on hook methods for topology-specific ingest/drain/flush behavior.
The writer is closed before summarize_status!, so fatal positivity summaries can quarantine a closed staging file. Any exception before promotion closes and quarantines the writer, then closes the reader.
AtmosTransport.Preprocessing.set_end_of_day_seed! Method
set_end_of_day_seed!(reader::GEOSNativeReader, seed) → readerSet the end-of-day mass seed produced by the orchestrator's last window. Called once per day at the end of process_day. For NoChain readers this is a no-op.
AtmosTransport.Preprocessing.source_grid Function
source_grid(settings::AbstractMetSettings)Return the source grid mesh that read_window! produces data on. Used by the orchestrator to build the appropriate regridder for the target grid (or to detect the source==target passthrough case).
AtmosTransport.Preprocessing.source_grid Method
source_grid(settings::GEOSSettings) -> CubedSphereMeshThe native source mesh GEOS data is archived on (Nc × Nc per panel, GEOS-native panel convention).
AtmosTransport.Preprocessing.summarize_cs_positivity_status Method
summarize_cs_positivity_status(worst; cfl_limit, steps_per_window,
require_substep_positivity = true,
quarantine_path = nothing)Post-loop summary helper. Logs the worst-window outcome, and if it exceeds cfl_limit:
deletes
quarantine_path(if given) so a downstream consumer cannot pick up the half-written binary;errors when
require_substep_positivity = true, otherwise warns.
The error message includes a recommended steps_per_window value that would satisfy the gate, computed from the observed worst ratio.
AtmosTransport.Preprocessing.summarize_ll_positivity_status Method
summarize_ll_positivity_status(worst; cfl_limit, steps_per_window,
require_substep_positivity = true,
quarantine_path = nothing)Post-loop summary helper for the LL positivity accumulator. Logs the worst-window outcome, and if it exceeds cfl_limit:
deletes
quarantine_path(if given) so a downstream consumer cannot pick up the half-written binary;errors when
require_substep_positivity = true, otherwise warns.
The error/warn message includes a recommended steps_per_window value that would satisfy the gate, computed from the observed worst ratio. The "no representable rescue" branch from CS round-3 is mirrored here.
AtmosTransport.Preprocessing.summarize_rg_positivity_status Method
summarize_rg_positivity_status(worst; cfl_limit, steps_per_window,
require_substep_positivity = true,
quarantine_path = nothing)Post-loop summary helper for the RG positivity accumulator. Mirrors summarize_cs_positivity_status (CS round-2 + round-3 escape-hatch semantics).
AtmosTransport.Preprocessing.summarize_status! Function
summarize_status!(contract::AbstractWindowContract;
quarantine_path::Union{Nothing, AbstractString} = nothing)
-> nothingPost-loop summary helper. Logs the worst-window outcome; if the contract's policy demands and the worst exceeds the gate, deletes quarantine_path (if given) and errors. Otherwise warns when the gate is violated but the policy is set to record-and-continue.
AtmosTransport.Preprocessing.summarize_status! Method
summarize_status!(contract::CubedSphereContract;
quarantine_path::Union{Nothing, AbstractString} = nothing)Run the CS positivity post-loop summary using the contract's worst accumulator and policy fields. May log, warn, or error depending on the accumulator state and require_substep_positivity.
AtmosTransport.Preprocessing.target_summary Method
target_summary(grid) -> StringReturn a short human-readable summary of the configured target grid.
sourceAtmosTransport.Preprocessing.tm5_copy_or_regrid_ll! Method
tm5_copy_or_regrid_ll!(dst_3d, ws_field, ws)LL phase-2 helper: copy (identity, when shapes match) or regrid (conservative) a single source-grid merged TM5 field into the per-window target array dst_3d of shape (Nx, Ny, Nz).
AtmosTransport.Preprocessing.tm5_native_fields_for_hour! Method
tm5_native_fields_for_hour!(entu, detu, entd, detd,
udmf_hour, ddmf_hour, udrf_hour, ddrf_hour,
t_hour, q_hour, ps_hour,
ak_full, bk_full, Nz_native;
stats=nothing, scratch=nothing) -> nothingGrid-level entry point: for each column (i, j) in the 2D grid, compute dz from (T, Q, ps) then call ec2tm_from_rates!. Writes results into the 3D (Nlon, Nlat, Nz_native) output arrays.
All input 3D arrays are native-level (137 layers for ERA5); 2D ps_hour is surface pressure in Pa. stats counters are bumped across all columns. scratch is an optional 4-tuple of length-Nz_native vectors to avoid per-column allocation; when nothing, fresh ones are allocated inside.
AtmosTransport.Preprocessing.update_accumulator! Function
update_accumulator!(contract::AbstractWindowContract,
positivity_diag, win_idx::Int) -> nothingFold one window's positivity diagnostic into the contract's worst- window accumulator. Concrete contracts mutate their internal state and return nothing. Idempotent if positivity_diag.ratio does not exceed the current worst.
AtmosTransport.Preprocessing.update_accumulator! Method
update_accumulator!(contract::CubedSphereContract, positivity_diag, win_idx::Int)Fold one window's positivity diagnostic into the CS contract's worst-window accumulator. Mutates contract.worst.
AtmosTransport.Preprocessing.update_cs_positivity_accumulator Method
update_cs_positivity_accumulator(worst, diag, win_idx) -> NamedTupleReturn an updated accumulator from a fresh per-window diagnostic.
sourceAtmosTransport.Preprocessing.update_ll_positivity_accumulator Method
update_ll_positivity_accumulator(worst, diag, win_idx) -> NamedTupleReturn an updated LL accumulator from a fresh per-window diagnostic.
sourceAtmosTransport.Preprocessing.update_rg_positivity_accumulator Method
update_rg_positivity_accumulator(worst, diag, win_idx) -> NamedTupleReturn an updated RG accumulator from a fresh per-window diagnostic.
sourceAtmosTransport.Preprocessing.verify_boundary_stub_flux_rg Method
verify_boundary_stub_flux_rg(hflux, face_left, face_right;
tol = 0.0) -> NamedTupleExplicit-invariant scan: any non-zero hflux value on a boundary stub (face_left ≤ 0 or face_right ≤ 0) is a contract violation. The runtime advection silently discards such fluxes (StrangSplitting.jl:279), so a writer that produces them is emitting data the runtime cannot apply — almost always a sign-flip or boundary- masking bug in preprocessing.
Returns (violated, worst_flux, worst_face, worst_level):
violated :: Bool—trueiff any |flux| > tol on a boundary stub.worst_flux :: Float64— signed value of the worst-magnitude violation, or0.0if none.worst_face :: Int— face index of the worst violation, or0.worst_level :: Int— k-index of the worst violation, or0.
tol is the absolute tolerance below which a "near-zero" stub flux is permitted. Default 0.0 (strict) — RG writers should explicitly zero boundary stubs.
AtmosTransport.Preprocessing.verify_cs_window_contract! Method
verify_cs_window_contract!(m_cur, am, bm, cm, m_next, steps_per_window, win_idx;
replay_tol, positivity_cfl_limit = 0.95, halo_width = 0)Single canonical per-window CS binary contract check. Runs the replay gate (verify_write_replay_cs!, errors on failure) followed by the per-substep positivity scan (verify_substep_positivity_cs, returns a diagnostic). Every CS-producing preprocessor (spectral, regrid, GEOS-native) should call this so no path can silently skip a gate.
Returns (; replay, positivity) with both diagnostics. Positivity is non-fatal here — callers aggregate the worst window and pass it to summarize_cs_positivity_status after the loop, where the run-level require_substep_positivity policy decides whether to error or warn.
AtmosTransport.Preprocessing.verify_ll_window_contract! Method
verify_ll_window_contract!(m_cur, am, bm, cm, m_next, steps_per_window, win_idx;
replay_tol, positivity_cfl_limit = 0.95,
div_scratch = nothing)Single canonical per-window LL binary contract check. Runs the replay gate (errors on failure) followed by the per-substep positivity scan (verify_substep_positivity_ll!, returns a diagnostic).
div_scratch may be pre-allocated by the caller (workspace-owned scratch from P2) to suppress the per-window Array{Float64} the default-allocating verify_window_continuity_ll would otherwise produce. Default nothing → allocate locally.
Returns (; replay, positivity). Positivity is non-fatal here — callers aggregate the worst window across the loop and pass it to summarize_ll_positivity_status where the run-level require_substep_positivity policy decides whether to error or warn.
AtmosTransport.Preprocessing.verify_rg_window_contract! Method
verify_rg_window_contract!(m_cur, hflux, cm, m_next, face_left, face_right,
steps_per_window, win_idx;
replay_tol, positivity_cfl_limit = 0.95,
div_scratch = nothing,
outgoing_h = nothing, bad_h = nothing,
boundary_stub_tol = 0.0)Single canonical per-window RG binary contract check. Runs three gates in order:
Boundary-stub flux gate — errors hard if any boundary stub (
face_left ≤ 0/face_right ≤ 0) carries non-zerohfluxaboveboundary_stub_tol. Norequire_*escape hatch: such fluxes are silently discarded by the runtime (StrangSplitting.jl:279), so emitting them is always a writer bug.Replay gate —
verify_window_continuity_rg; errors on failure.Per-substep positivity scan —
verify_substep_positivity_rg!, returns a non-fatal diagnostic; the run-level accumulator +summarize_rg_positivity_statusdecides fatal-vs-warn.
div_scratch, outgoing_h, bad_h may be pre-allocated by the caller (workspace-owned scratch from P2) to suppress per-window allocation. Default nothing → allocate locally.
Returns (; replay, positivity). Boundary-stub failure does not return; it errors out before the replay gate so a broken writer cannot silently emit a binary the runtime would partially evaluate.
AtmosTransport.Preprocessing.verify_substep_positivity_cs! Method
verify_substep_positivity_cs!(m, am, bm, cm; cfl_limit = 0.95,
halo_width = 0, m_next = nothing)Verify the per-substep horizontal+vertical positivity contract that the runtime's _cs_static_subcycle_count depends on. For every interior cell on every panel:
The cell air mass itself must be positive (
m > 0). A non-positive cell mass is an immediate contract violation — the runtime divides bymand would produceInforNaNin the CFL scan. Such a cell is reported withratio = Infregardless of flux magnitude.The combined Strang-palindrome outgoing budget
2 * (out_x + out_y + out_z)must not exceedcfl_limit * m_ref, wherem_ref = min(m, m_next)whenm_nextis supplied by a caller that wants endpoint tightening, andm_ref = motherwise. The factor of 2 is required because the CS runtime applies the direction sequenceX-Y-Z-Z-Y-Xfor every met substep.
Returns a NamedTuple (direction, ratio, location, ok):
direction :: Union{Symbol, Nothing}— the dominant contributor among:x,:y,:z, ornothingwhen no cell was inspected.ratio :: Float64— worst observed palindrome outgoing budget over the reference mass, orInfif any inspected cell had invalid mass or flux.location :: NTuple{4, Int}—(panel, i, j, k)of the worst cell.ok :: Bool—trueiffratio <= cfl_limit.
The replay gate (verify_write_replay_cs!) only checks endpoint continuity. A binary that drives a cell mass negative mid-sweep can still pass replay because the cell re-fills from inflow before the window ends — but the runtime cannot recover. This gate is the actual contract the runtime depends on.
halo_width defaults to 0 (panel arrays are stored unhaloed at preprocess time); pass > 0 to scan only the interior of a haloed buffer.
AtmosTransport.Preprocessing.verify_substep_positivity_ll! Method
verify_substep_positivity_ll!(m, am, bm, cm; cfl_limit = 0.95)Per-substep horizontal+vertical positivity scan for a structured LL window. Mirrors verify_substep_positivity_cs! but operates on a single LL window (no panel dimension).
For every cell (i, j, k):
m > 0. A non-positive cell mass is reported withratio = Infand short-circuits this cell's CFL ratios; the runtime divides bymand wouldInf/NaNotherwise.Outgoing mass per substep, per direction, ≤
cfl_limit * m.
NaN/Inf cell mass and NaN/Inf fluxes are flagged as ratio = Inf (see CS round-2 fix in cubed_sphere_contracts.jl).
Returns (direction, ratio, location, ok) with:
direction :: Union{Symbol, Nothing}—:x/:y/:z/nothing(no inspection).ratio :: Float64— worstoutgoing / m.location :: NTuple{3, Int}—(i, j, k).ok :: Bool—ratio ≤ cfl_limit.
AtmosTransport.Preprocessing.verify_substep_positivity_rg! Method
verify_substep_positivity_rg!(m, hflux, cm, face_left, face_right;
cfl_limit = 0.95,
outgoing_h = nothing, bad_h = nothing)Per-substep horizontal+vertical positivity scan for a face-indexed RG window. Mirrors verify_substep_positivity_cs! / ..._ll! but operates on the face-indexed RG mass-flux representation.
For every cell (c, k):
m > 0. A non-positive cell mass is reported withratio = Inf.Horizontal outgoing mass per substep ≤
cfl_limit * m. Only interior faces (face_left > 0 && face_right > 0) contribute, matching the runtime advection inStrangSplitting.jl:279. Boundary stubs are not counted as outflow here — seeverify_boundary_stub_flux_rgfor the separate "non-zero flux on a boundary stub" invariant.Vertical outgoing mass per substep ≤
cfl_limit * m. Same as CS / LL:max(0, -cm[c, k]) + max(0, cm[c, k+1]).
NaN/Inf cell mass and NaN/Inf fluxes are flagged as ratio = Inf (matches the CS round-2 fix).
outgoing_h and bad_h can be passed in as workspace-owned scratch to suppress per-window allocation once P2 wires this into the unified driver. Default nothing → allocate locally.
Returns (direction, ratio, location, ok) with:
direction :: Union{Symbol, Nothing}—:h/:z/nothing.ratio :: Float64— worstoutgoing / m.location :: NTuple{2, Int}—(cell, level).ok :: Bool—ratio ≤ cfl_limit.
AtmosTransport.Preprocessing.verify_window! Function
verify_window!(window, contract::AbstractWindowContract, win_idx::Int)
-> (; replay, positivity)Run the contract's replay and positivity gates for one window. The replay gate throws on violation; the positivity gate is non-fatal at this layer (the run-level accumulator + summarize_status! decides fatal-vs-warn based on require_substep_positivity).
window is the topology's per-window payload (a NamedTuple in P1, a typed ReadyWindow{G, FT} in P2).
AtmosTransport.Preprocessing.verify_window! Method
verify_window!(window, contract::CubedSphereContract, win_idx::Int)
-> (; replay, positivity)Run the per-window CS contract on a NamedTuple window with fields m_cur, am, bm, cm, m_next (each a 6-tuple of panel arrays). Delegates to verify_cs_window_contract!; the replay gate throws on violation, the positivity gate is non-fatal here.
AtmosTransport.Preprocessing.verify_write_replay_cs! Method
verify_write_replay_cs!(m_cur, am, bm, cm, m_next, steps_per_window, tol_rel, win_idx;
div_scratch = nothing)Run the CS write-time replay gate for one window and return its diagnostic.
The check integrates the stored panel-local fluxes from m_cur under the runtime palindrome-continuity contract and verifies that the result matches the explicit endpoint m_next. A failure here means the binary would produce a runtime day-boundary or window-boundary mass inconsistency.
div_scratch may be pre-allocated by the caller (workspace-owned scratch from P2 / contract-owned lazy scratch from P1) so the gate doesn't allocate the panel-shared div_h per call. Default nothing → allocate locally. Shape must match size(m_cur[1]).
AtmosTransport.Preprocessing.window_metadata Function
window_metadata(reader::AbstractMetReader) → NamedTuplePer-source window timing metadata. Standard fields:
windows::Int—windows_per_day(reader).substeps::Int— sub-windows per write window (e.g. GEOS'sdt_met_seconds ÷ mass_flux_dt).dt_substep::Float64— substep wall-clock in seconds.
Concrete readers may add source-specific keys (e.g. GEOS's mass_flux_dt).
AtmosTransport.Preprocessing.windows_per_day Function
windows_per_day(settings::AbstractMetSettings, date::Date) -> IntNumber of preprocessing windows per UTC day for this source on date. For most sources this is constant (24 for hourly), but date-dependent sources (e.g. a leap-second day) may override.
AtmosTransport.Preprocessing.write_window! Function
write_window!(writer, ready)Write one validated ready window through a topology-specific binary writer. Concrete methods are topology-indexed by AbstractBinaryWriter and ReadyWindow so writer/window mismatches fail by dispatch.
AtmosTransport.Preprocessing.write_window! Method
write_window!(io, win_idx, storage, settings, merged, last_hour_next) -> Int64Write one window's payload blocks to the output stream in v4 on-disk order.
source