HITRAN Spectroscopic Data Management
vSmartMOM.jl requires spectroscopic line parameters from the HITRAN database to compute gas absorption cross-sections. This page explains the two available data pathways, how to switch between them, and how version provenance is tracked.
Overview
There are two pathways for obtaining HITRAN data:
| Legacy Artifacts (default) | Direct Download | |
|---|---|---|
| HITRAN edition | HITRAN 2016 | Current edition on hitran.org (HITRAN 2024 as of early 2025) |
| Data source | Pre-packaged tarballs hosted on a Caltech server | Live download from the HITRAN API |
| Isotopologues | All isotopologues per molecule | All isotopologues per molecule |
| Wavenumber range | Full range (0–150,000 cm⁻¹) | Configurable per download |
| Version tracking | Implicit (URL path contains hitran_2016) | Explicit .meta.toml file with SHA-256, download date, source URL |
| Storage | Julia Artifacts cache (~/.julia/artifacts/) | Scratch space (~/.julia/scratchspaces/) |
| Internet required | Only on first access (lazy download) | Only on first access (cached thereafter) |
Backward compatibility: The default behavior is unchanged. Existing code that calls artifact("CO2") or uses YAML configs with molecule names will continue to use the legacy HITRAN 2016 artifacts. You must explicitly opt in to a different edition.
Pathway 1: Legacy Artifacts (HITRAN 2016) – Default
This is the original data pathway and remains the default. HITRAN 2016 line-parameter files (.par format) are distributed as Julia lazy artifacts. They are downloaded automatically on first use from a Caltech server and cached in ~/.julia/artifacts/.
No setup is required. This pathway is active by default when the package loads.
How it works
using vSmartMOM
using vSmartMOM.Absorption
# artifact() returns the path to the .par file for the requested molecule.
# On first call, the artifact is downloaded and cached locally.
co2_path = artifact("CO2")
# Read the line parameters (all isotopologues, filtered by wavenumber range)
hitran_data = read_hitran(co2_path, mol=2, ν_min=6000, ν_max=6400)
# Check which isotopologues are included
println(sort(unique(hitran_data.iso))) # e.g., [1, 2, 3, 4, 5, 6, 7, 8]Limitations
The data is HITRAN 2016. There is no way to update to a newer edition without switching to the direct download pathway.
Version provenance is not explicitly tracked – the only indication of the edition is in the artifact download URL.
Pathway 2: Direct Download from hitran.org
This pathway downloads data directly from the HITRAN line-by-line API, similar to how the Python HAPI library works. Data is cached locally in a Julia scratch space with full provenance metadata.
Switching to the direct download pathway
using vSmartMOM
# Switch the active edition. This affects all subsequent artifact() calls.
set_hitran_edition!("HITRAN2024")
# Now artifact() will use the direct download pathway.
# On first call for each molecule, data is downloaded from hitran.org and cached.
co2_path = artifact("CO2")After this call, the CO2 line-parameter file is stored at:
~/.julia/scratchspaces/<package-uuid>/hitran_data/HITRAN2024/CO2.paralongside a metadata file:
~/.julia/scratchspaces/<package-uuid>/hitran_data/HITRAN2024/CO2.meta.tomlDownloading with custom wavenumber ranges
For large molecules (e.g., H2O), downloading the full 0–150,000 cm⁻¹ range can be slow. You can restrict the download to the spectral region you need:
# Download only the O2 A-band region
fetch_hitran("O2"; numin=12900, numax=13200, edition="HITRAN2024")
# Download CO2 for the 1.6 um band
fetch_hitran("CO2"; numin=6000, numax=6400, edition="HITRAN2024")Wavenumber range is fixed at download time
Once a molecule is cached, subsequent artifact() calls return the cached file regardless of the wavenumber range used during download. If you need a different range, either use force=true to re-download or use a different edition label.
Downloading by explicit isotopologue IDs
If you need only specific isotopologues, use the lower-level fetch_hitran_by_ids:
# Download only the two main CO2 isotopologues (global IDs 7 and 8)
fetch_hitran_by_ids("CO2_main", [7, 8]; numin=6000, numax=6400, edition="HITRAN2024")Global isotopologue IDs can be looked up with:
using vSmartMOM.Absorption
Absorption.mol_globalID(2, 1) # CO2, isotopologue 1 -> global ID 7
Absorption.mol_globalID(2, 2) # CO2, isotopologue 2 -> global ID 8Re-downloading (force refresh)
To re-download data (e.g., after a HITRAN database update), use force=true:
fetch_hitran("CO2"; edition="HITRAN2024", force=true)Version Tracking and Provenance
Every molecule downloaded via the direct pathway has a companion .meta.toml file that records:
# HITRAN download metadata
# Auto-generated by vSmartMOM.jl -- do not edit
molecule = "CO2"
edition = "HITRAN2024"
download_date = "2026-04-01T13:16:47.011"
numin = 0
numax = 150000
iso_ids = "7,8,9,10,11,12,13,14,121,15,120,122"
source_url = "https://hitran.org/lbl/api?iso_ids_list=7,8,9,10,11,12,13,14,121,15,120,122&numin=0&numax=150000"
sha256 = "ad576a2ac2f32619910ceb6af4e56cdcf78c3656e6883d3ebe705bf96b837874"
file_size_bytes = 755251You can query this programmatically:
set_hitran_edition!("HITRAN2024")
info = hitran_info("CO2")
println(info["sha256"]) # file integrity hash
println(info["download_date"]) # when it was downloaded
println(info["source_url"]) # exact API query usedEdition labels
The "edition" is a user-assigned label. The HITRAN API always serves the current database edition and has no version selector. The edition label is simply a name for the local cache directory. You can use any label you like:
# These are all valid edition labels
fetch_hitran("CO2"; edition="HITRAN2024")
fetch_hitran("CO2"; edition="test_narrowband", numin=6000, numax=6100)
fetch_hitran("CO2"; edition="production_v3")Use available_hitran_editions() to list all editions that exist locally:
available_hitran_editions()
# ["artifact", "HITRAN2024", "test_narrowband"]Managing Editions
Switching between editions
# Use HITRAN 2016 (legacy artifacts)
set_hitran_edition!("artifact")
path_2016 = artifact("CO2")
# Use HITRAN 2024 (downloaded from hitran.org)
set_hitran_edition!("HITRAN2024")
path_2024 = artifact("CO2")
# Check which edition is active
get_hitran_edition() # "HITRAN2024"Checking cache status
hitran_is_cached("CO2") # check current edition
hitran_is_cached("H2O", "HITRAN2024") # check specific editionAvailable molecules
using vSmartMOM.Absorption
Absorption.show_molecules() # list all HITRAN molecule names
Absorption.search_molecules("H") # search by substringIntegration with the RT Pipeline
The full RT pipeline (model_from_parameters) uses artifact() internally to obtain HITRAN data. Switching the edition before building the model is all you need:
using vSmartMOM
# Option A: Use HITRAN 2016 (default, no setup needed)
params = read_parameters("config/my_config.yaml")
model = model_from_parameters(params)
R, T = rt_run(model)
# Option B: Use HITRAN 2024
set_hitran_edition!("HITRAN2024")
params = read_parameters("config/my_config.yaml")
model = model_from_parameters(params) # downloads HITRAN 2024 data as needed
R, T = rt_run(model)No changes to YAML configuration files are required.
HITRAN API Rate Limits
The HITRAN API at hitran.org imposes a daily query limit. If you exceed it, you will see:
ERROR: HITRAN API rate limit exceeded. You have exceeded the daily limit of API queries. Try again tomorrow.To minimize API calls:
Data is cached locally after the first download. Subsequent calls use the cache.
Use
hitran_is_cached("CO2")to check before triggering a download.Download only the wavenumber range you need (use
numin/numaxinfetch_hitran).
Cleaning Up Cached Data
Downloaded HITRAN data lives in Julia's scratch space system. To remove all cached data:
using Scratch
Scratch.clear_scratchspaces!()Legacy artifacts are managed by Julia's package manager:
using Pkg
Pkg.gc() # removes unused artifactsLibrary Reference
The canonical docstrings for artifact, fetch_hitran, fetch_hitran_by_ids, HITRAN edition preferences, hitran_info, and hitran_is_cached are grouped in the Absorption API.