CDAWeb
Julia interface to NASA's Coordinated Data Analysis Web (CDAWeb) for accessing space physics data.
Installation
using Pkg
Pkg.add("CDAWeb")An agent skill is included for exploring CDAWeb using natural language. To install it using skills, run:
npx skills add JuliaSpacePhysics/CDAWeb.jlQuick Example
using CDAWeb
using Dates
# Get dataset description
get_dataset("AC_H0_MFI")JSON.Object{String, Any} with 14 entries:
"Id" => "AC_H0_MFI"
"Doi" => "10.48322/e0dc-0h53"
"SpaseResourceId" => "spase://NASA/NumericalData/ACE/MAG/L2/PT16S"
"Observatory" => Any["AC"]
"Instrument" => Any["MAG"]
"ObservatoryGroup" => Any["ACE"]
"InstrumentType" => Any["Magnetic Fields (space)"]
"Label" => "H0 - ACE Magnetic Field 16-Second Level 2 Data - N. …
"TimeInterval" => Object{String, Any}("Start"=>"1997-09-02T00:00:12.000…
"PiName" => "N. Ness"
"PiAffiliation" => "Bartol Research Institute"
"Notes" => "https://cdaweb.gsfc.nasa.gov/misc/NotesA.html#AC_H0_…
"DatasetLink" => Any[Object{String, Any}("Title"=>"the ACE Science Cen…
"AdditionalMetadata" => Any[Object{String, Any}("Type"=>"SPASE", "value"=>"ht…# Get dataset within the time range and display its attributes
ds = get_data("AC_H0_MFI", "2023-01-01", "2023-01-02")
ds.attribCommonDataModel.Attributes{CDFDatasets.ConcatCDFDataset{Vector{CommonDataFormat.CDFDataset{CommonDataFormat.NoCompression, Int64}}}} with 26 entries:
"TITLE" => ["ACE> Magnetometer Parameters"]
"Project" => ["ACE>Advanced Composition Explorer", "ISTP>I…
"Discipline" => ["Space Physics>Interplanetary Studies"]
"Source_name" => ["AC>Advanced Composition Explorer"]
"Data_type" => ["H0>16-Sec Level 2 Data"]
"Descriptor" => ["MAG>ACE Magnetic Field Instrument"]
"Data_version" => ["7"]
"Generated_by" => ["ACE Science Center"]
"Generation_date" => ["20230313"]
"LINK_TEXT" => ["Release notes and other info available at"]
"LINK_TITLE" => ["the ACE Science Center Level 2 Data website…
"HTTP_LINK" => ["http://www.srl.caltech.edu/ACE/ASC/level2/i…
"TEXT" => ["MAG - ACE Magnetic Field Experiment", "Refe…
"MODS" => ["Initial Release 9/7/01 ", "12/04/02: Fixed …
"ADID_ref" => ["NSSD0327"]
"Logical_file_id" => ["AC_H0_MFI_20230101_V07"]
"Logical_source" => ["AC_H0_MFI"]
"Logical_source_description" => ["H0 - ACE Magnetic Field 16-Second Level 2 D…
"PI_name" => ["N. Ness"]
⋮ => ⋮# Fetch solar wind velocity data from OMNI dataset
dataset = "OMNI_COHO1HR_MERGED_MAG_PLASMA"
t0 = DateTime(2020, 1, 1) # Start time
t1 = DateTime(2020, 1, 2) # End time
data = get_data(dataset, "V", t0, t1) # Data is automatically cached for faster subsequent accessV (25)
Datatype: Float32
Dimensions: Epoch
Attributes:
FILLVAL = Float32[-1.0f31]
FIELDNAM = Bulk Flow Speed
VALIDMAX = Float32[1200.0]
CATDESC = Bulk Flow Speed
VALIDMIN = Float32[0.0]
DISPLAY_TYPE = time_series
UNITS = km/s
DEPEND_0 = Epoch
FORMAT = F5.0
VAR_TYPE = data
LABLAXIS = Bulk Flow Speed
DIM_SIZES = Int32[0]
Retrieve the original monthly data files and clip to the exact requested time range.
data = get_data(dataset, t0, t1; clip = true)["V"]View: 1:25
V (744)
Datatype: Float32
Dimensions: Epoch
Attributes:
FILLVAL = Float32[-1.0f31]
FIELDNAM = Bulk Flow Speed
VALIDMAX = Float32[1200.0]
CATDESC = Bulk Flow Speed
VALIDMIN = Float32[0.0]
DISPLAY_TYPE = time_series
UNITS = km/s
DEPEND_0 = Epoch
FORMAT = F5.0
VAR_TYPE = data
LABLAXIS = Bulk Flow Speed
Additional Features
Get Metadata from Web Services
Access metadata directly from CDAWeb's RESTful services:
# Get descriptions of the instrument types that are available from CDAS.
instrument_types = get_instrument_types()18-element Vector{Any}:
JSON.Object{String, Any}("Name" => "Activity Indices")
JSON.Object{String, Any}("Name" => "Electric Fields (space)")
JSON.Object{String, Any}("Name" => "Engineering")
JSON.Object{String, Any}("Name" => "Ephemeris/Attitude/Ancillary")
JSON.Object{String, Any}("Name" => "Ground-Based HF-Radars")
JSON.Object{String, Any}("Name" => "Ground-Based Imagers")
JSON.Object{String, Any}("Name" => "Ground-Based Magnetometers, Riometers, Sounders")
JSON.Object{String, Any}("Name" => "Ground-Based VLF/ELF/ULF, Photometers")
JSON.Object{String, Any}("Name" => "Housekeeping")
JSON.Object{String, Any}("Name" => "Imaging and Remote Sensing (ITM/Earth)")
JSON.Object{String, Any}("Name" => "Imaging and Remote Sensing (Magnetosphere/Earth)")
JSON.Object{String, Any}("Name" => "Imaging and Remote Sensing (Sun)")
JSON.Object{String, Any}("Name" => "Imaging and Remote Sensing(Magnetosphere/Earth)")
JSON.Object{String, Any}("Name" => "Magnetic Fields (Balloon)")
JSON.Object{String, Any}("Name" => "Magnetic Fields (space)")
JSON.Object{String, Any}("Name" => "Particles (space)")
JSON.Object{String, Any}("Name" => "Plasma and Solar Wind")
JSON.Object{String, Any}("Name" => "Radio and Plasma Waves (space)")See also get_dataviews, get_datasets, get_instruments, get_instrument_types, get_observatories, get_observatory_groups, get_observatory_groups_and_instruments, get_original_file_descs, and get_data_file_descs. These functions are convenience wrappers around the CDAS RESTful Web Services, closely matching the original API.
Accessing Master CDF Metadata
Retrieve metadata without specifying a time range to access the master CDF file:
# Update/download the master CDF files
CDAWeb.update_master_cdf()
# Returns metadata from the master CDF for the ACE magnetic field dataset
get_data("AC_H0_MFI", "BGSEc")BGSEc (3 × 0)
Datatype: Float32
Dimensions: cartesian × Epoch
Attributes:
FILLVAL = Float32[-1.0f31]
FIELDNAM = Mag Field vector, GSE coord
VALIDMAX = Float32[65534.0, 65534.0, 65534.0]
SCALEMAX = Float32[25.0, 25.0, 25.0]
DEPEND_1 = cartesian
CATDESC = Magnetic Field Vector in GSE Cartesian coordinates (16 sec)
AVG_TYPE =
VALIDMIN = Float32[-65534.0, -65534.0, -65534.0]
DISPLAY_TYPE = time_series
UNITS = nT
VAR_NOTES =
DEPEND_0 = Epoch
LABL_PTR_1 = CommonDataFormat.StaticString{6, UInt8}["Bx GSE", "By GSE", "Bz GSE"]
FORMAT = F9.3
VAR_TYPE = data
SCALEMIN = Float32[-25.0, -25.0, -25.0]
DICT_KEY = magnetic_field
Finding Available Datasets
Search for datasets matching a pattern:
# Find all ACE H0 (high resolution) datasets
find_datasets("AC_H0")2-element Vector{CDFDatasets.CDFDataset{CommonDataFormat.CDFDataset{CommonDataFormat.NoCompression, Int64}}}:
AC_H0_MFI (17 variables: Epoch, Time_PB5, Magnitude, BGSEc, label_BGSE, BGSM, label_bgsm, dBrms, Q_FLAG, SC_pos_GSE, label_pos_GSE, SC_pos_GSM, …)
AC_H0_SWE (19 variables: Epoch, Time_PB5, unit_time, label_time, format_time, Np, Vp, Tpr, alpha_ratio, V_GSE, label_V_GSE, V_RTN, …)Cache Management
View cache metadata to inspect what data has been cached locally:
# Show metadata for web-served (processed) cached files
CDAWeb.cache_metadata(false) |> scrollable_table# Show metadata for original CDF cached files
CDAWeb.cache_metadata(true) |> scrollable_tableAPI Reference
CDAWeb.CDAWebProductCDAWeb.CDAWebProductsCDAWeb._fetch_and_cache_files!CDAWeb._get_cache_dbCDAWeb._get_stmt_orig_cacheCDAWeb._get_stmt_variable_cacheCDAWeb._update_cache!CDAWeb._update_cache!CDAWeb.cache_metadataCDAWeb.clear_cache!CDAWeb.clear_cache!CDAWeb.clear_metadata_cache!CDAWeb.find_cached_and_missingCDAWeb.find_cached_and_missingCDAWeb.get_dataCDAWeb.get_data_file_descsCDAWeb.get_data_filesCDAWeb.get_data_filesCDAWeb.get_datasetCDAWeb.get_datasetCDAWeb.get_datasetsCDAWeb.get_dataviewsCDAWeb.get_instrument_typesCDAWeb.get_instrumentsCDAWeb.get_inventoryCDAWeb.get_observatoriesCDAWeb.get_observatory_groupsCDAWeb.get_observatory_groups_and_instrumentsCDAWeb.get_original_file_descsCDAWeb.get_variablesCDAWeb.group_contiguous_fragmentsCDAWeb.split_into_fragmentsCDAWeb.@cda_str
CDAWeb.CDAWebProduct — Type
CDAWebProduct{P} <: FunctionA lazy specification for retrieving CDAWeb data. When called, it fetches data using get_data with clip=true and direct=false by default for better performance.
See also: CDAWebProducts, @cda_str
CDAWeb.CDAWebProducts — Type
CDAWebProducts{T} <: AbstractVector{T}A vector-like container of CDAWebProducts that is also callable.
See also: CDAWebProduct, @cda_str
CDAWeb._fetch_and_cache_files! — Method
Fetch files from API, download them, and add to cache.
CDAWeb._get_cache_db — Method
Initialize or get existing cache database with proper schema and settings.
CDAWeb._get_stmt_orig_cache — Method
Get the prepared statement for orig cache queries.
CDAWeb._get_stmt_variable_cache — Method
Get the prepared statement for variable cache queries.
CDAWeb._update_cache! — Method
Update orig cache metadata in SQLite database (process-safe, atomic).
CDAWeb._update_cache! — Method
Update variable cache metadata in SQLite database (process-safe, atomic).
CDAWeb.cache_metadata — Function
Get cache metadata
CDAWeb.clear_cache! — Method
Clear cache entries for a specific dataset (process-safe).
CDAWeb.clear_cache! — Method
Clear all cache entries.
CDAWeb.clear_metadata_cache! — Method
Clear the in-memory metadata cache (datasets, variables, etc.).
CDAWeb.find_cached_and_missing — Method
Find cached files and missing time ranges using fragment-based caching and SQL (for orig=false).
CDAWeb.find_cached_and_missing — Method
Find cached files and missing time ranges using SQL query (for orig=true).
CDAWeb.get_data — Function
get_data(dataset, variable)
get_data(dataset, t0, t1; clip = false, master_attributes = false)
get_data(dataset, variable, t0, t1; direct = true, kw...)
get_data(path, t0, t1; kw...)Fetch data for a dataset (variable) within a time range (t0, t1).
If no time range is specified, the master CDF dataset is returned.
A path-like format <dataset>/<variable> can also be used to specify the dataset and variable.
Set master_attributes=true to use master CDF attributes. Set clip=true to restrict data to exact time bounds. Set direct=false to fetch the entire dataset first, then index into it.
See get_data_files for caching options.
CDAWeb.get_data_file_descs — Method
get_data_file_descs(dataset, variables, t0, t1; dataview = "sp_phys", format = "cdf", query...)Get descriptive information about the specified data file for the dataset, variables.
See Get Data for more details.
CDAWeb.get_data_files — Method
get_data_files(dataset, variables, t0, t1; fragment_period = Hour(24), kw...)Get processed data file paths for a dataset + variables within time range (t0, t1).
Note: usually this is slower and not suitable when needing multiple variables due to CDAWeb's web service processing overhead.
CDAWeb.get_data_files — Method
get_data_files(dataset, t0, t1; kw...)Get original data file paths for a dataset within time range (t0, t1).
If files are not available in cache, they will be fetched and cached.
CDAWeb.get_dataset — Method
get_dataset(id, start_time, stop_time; kw...)Get the dataset by id between start_time and stop_time.
If no dataset is available for the specified time range, the corresponding master dataset is returned.
CDAWeb.get_dataset — Method
get_dataset(id; kw...)Get the dataset description by id.
The value of id may be
- CDAS (e.g.,
AC_H2_MFI), - DOI (e.g.,
10.48322/fh85-fj47), - SPASE ResourceID (e.g.,
spase://NASA/NumericalData/ACE/MAG/L2/PT1H).
See also get_datasets.
CDAWeb.get_datasets — Method
get_datasets(; use_cache = true, query...)Get descriptions of available datasets for the query.
See Get Datasets for available query parameters.
CDAWeb.get_dataviews — Method
get_dataviews()Get descriptions of available dataviews.
CDAWeb.get_instrument_types — Method
get_instrument_types(; query...)Get available instrument types.
See Details for available query parameters.
CDAWeb.get_instruments — Method
get_instruments(; query...)Get descriptions of available instruments.
See Details for available query parameters.
CDAWeb.get_inventory — Method
get_inventory(dataset, t0, t1; dataview = "sp_phys")Get descriptions of the available inventory for the dataset.
See Details.
CDAWeb.get_observatories — Method
get_observatories(; query...)Get descriptions of available observatories.
See Details for available query parameters.
CDAWeb.get_observatory_groups — Method
get_observatory_groups(; query...)Get descriptions of available observatory groups.
See Details for available query parameters.
CDAWeb.get_observatory_groups_and_instruments — Method
get_observatory_groups_and_instruments(; query...)Get descriptions of available observatory groups and instruments.
This is a convenience/performance alternative to making multiple calls to Get Observatory Groups, Get Observatories, and Get Instruments.
See Details for available query parameters.
CDAWeb.get_original_file_descs — Method
get_original_file_descs(id, start_time, stop_time; dataview = "sp_phys")Get descriptive information about original data files from the id dataset.
Original data files may lack updated meta-data and virtual variable values contained in files obtained from the other Get Data services.
See also get_data_file_descs.
CDAWeb.get_variables — Method
get_variables(dataset)Get descriptions of available variables for the dataset.
See Get Variables for more details.
CDAWeb.group_contiguous_fragments — Method
Group contiguous fragments to minimize API calls.
CDAWeb.split_into_fragments — Method
Split time range into fixed-duration fragments with aligned boundaries.
CDAWeb.@cda_str — Macro
cda"dataset/parameter"
cda"dataset/parameter1,parameter2"String macro to create a CDAWebProduct from a string identifier. Supports multiple parameters separated by commas, which returns a CDAWebProducts object (like a vector of CDAWebProduct).
Examples
# Single parameter
product = cda"OMNI_COHO1HR_MERGED_MAG_PLASMA/flow_speed"
product(t0 , t1)
# Multiple parameters
products = cda"OMNI_COHO1HR_MERGED_MAG_PLASMA/flow_speed,Pressure"
products(t0 , t1)