CDAWeb

DOI version

Julia interface to NASA's Coordinated Data Analysis Web (CDAWeb) for accessing space physics data.

Installation

using Pkg
Pkg.add("CDAWeb")

An agent skill is included for exploring CDAWeb using natural language. To install it using skills, run:

npx skills add JuliaSpacePhysics/CDAWeb.jl

Quick Example

using CDAWeb
using Dates

# Get dataset description
get_dataset("AC_H0_MFI")
JSON.Object{String, Any} with 14 entries:
  "Id"                 => "AC_H0_MFI"
  "Doi"                => "10.48322/e0dc-0h53"
  "SpaseResourceId"    => "spase://NASA/NumericalData/ACE/MAG/L2/PT16S"
  "Observatory"        => Any["AC"]
  "Instrument"         => Any["MAG"]
  "ObservatoryGroup"   => Any["ACE"]
  "InstrumentType"     => Any["Magnetic Fields (space)"]
  "Label"              => "H0 - ACE Magnetic Field 16-Second Level 2 Data - N. …
  "TimeInterval"       => Object{String, Any}("Start"=>"1997-09-02T00:00:12.000…
  "PiName"             => "N. Ness"
  "PiAffiliation"      => "Bartol Research Institute"
  "Notes"              => "https://cdaweb.gsfc.nasa.gov/misc/NotesA.html#AC_H0_…
  "DatasetLink"        => Any[Object{String, Any}("Title"=>"the ACE Science Cen…
  "AdditionalMetadata" => Any[Object{String, Any}("Type"=>"SPASE", "value"=>"ht…
# Get dataset within the time range and display its attributes
ds = get_data("AC_H0_MFI", "2023-01-01", "2023-01-02")
ds.attrib
CommonDataModel.Attributes{CDFDatasets.ConcatCDFDataset{Vector{CommonDataFormat.CDFDataset{CommonDataFormat.NoCompression, Int64}}}} with 26 entries:
  "TITLE"                      => ["ACE> Magnetometer Parameters"]
  "Project"                    => ["ACE>Advanced Composition Explorer", "ISTP>I…
  "Discipline"                 => ["Space Physics>Interplanetary Studies"]
  "Source_name"                => ["AC>Advanced Composition Explorer"]
  "Data_type"                  => ["H0>16-Sec Level 2 Data"]
  "Descriptor"                 => ["MAG>ACE Magnetic Field Instrument"]
  "Data_version"               => ["7"]
  "Generated_by"               => ["ACE Science Center"]
  "Generation_date"            => ["20230313"]
  "LINK_TEXT"                  => ["Release notes and other info available at"]
  "LINK_TITLE"                 => ["the ACE Science Center Level 2 Data website…
  "HTTP_LINK"                  => ["http://www.srl.caltech.edu/ACE/ASC/level2/i…
  "TEXT"                       => ["MAG - ACE Magnetic Field Experiment", "Refe…
  "MODS"                       => ["Initial Release 9/7/01 ", "12/04/02: Fixed …
  "ADID_ref"                   => ["NSSD0327"]
  "Logical_file_id"            => ["AC_H0_MFI_20230101_V07"]
  "Logical_source"             => ["AC_H0_MFI"]
  "Logical_source_description" => ["H0 - ACE Magnetic Field 16-Second Level 2 D…
  "PI_name"                    => ["N. Ness"]
  ⋮                            => ⋮
# Fetch solar wind velocity data from OMNI dataset
dataset = "OMNI_COHO1HR_MERGED_MAG_PLASMA"
t0 = DateTime(2020, 1, 1) # Start time
t1 = DateTime(2020, 1, 2) # End time
data = get_data(dataset, "V", t0, t1) # Data is automatically cached for faster subsequent access
V (25)
  Datatype:    Float32
  Dimensions:  Epoch
  Attributes:
   FILLVAL              = Float32[-1.0f31]
   FIELDNAM             = Bulk Flow Speed
   VALIDMAX             = Float32[1200.0]
   CATDESC              = Bulk Flow Speed
   VALIDMIN             = Float32[0.0]
   DISPLAY_TYPE         = time_series
   UNITS                = km/s
   DEPEND_0             = Epoch
   FORMAT               = F5.0
   VAR_TYPE             = data
   LABLAXIS             = Bulk Flow Speed
   DIM_SIZES            = Int32[0]

Retrieve the original monthly data files and clip to the exact requested time range.

data = get_data(dataset, t0, t1; clip = true)["V"]
View:  1:25
 V  (744)
   Datatype:    Float32
   Dimensions:  Epoch
   Attributes:
    FILLVAL              = Float32[-1.0f31]
    FIELDNAM             = Bulk Flow Speed
    VALIDMAX             = Float32[1200.0]
    CATDESC              = Bulk Flow Speed
    VALIDMIN             = Float32[0.0]
    DISPLAY_TYPE         = time_series
    UNITS                = km/s
    DEPEND_0             = Epoch
    FORMAT               = F5.0
    VAR_TYPE             = data
    LABLAXIS             = Bulk Flow Speed

Additional Features

Get Metadata from Web Services

Access metadata directly from CDAWeb's RESTful services:

# Get descriptions of the instrument types that are available from CDAS.
instrument_types = get_instrument_types()
18-element Vector{Any}:
 JSON.Object{String, Any}("Name" => "Activity Indices")
 JSON.Object{String, Any}("Name" => "Electric Fields (space)")
 JSON.Object{String, Any}("Name" => "Engineering")
 JSON.Object{String, Any}("Name" => "Ephemeris/Attitude/Ancillary")
 JSON.Object{String, Any}("Name" => "Ground-Based HF-Radars")
 JSON.Object{String, Any}("Name" => "Ground-Based Imagers")
 JSON.Object{String, Any}("Name" => "Ground-Based Magnetometers, Riometers, Sounders")
 JSON.Object{String, Any}("Name" => "Ground-Based VLF/ELF/ULF, Photometers")
 JSON.Object{String, Any}("Name" => "Housekeeping")
 JSON.Object{String, Any}("Name" => "Imaging and Remote Sensing (ITM/Earth)")
 JSON.Object{String, Any}("Name" => "Imaging and Remote Sensing (Magnetosphere/Earth)")
 JSON.Object{String, Any}("Name" => "Imaging and Remote Sensing (Sun)")
 JSON.Object{String, Any}("Name" => "Imaging and Remote Sensing(Magnetosphere/Earth)")
 JSON.Object{String, Any}("Name" => "Magnetic Fields (Balloon)")
 JSON.Object{String, Any}("Name" => "Magnetic Fields (space)")
 JSON.Object{String, Any}("Name" => "Particles (space)")
 JSON.Object{String, Any}("Name" => "Plasma and Solar Wind")
 JSON.Object{String, Any}("Name" => "Radio and Plasma Waves (space)")

See also get_dataviews, get_datasets, get_instruments, get_instrument_types, get_observatories, get_observatory_groups, get_observatory_groups_and_instruments, get_original_file_descs, and get_data_file_descs. These functions are convenience wrappers around the CDAS RESTful Web Services, closely matching the original API.

Accessing Master CDF Metadata

Retrieve metadata without specifying a time range to access the master CDF file:

# Update/download the master CDF files
CDAWeb.update_master_cdf()
# Returns metadata from the master CDF for the ACE magnetic field dataset
get_data("AC_H0_MFI", "BGSEc")
BGSEc (3 × 0)
  Datatype:    Float32
  Dimensions:  cartesian × Epoch
  Attributes:
   FILLVAL              = Float32[-1.0f31]
   FIELDNAM             = Mag Field vector, GSE coord
   VALIDMAX             = Float32[65534.0, 65534.0, 65534.0]
   SCALEMAX             = Float32[25.0, 25.0, 25.0]
   DEPEND_1             = cartesian
   CATDESC              = Magnetic Field Vector in GSE Cartesian coordinates (16 sec)
   AVG_TYPE             =  
   VALIDMIN             = Float32[-65534.0, -65534.0, -65534.0]
   DISPLAY_TYPE         = time_series
   UNITS                = nT
   VAR_NOTES            =  
   DEPEND_0             = Epoch
   LABL_PTR_1           = CommonDataFormat.StaticString{6, UInt8}["Bx GSE", "By GSE", "Bz GSE"]
   FORMAT               = F9.3
   VAR_TYPE             = data
   SCALEMIN             = Float32[-25.0, -25.0, -25.0]
   DICT_KEY             = magnetic_field

Finding Available Datasets

Search for datasets matching a pattern:

# Find all ACE H0 (high resolution) datasets
find_datasets("AC_H0")
2-element Vector{CDFDatasets.CDFDataset{CommonDataFormat.CDFDataset{CommonDataFormat.NoCompression, Int64}}}:
 AC_H0_MFI (17 variables: Epoch, Time_PB5, Magnitude, BGSEc, label_BGSE, BGSM, label_bgsm, dBrms, Q_FLAG, SC_pos_GSE, label_pos_GSE, SC_pos_GSM, …)
 AC_H0_SWE (19 variables: Epoch, Time_PB5, unit_time, label_time, format_time, Np, Vp, Tpr, alpha_ratio, V_GSE, label_V_GSE, V_RTN, …)

Cache Management

View cache metadata to inspect what data has been cached locally:

# Show metadata for web-served (processed) cached files
CDAWeb.cache_metadata(false)  |> scrollable_table
dataset variable start_time end_time path
OMNI_COHO1HR_MERGED_MAG_PLASMA V 2020-01-01T00:00:00 2020-01-02T00:00:00 /home/runner/.cdaweb/data/OMNI_COHO1HR_MERGED_MAG_PLASMA/V_omni_coho1hrs_merged_mag_plasma_20200101000000_20200102000000_cdaweb.cdf
# Show metadata for original CDF cached files
CDAWeb.cache_metadata(true)  |> scrollable_table
dataset start_time end_time path
AC_H0_MFI 2023-01-01T00:00:00 2023-01-02T00:00:00 /home/runner/.cdaweb/data/AC_H0_MFI/ac_h0_mfi_20230101_v07.cdf
OMNI_COHO1HR_MERGED_MAG_PLASMA 2020-01-01T00:00:00 2020-01-31T23:00:00 /home/runner/.cdaweb/data/OMNI_COHO1HR_MERGED_MAG_PLASMA/omni_coho1hr_merged_mag_plasma_20200101_v01.cdf

API Reference

CDAWeb.CDAWebProductType
CDAWebProduct{P} <: Function

A lazy specification for retrieving CDAWeb data. When called, it fetches data using get_data with clip=true and direct=false by default for better performance.

See also: CDAWebProducts, @cda_str

source
CDAWeb.get_dataFunction
get_data(dataset, variable)
get_data(dataset, t0, t1; clip = false, master_attributes = false)
get_data(dataset, variable, t0, t1; direct = true, kw...)
get_data(path, t0, t1; kw...)

Fetch data for a dataset (variable) within a time range (t0, t1).

If no time range is specified, the master CDF dataset is returned.

A path-like format <dataset>/<variable> can also be used to specify the dataset and variable.

Set master_attributes=true to use master CDF attributes. Set clip=true to restrict data to exact time bounds. Set direct=false to fetch the entire dataset first, then index into it.

See get_data_files for caching options.

source
CDAWeb.get_data_file_descsMethod
get_data_file_descs(dataset, variables, t0, t1; dataview = "sp_phys", format = "cdf", query...)

Get descriptive information about the specified data file for the dataset, variables.

See Get Data for more details.

source
CDAWeb.get_data_filesMethod
get_data_files(dataset, variables, t0, t1; fragment_period = Hour(24), kw...)

Get processed data file paths for a dataset + variables within time range (t0, t1).

Note: usually this is slower and not suitable when needing multiple variables due to CDAWeb's web service processing overhead.

source
CDAWeb.get_data_filesMethod
get_data_files(dataset, t0, t1; kw...)

Get original data file paths for a dataset within time range (t0, t1).

If files are not available in cache, they will be fetched and cached.

source
CDAWeb.get_datasetMethod
get_dataset(id, start_time, stop_time; kw...)

Get the dataset by id between start_time and stop_time.

If no dataset is available for the specified time range, the corresponding master dataset is returned.

source
CDAWeb.get_datasetMethod
get_dataset(id; kw...)

Get the dataset description by id.

The value of id may be

  • CDAS (e.g., AC_H2_MFI),
  • DOI (e.g., 10.48322/fh85-fj47),
  • SPASE ResourceID (e.g., spase://NASA/NumericalData/ACE/MAG/L2/PT1H).

See also get_datasets.

source
CDAWeb.get_observatory_groups_and_instrumentsMethod
get_observatory_groups_and_instruments(; query...)

Get descriptions of available observatory groups and instruments.

This is a convenience/performance alternative to making multiple calls to Get Observatory Groups, Get Observatories, and Get Instruments.

See Details for available query parameters.

source
CDAWeb.get_original_file_descsMethod
get_original_file_descs(id, start_time, stop_time; dataview = "sp_phys")

Get descriptive information about original data files from the id dataset.

Original data files may lack updated meta-data and virtual variable values contained in files obtained from the other Get Data services.

See also get_data_file_descs.

source
CDAWeb.@cda_strMacro
cda"dataset/parameter"
cda"dataset/parameter1,parameter2"

String macro to create a CDAWebProduct from a string identifier. Supports multiple parameters separated by commas, which returns a CDAWebProducts object (like a vector of CDAWebProduct).

Examples

# Single parameter
product = cda"OMNI_COHO1HR_MERGED_MAG_PLASMA/flow_speed"
product(t0 , t1)

# Multiple parameters
products = cda"OMNI_COHO1HR_MERGED_MAG_PLASMA/flow_speed,Pressure"
products(t0 , t1)
source