CDFDatasets

CDFDatasets.jl is a Julia package for reading CDF (Common Data Format) files, commonly used in space physics and other scientific domains. It provides a Julia interface to CDF files using the CommonDataModel.jl interface.

Installation

using Pkg
Pkg.add("CDFDatasets")

Quickstart

Here's a quick example using OMNI solar wind data:

using CDFDatasets

# Open a CDF dataset
omni_file = joinpath(pkgdir(CDFDatasets), "data/omni_coho1hr_merged_mag_plasma_20200501_v01.cdf")
ds = CDFDataset(omni_file)
Dataset: 
Group: omni_coho1hr_merged_mag_plasma

Variables
  Epoch   (744)
    Datatype:    CommonDataFormat.Epoch
    Dimensions:  
    Attributes:
     FILLVAL              = [-1.0e31]
     VAR_TYPE             = support_data
     CATDESC              = Epoch Time
     FIELDNAM             = Epoch Time
     VALIDMAX             = CommonDataFormat.Epoch[2020-12-31T23:59:59]
     SCALEMIN             = CommonDataFormat.Epoch[1963-01-01T00:00:00]
     SCALEMAX             = CommonDataFormat.Epoch[2020-12-31T23:59:59]
     VALIDMIN             = CommonDataFormat.Epoch[1963-01-01T00:00:00]
     UNITS                = DD-MMM-YYYY_hr:mm

  heliographicLatitude   (744)
    Datatype:    Float32
    Dimensions:  Epoch
    Attributes:
     FILLVAL              = Float32[-1.0f31]
     FIELDNAM             = Heliographic Latitude
     VALIDMAX             = Float32[90.0]
     CATDESC              = HelioGraphic Inertial (HGI) latitude
     VALIDMIN             = Float32[-90.0]
     DISPLAY_TYPE         = time_series
     UNITS                = deg
     DEPEND_0             = Epoch
     FORMAT               = f7.1
     VAR_TYPE             = data
     LABLAXIS             = Heliographic Latitude

  heliographicLongitude   (744)
    Datatype:    Float32
    Dimensions:  Epoch
    Attributes:
     FILLVAL              = Float32[-1.0f31]
     FIELDNAM             = Heliographic Longitude
     VALIDMAX             = Float32[360.0]
     CATDESC              = HGI longitude
     VALIDMIN             = Float32[0.0]
     DISPLAY_TYPE         = time_series
     UNITS                = deg
     DEPEND_0             = Epoch
     FORMAT               = f7.1
     VAR_TYPE             = data
     LABLAXIS             = Heliographic Longitude

  BR   (744)
    Datatype:    Float32
    Dimensions:  Epoch
    Attributes:
     FILLVAL              = Float32[-1.0f31]
     FIELDNAM             = BR (RTN)
     VALIDMAX             = Float32[1000.0]
     CATDESC              = BR in RTN (Radial-Tangential-Normal) coordinate system
     VALIDMIN             = Float32[-1000.0]
     DISPLAY_TYPE         = time_series
     UNITS                = nT
     DEPEND_0             = Epoch
     FORMAT               = f6.1
     VAR_TYPE             = data
     LABLAXIS             = BR (RTN)

  BT   (744)
    Datatype:    Float32
    Dimensions:  Epoch
    Attributes:
     FILLVAL              = Float32[-1.0f31]
     FIELDNAM             = BT (RTN)
     VALIDMAX             = Float32[1000.0]
     CATDESC              = BT in RTN coordinate system
     VALIDMIN             = Float32[-1000.0]
     DISPLAY_TYPE         = time_series
     UNITS                = nT
     DEPEND_0             = Epoch
     FORMAT               = f6.1
     VAR_TYPE             = data
     LABLAXIS             = BT (RTN)

  BN   (744)
    Datatype:    Float32
    Dimensions:  Epoch
    Attributes:
     FILLVAL              = Float32[-1.0f31]
     FIELDNAM             = BN (RTN)
     VALIDMAX             = Float32[1000.0]
     CATDESC              = BN in RTN coordinate system
     VALIDMIN             = Float32[-1000.0]
     DISPLAY_TYPE         = time_series
     UNITS                = nT
     DEPEND_0             = Epoch
     FORMAT               = f6.1
     VAR_TYPE             = data
     LABLAXIS             = BN (RTN)

  ABS_B   (744)
    Datatype:    Float32
    Dimensions:  Epoch
    Attributes:
     FILLVAL              = Float32[-1.0f31]
     FIELDNAM             = Field Magnitude Avg.
     VALIDMAX             = Float32[100.0]
     CATDESC              = Field Magnitude Average |B  1/N SUM |B|
     VALIDMIN             = Float32[0.0]
     DISPLAY_TYPE         = time_series
     UNITS                = nT
     DEPEND_0             = Epoch
     FORMAT               = F6.1
     VAR_TYPE             = data
     LABLAXIS             = Field Magnitude Avg.

  V   (744)
    Datatype:    Float32
    Dimensions:  Epoch
    Attributes:
     FILLVAL              = Float32[-1.0f31]
     FIELDNAM             = Bulk Flow Speed
     VALIDMAX             = Float32[1200.0]
     CATDESC              = Bulk Flow Speed
     VALIDMIN             = Float32[0.0]
     DISPLAY_TYPE         = time_series
     UNITS                = km/s
     DEPEND_0             = Epoch
     FORMAT               = F5.0
     VAR_TYPE             = data
     LABLAXIS             = Bulk Flow Speed

  elevAngle   (744)
    Datatype:    Float32
    Dimensions:  Epoch
    Attributes:
     FILLVAL              = Float32[-1.0f31]
     FIELDNAM             = elevation Angle 
     VALIDMAX             = Float32[90.0]
     CATDESC              = Proton flow elevation angle / latitude (RTN)
     VALIDMIN             = Float32[-90.0]
     DISPLAY_TYPE         = time_series
     UNITS                = Deg
     DEPEND_0             = Epoch
     FORMAT               = F6.1
     VAR_TYPE             = data
     LABLAXIS             = elevation Angle

  azimuthAngle   (744)
    Datatype:    Float32
    Dimensions:  Epoch
    Attributes:
     FILLVAL              = Float32[-1.0f31]
     FIELDNAM             = azimuth Angle 
     VALIDMAX             = Float32[180.0]
     CATDESC              = Proton flow azimuth angle / longitude (RTN)
     VALIDMIN             = Float32[-180.0]
     DISPLAY_TYPE         = time_series
     UNITS                = Deg
     DEPEND_0             = Epoch
     FORMAT               = F6.1
     VAR_TYPE             = data
     LABLAXIS             = azimuth Angle

  N   (744)
    Datatype:    Float32
    Dimensions:  Epoch
    Attributes:
     FILLVAL              = Float32[-1.0f31]
     FIELDNAM             = Ion Density
     VALIDMAX             = Float32[150.0]
     CATDESC              = Ion Density
     VALIDMIN             = Float32[0.0]
     DISPLAY_TYPE         = time_series
     UNITS                = N/cm3
     DEPEND_0             = Epoch
     FORMAT               = F6.1
     VAR_TYPE             = data
     LABLAXIS             = Ion density

  T   (744)
    Datatype:    Float32
    Dimensions:  Epoch
    Attributes:
     FILLVAL              = Float32[-1.0f31]
     FIELDNAM             = Temperature
     VALIDMAX             = Float32[1.0f7]
     CATDESC              = Temperature
     VALIDMIN             = Float32[0.0]
     DISPLAY_TYPE         = time_series
     UNITS                = Deg K
     DEPEND_0             = Epoch
     FORMAT               = F9.0
     VAR_TYPE             = data
     LABLAXIS             = Temperature

Global attributes
  Project              = ["SPDF"]
  Discipline           = ["Space Physics>Interplanetary Studies"]
  Source_name          = ["OMNI (1AU IP Data)>Merged 1 Hour Interplantary OMNI data in RTN system"]
  Data_type            = ["COHO1HR>Definitive Hourly data from cohoweb"]
  Descriptor           = ["merged magnetic field and plasma data from cohoweb"]
  Data_version         = ["1"]
  TITLE                = ["Near-Earth Heliosphere Data (OMNI)"]
  TEXT                 = ["Hourly averaged definitive multispacecraft interplanetary parameters data", "The Heliographic Inertial (HGI) coordinates are Sun-centered and inertially fixed with respect to an X-axis directed along the intersection line of the ecliptic and solar equatorial  planes. The solar equator plane is inclined at 7.25 degrees from the ecliptic. This direction was towards ecliptic longitude of 74.367 degrees on 1 January 1900 at 1200 UT; because of precession of the celestial equator, this longitude increases by 1.4 degrees/century. The Z axis  is  directed perpendicular and northward from the solar equator, and the Y-axis completes the right-handed set. This system differs from the usual heliographic coordinates (e.g. Carrington longitudes) which are fixed in the frame of the rotating Sun.", "The RTN system is fixed at a spacecraft (or the planet). The R axis is directed radially away from the Sun, the T axis is the cross product of the solar rotation axis and the R axis, and the N axis is the cross product of the R and T axes.  At zero heliographic latitude, when the spacecraft is in the solar equatorial plane, the N and solar rotation axes are parallel.", "Latitude and longitude angles of solar wind plasma flow are generally measured  from the radius vector away from the Sun. In all cases, latitude angles are positive for north-going flow.  The flow longitude angles have been treated differently for the near-Earth data, i.e. the OMNI, and for the deep space data. The flow is positive for the  near-Earth data when coming from the right side of the Sun as viewed  from  the Earth, i.e. flowing toward +Y from -X GSE or opposite to the direction of planetary motion. On the other hand, the flow longitudes for the deep space spacecraft use the opposite sign convection, i.e. positive for flow in the +T direction in the RTN system."]
  MODS                 = ["created July 2007;", "conversion to ISTP/IACG CDFs via SKTEditor Feb 2000", "Time tags in CDAWeb version were modified in March 2005 to use the", "CDAWeb convention of having mid-average time tags rather than OMNI's", "original convention of start-of-average time tags."]
  PI_name              = ["J.H. King, N. Papatashvilli"]
  PI_affiliation       = ["AdnetSystems, NASA GSFC"]
  Generation_date      = ["Ongoing"]
  Acknowledgement      = ["NSSDC"]
  ADID_ref             = ["NSSD0110"]
  Rules_of_use         = ["Public"]
  Instrument_type      = ["Plasma and Solar Wind", "Magnetic Fields (space)"]
  Generated_by         = ["King/Papatashvilli"]
  Time_resolution      = ["1 hour"]
  Logical_file_id      = ["omni_coho1hr_merged_mag_plasma_00000000_v01"]
  Logical_source       = ["omni_coho1hr_merged_mag_plasma"]
  Logical_source_description = ["OMNI Combined merged hourly magnetic field, plasma and ephermis data"]
  LINK_TEXT            = ["COHO dataset", "Additional analysis tools for these data from the"]
  LINK_TITLE           = ["Documentation", "COHOWeb service"]
  HTTP_LINK            = ["https://omniweb.gsfc.nasa.gov/coho/html/cw_data.html", "http://cohoweb.gsfc.nasa.gov"]
  alt_logical_source   = ["Combined_OMNI_1AU-MagneticField-Plasma-Particles_mrg1hr_1hour_cdf"]
  Mission_group        = ["OMNI (Combined 1AU IP Data; Magnetic and Solar Indices)", "ACE", "Wind", "IMP (All)", "!___Interplanetary Data near 1 AU"]
  spase_DatasetResourceID = ["spase://NASA/NumericalData/OMNI/COHO/MergedMagPlasma/PT1H"]
  DOI                  = ["https://doi.org/10.48322/6ffx-3441"]

Explore the dataset

julia> println("Variables: ", keys(ds))Variables: ["Epoch", "heliographicLatitude", "heliographicLongitude", "BR", "BT", "BN", "ABS_B", "V", "elevAngle", "azimuthAngle", "N", "T"]
julia> println("Attributes: ", keys(ds.attrib))Attributes: ["Project", "Discipline", "Source_name", "Data_type", "Descriptor", "Data_version", "TITLE", "TEXT", "MODS", "PI_name", "PI_affiliation", "Generation_date", "Acknowledgement", "ADID_ref", "Rules_of_use", "Instrument_type", "Generated_by", "Time_resolution", "Logical_file_id", "Logical_source", "Logical_source_description", "LINK_TEXT", "LINK_TITLE", "HTTP_LINK", "alt_logical_source", "Mission_group", "spase_DatasetResourceID", "DOI"]
julia> ds.attrib["Descriptor"]1-element Vector{String}: "merged magnetic field and plasma data from cohoweb"

Access variables

julia> ds["Epoch"]Epoch (744)
  Datatype:    CommonDataFormat.Epoch
  Dimensions:
  Attributes:
   FILLVAL              = [-1.0e31]
   VAR_TYPE             = support_data
   CATDESC              = Epoch Time
   FIELDNAM             = Epoch Time
   VALIDMAX             = CommonDataFormat.Epoch[2020-12-31T23:59:59]
   SCALEMIN             = CommonDataFormat.Epoch[1963-01-01T00:00:00]
   SCALEMAX             = CommonDataFormat.Epoch[2020-12-31T23:59:59]
   VALIDMIN             = CommonDataFormat.Epoch[1963-01-01T00:00:00]
   UNITS                = DD-MMM-YYYY_hr:mm
julia> ds["Epoch"][[1,end]]2-element Vector{CommonDataFormat.Epoch}: 2020-05-01T00:00:00 2020-05-31T23:00:00
julia> ds["BR"]BR (744) Datatype: Float32 Dimensions: Epoch Attributes: FILLVAL = Float32[-1.0f31] FIELDNAM = BR (RTN) VALIDMAX = Float32[1000.0] CATDESC = BR in RTN (Radial-Tangential-Normal) coordinate system VALIDMIN = Float32[-1000.0] DISPLAY_TYPE = time_series UNITS = nT DEPEND_0 = Epoch FORMAT = f6.1 VAR_TYPE = data LABLAXIS = BR (RTN)
# Calculate magnetic field magnitude
br = ds["BR"]
bt = ds["BT"]
bn = ds["BN"]
b_mag = sqrt.(br.^2 + bt.^2 + bn.^2) |> collect
744-element Vector{Float32}:
 3.544009
 2.9342802
 3.442383
 3.8961518
 0.64031243
 5.3721504
 5.4990907
 5.9514704
 4.3301272
 4.634652
 ⋮
 3.3301651
 2.9949956
 3.226453
 3.0380914
 3.828838
 3.3301651
 3.3674917
 3.749667
 3.8026307

API Reference

CDFDatasets.CDFDatasetMethod
CDFDataset(file; lazy = true)

Load the CDF dataset at the file path. The dataset supports the API of the JuliaGeo/CommonDataModel.jl.

lazy controls whether variable values are loaded immediately or only when accessed by the user. If True, variables' values are loaded on demand. If False, all variable values are loaded during parsing.

source
CDFDatasets.ConcatCDFVariableMethod
ConcatCDFVariable(arrays; metadata = nothing, dim = nothing)

Concatenate multiple CDF variables along the dim dimension (by default the record dimension (last dimension)).

source