CDFDatasets

DOI version

CDFDatasets.jl is a Julia package for reading CDF (Common Data Format) files, commonly used in space physics and other scientific domains. It provides a Julia interface to CDF files using the CommonDataModel.jl interface. See CDF reader benchmarks for comparison with other CDF readers.

Installation

using Pkg
Pkg.add("CDFDatasets")

Quickstart

Here's a quick example using OMNI solar wind data:

using CDFDatasets

# Open a CDF dataset
omni_file = joinpath(pkgdir(CDFDatasets), "data/omni_coho1hr_merged_mag_plasma_20200501_v01.cdf")
ds = cdfopen(omni_file)
Dataset: /home/runner/work/CDFDatasets.jl/CDFDatasets.jl/data/omni_coho1hr_merged_mag_plasma_20200501_v01.cdf
Group: omni_coho1hr_merged_mag_plasma

Data variables
  heliographicLatitude   (744)
    Datatype:    Float32
    Dimensions:  Epoch
    Attributes:
     FIELDNAM             = Heliographic Latitude
     CATDESC              = HelioGraphic Inertial (HGI) latitude
     VALIDMIN             = Float32[-90.0]
     VALIDMAX             = Float32[90.0]
     UNITS                = deg
     FORMAT               = f7.1
     FILLVAL              = Float32[-1.0f31]
     VAR_TYPE             = data
     DISPLAY_TYPE         = time_series
     DEPEND_0             = Epoch
     LABLAXIS             = Heliographic Latitude

  heliographicLongitude   (744)
    Datatype:    Float32
    Dimensions:  Epoch
    Attributes:
     FIELDNAM             = Heliographic Longitude
     CATDESC              = HGI longitude
     VALIDMIN             = Float32[0.0]
     VALIDMAX             = Float32[360.0]
     UNITS                = deg
     FORMAT               = f7.1
     FILLVAL              = Float32[-1.0f31]
     VAR_TYPE             = data
     DISPLAY_TYPE         = time_series
     DEPEND_0             = Epoch
     LABLAXIS             = Heliographic Longitude

  BR   (744)
    Datatype:    Float32
    Dimensions:  Epoch
    Attributes:
     FIELDNAM             = BR (RTN)
     CATDESC              = BR in RTN (Radial-Tangential-Normal) coordinate system
     VALIDMIN             = Float32[-1000.0]
     VALIDMAX             = Float32[1000.0]
     UNITS                = nT
     FORMAT               = f6.1
     FILLVAL              = Float32[-1.0f31]
     VAR_TYPE             = data
     DISPLAY_TYPE         = time_series
     DEPEND_0             = Epoch
     LABLAXIS             = BR (RTN)

  BT   (744)
    Datatype:    Float32
    Dimensions:  Epoch
    Attributes:
     FIELDNAM             = BT (RTN)
     CATDESC              = BT in RTN coordinate system
     VALIDMIN             = Float32[-1000.0]
     VALIDMAX             = Float32[1000.0]
     UNITS                = nT
     FORMAT               = f6.1
     FILLVAL              = Float32[-1.0f31]
     VAR_TYPE             = data
     DISPLAY_TYPE         = time_series
     DEPEND_0             = Epoch
     LABLAXIS             = BT (RTN)

  BN   (744)
    Datatype:    Float32
    Dimensions:  Epoch
    Attributes:
     FIELDNAM             = BN (RTN)
     CATDESC              = BN in RTN coordinate system
     VALIDMIN             = Float32[-1000.0]
     VALIDMAX             = Float32[1000.0]
     UNITS                = nT
     FORMAT               = f6.1
     FILLVAL              = Float32[-1.0f31]
     VAR_TYPE             = data
     DISPLAY_TYPE         = time_series
     DEPEND_0             = Epoch
     LABLAXIS             = BN (RTN)

  ABS_B   (744)
    Datatype:    Float32
    Dimensions:  Epoch
    Attributes:
     FIELDNAM             = Field Magnitude Avg.
     CATDESC              = Field Magnitude Average |B  1/N SUM |B|
     VALIDMIN             = Float32[0.0]
     VALIDMAX             = Float32[100.0]
     UNITS                = nT
     FORMAT               = F6.1
     FILLVAL              = Float32[-1.0f31]
     VAR_TYPE             = data
     DISPLAY_TYPE         = time_series
     DEPEND_0             = Epoch
     LABLAXIS             = Field Magnitude Avg.

  V   (744)
    Datatype:    Float32
    Dimensions:  Epoch
    Attributes:
     FIELDNAM             = Bulk Flow Speed
     CATDESC              = Bulk Flow Speed
     VALIDMIN             = Float32[0.0]
     VALIDMAX             = Float32[1200.0]
     UNITS                = km/s
     FORMAT               = F5.0
     FILLVAL              = Float32[-1.0f31]
     VAR_TYPE             = data
     DISPLAY_TYPE         = time_series
     DEPEND_0             = Epoch
     LABLAXIS             = Bulk Flow Speed

  elevAngle   (744)
    Datatype:    Float32
    Dimensions:  Epoch
    Attributes:
     FIELDNAM             = elevation Angle 
     CATDESC              = Proton flow elevation angle / latitude (RTN)
     VALIDMIN             = Float32[-90.0]
     VALIDMAX             = Float32[90.0]
     UNITS                = Deg
     FORMAT               = F6.1
     FILLVAL              = Float32[-1.0f31]
     VAR_TYPE             = data
     DISPLAY_TYPE         = time_series
     DEPEND_0             = Epoch
     LABLAXIS             = elevation Angle

  azimuthAngle   (744)
    Datatype:    Float32
    Dimensions:  Epoch
    Attributes:
     FIELDNAM             = azimuth Angle 
     CATDESC              = Proton flow azimuth angle / longitude (RTN)
     VALIDMIN             = Float32[-180.0]
     VALIDMAX             = Float32[180.0]
     UNITS                = Deg
     FORMAT               = F6.1
     FILLVAL              = Float32[-1.0f31]
     VAR_TYPE             = data
     DISPLAY_TYPE         = time_series
     DEPEND_0             = Epoch
     LABLAXIS             = azimuth Angle

  N   (744)
    Datatype:    Float32
    Dimensions:  Epoch
    Attributes:
     FIELDNAM             = Ion Density
     CATDESC              = Ion Density
     VALIDMIN             = Float32[0.0]
     VALIDMAX             = Float32[150.0]
     UNITS                = N/cm3
     FORMAT               = F6.1
     FILLVAL              = Float32[-1.0f31]
     VAR_TYPE             = data
     DISPLAY_TYPE         = time_series
     DEPEND_0             = Epoch
     LABLAXIS             = Ion density

  T   (744)
    Datatype:    Float32
    Dimensions:  Epoch
    Attributes:
     FIELDNAM             = Temperature
     CATDESC              = Temperature
     VALIDMIN             = Float32[0.0]
     VALIDMAX             = Float32[1.0f7]
     UNITS                = Deg K
     FORMAT               = F9.0
     FILLVAL              = Float32[-1.0f31]
     VAR_TYPE             = data
     DISPLAY_TYPE         = time_series
     DEPEND_0             = Epoch
     LABLAXIS             = Temperature

Support variables: Epoch
Global attributes
  Project              = ["SPDF"]
  Discipline           = ["Space Physics>Interplanetary Studies"]
  Source_name          = ["OMNI (1AU IP Data)>Merged 1 Hour Interplantary OMNI data in RTN system"]
  Data_type            = ["COHO1HR>Definitive Hourly data from cohoweb"]
  Descriptor           = ["merged magnetic field and plasma data from cohoweb"]
  Data_version         = ["1"]
  TITLE                = ["Near-Earth Heliosphere Data (OMNI)"]
  TEXT                 = ["Hourly averaged definitive multispacecraft interplanetary parameters data", "The Heliographic Inertial (HGI) coordinates are Sun-centered and inertially fixed with respect to an X-axis directed along the intersection line of the ecliptic and solar equatorial  planes. The solar equator plane is inclined at 7.25 degrees from the ecliptic. This direction was towards ecliptic longitude of 74.367 degrees on 1 January 1900 at 1200 UT; because of precession of the celestial equator, this longitude increases by 1.4 degrees/century. The Z axis  is  directed perpendicular and northward from the solar equator, and the Y-axis completes the right-handed set. This system differs from the usual heliographic coordinates (e.g. Carrington longitudes) which are fixed in the frame of the rotating Sun.", "The RTN system is fixed at a spacecraft (or the planet). The R axis is directed radially away from the Sun, the T axis is the cross product of the solar rotation axis and the R axis, and the N axis is the cross product of the R and T axes.  At zero heliographic latitude, when the spacecraft is in the solar equatorial plane, the N and solar rotation axes are parallel.", "Latitude and longitude angles of solar wind plasma flow are generally measured  from the radius vector away from the Sun. In all cases, latitude angles are positive for north-going flow.  The flow longitude angles have been treated differently for the near-Earth data, i.e. the OMNI, and for the deep space data. The flow is positive for the  near-Earth data when coming from the right side of the Sun as viewed  from  the Earth, i.e. flowing toward +Y from -X GSE or opposite to the direction of planetary motion. On the other hand, the flow longitudes for the deep space spacecraft use the opposite sign convection, i.e. positive for flow in the +T direction in the RTN system."]
  MODS                 = ["created July 2007;", "conversion to ISTP/IACG CDFs via SKTEditor Feb 2000", "Time tags in CDAWeb version were modified in March 2005 to use the", "CDAWeb convention of having mid-average time tags rather than OMNI's", "original convention of start-of-average time tags."]
  PI_name              = ["J.H. King, N. Papatashvilli"]
  PI_affiliation       = ["AdnetSystems, NASA GSFC"]
  Generation_date      = ["Ongoing"]
  Acknowledgement      = ["NSSDC"]
  ADID_ref             = ["NSSD0110"]
  Rules_of_use         = ["Public"]
  Instrument_type      = ["Plasma and Solar Wind", "Magnetic Fields (space)"]
  Generated_by         = ["King/Papatashvilli"]
  Time_resolution      = ["1 hour"]
  Logical_file_id      = ["omni_coho1hr_merged_mag_plasma_00000000_v01"]
  Logical_source       = ["omni_coho1hr_merged_mag_plasma"]
  Logical_source_description = ["OMNI Combined merged hourly magnetic field, plasma and ephermis data"]
  LINK_TEXT            = ["COHO dataset", "Additional analysis tools for these data from the"]
  LINK_TITLE           = ["Documentation", "COHOWeb service"]
  HTTP_LINK            = ["https://omniweb.gsfc.nasa.gov/coho/html/cw_data.html", "http://cohoweb.gsfc.nasa.gov"]
  alt_logical_source   = ["Combined_OMNI_1AU-MagneticField-Plasma-Particles_mrg1hr_1hour_cdf"]
  Mission_group        = ["OMNI (Combined 1AU IP Data; Magnetic and Solar Indices)", "ACE", "Wind", "IMP (All)", "!___Interplanetary Data near 1 AU"]
  spase_DatasetResourceID = ["spase://NASA/NumericalData/OMNI/COHO/MergedMagPlasma/PT1H"]
  DOI                  = ["https://doi.org/10.48322/6ffx-3441"]

Explore the dataset

julia> println("Variables: ", keys(ds))Variables: ["Epoch", "heliographicLatitude", "heliographicLongitude", "BR", "BT", "BN", "ABS_B", "V", "elevAngle", "azimuthAngle", "N", "T"]
julia> println("Attributes: ", keys(ds.attrib))Attributes: ["Project", "Discipline", "Source_name", "Data_type", "Descriptor", "Data_version", "TITLE", "TEXT", "MODS", "PI_name", "PI_affiliation", "Generation_date", "Acknowledgement", "ADID_ref", "Rules_of_use", "Instrument_type", "Generated_by", "Time_resolution", "Logical_file_id", "Logical_source", "Logical_source_description", "LINK_TEXT", "LINK_TITLE", "HTTP_LINK", "alt_logical_source", "Mission_group", "spase_DatasetResourceID", "DOI"]
julia> ds.attrib["Descriptor"]1-element Vector{String}: "merged magnetic field and plasma data from cohoweb"

Access variables

julia> ds["Epoch"]Epoch (744)
  Datatype:    Epoch
  Dimensions:
  Attributes:
   FIELDNAM             = Epoch Time
   CATDESC              = Epoch Time
   VALIDMIN             = Epoch[1963-01-01T00:00:00]
   VALIDMAX             = Epoch[2020-12-31T23:59:59]
   SCALEMIN             = Epoch[1963-01-01T00:00:00]
   SCALEMAX             = Epoch[2020-12-31T23:59:59]
   UNITS                = DD-MMM-YYYY_hr:mm
   FILLVAL              = [-1.0e31]
   VAR_TYPE             = support_data
julia> ds["Epoch"][[1,end]]2-element Vector{Epoch}: 2020-05-01T00:00:00 2020-05-31T23:00:00
julia> ds["BR"]BR (744) Datatype: Float32 Dimensions: Epoch Attributes: FIELDNAM = BR (RTN) CATDESC = BR in RTN (Radial-Tangential-Normal) coordinate system VALIDMIN = Float32[-1000.0] VALIDMAX = Float32[1000.0] UNITS = nT FORMAT = f6.1 FILLVAL = Float32[-1.0f31] VAR_TYPE = data DISPLAY_TYPE = time_series DEPEND_0 = Epoch LABLAXIS = BR (RTN)
# Calculate magnetic field magnitude
br = ds["BR"]
bt = ds["BT"]
bn = ds["BN"]
b_mag = sqrt.(br.^2 + bt.^2 + bn.^2) |> collect
744-element Vector{Float32}:
 3.544009
 2.9342802
 3.442383
 3.8961518
 0.64031243
 5.3721504
 5.4990907
 5.9514704
 4.3301272
 4.634652
 ⋮
 3.3301651
 2.9949956
 3.226453
 3.0380914
 3.828838
 3.3301651
 3.3674917
 3.749667
 3.8026307

API Reference

CDFDatasets.CDFDatasetMethod
CDFDataset(file; backend = :julia)

Load the CDF dataset at the file path. The dataset supports the API of the JuliaGeo/CommonDataModel.jl.

backend controls the backend used to load the CDF dataset. Two options are available: :julia and :PyCDFpp. The default is :julia.

For PyCDFpp backend, we use lazy_load = true by default. If lazy_load = false, all variable values are immediately loaded.

source
CDFDatasets.ConcatCDFVariableMethod
ConcatCDFVariable(arrays; metadata = nothing, dim = nothing)

Concatenate multiple CDF variables along the dim dimension (by default the record dimension (last dimension)).

source
CDFDatasets.cdfopenMethod
cdfopen(file; kw...) :: CDFDataset
cdfopen(files; kw...) :: ConcatCDFDataset

Opens CDF file(s) as a AbstractCDFDataset.

source