CDFDatasets
CDFDatasets.jl is a Julia package for reading CDF (Common Data Format) files, commonly used in space physics and other scientific domains. It provides a Julia interface to CDF files using the CommonDataModel.jl interface. See CDF reader benchmarks for comparison with other CDF readers.
Installation
using Pkg
Pkg.add("CDFDatasets")Quickstart
Here's a quick example using OMNI solar wind data:
using CDFDatasets
# Open a CDF dataset
omni_file = joinpath(pkgdir(CDFDatasets), "data/omni_coho1hr_merged_mag_plasma_20200501_v01.cdf")
ds = cdfopen(omni_file)Dataset: /home/runner/work/CDFDatasets.jl/CDFDatasets.jl/data/omni_coho1hr_merged_mag_plasma_20200501_v01.cdf
Group: omni_coho1hr_merged_mag_plasma
Data variables
heliographicLatitude (744)
Datatype: Float32
Dimensions: Epoch
Attributes:
FIELDNAM = Heliographic Latitude
CATDESC = HelioGraphic Inertial (HGI) latitude
VALIDMIN = Float32[-90.0]
VALIDMAX = Float32[90.0]
UNITS = deg
FORMAT = f7.1
FILLVAL = Float32[-1.0f31]
VAR_TYPE = data
DISPLAY_TYPE = time_series
DEPEND_0 = Epoch
LABLAXIS = Heliographic Latitude
heliographicLongitude (744)
Datatype: Float32
Dimensions: Epoch
Attributes:
FIELDNAM = Heliographic Longitude
CATDESC = HGI longitude
VALIDMIN = Float32[0.0]
VALIDMAX = Float32[360.0]
UNITS = deg
FORMAT = f7.1
FILLVAL = Float32[-1.0f31]
VAR_TYPE = data
DISPLAY_TYPE = time_series
DEPEND_0 = Epoch
LABLAXIS = Heliographic Longitude
BR (744)
Datatype: Float32
Dimensions: Epoch
Attributes:
FIELDNAM = BR (RTN)
CATDESC = BR in RTN (Radial-Tangential-Normal) coordinate system
VALIDMIN = Float32[-1000.0]
VALIDMAX = Float32[1000.0]
UNITS = nT
FORMAT = f6.1
FILLVAL = Float32[-1.0f31]
VAR_TYPE = data
DISPLAY_TYPE = time_series
DEPEND_0 = Epoch
LABLAXIS = BR (RTN)
BT (744)
Datatype: Float32
Dimensions: Epoch
Attributes:
FIELDNAM = BT (RTN)
CATDESC = BT in RTN coordinate system
VALIDMIN = Float32[-1000.0]
VALIDMAX = Float32[1000.0]
UNITS = nT
FORMAT = f6.1
FILLVAL = Float32[-1.0f31]
VAR_TYPE = data
DISPLAY_TYPE = time_series
DEPEND_0 = Epoch
LABLAXIS = BT (RTN)
BN (744)
Datatype: Float32
Dimensions: Epoch
Attributes:
FIELDNAM = BN (RTN)
CATDESC = BN in RTN coordinate system
VALIDMIN = Float32[-1000.0]
VALIDMAX = Float32[1000.0]
UNITS = nT
FORMAT = f6.1
FILLVAL = Float32[-1.0f31]
VAR_TYPE = data
DISPLAY_TYPE = time_series
DEPEND_0 = Epoch
LABLAXIS = BN (RTN)
ABS_B (744)
Datatype: Float32
Dimensions: Epoch
Attributes:
FIELDNAM = Field Magnitude Avg.
CATDESC = Field Magnitude Average |B 1/N SUM |B|
VALIDMIN = Float32[0.0]
VALIDMAX = Float32[100.0]
UNITS = nT
FORMAT = F6.1
FILLVAL = Float32[-1.0f31]
VAR_TYPE = data
DISPLAY_TYPE = time_series
DEPEND_0 = Epoch
LABLAXIS = Field Magnitude Avg.
V (744)
Datatype: Float32
Dimensions: Epoch
Attributes:
FIELDNAM = Bulk Flow Speed
CATDESC = Bulk Flow Speed
VALIDMIN = Float32[0.0]
VALIDMAX = Float32[1200.0]
UNITS = km/s
FORMAT = F5.0
FILLVAL = Float32[-1.0f31]
VAR_TYPE = data
DISPLAY_TYPE = time_series
DEPEND_0 = Epoch
LABLAXIS = Bulk Flow Speed
elevAngle (744)
Datatype: Float32
Dimensions: Epoch
Attributes:
FIELDNAM = elevation Angle
CATDESC = Proton flow elevation angle / latitude (RTN)
VALIDMIN = Float32[-90.0]
VALIDMAX = Float32[90.0]
UNITS = Deg
FORMAT = F6.1
FILLVAL = Float32[-1.0f31]
VAR_TYPE = data
DISPLAY_TYPE = time_series
DEPEND_0 = Epoch
LABLAXIS = elevation Angle
azimuthAngle (744)
Datatype: Float32
Dimensions: Epoch
Attributes:
FIELDNAM = azimuth Angle
CATDESC = Proton flow azimuth angle / longitude (RTN)
VALIDMIN = Float32[-180.0]
VALIDMAX = Float32[180.0]
UNITS = Deg
FORMAT = F6.1
FILLVAL = Float32[-1.0f31]
VAR_TYPE = data
DISPLAY_TYPE = time_series
DEPEND_0 = Epoch
LABLAXIS = azimuth Angle
N (744)
Datatype: Float32
Dimensions: Epoch
Attributes:
FIELDNAM = Ion Density
CATDESC = Ion Density
VALIDMIN = Float32[0.0]
VALIDMAX = Float32[150.0]
UNITS = N/cm3
FORMAT = F6.1
FILLVAL = Float32[-1.0f31]
VAR_TYPE = data
DISPLAY_TYPE = time_series
DEPEND_0 = Epoch
LABLAXIS = Ion density
T (744)
Datatype: Float32
Dimensions: Epoch
Attributes:
FIELDNAM = Temperature
CATDESC = Temperature
VALIDMIN = Float32[0.0]
VALIDMAX = Float32[1.0f7]
UNITS = Deg K
FORMAT = F9.0
FILLVAL = Float32[-1.0f31]
VAR_TYPE = data
DISPLAY_TYPE = time_series
DEPEND_0 = Epoch
LABLAXIS = Temperature
Support variables: Epoch
Global attributes
Project = ["SPDF"]
Discipline = ["Space Physics>Interplanetary Studies"]
Source_name = ["OMNI (1AU IP Data)>Merged 1 Hour Interplantary OMNI data in RTN system"]
Data_type = ["COHO1HR>Definitive Hourly data from cohoweb"]
Descriptor = ["merged magnetic field and plasma data from cohoweb"]
Data_version = ["1"]
TITLE = ["Near-Earth Heliosphere Data (OMNI)"]
TEXT = ["Hourly averaged definitive multispacecraft interplanetary parameters data", "The Heliographic Inertial (HGI) coordinates are Sun-centered and inertially fixed with respect to an X-axis directed along the intersection line of the ecliptic and solar equatorial planes. The solar equator plane is inclined at 7.25 degrees from the ecliptic. This direction was towards ecliptic longitude of 74.367 degrees on 1 January 1900 at 1200 UT; because of precession of the celestial equator, this longitude increases by 1.4 degrees/century. The Z axis is directed perpendicular and northward from the solar equator, and the Y-axis completes the right-handed set. This system differs from the usual heliographic coordinates (e.g. Carrington longitudes) which are fixed in the frame of the rotating Sun.", "The RTN system is fixed at a spacecraft (or the planet). The R axis is directed radially away from the Sun, the T axis is the cross product of the solar rotation axis and the R axis, and the N axis is the cross product of the R and T axes. At zero heliographic latitude, when the spacecraft is in the solar equatorial plane, the N and solar rotation axes are parallel.", "Latitude and longitude angles of solar wind plasma flow are generally measured from the radius vector away from the Sun. In all cases, latitude angles are positive for north-going flow. The flow longitude angles have been treated differently for the near-Earth data, i.e. the OMNI, and for the deep space data. The flow is positive for the near-Earth data when coming from the right side of the Sun as viewed from the Earth, i.e. flowing toward +Y from -X GSE or opposite to the direction of planetary motion. On the other hand, the flow longitudes for the deep space spacecraft use the opposite sign convection, i.e. positive for flow in the +T direction in the RTN system."]
MODS = ["created July 2007;", "conversion to ISTP/IACG CDFs via SKTEditor Feb 2000", "Time tags in CDAWeb version were modified in March 2005 to use the", "CDAWeb convention of having mid-average time tags rather than OMNI's", "original convention of start-of-average time tags."]
PI_name = ["J.H. King, N. Papatashvilli"]
PI_affiliation = ["AdnetSystems, NASA GSFC"]
Generation_date = ["Ongoing"]
Acknowledgement = ["NSSDC"]
ADID_ref = ["NSSD0110"]
Rules_of_use = ["Public"]
Instrument_type = ["Plasma and Solar Wind", "Magnetic Fields (space)"]
Generated_by = ["King/Papatashvilli"]
Time_resolution = ["1 hour"]
Logical_file_id = ["omni_coho1hr_merged_mag_plasma_00000000_v01"]
Logical_source = ["omni_coho1hr_merged_mag_plasma"]
Logical_source_description = ["OMNI Combined merged hourly magnetic field, plasma and ephermis data"]
LINK_TEXT = ["COHO dataset", "Additional analysis tools for these data from the"]
LINK_TITLE = ["Documentation", "COHOWeb service"]
HTTP_LINK = ["https://omniweb.gsfc.nasa.gov/coho/html/cw_data.html", "http://cohoweb.gsfc.nasa.gov"]
alt_logical_source = ["Combined_OMNI_1AU-MagneticField-Plasma-Particles_mrg1hr_1hour_cdf"]
Mission_group = ["OMNI (Combined 1AU IP Data; Magnetic and Solar Indices)", "ACE", "Wind", "IMP (All)", "!___Interplanetary Data near 1 AU"]
spase_DatasetResourceID = ["spase://NASA/NumericalData/OMNI/COHO/MergedMagPlasma/PT1H"]
DOI = ["https://doi.org/10.48322/6ffx-3441"]
Explore the dataset
julia> println("Variables: ", keys(ds))Variables: ["Epoch", "heliographicLatitude", "heliographicLongitude", "BR", "BT", "BN", "ABS_B", "V", "elevAngle", "azimuthAngle", "N", "T"]julia> println("Attributes: ", keys(ds.attrib))Attributes: ["Project", "Discipline", "Source_name", "Data_type", "Descriptor", "Data_version", "TITLE", "TEXT", "MODS", "PI_name", "PI_affiliation", "Generation_date", "Acknowledgement", "ADID_ref", "Rules_of_use", "Instrument_type", "Generated_by", "Time_resolution", "Logical_file_id", "Logical_source", "Logical_source_description", "LINK_TEXT", "LINK_TITLE", "HTTP_LINK", "alt_logical_source", "Mission_group", "spase_DatasetResourceID", "DOI"]julia> ds.attrib["Descriptor"]1-element Vector{String}: "merged magnetic field and plasma data from cohoweb"
Access variables
julia> ds["Epoch"]Epoch (744) Datatype: Epoch Dimensions: Attributes: FIELDNAM = Epoch Time CATDESC = Epoch Time VALIDMIN = Epoch[1963-01-01T00:00:00] VALIDMAX = Epoch[2020-12-31T23:59:59] SCALEMIN = Epoch[1963-01-01T00:00:00] SCALEMAX = Epoch[2020-12-31T23:59:59] UNITS = DD-MMM-YYYY_hr:mm FILLVAL = [-1.0e31] VAR_TYPE = support_datajulia> ds["Epoch"][[1,end]]2-element Vector{Epoch}: 2020-05-01T00:00:00 2020-05-31T23:00:00julia> ds["BR"]BR (744) Datatype: Float32 Dimensions: Epoch Attributes: FIELDNAM = BR (RTN) CATDESC = BR in RTN (Radial-Tangential-Normal) coordinate system VALIDMIN = Float32[-1000.0] VALIDMAX = Float32[1000.0] UNITS = nT FORMAT = f6.1 FILLVAL = Float32[-1.0f31] VAR_TYPE = data DISPLAY_TYPE = time_series DEPEND_0 = Epoch LABLAXIS = BR (RTN)
# Calculate magnetic field magnitude
br = ds["BR"]
bt = ds["BT"]
bn = ds["BN"]
b_mag = sqrt.(br.^2 + bt.^2 + bn.^2) |> collect744-element Vector{Float32}:
3.544009
2.9342802
3.442383
3.8961518
0.64031243
5.3721504
5.4990907
5.9514704
4.3301272
4.634652
⋮
3.3301651
2.9949956
3.226453
3.0380914
3.828838
3.3301651
3.3674917
3.749667
3.8026307API Reference
CDFDatasets.CDFDatasetCDFDatasets.ConcatCDFVariableCDFDatasets.cdfopenCDFDatasets.materializeCDFDatasets.replace_fillval_by_nan!CDFDatasets.sanitize
CDFDatasets.CDFDataset — Method
CDFDataset(file; backend = :julia)Load the CDF dataset at the file path. The dataset supports the API of the JuliaGeo/CommonDataModel.jl.
backend controls the backend used to load the CDF dataset. Two options are available: :julia and :PyCDFpp. The default is :julia.
For PyCDFpp backend, we use lazy_load = true by default. If lazy_load = false, all variable values are immediately loaded.
CDFDatasets.ConcatCDFVariable — Method
ConcatCDFVariable(arrays; metadata = nothing, dim = nothing)Concatenate multiple CDF variables along the dim dimension (by default the record dimension (last dimension)).
CDFDatasets.cdfopen — Method
cdfopen(file; kw...) :: CDFDataset
cdfopen(files; kw...) :: ConcatCDFDatasetOpens CDF file(s) as a AbstractCDFDataset.
CDFDatasets.materialize — Method
materialize(var)::MaterializedCDFVariableLoad the variable data from disk into memory.
CDFDatasets.replace_fillval_by_nan! — Method
Replaces fill values by NaN for var with float type elements.
CDFDatasets.sanitize — Method
sanitize(var::AbstractCDFVariable; replace_fillval = true, replace_invalid = true)Load variable data as an array with fill values and invalid data replaced by NaN.
See also: replace_fillval_by_nan!