CDFDatasets
CDFDatasets.jl is a Julia package for reading CDF (Common Data Format) files, commonly used in space physics and other scientific domains. It provides a Julia interface to CDF files using the CommonDataModel.jl interface.
Installation
using Pkg
Pkg.add("CDFDatasets")
Quickstart
Here's a quick example using OMNI solar wind data:
using CDFDatasets
# Open a CDF dataset
omni_file = joinpath(pkgdir(CDFDatasets), "data/omni_coho1hr_merged_mag_plasma_20200501_v01.cdf")
ds = CDFDataset(omni_file)
Dataset:
Group: omni_coho1hr_merged_mag_plasma
Variables
Epoch (744)
Datatype: CommonDataFormat.Epoch
Dimensions:
Attributes:
FILLVAL = [-1.0e31]
VAR_TYPE = support_data
CATDESC = Epoch Time
FIELDNAM = Epoch Time
VALIDMAX = CommonDataFormat.Epoch[2020-12-31T23:59:59]
SCALEMIN = CommonDataFormat.Epoch[1963-01-01T00:00:00]
SCALEMAX = CommonDataFormat.Epoch[2020-12-31T23:59:59]
VALIDMIN = CommonDataFormat.Epoch[1963-01-01T00:00:00]
UNITS = DD-MMM-YYYY_hr:mm
heliographicLatitude (744)
Datatype: Float32
Dimensions: Epoch
Attributes:
FILLVAL = Float32[-1.0f31]
FIELDNAM = Heliographic Latitude
VALIDMAX = Float32[90.0]
CATDESC = HelioGraphic Inertial (HGI) latitude
VALIDMIN = Float32[-90.0]
DISPLAY_TYPE = time_series
UNITS = deg
DEPEND_0 = Epoch
FORMAT = f7.1
VAR_TYPE = data
LABLAXIS = Heliographic Latitude
heliographicLongitude (744)
Datatype: Float32
Dimensions: Epoch
Attributes:
FILLVAL = Float32[-1.0f31]
FIELDNAM = Heliographic Longitude
VALIDMAX = Float32[360.0]
CATDESC = HGI longitude
VALIDMIN = Float32[0.0]
DISPLAY_TYPE = time_series
UNITS = deg
DEPEND_0 = Epoch
FORMAT = f7.1
VAR_TYPE = data
LABLAXIS = Heliographic Longitude
BR (744)
Datatype: Float32
Dimensions: Epoch
Attributes:
FILLVAL = Float32[-1.0f31]
FIELDNAM = BR (RTN)
VALIDMAX = Float32[1000.0]
CATDESC = BR in RTN (Radial-Tangential-Normal) coordinate system
VALIDMIN = Float32[-1000.0]
DISPLAY_TYPE = time_series
UNITS = nT
DEPEND_0 = Epoch
FORMAT = f6.1
VAR_TYPE = data
LABLAXIS = BR (RTN)
BT (744)
Datatype: Float32
Dimensions: Epoch
Attributes:
FILLVAL = Float32[-1.0f31]
FIELDNAM = BT (RTN)
VALIDMAX = Float32[1000.0]
CATDESC = BT in RTN coordinate system
VALIDMIN = Float32[-1000.0]
DISPLAY_TYPE = time_series
UNITS = nT
DEPEND_0 = Epoch
FORMAT = f6.1
VAR_TYPE = data
LABLAXIS = BT (RTN)
BN (744)
Datatype: Float32
Dimensions: Epoch
Attributes:
FILLVAL = Float32[-1.0f31]
FIELDNAM = BN (RTN)
VALIDMAX = Float32[1000.0]
CATDESC = BN in RTN coordinate system
VALIDMIN = Float32[-1000.0]
DISPLAY_TYPE = time_series
UNITS = nT
DEPEND_0 = Epoch
FORMAT = f6.1
VAR_TYPE = data
LABLAXIS = BN (RTN)
ABS_B (744)
Datatype: Float32
Dimensions: Epoch
Attributes:
FILLVAL = Float32[-1.0f31]
FIELDNAM = Field Magnitude Avg.
VALIDMAX = Float32[100.0]
CATDESC = Field Magnitude Average |B 1/N SUM |B|
VALIDMIN = Float32[0.0]
DISPLAY_TYPE = time_series
UNITS = nT
DEPEND_0 = Epoch
FORMAT = F6.1
VAR_TYPE = data
LABLAXIS = Field Magnitude Avg.
V (744)
Datatype: Float32
Dimensions: Epoch
Attributes:
FILLVAL = Float32[-1.0f31]
FIELDNAM = Bulk Flow Speed
VALIDMAX = Float32[1200.0]
CATDESC = Bulk Flow Speed
VALIDMIN = Float32[0.0]
DISPLAY_TYPE = time_series
UNITS = km/s
DEPEND_0 = Epoch
FORMAT = F5.0
VAR_TYPE = data
LABLAXIS = Bulk Flow Speed
elevAngle (744)
Datatype: Float32
Dimensions: Epoch
Attributes:
FILLVAL = Float32[-1.0f31]
FIELDNAM = elevation Angle
VALIDMAX = Float32[90.0]
CATDESC = Proton flow elevation angle / latitude (RTN)
VALIDMIN = Float32[-90.0]
DISPLAY_TYPE = time_series
UNITS = Deg
DEPEND_0 = Epoch
FORMAT = F6.1
VAR_TYPE = data
LABLAXIS = elevation Angle
azimuthAngle (744)
Datatype: Float32
Dimensions: Epoch
Attributes:
FILLVAL = Float32[-1.0f31]
FIELDNAM = azimuth Angle
VALIDMAX = Float32[180.0]
CATDESC = Proton flow azimuth angle / longitude (RTN)
VALIDMIN = Float32[-180.0]
DISPLAY_TYPE = time_series
UNITS = Deg
DEPEND_0 = Epoch
FORMAT = F6.1
VAR_TYPE = data
LABLAXIS = azimuth Angle
N (744)
Datatype: Float32
Dimensions: Epoch
Attributes:
FILLVAL = Float32[-1.0f31]
FIELDNAM = Ion Density
VALIDMAX = Float32[150.0]
CATDESC = Ion Density
VALIDMIN = Float32[0.0]
DISPLAY_TYPE = time_series
UNITS = N/cm3
DEPEND_0 = Epoch
FORMAT = F6.1
VAR_TYPE = data
LABLAXIS = Ion density
T (744)
Datatype: Float32
Dimensions: Epoch
Attributes:
FILLVAL = Float32[-1.0f31]
FIELDNAM = Temperature
VALIDMAX = Float32[1.0f7]
CATDESC = Temperature
VALIDMIN = Float32[0.0]
DISPLAY_TYPE = time_series
UNITS = Deg K
DEPEND_0 = Epoch
FORMAT = F9.0
VAR_TYPE = data
LABLAXIS = Temperature
Global attributes
Project = ["SPDF"]
Discipline = ["Space Physics>Interplanetary Studies"]
Source_name = ["OMNI (1AU IP Data)>Merged 1 Hour Interplantary OMNI data in RTN system"]
Data_type = ["COHO1HR>Definitive Hourly data from cohoweb"]
Descriptor = ["merged magnetic field and plasma data from cohoweb"]
Data_version = ["1"]
TITLE = ["Near-Earth Heliosphere Data (OMNI)"]
TEXT = ["Hourly averaged definitive multispacecraft interplanetary parameters data", "The Heliographic Inertial (HGI) coordinates are Sun-centered and inertially fixed with respect to an X-axis directed along the intersection line of the ecliptic and solar equatorial planes. The solar equator plane is inclined at 7.25 degrees from the ecliptic. This direction was towards ecliptic longitude of 74.367 degrees on 1 January 1900 at 1200 UT; because of precession of the celestial equator, this longitude increases by 1.4 degrees/century. The Z axis is directed perpendicular and northward from the solar equator, and the Y-axis completes the right-handed set. This system differs from the usual heliographic coordinates (e.g. Carrington longitudes) which are fixed in the frame of the rotating Sun.", "The RTN system is fixed at a spacecraft (or the planet). The R axis is directed radially away from the Sun, the T axis is the cross product of the solar rotation axis and the R axis, and the N axis is the cross product of the R and T axes. At zero heliographic latitude, when the spacecraft is in the solar equatorial plane, the N and solar rotation axes are parallel.", "Latitude and longitude angles of solar wind plasma flow are generally measured from the radius vector away from the Sun. In all cases, latitude angles are positive for north-going flow. The flow longitude angles have been treated differently for the near-Earth data, i.e. the OMNI, and for the deep space data. The flow is positive for the near-Earth data when coming from the right side of the Sun as viewed from the Earth, i.e. flowing toward +Y from -X GSE or opposite to the direction of planetary motion. On the other hand, the flow longitudes for the deep space spacecraft use the opposite sign convection, i.e. positive for flow in the +T direction in the RTN system."]
MODS = ["created July 2007;", "conversion to ISTP/IACG CDFs via SKTEditor Feb 2000", "Time tags in CDAWeb version were modified in March 2005 to use the", "CDAWeb convention of having mid-average time tags rather than OMNI's", "original convention of start-of-average time tags."]
PI_name = ["J.H. King, N. Papatashvilli"]
PI_affiliation = ["AdnetSystems, NASA GSFC"]
Generation_date = ["Ongoing"]
Acknowledgement = ["NSSDC"]
ADID_ref = ["NSSD0110"]
Rules_of_use = ["Public"]
Instrument_type = ["Plasma and Solar Wind", "Magnetic Fields (space)"]
Generated_by = ["King/Papatashvilli"]
Time_resolution = ["1 hour"]
Logical_file_id = ["omni_coho1hr_merged_mag_plasma_00000000_v01"]
Logical_source = ["omni_coho1hr_merged_mag_plasma"]
Logical_source_description = ["OMNI Combined merged hourly magnetic field, plasma and ephermis data"]
LINK_TEXT = ["COHO dataset", "Additional analysis tools for these data from the"]
LINK_TITLE = ["Documentation", "COHOWeb service"]
HTTP_LINK = ["https://omniweb.gsfc.nasa.gov/coho/html/cw_data.html", "http://cohoweb.gsfc.nasa.gov"]
alt_logical_source = ["Combined_OMNI_1AU-MagneticField-Plasma-Particles_mrg1hr_1hour_cdf"]
Mission_group = ["OMNI (Combined 1AU IP Data; Magnetic and Solar Indices)", "ACE", "Wind", "IMP (All)", "!___Interplanetary Data near 1 AU"]
spase_DatasetResourceID = ["spase://NASA/NumericalData/OMNI/COHO/MergedMagPlasma/PT1H"]
DOI = ["https://doi.org/10.48322/6ffx-3441"]
Explore the dataset
julia> println("Variables: ", keys(ds))
Variables: ["Epoch", "heliographicLatitude", "heliographicLongitude", "BR", "BT", "BN", "ABS_B", "V", "elevAngle", "azimuthAngle", "N", "T"]
julia> println("Attributes: ", keys(ds.attrib))
Attributes: ["Project", "Discipline", "Source_name", "Data_type", "Descriptor", "Data_version", "TITLE", "TEXT", "MODS", "PI_name", "PI_affiliation", "Generation_date", "Acknowledgement", "ADID_ref", "Rules_of_use", "Instrument_type", "Generated_by", "Time_resolution", "Logical_file_id", "Logical_source", "Logical_source_description", "LINK_TEXT", "LINK_TITLE", "HTTP_LINK", "alt_logical_source", "Mission_group", "spase_DatasetResourceID", "DOI"]
julia> ds.attrib["Descriptor"]
1-element Vector{String}: "merged magnetic field and plasma data from cohoweb"
Access variables
julia> ds["Epoch"]
Epoch (744) Datatype: CommonDataFormat.Epoch Dimensions: Attributes: FILLVAL = [-1.0e31] VAR_TYPE = support_data CATDESC = Epoch Time FIELDNAM = Epoch Time VALIDMAX = CommonDataFormat.Epoch[2020-12-31T23:59:59] SCALEMIN = CommonDataFormat.Epoch[1963-01-01T00:00:00] SCALEMAX = CommonDataFormat.Epoch[2020-12-31T23:59:59] VALIDMIN = CommonDataFormat.Epoch[1963-01-01T00:00:00] UNITS = DD-MMM-YYYY_hr:mm
julia> ds["Epoch"][[1,end]]
2-element Vector{CommonDataFormat.Epoch}: 2020-05-01T00:00:00 2020-05-31T23:00:00
julia> ds["BR"]
BR (744) Datatype: Float32 Dimensions: Epoch Attributes: FILLVAL = Float32[-1.0f31] FIELDNAM = BR (RTN) VALIDMAX = Float32[1000.0] CATDESC = BR in RTN (Radial-Tangential-Normal) coordinate system VALIDMIN = Float32[-1000.0] DISPLAY_TYPE = time_series UNITS = nT DEPEND_0 = Epoch FORMAT = f6.1 VAR_TYPE = data LABLAXIS = BR (RTN)
# Calculate magnetic field magnitude
br = ds["BR"]
bt = ds["BT"]
bn = ds["BN"]
b_mag = sqrt.(br.^2 + bt.^2 + bn.^2) |> collect
744-element Vector{Float32}:
3.544009
2.9342802
3.442383
3.8961518
0.64031243
5.3721504
5.4990907
5.9514704
4.3301272
4.634652
⋮
3.3301651
2.9949956
3.226453
3.0380914
3.828838
3.3301651
3.3674917
3.749667
3.8026307
API Reference
CDFDatasets.CDFDataset
CDFDatasets.ConcatCDFVariable
CDFDatasets.replace_fillval_by_nan!
CDFDatasets.sanitize
CDFDatasets.CDFDataset
— MethodCDFDataset(file; lazy = true)
Load the CDF dataset at the file
path. The dataset supports the API of the JuliaGeo/CommonDataModel.jl.
lazy
controls whether variable values are loaded immediately or only when accessed by the user. If True, variables' values are loaded on demand. If False, all variable values are loaded during parsing.
CDFDatasets.ConcatCDFVariable
— MethodConcatCDFVariable(arrays; metadata = nothing, dim = nothing)
Concatenate multiple CDF variables along the dim
dimension (by default the record dimension (last dimension)).
CDFDatasets.replace_fillval_by_nan!
— MethodReplaces fill values by NaN for var
with float type elements.
CDFDatasets.sanitize
— Methodsanitize(var::AbstractCDFVariable; replace_fillval = true, replace_invalid = true)
Load variable data as an array with fill values and invalid data replaced by NaN
.
See also: replace_fillval_by_nan!