Adding New Datasets
Observations
Surface
The MELODIES MONET tool has a Command Line Interface that can be used to download and create MELODIES MONET-ready datasets for: AirNow, AERONET, AQS, ISH, ISH-Lite, and OpenAQ. New surface observational datasets formally added to MELODIES MONET should be added to this Command Line Interface.
If you are interested in converting a new observational dataset to our netCDF format on your own for testing within MELODIES MONET, please see the notes below.
The dataset should have these dimensions (in this order):
timey(an optional singleton dimension, included for consistency with model surface datasets)x(the site dimension)
The dataset should have these coordinate variables:
time(UTC time, as timezone-naivedatetime64format in xarray;timedim)siteid(unique site identifier, as string;xdim)latitude(site latitude, in degrees;xdim)longitude(site longitude, in degrees;xdim)
This variable is required for regulatory metrics, and can be optionally used for time series plots. Otherwise, you might omit it:
time_local(local time, usually local standard time, not including daylight savings, as timezone-naivedatetime64format in xarray; note that this varies in both thetimeandxdimensions)
It’s good practice to include
unitsattributes for your data variables, though this is not strictly required. Similarly, you may wish to includelong_names.Site metadata variables (e.g. site name, site elevation, EPA region, etc.) should ideally be stored as varying only in the
xdimension, to save space.If you have sub-hourly data, you may want to aggregate it to hourly, especially if different sites have different time resolutions.
Example abbreviated xarray representation for AirNow demonstrating these qualities:
<xarray.Dataset>
Dimensions: (time: 289, y: 1, x: 2231)
Coordinates:
* time (time) datetime64[ns] 2023-04-04 ... 2023-04-16
siteid (x) <U12 ...
latitude (x) float64 ...
longitude (x) float64 ...
Dimensions without coordinates: y, x
Data variables:
NO2 (time, y, x) float64 ...
time_local (time, y, x) datetime64[ns] ...
epa_region (y, x) <U5 ...
You can examine the get_* functions in the Command Line Interface
(melodies_monet/_cli.py) for examples of converting observational datasets
in pandas DataFrame format to xarray Dataset format.
Aircraft, Sonde, Mobile, and Ground Campaign Data
New aircraft, sonde, mobile, and ground campaign datasets should work in the tool with no changes as long as the data format is NetCDF, ICARTT, or CSV. We are constantly working to generalize our code. If an issue arises, please post on GitHub Issues.
Satellite
Examples for reading satellite datasets can be
found in the monetio/sat folder in the MONETIO repository
on GitHub.
While a part of the MONETIO repository,
the private MELODIES MONET readers are designated with prefix _
and suffix _mm.
Models
Examples for reading model datasets can be
found in the monetio/models folder in the MONETIO repository
on GitHub.
These include e.g., _cesm_fv_mm.py, _cmaq_mm.py, and _wrfchem_mm.py.
While a part of the MONETIO repository,
the private MELODIES MONET readers are designated with prefix _
and suffix _mm.
Support for additional models is also under developed.
Standard variables are required to be computed in each model reader for each capability including surface, aircraft, and satellite as specified in the table below.
Capability |
Variable Name
in Code
|
Description |
Additional Requirements |
|---|---|---|---|
Surface |
timelatitudelongitude |
Time in
datetime64[ns] formatLatitude in degrees
Longitude in degrees
|
Provide only surface model data
or if provide vertical model data,
first level must be the level
nearest to the surface.
All gases are in ppb and
all aerosols are in µg/m3.
|
Aircraft |
timelatitudelongitudepres_pa_midtemperature_k |
Time in
datetime64[ns] formatLatitude in degrees
Longitude in degrees
Mid-level pressure in pascals (Pa)
Mid-level temperature in kelvin (K)
|
Provide vertical model data.
All gases are in ppb and
all aerosols are in µg/m3.
|
Satellites |
timelatitudelongitudepres_pa_midtemperature_kdz_msurfpres_pa |
Time in
datetime64[ns] formatLatitude in degrees
Longitude in degrees
Mid-level pressure in pascals (Pa)
Mid-level temperature in kelvin (K)
Layer thickness in meters (m)
Surface pressure in pascals (Pa)
|
Provide vertical model data.
|