{ "cells": [ { "cell_type": "markdown", "id": "48080cc2-be7e-4fa2-b34d-329041fccdd7", "metadata": {}, "source": [ "# AEROMMA and UFS-AQM: Read Paired Data and Create Plots" ] }, { "cell_type": "markdown", "id": "1f968f42-85e9-4eab-bbe7-f53e3c54e315", "metadata": {}, "source": [ "Our first example will demonstrate the basics available in MELODIES MONET to compare the UFS-AQM model results against AEROMMA aircraft observations (https://csl.noaa.gov/projects/aeromma/) for ozone, nitrogen oxide (NO), nitrogen dioxide (NO2), and carbon monoxide (CO).\n", "\n", "This example reads in the AEROMMA and UFS-AQM paired data created by the scripts described in the Aircraft Pairing example on ReadTheDocs. This includes analysis over 3 flights and 2 days with a resampling of 30 s. To make the timeseries plot clearer, we choose to only plot 2 flights over 1 day, but you are welcome to test expanding this analysis over the entire period on your own.\n", "\n", "First, we import the {mod}`melodies_monet.driver` module." ] }, { "cell_type": "code", "execution_count": 1, "id": "d7240c01-7c05-49e7-bfca-01e23dc6bed6", "metadata": { "ExecuteTime": { "end_time": "2025-06-18T19:55:07.172300Z", "start_time": "2025-06-18T19:55:01.157955Z" } }, "outputs": [], "source": [ "from melodies_monet import driver" ] }, { "cell_type": "markdown", "id": "0a8484c7-5b57-4d0b-b132-cb30338bac43", "metadata": {}, "source": [ "## Analysis driver class" ] }, { "cell_type": "markdown", "id": "24c2f889-4fde-4e13-9092-35c54c096148", "metadata": {}, "source": [ "Now, lets create an instance of the analysis driver class, {class}`melodies_monet.driver.analysis`. It consists of these main parts:\n", "\n", "* model instances\n", "\n", "* observation instances\n", "\n", "* a paired instance of both" ] }, { "cell_type": "code", "execution_count": 2, "id": "45a85e85-8d36-4dd6-8001-c6cc28275746", "metadata": { "ExecuteTime": { "end_time": "2025-06-18T19:55:07.220248Z", "start_time": "2025-06-18T19:55:07.200584Z" } }, "outputs": [], "source": [ "an = driver.analysis()" ] }, { "cell_type": "markdown", "id": "6c5b21f7-9d2d-4b32-b8e8-853f7587e733", "metadata": {}, "source": [ "Initially, most of our analysis object’s attributes are set to None, though some have meaningful defaults:" ] }, { "cell_type": "code", "execution_count": 3, "id": "84f0538c-2bc4-468d-8130-5878ef45f600", "metadata": { "ExecuteTime": { "end_time": "2025-06-18T19:55:07.268888Z", "start_time": "2025-06-18T19:55:07.250250Z" } }, "outputs": [ { "data": { "text/plain": [ "analysis(\n", " control='control.yaml',\n", " control_dict=None,\n", " models={},\n", " obs={},\n", " paired={},\n", " start_time=None,\n", " end_time=None,\n", " time_intervals=None,\n", " download_maps=True,\n", " output_dir=None,\n", " output_dir_save=None,\n", " output_dir_read=None,\n", " debug=False,\n", " save=None,\n", " read=None,\n", " regrid=False,\n", ")" ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "an" ] }, { "cell_type": "markdown", "id": "8f39e654-28eb-42b0-ad1b-887ef241755c", "metadata": {}, "source": [ "## Control file" ] }, { "cell_type": "markdown", "id": "265ea0ea-d420-4c6f-ab87-65229adb2df5", "metadata": {}, "source": [ "We set the YAML control file and begin by reading the file." ] }, { "cell_type": "code", "execution_count": 4, "id": "cf1a865a-ead4-436a-8287-5bb8cbf4d3fe", "metadata": { "ExecuteTime": { "end_time": "2025-06-18T19:55:08.219544Z", "start_time": "2025-06-18T19:55:08.144225Z" }, "scrolled": true }, "outputs": [ { "data": { "text/plain": [ "{'analysis': {'start_time': '2023-06-27-00:00:00',\n", " 'end_time': '2023-06-28-23:59:00',\n", " 'output_dir': './output/aeromma_ufsaqm',\n", " 'debug': True,\n", " 'read': {'paired': {'method': 'netcdf',\n", " 'filenames': {'aeromma_ufsaqm': ['example:ufsaqm:merge_0627_L1',\n", " 'example:ufsaqm:merge_0627_L2']}}}},\n", " 'model': {'ufsaqm': {'files': 'example:ufsaqm:model_data',\n", " 'mod_type': 'ufs',\n", " 'radius_of_influence': 19500,\n", " 'mapping': {'aeromma': {'no2_ave': 'NO2_LIF',\n", " 'no_ave': 'NO_LIF',\n", " 'o3_ave': 'O3_CL',\n", " 'co': 'CO_LGR'}},\n", " 'variables': {'pres_pa_mid': {'rename': 'pressure_model',\n", " 'unit_scale': 1,\n", " 'unit_scale_method': '*'},\n", " 'temperature_k': {'rename': 'temp_model',\n", " 'unit_scale': 1,\n", " 'unit_scale_method': '*'}},\n", " 'projection': None,\n", " 'plot_kwargs': {'color': 'dodgerblue', 'marker': '^', 'linestyle': ':'}}},\n", " 'obs': {'aeromma': {'filename': 'example:ufsaqm:AEROMMA',\n", " 'obs_type': 'aircraft',\n", " 'time_var': 'Time_Start',\n", " 'resample': '30s',\n", " 'variables': {'O3_CL': {'unit_scale': 1,\n", " 'unit_scale_method': '*',\n", " 'nan_value': -7777,\n", " 'LLOD_value': -8888,\n", " 'LLOD_setvalue': 0.0,\n", " 'ylabel_plot': 'O3 (ppbv)'},\n", " 'NO_LIF': {'unit_scale': 1000.0,\n", " 'unit_scale_method': '/',\n", " 'nan_value': -7777,\n", " 'LLOD_value': -8888,\n", " 'LLOD_setvalue': 0.0,\n", " 'ylabel_plot': 'NO (ppbv)'},\n", " 'NO2_LIF': {'unit_scale': 1000.0,\n", " 'unit_scale_method': '/',\n", " 'nan_value': -7777,\n", " 'LLOD_value': -8888,\n", " 'LLOD_setvalue': 0.0,\n", " 'ylabel_plot': 'NO2 (ppbv)'},\n", " 'CO_LGR': {'nan_value': -7777,\n", " 'LLOD_value': -8888,\n", " 'LLOD_setvalue': 0.0,\n", " 'ylabel_plot': 'CO (ppbv)'},\n", " 'G_LAT': {'rename': 'latitude', 'unit_scale': 1, 'unit_scale_method': '*'},\n", " 'G_LONG': {'rename': 'longitude',\n", " 'unit_scale': 1,\n", " 'unit_scale_method': '*'},\n", " 'PW': {'rename': 'pressure_obs',\n", " 'unit_scale': 100,\n", " 'unit_scale_method': '*'},\n", " 'TW': {'rename': 'temp_obs', 'unit_scale': 1, 'unit_scale_method': '*'},\n", " 'G_ALT': {'rename': 'altitude', 'unit_scale': 1, 'unit_scale_method': '*'},\n", " 'Time_Start': {'rename': 'time'}}}},\n", " 'plots': {'plot_grp1': {'type': 'timeseries',\n", " 'fig_kwargs': {'figsize': [12, 6]},\n", " 'default_plot_kwargs': {'linewidth': 2.0, 'markersize': 5.0},\n", " 'text_kwargs': {'fontsize': 24.0},\n", " 'domain_type': ['all'],\n", " 'domain_name': ['Los Angeles'],\n", " 'data': ['aeromma_ufsaqm'],\n", " 'data_proc': {'rem_obs_nan': True,\n", " 'ts_select_time': 'time',\n", " 'set_axis': False,\n", " 'altitude_yax2': {'altitude_variable': 'altitude',\n", " 'altitude_ticks': 1000,\n", " 'ylabel2': 'Altitude (m)',\n", " 'plot_kwargs_y2': {'color': 'g'},\n", " 'altitude_unit': 'm',\n", " 'altitude_scaling_factor': 1}}},\n", " 'plot_grp2': {'type': 'vertprofile',\n", " 'fig_kwargs': {'figsize': [10, 14]},\n", " 'default_plot_kwargs': {'linewidth': 4.0, 'markersize': 10.0},\n", " 'text_kwargs': {'fontsize': 36.0},\n", " 'ylabel_vert': 'Altitude (m)',\n", " 'domain_type': ['all'],\n", " 'domain_name': ['Los Angeles'],\n", " 'data': ['aeromma_ufsaqm'],\n", " 'data_proc': {'rem_obs_nan': True,\n", " 'set_axis': False,\n", " 'interquartile_style': 'shading'},\n", " 'altitude_variable': 'altitude',\n", " 'vertprofile_bins': {'range': {'start': 0, 'stop': 4000, 'step': 500}},\n", " 'vmin': -1,\n", " 'vmax': 4001},\n", " 'plot_grp2a': {'type': 'vertprofile',\n", " 'fig_kwargs': {'figsize': [10, 14]},\n", " 'default_plot_kwargs': {'linewidth': 4.0, 'markersize': 10.0},\n", " 'text_kwargs': {'fontsize': 36.0},\n", " 'gridlines': True,\n", " 'ylabel_vert': 'Altitude (m)',\n", " 'domain_type': ['all'],\n", " 'domain_name': ['Los Angeles'],\n", " 'data': ['aeromma_ufsaqm'],\n", " 'data_proc': {'rem_obs_nan': True,\n", " 'set_axis': False,\n", " 'interquartile_style': 'box'},\n", " 'altitude_variable': 'altitude',\n", " 'vertprofile_bins': {'range': {'start': 0, 'stop': 4000, 'step': 500}},\n", " 'vmin': -1,\n", " 'vmax': 4001},\n", " 'plot_grp3': {'type': 'violin',\n", " 'fig_kwargs': {'figsize': [10, 8]},\n", " 'text_kwargs': {'fontsize': 24.0},\n", " 'domain_type': ['all'],\n", " 'domain_name': ['Los Angeles'],\n", " 'data': ['aeromma_ufsaqm'],\n", " 'data_proc': {'rem_obs_nan': True, 'set_axis': False}},\n", " 'plot_grp3a': {'type': 'violin',\n", " 'fig_kwargs': {'figsize': [10, 8]},\n", " 'text_kwargs': {'fontsize': 24.0},\n", " 'gridlines': True,\n", " 'domain_type': ['all'],\n", " 'domain_name': ['Los Angeles'],\n", " 'data': ['aeromma_ufsaqm'],\n", " 'data_proc': {'rem_obs_nan': True,\n", " 'set_axis': False,\n", " 'set_stat_sig': True}},\n", " 'plot_grp4': {'type': 'scatter_density',\n", " 'fig_kwargs': {'figsize': [10, 10]},\n", " 'default_plot_kwargs': {'linewidth': 4.0, 'markersize': 10.0},\n", " 'text_kwargs': {'fontsize': 24.0},\n", " 'domain_type': ['all'],\n", " 'domain_name': ['Los Angeles'],\n", " 'data': ['aeromma_ufsaqm'],\n", " 'data_proc': {'rem_obs_nan': True,\n", " 'set_axis': False,\n", " 'vmin_x': None,\n", " 'vmax_x': None,\n", " 'vmin_y': None,\n", " 'vmax_y': None},\n", " 'color_map': {'colors': ['royalblue', 'cyan', 'yellow', 'orange'],\n", " 'over': 'red',\n", " 'under': 'blue'},\n", " 'xlabel': 'Model',\n", " 'ylabel': 'Observation',\n", " 'title': 'Scatter Density Plot',\n", " 'fill': True,\n", " 'shade_lowest': True,\n", " 'vcenter': None,\n", " 'extensions': ['min', 'max']},\n", " 'plot_grp5': {'type': 'taylor',\n", " 'fig_kwargs': {'figsize': [8, 8]},\n", " 'default_plot_kwargs': {'linewidth': 2.0, 'markersize': 10.0},\n", " 'text_kwargs': {'fontsize': 16.0},\n", " 'domain_type': ['all'],\n", " 'domain_name': ['Los Angeles'],\n", " 'data': ['aeromma_ufsaqm'],\n", " 'data_proc': {'rem_obs_nan': True, 'set_axis': False}},\n", " 'plot_grp6': {'type': 'boxplot',\n", " 'fig_kwargs': {'figsize': [8, 6]},\n", " 'text_kwargs': {'fontsize': 24.0},\n", " 'domain_type': ['all'],\n", " 'domain_name': ['Los Angeles'],\n", " 'data': ['aeromma_ufsaqm'],\n", " 'data_proc': {'rem_obs_nan': True, 'set_axis': False}},\n", " 'plot_grp6a': {'type': 'boxplot',\n", " 'fig_kwargs': {'figsize': [8, 6]},\n", " 'text_kwargs': {'fontsize': 24.0},\n", " 'gridlines': True,\n", " 'domain_type': ['all'],\n", " 'domain_name': ['Los Angeles'],\n", " 'data': ['aeromma_ufsaqm'],\n", " 'data_proc': {'rem_obs_nan': True,\n", " 'set_axis': False,\n", " 'set_stat_sig': True}}}}" ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "control_fn='control_read_looped_aircraft_AEROMMA_UFS_AQM.yaml'\n", "an.control=control_fn\n", "an.read_control() \n", "an.control_dict " ] }, { "cell_type": "markdown", "id": "a0afea3b-d7cd-4dbd-afb0-4364ddaf08dd", "metadata": {}, "source": [ "Now, some of our analysis object’s attributes are populated:" ] }, { "cell_type": "code", "execution_count": 5, "id": "1adbb5e0-17a1-4420-978a-7c2b27bc0452", "metadata": { "ExecuteTime": { "end_time": "2025-06-18T19:55:08.267454Z", "start_time": "2025-06-18T19:55:08.251489Z" } }, "outputs": [ { "data": { "text/plain": [ "analysis(\n", " control='control_read_looped_aircraft_AEROMMA_UFS_AQM.yaml',\n", " control_dict=...,\n", " models={},\n", " obs={},\n", " paired={},\n", " start_time=Timestamp('2023-06-27 00:00:00'),\n", " end_time=Timestamp('2023-06-28 23:59:00'),\n", " time_intervals=None,\n", " download_maps=True,\n", " output_dir='./output/aeromma_ufsaqm',\n", " output_dir_save='./output/aeromma_ufsaqm',\n", " output_dir_read='./output/aeromma_ufsaqm',\n", " debug=True,\n", " save=None,\n", " read={'paired': {'method': 'netcdf', 'filenames': {'aeromma_ufsaqm': ['example:ufsaqm:merge_0627_L1', 'example:ufsaqm:merge_0627_L2']}}},\n", " regrid=False,\n", ")" ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "an" ] }, { "cell_type": "markdown", "id": "f31ca9b8-be51-4116-831a-dc8568b964ac", "metadata": {}, "source": [ "## Load the model data" ] }, { "cell_type": "markdown", "id": "bd3879b4-511a-4566-8019-96a07f52bca0", "metadata": {}, "source": [ "The driver will automatically loop through the “models” found in the model section of the YAML file and create an instance of melodies_monet.driver.model for each that includes the\n", "\n", "* label\n", "\n", "* mapping information\n", "\n", "* file names (can be expressed using a glob expression)\n", "\n", "* xarray object" ] }, { "cell_type": "code", "execution_count": 6, "id": "70b89dc7-5d11-47b6-b53f-ef1a05b2bf59", "metadata": { "ExecuteTime": { "end_time": "2025-06-18T19:56:02.824085Z", "start_time": "2025-06-18T19:55:08.347335Z" }, "scrolled": true }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "ufs\n", "example:ufsaqm:model_data\n", "**** Reading UFS-AQM or UFS-Chem model output...\n" ] } ], "source": [ "an.open_models()" ] }, { "cell_type": "markdown", "id": "dc72de4e-8379-4dd7-bdb5-747882b800c0", "metadata": {}, "source": [ "Applying open_models() populates the models attribute." ] }, { "cell_type": "code", "execution_count": 7, "id": "49e3ca24-3457-4bc6-9b0b-f247661c64de", "metadata": { "ExecuteTime": { "end_time": "2025-06-18T19:56:02.855597Z", "start_time": "2025-06-18T19:56:02.839894Z" } }, "outputs": [ { "data": { "text/plain": [ "{'ufsaqm': model(\n", " model='ufs',\n", " is_global=False,\n", " radius_of_influence=19500,\n", " mod_kwargs={'var_list': ['no_ave', 'no2_ave', 'o3_ave', 'co', 'lat', 'lon', 'phalf', 'tmp', 'pressfc', 'dpres', 'hgtsfc', 'delz']},\n", " file_str='example:ufsaqm:model_data',\n", " label='ufsaqm',\n", " obj=...,\n", " extra_calc=None,\n", " mapping={'aeromma': {'no2_ave': 'NO2_LIF', 'no_ave': 'NO_LIF', 'o3_ave': 'O3_CL', 'co': 'CO_LGR'}},\n", " variable_dict={'temp_model': {'rename': 'temp_model', 'unit_scale': 1, 'unit_scale_method': '*'}, 'pressure_model': {'rename': 'pressure_model', 'unit_scale': 1, 'unit_scale_method': '*'}},\n", " label='ufsaqm',\n", " ...\n", " )}" ] }, "execution_count": 7, "metadata": {}, "output_type": "execute_result" } ], "source": [ "an.models" ] }, { "cell_type": "markdown", "id": "b33c2c76-296c-40da-82bd-e7c33a23c4f3", "metadata": {}, "source": [ "We can access the underlying dataset with the obj attribute." ] }, { "cell_type": "code", "execution_count": 8, "id": "c5343592-fc96-467a-9d09-c3a380b9eed0", "metadata": { "ExecuteTime": { "end_time": "2025-06-18T19:56:03.105010Z", "start_time": "2025-06-18T19:56:02.903101Z" } }, "outputs": [ { "data": { "text/html": [ "
<xarray.Dataset> Size: 880MB\n",
"Dimensions: (time: 1, z: 64, y: 488, x: 775)\n",
"Coordinates:\n",
" latitude (y, x) float64 3MB dask.array<chunksize=(488, 775), meta=np.ndarray>\n",
" longitude (y, x) float64 3MB dask.array<chunksize=(488, 775), meta=np.ndarray>\n",
" * time (time) datetime64[ns] 8B 2023-06-27T13:00:00\n",
"Dimensions without coordinates: z, y, x\n",
"Data variables:\n",
" no_ave (time, z, y, x) float32 97MB dask.array<chunksize=(1, 64, 488, 775), meta=np.ndarray>\n",
" no2_ave (time, z, y, x) float32 97MB dask.array<chunksize=(1, 64, 488, 775), meta=np.ndarray>\n",
" o3_ave (time, z, y, x) float32 97MB dask.array<chunksize=(1, 64, 488, 775), meta=np.ndarray>\n",
" co (time, z, y, x) float32 97MB dask.array<chunksize=(1, 64, 488, 775), meta=np.ndarray>\n",
" temp_model (time, z, y, x) float32 97MB dask.array<chunksize=(1, 64, 488, 775), meta=np.ndarray>\n",
" surfpres_pa (time, y, x) float32 2MB dask.array<chunksize=(1, 488, 775), meta=np.ndarray>\n",
" dp_pa (time, z, y, x) float32 97MB dask.array<chunksize=(1, 64, 488, 775), meta=np.ndarray>\n",
" surfalt_m (time, y, x) float32 2MB dask.array<chunksize=(1, 488, 775), meta=np.ndarray>\n",
" dz_m (time, z, y, x) float32 97MB dask.array<chunksize=(1, 64, 488, 775), meta=np.ndarray>\n",
" pressure_model (time, z, y, x) float32 97MB dask.array<chunksize=(1, 64, 488, 775), meta=np.ndarray>\n",
" alt_msl_m_full (time, z, y, x) float32 97MB dask.array<chunksize=(1, 64, 488, 775), meta=np.ndarray>\n",
"Attributes: (12/15)\n",
" ak: [2.0000000e+01 6.4247002e+01 1.3778999e+02 2.2195799e+02 3....\n",
" bk: [0.0000000e+00 0.0000000e+00 0.0000000e+00 0.0000000e+00 0....\n",
" cen_lat: 50.0\n",
" cen_lon: -118.0\n",
" dlat: 0.11690814\n",
" dlon: 0.11690814\n",
" ... ...\n",
" lat1: -28.5\n",
" lat2: 28.5\n",
" lon1: -45.25\n",
" lon2: 45.25\n",
" ncnsto: 202\n",
" source: FV3GFS<xarray.Dataset> Size: 14kB\n",
"Dimensions: (time: 173)\n",
"Coordinates:\n",
" * time (time) datetime64[ns] 1kB 2023-06-27T16:09:00 ... 2023-06-2...\n",
"Data variables:\n",
" CO_LGR (time) float64 1kB 143.8 143.0 138.4 ... 141.8 140.9 140.4\n",
" pressure_obs (time) float64 1kB 9.025e+04 8.86e+04 ... 9.171e+04 9.246e+04\n",
" temp_obs (time) float64 1kB 293.5 292.1 290.8 ... 295.6 297.4 298.0\n",
" latitude (time) float64 1kB 34.63 34.63 34.65 ... 34.6 34.61 34.62\n",
" longitude (time) float64 1kB -118.1 -118.1 -118.2 ... -118.1 -118.1\n",
" altitude (time) float64 1kB 982.2 1.143e+03 1.303e+03 ... 857.9 799.8\n",
" NO_LIF (time) float64 1kB 0.2573 0.3002 0.2095 ... 0.2122 0.1891\n",
" NO2_LIF (time) float64 1kB nan nan nan nan ... 0.876 0.6945 0.5946\n",
" O3_CL (time) float64 1kB 56.09 56.16 57.02 ... 62.91 63.3 63.07