AirNow and WRF-Chem
Our first example will demonstrate the basics available in MELODIES MONET to compare WRF-Chem model results against AirNow surface observations for ozone and PM2.5.
This example shows how one can compare results from two different model simulations against observations. This particular example compares WRF-Chem results using two different chemical mechanisms (RACM_ESRL and RACM_ESRL_VCP). Simulated surface ozone is slightly improved in WRF-Chem using the RACM_ESRL_VCP mechanism as compared to the AirNow observations.
First, we import the melodies_monet.driver
module.
from melodies_monet import driver
Please install s3fs if retrieving from the Amazon S3 Servers. Otherwise continue with local data
Analysis driver class
Now, lets create an instance of the analysis driver class, melodies_monet.driver.analysis
.
It consists of these main parts:
model instances
observation instances
a paired instance of both
an = driver.analysis()
Initially, most of our analysis
object’s attributes
are set to None
, though some have meaningful defaults:
an
analysis(
control='control.yaml',
control_dict=None,
models={},
obs={},
paired={},
start_time=None,
end_time=None,
time_intervals=None,
download_maps=True,
output_dir=None,
output_dir_save=None,
output_dir_read=None,
debug=False,
save=None,
read=None,
)
Control file
We set the YAML control file and begin by reading the file.
Note
Check out the Description of All YAML Options for info on how to create and modify these files.
an.control = "control_wrfchem_mech-0905_2.yaml"
an.read_control()
an.control_dict
Show code cell output
{'analysis': {'start_time': '2019-09-05-06:00:00',
'end_time': '2019-09-06-06:00:00',
'output_dir': './output/airnow_wrfchem',
'debug': True},
'model': {'RACM_ESRL': {'files': 'example:wrfchem:racm_esrl',
'mod_type': 'wrfchem',
'mod_kwargs': {'mech': 'racm_esrl_vcp', 'surf_only_nc': True},
'radius_of_influence': 12000,
'mapping': {'airnow': {'PM2_5_DRY': 'PM2.5', 'o3': 'OZONE'}},
'projection': None,
'plot_kwargs': {'color': 'magenta', 'marker': 's', 'linestyle': '-'}},
'RACM_ESRL_VCP': {'files': 'example:wrfchem:racm_esrl_vcp',
'mod_type': 'wrfchem',
'mod_kwargs': {'mech': 'racm_esrl_vcp', 'surf_only_nc': True},
'radius_of_influence': 12000,
'mapping': {'airnow': {'PM2_5_DRY': 'PM2.5', 'o3': 'OZONE'}},
'projection': None,
'plot_kwargs': {'color': 'gold', 'marker': 'o', 'linestyle': '-'}}},
'obs': {'airnow': {'use_airnow': True,
'filename': 'example:airnow:2019-09',
'obs_type': 'pt_sfc',
'variables': {'OZONE': {'unit_scale': 1,
'unit_scale_method': '*',
'nan_value': -1.0,
'ylabel_plot': 'Ozone (ppbv)',
'vmin_plot': 15.0,
'vmax_plot': 55.0,
'vdiff_plot': 20.0,
'nlevels_plot': 21},
'PM2.5': {'unit_scale': 1,
'unit_scale_method': '*',
'nan_value': -1.0,
'ylabel_plot': 'PM2.5 (ug/m3)',
'ty_scale': 2.0,
'vmin_plot': 0.0,
'vmax_plot': 22.0,
'vdiff_plot': 15.0,
'nlevels_plot': 23}}}},
'plots': {'plot_grp1': {'type': 'timeseries',
'fig_kwargs': {'figsize': [12, 6]},
'default_plot_kwargs': {'linewidth': 2.0, 'markersize': 10.0},
'text_kwargs': {'fontsize': 24.0},
'domain_type': ['all', 'state_name', 'epa_region'],
'domain_name': ['CONUS', 'CA', 'R9'],
'data': ['airnow_RACM_ESRL', 'airnow_RACM_ESRL_VCP'],
'data_proc': {'rem_obs_nan': True,
'ts_select_time': 'time_local',
'ts_avg_window': 'H',
'set_axis': True}},
'plot_grp2': {'type': 'taylor',
'fig_kwargs': {'figsize': [8, 8]},
'default_plot_kwargs': {'linewidth': 2.0, 'markersize': 10.0},
'text_kwargs': {'fontsize': 16.0},
'domain_type': ['all'],
'domain_name': ['CONUS'],
'data': ['airnow_RACM_ESRL', 'airnow_RACM_ESRL_VCP'],
'data_proc': {'rem_obs_nan': True, 'set_axis': True}},
'plot_grp3': {'type': 'spatial_bias',
'fig_kwargs': {'states': True, 'figsize': [10, 5]},
'text_kwargs': {'fontsize': 16.0},
'domain_type': ['all'],
'domain_name': ['CONUS'],
'data': ['airnow_RACM_ESRL', 'airnow_RACM_ESRL_VCP'],
'data_proc': {'rem_obs_nan': True, 'set_axis': True}},
'plot_grp4': {'type': 'spatial_overlay',
'fig_kwargs': {'states': True, 'figsize': [10, 5]},
'text_kwargs': {'fontsize': 16.0},
'domain_type': ['all', 'epa_region'],
'domain_name': ['CONUS', 'R9'],
'data': ['airnow_RACM_ESRL', 'airnow_RACM_ESRL_VCP'],
'data_proc': {'rem_obs_nan': True, 'set_axis': True}},
'plot_grp5': {'type': 'boxplot',
'fig_kwargs': {'figsize': [8, 6]},
'text_kwargs': {'fontsize': 20.0},
'domain_type': ['all'],
'domain_name': ['CONUS'],
'data': ['airnow_RACM_ESRL', 'airnow_RACM_ESRL_VCP'],
'data_proc': {'rem_obs_nan': True, 'set_axis': False}},
'plot_grp6': {'type': 'multi_boxplot',
'fig_kwargs': {'figsize': [10, 8]},
'text_kwargs': {'fontsize': 20.0},
'domain_type': ['all'],
'domain_name': ['CONUS'],
'region_name': ['epa_region'],
'region_list': ['R1',
'R2',
'R3',
'R4',
'R5',
'R6',
'R7',
'R8',
'R9',
'R10'],
'model_name_list': ['AirNow', 'WRF-Chem', 'WRF-Chem vcp'],
'data': ['airnow_RACM_ESRL', 'airnow_RACM_ESRL_VCP'],
'data_proc': {'rem_obs_nan': True, 'set_axis': False}}},
'stats': {'stat_list': ['MB', 'MdnB', 'R2', 'RMSE'],
'round_output': 2,
'output_table': False,
'output_table_kwargs': {'figsize': [7, 3],
'fontsize': 12.0,
'xscale': 1.4,
'yscale': 1.4,
'edges': 'horizontal'},
'domain_type': ['all'],
'domain_name': ['CONUS'],
'data': ['airnow_RACM_ESRL', 'airnow_RACM_ESRL_VCP']}}
Note: This is the complete file that was loaded.
1# General Description:
2# - Any key that is specific for a plot type will begin with `ts` for timeseries, `ty` for taylor.
3# - Some keys/groups are optional.
4# - For now, all plots except time series average over the analysis window.
5# - Setting axis values
6# - If set_axis = True in data_proc section of each plot_grp,
7# the yaxis for the plot will be set based on the values
8# specified in the obs section for each variable.
9# - If set_axis is set to False, then defaults will be used.
10# - 'vmin_plot' and 'vmax_plot' are needed for
11# 'timeseries', 'spatial_overlay', and 'boxplot'.
12# - 'vdiff_plot' is needed for 'spatial_bias' plots
13# - 'ty_scale' is needed for 'taylor' plots.
14# - 'nlevels' or the number of levels used in the contour plot can also optionally be provided for spatial_overlay plot.
15# - If set_axis = True and the proper limits are not provided in the obs section,
16# a warning will print, and the plot will be created using the default limits.
17analysis:
18 start_time: "2019-09-05-06:00:00" # UTC
19 end_time: "2019-09-06-06:00:00" # UTC
20 output_dir: ./output/airnow_wrfchem # relative to the program using this control file
21 # Currently, the directory must exist or plot saving will error and fail.
22 debug: True
23
24model:
25 RACM_ESRL: # model label
26 files: example:wrfchem:racm_esrl
27 mod_type: "wrfchem"
28 mod_kwargs:
29 mech: "racm_esrl_vcp"
30 surf_only_nc: True # specify that we have only one vertical level; WRF-Chem specific
31 radius_of_influence: 12000 # meters
32 mapping: # of _model_ species name to _obs_ species name
33 airnow: # specifically for the obs labeled 'airnow'
34 PM2_5_DRY: "PM2.5"
35 o3: "OZONE"
36 projection: ~
37 plot_kwargs: # optional
38 color: "magenta"
39 marker: "s"
40 linestyle: "-"
41 RACM_ESRL_VCP:
42 files: example:wrfchem:racm_esrl_vcp
43 mod_type: "wrfchem"
44 mod_kwargs:
45 mech: "racm_esrl_vcp"
46 surf_only_nc: True
47 radius_of_influence: 12000
48 mapping:
49 airnow:
50 PM2_5_DRY: "PM2.5"
51 o3: "OZONE"
52 projection: ~
53 plot_kwargs:
54 color: "gold"
55 marker: "o"
56 linestyle: "-"
57
58obs:
59 airnow: # obs label
60 use_airnow: True
61 filename: example:airnow:2019-09
62 obs_type: pt_sfc
63 variables: # optional
64 OZONE:
65 unit_scale: 1
66 # ^ optional; Scaling factor
67 unit_scale_method: "*"
68 # ^ optional; Multiply = '*' , Add = '+', subtract = '-', divide = '/'
69 nan_value: -1.0
70 # ^ optional; When loading data, set this value to NaN
71 ylabel_plot: "Ozone (ppbv)"
72 # optional; set ylabel in order to include units and/or other info
73 vmin_plot: 15.0
74 # ^ optional; Min for y-axis during plotting.
75 # To apply to a plot, change restrict_yaxis = True.
76 vmax_plot: 55.0
77 # ^ optional; Max for y-axis during plotting.
78 # To apply to a plot, change restrict_yaxis = True.
79 vdiff_plot: 20.0
80 # ^ optional; +/- range to use in bias plots.
81 # To apply to a plot, change restrict_yaxis = True.
82 nlevels_plot: 21
83 # ^ optional; number of levels used in colorbar for contourf plot.
84 PM2.5:
85 unit_scale: 1
86 unit_scale_method: "*"
87 # obs_min: 0
88 # ^ optional; set all values less than this value to NaN
89 # obs_max: 100
90 # ^ optional; set all values greater than this value to NaN
91 nan_value: -1.0
92 # Note: The obs_min, obs_max, and nan_values are set to NaN first
93 # and then the unit conversion is applied.
94 ylabel_plot: "PM2.5 (ug/m3)"
95 ty_scale: 2.0 # optional; `ty_` indicates for Taylor diagram plot
96 vmin_plot: 0.0
97 vmax_plot: 22.0
98 vdiff_plot: 15.0
99 nlevels_plot: 23
100
101plots:
102 plot_grp1:
103 type: "timeseries" # plot type
104 fig_kwargs: # optional; to define figure options
105 figsize: [12, 6] # figure size (width, height) in inches
106 default_plot_kwargs:
107 # ^ optional; Define defaults for all plots.
108 # Important: Model kwargs overwrite these.
109 linewidth: 2.0
110 markersize: 10.
111 text_kwargs: # optional
112 fontsize: 24.
113 domain_type: ["all", "state_name", "epa_region"]
114 # ^ List of domain types: 'all' or any domain in obs file.
115 # (e.g., airnow: epa_region, state_name, siteid, etc.)
116 domain_name: ["CONUS", "CA", "R9"]
117 # ^ List of domain names. If domain_type = all,
118 # the domain name is used in the plot title.
119 data: ["airnow_RACM_ESRL", "airnow_RACM_ESRL_VCP"]
120 # ^ make this a list of pairs in obs_model
121 # where the obs is the obs label and model is the model_label
122 data_proc: # optional??
123 rem_obs_nan: True
124 # ^ True: Remove all points where model or obs variable is NaN.
125 # False: Remove only points where model variable is NaN.
126 ts_select_time: "time_local" # `ts_` indicates this is time series plot-specific
127 # ^ Time used for avg and plotting
128 # Options: 'time' for UTC or 'time_local'
129 ts_avg_window: "H"
130 # ^ Options: None for no averaging, pandas resample rule (e.g., 'H', 'D')
131 set_axis: True
132 # ^ If true, add `vmin_plot` and `vmax_plot` for each variable in obs.
133
134 plot_grp2:
135 type: "taylor"
136 fig_kwargs:
137 figsize: [8, 8]
138 default_plot_kwargs:
139 linewidth: 2.0
140 markersize: 10.
141 text_kwargs:
142 fontsize: 16.
143 domain_type: ["all"]
144 domain_name: ["CONUS"]
145 data: ["airnow_RACM_ESRL", "airnow_RACM_ESRL_VCP"]
146 data_proc:
147 rem_obs_nan: True
148 set_axis: True
149
150 plot_grp3:
151 type: "spatial_bias"
152 fig_kwargs: # optional; For all spatial plots, specify map_kwargs here too.
153 states: True # such as whether to show the state boundaries
154 figsize: [10, 5]
155 text_kwargs:
156 fontsize: 16.
157 domain_type: ["all",]
158 domain_name: ["CONUS"]
159 data: ["airnow_RACM_ESRL", "airnow_RACM_ESRL_VCP"]
160 data_proc:
161 rem_obs_nan: True
162 set_axis: True
163
164 plot_grp4:
165 type: "spatial_overlay"
166 fig_kwargs:
167 states: True
168 figsize: [10, 5]
169 text_kwargs:
170 fontsize: 16.
171 domain_type: ["all", "epa_region"]
172 domain_name: ["CONUS", "R9"]
173 data: ["airnow_RACM_ESRL", "airnow_RACM_ESRL_VCP"]
174 data_proc:
175 rem_obs_nan: True
176 set_axis: True
177
178 plot_grp5:
179 type: "boxplot"
180 fig_kwargs:
181 figsize: [8, 6]
182 text_kwargs:
183 fontsize: 20.
184 domain_type: ["all"]
185 domain_name: ["CONUS"]
186 data: ["airnow_RACM_ESRL", "airnow_RACM_ESRL_VCP"]
187 data_proc:
188 rem_obs_nan: True
189 set_axis: False
190
191 plot_grp6:
192 type: "multi_boxplot"
193 fig_kwargs:
194 figsize: [10, 8]
195 text_kwargs:
196 fontsize: 20.
197 domain_type: ["all"]
198 domain_name: ["CONUS"]
199 region_name: ['epa_region']
200 region_list: ['R1','R2','R3','R4','R5','R6','R7','R8','R9','R10']
201 model_name_list: ['AirNow','WRF-Chem','WRF-Chem vcp']
202 data: ["airnow_RACM_ESRL", "airnow_RACM_ESRL_VCP"]
203 data_proc:
204 rem_obs_nan: True
205 set_axis: False
206
207stats:
208 # Stats require positive numbers, so if you want to calculate temperature use Kelvin!
209 # Wind direction has special calculations for AirNow if obs name is 'WD'
210 stat_list: ["MB", "MdnB", "R2", "RMSE"]
211 # ^ List stats to calculate. Dictionary of definitions included
212 # in submodule `plots/proc_stats`. Only stats listed below are currently working.
213 # Full calc list:
214 # ['STDO', 'STDP', 'MdnNB','MdnNE','NMdnGE',
215 # 'NO', 'NOP', 'NP', 'MO', 'MP', 'MdnO', 'MdnP',
216 # 'RM', 'RMdn', 'MB', 'MdnB', 'NMB', 'NMdnB', 'FB',
217 # 'ME','MdnE','NME', 'NMdnE', 'FE', 'R2', 'RMSE','d1',
218 # 'E1', 'IOA', 'AC']
219 round_output: 2 # optional; defaults to rounding to 3rd decimal place
220 output_table: False
221 # ^ Always outputs a .txt file.
222 # Optional to also output a Matplotlib figure table (image).
223 output_table_kwargs: # optional
224 figsize: [7, 3]
225 fontsize: 12.
226 xscale: 1.4
227 yscale: 1.4
228 edges: "horizontal"
229 domain_type: ["all"]
230 domain_name: ["CONUS"]
231 data: ["airnow_RACM_ESRL", "airnow_RACM_ESRL_VCP"]
Now, some of our analysis
object’s attributes are populated:
an
analysis(
control='control_wrfchem_mech-0905_2.yaml',
control_dict=...,
models={},
obs={},
paired={},
start_time=Timestamp('2019-09-05 06:00:00'),
end_time=Timestamp('2019-09-06 06:00:00'),
time_intervals=None,
download_maps=True,
output_dir='./output/airnow_wrfchem',
output_dir_save='./output/airnow_wrfchem',
output_dir_read='./output/airnow_wrfchem',
debug=True,
save=None,
read=None,
)
Load the model data
The driver will automatically loop through the “models” found in the model
section
of the YAML file and create an instance of melodies_monet.driver.model
for each
that includes the
label
mapping information
file names (can be expressed using a glob expression)
xarray object
Note: Relevant control file section.
1model:
2 RACM_ESRL: # model label
3 files: example:wrfchem:racm_esrl
4 mod_type: "wrfchem"
5 mod_kwargs:
6 mech: "racm_esrl_vcp"
7 surf_only_nc: True # specify that we have only one vertical level; WRF-Chem specific
8 radius_of_influence: 12000 # meters
9 mapping: # of _model_ species name to _obs_ species name
10 airnow: # specifically for the obs labeled 'airnow'
11 PM2_5_DRY: "PM2.5"
12 o3: "OZONE"
13 projection: ~
14 plot_kwargs: # optional
15 color: "magenta"
16 marker: "s"
17 linestyle: "-"
18 RACM_ESRL_VCP:
19 files: example:wrfchem:racm_esrl_vcp
20 mod_type: "wrfchem"
21 mod_kwargs:
22 mech: "racm_esrl_vcp"
23 surf_only_nc: True
24 radius_of_influence: 12000
25 mapping:
26 airnow:
27 PM2_5_DRY: "PM2.5"
28 o3: "OZONE"
29 projection: ~
30 plot_kwargs:
31 color: "gold"
32 marker: "o"
33 linestyle: "-"
an.open_models()
example:wrfchem:racm_esrl
**** Reading WRF-Chem model output...
example:wrfchem:racm_esrl_vcp
**** Reading WRF-Chem model output...
Applying open_models()
populates the models
attribute.
an.models
{'RACM_ESRL': model(
model='wrfchem',
radius_of_influence=12000,
mod_kwargs={'mech': 'racm_esrl_vcp', 'surf_only_nc': True, 'var_list': ['PM2_5_DRY', 'o3']},
file_str='example:wrfchem:racm_esrl',
label='RACM_ESRL',
obj=...,
mapping={'airnow': {'PM2_5_DRY': 'PM2.5', 'o3': 'OZONE'}},
label='RACM_ESRL',
...
),
'RACM_ESRL_VCP': model(
model='wrfchem',
radius_of_influence=12000,
mod_kwargs={'mech': 'racm_esrl_vcp', 'surf_only_nc': True, 'var_list': ['PM2_5_DRY', 'o3']},
file_str='example:wrfchem:racm_esrl_vcp',
label='RACM_ESRL_VCP',
obj=...,
mapping={'airnow': {'PM2_5_DRY': 'PM2.5', 'o3': 'OZONE'}},
label='RACM_ESRL_VCP',
...
)}
We can access the underlying dataset with the
obj
attribute.
an.models['RACM_ESRL'].obj
<xarray.Dataset> Dimensions: (y: 284, x: 440, time: 31, z: 1) Coordinates: longitude (y, x) float32 -122.3 -122.2 -122.1 ... -60.68 -60.52 -60.37 latitude (y, x) float32 21.19 21.22 21.24 21.27 ... 50.33 50.28 50.24 50.2 * time (time) datetime64[ns] 2019-09-05 ... 2019-09-06T06:00:00 Dimensions without coordinates: y, x, z Data variables: PM2_5_DRY (time, z, y, x) float32 3.046 3.048 2.84 ... 0.45 0.4506 0.4512 o3 (time, z, y, x) float32 30.0 30.0 30.0 30.0 ... 34.72 36.87 37.13 Attributes: (12/17) FieldType: 104 MemoryOrder: XYZ description: pm2.5 aerosol dry mass units: ug m^-3 stagger: coordinates: XLONG XLAT XTIME ... ... MOAD_CEN_LAT: 39.617638 STAND_LON: -97.0 MAP_PROJ: 1 CEN_LAT: 39.617638 CEN_LON: -97.77487 mapping_tables_to_airnow: {'OZONE': 'o3', 'PM2.5':...
Load the observational data
As with the model data, the driver will loop through the “observations” found in
the obs
section of the YAML file and create an instance of
melodies_monet.driver.observation
for each.
Note: Relevant control file section.
1obs:
2 airnow: # obs label
3 use_airnow: True
4 filename: example:airnow:2019-09
5 obs_type: pt_sfc
6 variables: # optional
7 OZONE:
8 unit_scale: 1
9 # ^ optional; Scaling factor
10 unit_scale_method: "*"
11 # ^ optional; Multiply = '*' , Add = '+', subtract = '-', divide = '/'
12 nan_value: -1.0
13 # ^ optional; When loading data, set this value to NaN
14 ylabel_plot: "Ozone (ppbv)"
15 # optional; set ylabel in order to include units and/or other info
16 vmin_plot: 15.0
17 # ^ optional; Min for y-axis during plotting.
18 # To apply to a plot, change restrict_yaxis = True.
19 vmax_plot: 55.0
20 # ^ optional; Max for y-axis during plotting.
21 # To apply to a plot, change restrict_yaxis = True.
22 vdiff_plot: 20.0
23 # ^ optional; +/- range to use in bias plots.
24 # To apply to a plot, change restrict_yaxis = True.
25 nlevels_plot: 21
26 # ^ optional; number of levels used in colorbar for contourf plot.
27 PM2.5:
28 unit_scale: 1
29 unit_scale_method: "*"
30 # obs_min: 0
31 # ^ optional; set all values less than this value to NaN
32 # obs_max: 100
33 # ^ optional; set all values greater than this value to NaN
34 nan_value: -1.0
35 # Note: The obs_min, obs_max, and nan_values are set to NaN first
36 # and then the unit conversion is applied.
37 ylabel_plot: "PM2.5 (ug/m3)"
38 ty_scale: 2.0 # optional; `ty_` indicates for Taylor diagram plot
39 vmin_plot: 0.0
40 vmax_plot: 22.0
41 vdiff_plot: 15.0
42 nlevels_plot: 23
an.open_obs()
an.obs
{'airnow': observation(
obs='airnow',
label='airnow',
file='example:airnow:2019-09',
obj=...,
type='pt_src',
type=None,
variable_dict={'OZONE': {'unit_scale': 1, 'unit_scale_method': '*', 'nan_value': -1.0, 'ylabel_plot': 'Ozone (ppbv)', 'vmin_plot': 15.0, 'vmax_plot': 55.0, 'vdiff_plot': 20.0, 'nlevels_plot': 21}, 'PM2.5': {'unit_scale': 1, 'unit_scale_method': '*', 'nan_value': -1.0, 'ylabel_plot': 'PM2.5 (ug/m3)', 'ty_scale': 2.0, 'vmin_plot': 0.0, 'vmax_plot': 22.0, 'vdiff_plot': 15.0, 'nlevels_plot': 23}},
)}
an.obs['airnow'].obj
<xarray.Dataset> Dimensions: (x: 3786, time: 2091, y: 1) Coordinates: * x (x) int64 0 1 2 3 4 5 6 7 ... 3779 3780 3781 3782 3783 3784 3785 * time (time) datetime64[ns] 2019-09-01 ... 2019-09-30T00:30:00 latitude (y, x) float64 ... longitude (y, x) float64 ... siteid (y, x) object ... Dimensions without coordinates: y Data variables: (12/30) BARPR (time, y, x) float64 ... BC (time, y, x) float64 ... CO (time, y, x) float64 ... NO (time, y, x) float64 ... NO2 (time, y, x) float64 ... NO2Y (time, y, x) float64 ... ... ... cmsa_name (y, x) float64 ... msa_code (y, x) float64 ... msa_name (y, x) object ... state_name (y, x) object ... epa_region (y, x) object ... time_local (time, y, x) datetime64[ns] ... Attributes: title: format: NetCDF-4 date_created: 2021-06-07
Pair model and observational data
Now, we create a melodies_monet.driver.pair
for each model–obs pair
using the pair_data()
routine.
%%time
an.pair_data()
Show code cell output
After pairing:
time BARPR BC CO NO NO2 NO2Y NOX NOY OZONE \
0 2019-09-01 00:00:00 -1.0 -1.0 -1.0 -1.0 -1.0 -1.0 -1.0 -1.0 25.0
1 2019-09-01 00:15:00 -1.0 -1.0 -1.0 -1.0 -1.0 -1.0 -1.0 -1.0 NaN
2 2019-09-01 00:30:00 -1.0 -1.0 -1.0 -1.0 -1.0 -1.0 -1.0 -1.0 NaN
3 2019-09-01 01:00:00 -1.0 -1.0 -1.0 -1.0 -1.0 -1.0 -1.0 -1.0 24.0
4 2019-09-01 01:15:00 -1.0 -1.0 -1.0 -1.0 -1.0 -1.0 -1.0 -1.0 NaN
... ... ... ... ... ... ... ... ... ... ...
7916521 2019-09-29 23:15:00 -1.0 -1.0 -1.0 -1.0 -1.0 -1.0 -1.0 -1.0 NaN
7916522 2019-09-29 23:30:00 -1.0 -1.0 -1.0 -1.0 -1.0 -1.0 -1.0 -1.0 NaN
7916523 2019-09-30 00:00:00 -1.0 -1.0 -1.0 -1.0 -1.0 -1.0 -1.0 -1.0 8.0
7916524 2019-09-30 00:15:00 -1.0 -1.0 -1.0 -1.0 -1.0 -1.0 -1.0 -1.0 NaN
7916525 2019-09-30 00:30:00 -1.0 -1.0 -1.0 -1.0 -1.0 -1.0 -1.0 -1.0 NaN
... longitude cmsa_name msa_code msa_name state_name \
0 ... -52.8167 -1.0 -1.0 CC
1 ... -52.8167 -1.0 -1.0 CC
2 ... -52.8167 -1.0 -1.0 CC
3 ... -52.8167 -1.0 -1.0 CC
4 ... -52.8167 -1.0 -1.0 CC
... ... ... ... ... ... ...
7916521 ... 69.2725 -1.0 -1.0
7916522 ... 69.2725 -1.0 -1.0
7916523 ... 69.2725 -1.0 -1.0
7916524 ... 69.2725 -1.0 -1.0
7916525 ... 69.2725 -1.0 -1.0
epa_region time_local siteid PM2_5_DRY o3
0 CA 2019-08-31 20:00:00 000010102 NaN NaN
1 CA 2019-08-31 20:15:00 000010102 NaN NaN
2 CA 2019-08-31 20:30:00 000010102 NaN NaN
3 CA 2019-08-31 21:00:00 000010102 NaN NaN
4 CA 2019-08-31 21:15:00 000010102 NaN NaN
... ... ... ... ... ..
7916521 DSUZ 2019-09-30 04:15:00 UZB010001 NaN NaN
7916522 DSUZ 2019-09-30 04:30:00 UZB010001 NaN NaN
7916523 DSUZ 2019-09-30 05:00:00 UZB010001 NaN NaN
7916524 DSUZ 2019-09-30 05:15:00 UZB010001 NaN NaN
7916525 DSUZ 2019-09-30 05:30:00 UZB010001 NaN NaN
[7916526 rows x 36 columns]
After pairing:
time BARPR BC CO NO NO2 NO2Y NOX NOY OZONE \
0 2019-09-01 00:00:00 -1.0 -1.0 -1.0 -1.0 -1.0 -1.0 -1.0 -1.0 25.0
1 2019-09-01 00:15:00 -1.0 -1.0 -1.0 -1.0 -1.0 -1.0 -1.0 -1.0 NaN
2 2019-09-01 00:30:00 -1.0 -1.0 -1.0 -1.0 -1.0 -1.0 -1.0 -1.0 NaN
3 2019-09-01 01:00:00 -1.0 -1.0 -1.0 -1.0 -1.0 -1.0 -1.0 -1.0 24.0
4 2019-09-01 01:15:00 -1.0 -1.0 -1.0 -1.0 -1.0 -1.0 -1.0 -1.0 NaN
... ... ... ... ... ... ... ... ... ... ...
7916521 2019-09-29 23:15:00 -1.0 -1.0 -1.0 -1.0 -1.0 -1.0 -1.0 -1.0 NaN
7916522 2019-09-29 23:30:00 -1.0 -1.0 -1.0 -1.0 -1.0 -1.0 -1.0 -1.0 NaN
7916523 2019-09-30 00:00:00 -1.0 -1.0 -1.0 -1.0 -1.0 -1.0 -1.0 -1.0 8.0
7916524 2019-09-30 00:15:00 -1.0 -1.0 -1.0 -1.0 -1.0 -1.0 -1.0 -1.0 NaN
7916525 2019-09-30 00:30:00 -1.0 -1.0 -1.0 -1.0 -1.0 -1.0 -1.0 -1.0 NaN
... longitude cmsa_name msa_code msa_name state_name \
0 ... -52.8167 -1.0 -1.0 CC
1 ... -52.8167 -1.0 -1.0 CC
2 ... -52.8167 -1.0 -1.0 CC
3 ... -52.8167 -1.0 -1.0 CC
4 ... -52.8167 -1.0 -1.0 CC
... ... ... ... ... ... ...
7916521 ... 69.2725 -1.0 -1.0
7916522 ... 69.2725 -1.0 -1.0
7916523 ... 69.2725 -1.0 -1.0
7916524 ... 69.2725 -1.0 -1.0
7916525 ... 69.2725 -1.0 -1.0
epa_region time_local siteid PM2_5_DRY o3
0 CA 2019-08-31 20:00:00 000010102 NaN NaN
1 CA 2019-08-31 20:15:00 000010102 NaN NaN
2 CA 2019-08-31 20:30:00 000010102 NaN NaN
3 CA 2019-08-31 21:00:00 000010102 NaN NaN
4 CA 2019-08-31 21:15:00 000010102 NaN NaN
... ... ... ... ... ..
7916521 DSUZ 2019-09-30 04:15:00 UZB010001 NaN NaN
7916522 DSUZ 2019-09-30 04:30:00 UZB010001 NaN NaN
7916523 DSUZ 2019-09-30 05:00:00 UZB010001 NaN NaN
7916524 DSUZ 2019-09-30 05:15:00 UZB010001 NaN NaN
7916525 DSUZ 2019-09-30 05:30:00 UZB010001 NaN NaN
[7916526 rows x 36 columns]
CPU times: user 45 s, sys: 39.5 s, total: 1min 24s
Wall time: 1min 19s
an.paired
{'airnow_RACM_ESRL': pair(
type='pt_sfc',
radius_of_influence=1000000.0,
obs='airnow',
model='RACM_ESRL',
model_vars=['PM2_5_DRY', 'o3'],
obs_vars=['PM2.5', 'OZONE'],
filename='airnow_RACM_ESRL.nc',
),
'airnow_RACM_ESRL_VCP': pair(
type='pt_sfc',
radius_of_influence=1000000.0,
obs='airnow',
model='RACM_ESRL_VCP',
model_vars=['PM2_5_DRY', 'o3'],
obs_vars=['PM2.5', 'OZONE'],
filename='airnow_RACM_ESRL_VCP.nc',
)}
an.paired['airnow_RACM_ESRL']
pair(
type='pt_sfc',
radius_of_influence=1000000.0,
obs='airnow',
model='RACM_ESRL',
model_vars=['PM2_5_DRY', 'o3'],
obs_vars=['PM2.5', 'OZONE'],
filename='airnow_RACM_ESRL.nc',
)
Plot
The plotting()
routine produces plots.
Note: Relevant control file section.
1plots:
2 plot_grp1:
3 type: "timeseries" # plot type
4 fig_kwargs: # optional; to define figure options
5 figsize: [12, 6] # figure size (width, height) in inches
6 default_plot_kwargs:
7 # ^ optional; Define defaults for all plots.
8 # Important: Model kwargs overwrite these.
9 linewidth: 2.0
10 markersize: 10.
11 text_kwargs: # optional
12 fontsize: 24.
13 domain_type: ["all", "state_name", "epa_region"]
14 # ^ List of domain types: 'all' or any domain in obs file.
15 # (e.g., airnow: epa_region, state_name, siteid, etc.)
16 domain_name: ["CONUS", "CA", "R9"]
17 # ^ List of domain names. If domain_type = all,
18 # the domain name is used in the plot title.
19 data: ["airnow_RACM_ESRL", "airnow_RACM_ESRL_VCP"]
20 # ^ make this a list of pairs in obs_model
21 # where the obs is the obs label and model is the model_label
22 data_proc: # optional??
23 rem_obs_nan: True
24 # ^ True: Remove all points where model or obs variable is NaN.
25 # False: Remove only points where model variable is NaN.
26 ts_select_time: "time_local" # `ts_` indicates this is time series plot-specific
27 # ^ Time used for avg and plotting
28 # Options: 'time' for UTC or 'time_local'
29 ts_avg_window: "H"
30 # ^ Options: None for no averaging, pandas resample rule (e.g., 'H', 'D')
31 set_axis: True
32 # ^ If true, add `vmin_plot` and `vmax_plot` for each variable in obs.
33
34 plot_grp2:
35 type: "taylor"
36 fig_kwargs:
37 figsize: [8, 8]
38 default_plot_kwargs:
39 linewidth: 2.0
40 markersize: 10.
41 text_kwargs:
42 fontsize: 16.
43 domain_type: ["all"]
44 domain_name: ["CONUS"]
45 data: ["airnow_RACM_ESRL", "airnow_RACM_ESRL_VCP"]
46 data_proc:
47 rem_obs_nan: True
48 set_axis: True
49
50 plot_grp3:
51 type: "spatial_bias"
52 fig_kwargs: # optional; For all spatial plots, specify map_kwargs here too.
53 states: True # such as whether to show the state boundaries
54 figsize: [10, 5]
55 text_kwargs:
56 fontsize: 16.
57 domain_type: ["all",]
58 domain_name: ["CONUS"]
59 data: ["airnow_RACM_ESRL", "airnow_RACM_ESRL_VCP"]
60 data_proc:
61 rem_obs_nan: True
62 set_axis: True
63
64 plot_grp4:
65 type: "spatial_overlay"
66 fig_kwargs:
67 states: True
68 figsize: [10, 5]
69 text_kwargs:
70 fontsize: 16.
71 domain_type: ["all", "epa_region"]
72 domain_name: ["CONUS", "R9"]
73 data: ["airnow_RACM_ESRL", "airnow_RACM_ESRL_VCP"]
74 data_proc:
75 rem_obs_nan: True
76 set_axis: True
77
78 plot_grp5:
79 type: "boxplot"
80 fig_kwargs:
81 figsize: [8, 6]
82 text_kwargs:
83 fontsize: 20.
84 domain_type: ["all"]
85 domain_name: ["CONUS"]
86 data: ["airnow_RACM_ESRL", "airnow_RACM_ESRL_VCP"]
87 data_proc:
88 rem_obs_nan: True
89 set_axis: False
%%time
an.plotting()
Reference std: 5.409628418207226
Warning: ty_scale not specified for OZONE, so default used.
Reference std: 16.454896847070792
Warning: ty_scale not specified for OZONE, so default used.
/home/btang6/MELODIES-MONET/melodies_monet/plots/surfplots.py:341: FutureWarning: The default value of numeric_only in DataFrameGroupBy.mean is deprecated. In a future version, numeric_only will default to False. Either specify numeric_only or select only columns which should be valid for the function.
df_mean=df.groupby(['siteid'],as_index=False).mean()
/home/btang6/MELODIES-MONET/melodies_monet/plots/surfplots.py:341: FutureWarning: The default value of numeric_only in DataFrameGroupBy.mean is deprecated. In a future version, numeric_only will default to False. Either specify numeric_only or select only columns which should be valid for the function.
df_mean=df.groupby(['siteid'],as_index=False).mean()
/home/btang6/MELODIES-MONET/melodies_monet/plots/surfplots.py:341: FutureWarning: The default value of numeric_only in DataFrameGroupBy.mean is deprecated. In a future version, numeric_only will default to False. Either specify numeric_only or select only columns which should be valid for the function.
df_mean=df.groupby(['siteid'],as_index=False).mean()
/home/btang6/MELODIES-MONET/melodies_monet/plots/surfplots.py:341: FutureWarning: The default value of numeric_only in DataFrameGroupBy.mean is deprecated. In a future version, numeric_only will default to False. Either specify numeric_only or select only columns which should be valid for the function.
df_mean=df.groupby(['siteid'],as_index=False).mean()
/home/btang6/MELODIES-MONET/melodies_monet/plots/surfplots.py:693: FutureWarning: The default value of numeric_only in DataFrameGroupBy.mean is deprecated. In a future version, numeric_only will default to False. Either specify numeric_only or select only columns which should be valid for the function.
df_mean=df.groupby(['siteid'],as_index=False).mean()
/home/btang6/MELODIES-MONET/melodies_monet/plots/surfplots.py:693: FutureWarning: The default value of numeric_only in DataFrameGroupBy.mean is deprecated. In a future version, numeric_only will default to False. Either specify numeric_only or select only columns which should be valid for the function.
df_mean=df.groupby(['siteid'],as_index=False).mean()
/home/btang6/MELODIES-MONET/melodies_monet/plots/surfplots.py:693: FutureWarning: The default value of numeric_only in DataFrameGroupBy.mean is deprecated. In a future version, numeric_only will default to False. Either specify numeric_only or select only columns which should be valid for the function.
df_mean=df.groupby(['siteid'],as_index=False).mean()
/home/btang6/MELODIES-MONET/melodies_monet/plots/surfplots.py:693: FutureWarning: The default value of numeric_only in DataFrameGroupBy.mean is deprecated. In a future version, numeric_only will default to False. Either specify numeric_only or select only columns which should be valid for the function.
df_mean=df.groupby(['siteid'],as_index=False).mean()
/home/btang6/MELODIES-MONET/melodies_monet/plots/surfplots.py:693: FutureWarning: The default value of numeric_only in DataFrameGroupBy.mean is deprecated. In a future version, numeric_only will default to False. Either specify numeric_only or select only columns which should be valid for the function.
df_mean=df.groupby(['siteid'],as_index=False).mean()
/home/btang6/MELODIES-MONET/melodies_monet/plots/surfplots.py:693: FutureWarning: The default value of numeric_only in DataFrameGroupBy.mean is deprecated. In a future version, numeric_only will default to False. Either specify numeric_only or select only columns which should be valid for the function.
df_mean=df.groupby(['siteid'],as_index=False).mean()
/home/btang6/MELODIES-MONET/melodies_monet/plots/surfplots.py:693: FutureWarning: The default value of numeric_only in DataFrameGroupBy.mean is deprecated. In a future version, numeric_only will default to False. Either specify numeric_only or select only columns which should be valid for the function.
df_mean=df.groupby(['siteid'],as_index=False).mean()
/home/btang6/MELODIES-MONET/melodies_monet/plots/surfplots.py:693: FutureWarning: The default value of numeric_only in DataFrameGroupBy.mean is deprecated. In a future version, numeric_only will default to False. Either specify numeric_only or select only columns which should be valid for the function.
df_mean=df.groupby(['siteid'],as_index=False).mean()
CPU times: user 2min 53s, sys: 2min 29s, total: 5min 23s
Wall time: 5min 4s
The figures are saved in the directory specified by the
analysis
instance’s
output_dir
attribute.
Statistics
The stats()
routine produces tables of statistics.
Note: Relevant control file section.
1 plot_grp6:
2 type: "multi_boxplot"
3 fig_kwargs:
4 figsize: [10, 8]
5 text_kwargs:
6 fontsize: 20.
7 domain_type: ["all"]
8 domain_name: ["CONUS"]
9 region_name: ['epa_region']
10 region_list: ['R1','R2','R3','R4','R5','R6','R7','R8','R9','R10']
11 model_name_list: ['AirNow','WRF-Chem','WRF-Chem vcp']
12 data: ["airnow_RACM_ESRL", "airnow_RACM_ESRL_VCP"]
13 data_proc:
14 rem_obs_nan: True
15 set_axis: False
16
17stats:
18 # Stats require positive numbers, so if you want to calculate temperature use Kelvin!
19 # Wind direction has special calculations for AirNow if obs name is 'WD'
20 stat_list: ["MB", "MdnB", "R2", "RMSE"]
21 # ^ List stats to calculate. Dictionary of definitions included
22 # in submodule `plots/proc_stats`. Only stats listed below are currently working.
23 # Full calc list:
24 # ['STDO', 'STDP', 'MdnNB','MdnNE','NMdnGE',
25 # 'NO', 'NOP', 'NP', 'MO', 'MP', 'MdnO', 'MdnP',
%%time
an.stats()
CPU times: user 50.1 s, sys: 1min 20s, total: 2min 10s
Wall time: 1min 37s
The stats routine has produced two files (one for each data variable). This is one of them:
Stat_ID,Stat_FullName,airnow_RACM_ESRL,airnow_RACM_ESRL_VCP
MB,Mean_Bias,3.93,3.23
MdnB,Median_Bias,3.86,3.27
R2,Coefficient_of_Determination_(R2),0.56,0.54
RMSE,Root_Mean_Square_Error,11.64,11.59