CMAQv5.1 Tools and Utilities

From CMASWIKI
Jump to: navigation, search

Release Date: April 2016


Overview

CMAQv5.1 includes optional utility programs to process and prepare data for model evaluation. These programs are located in the $M3MODEL/TOOLS archive.

Updates in v5.1 release compared to v5.0.2 tools release:

  1. Observation data are now available from nine air quality monitoring networks for 2001 - 2014.
    • The formatted observation data files needed for running the sitecmp utility to match model and observed data are available on the CMAS Data Clearinghouse.
    • Previously users had to download these data from various locations on the web and possibly reformat them to be compatible with the sitecmp utility for matching modeled and observed data.
    • Observation files are included for the years 2001 through 2014 for the following networks: AERONET, AMON, AQS (hourly data), CASTNET (hourly and weekly data), CSN, IMPROVE, NADP, NAPS, and SEARCH (hourly and daily data). See the accompanying README file in the obs_data folder with information on the creation dates for each of the observation files.
    • The program used to create these formatted observation data (merge_aqs_species) has also be included in this release.
  2. The combine utility has been expanded to support post-processing of outputs from the two-way WRF-CMAQ system.
  3. The sitecmp utility for matching observed and modeled data has been updated.
    • The utility now runs faster for most situations because data are read only once if sufficient memory is available.
    • The header of the output .csv files has been changed to allow for easier import into R.
    • A field for the AQS parameter occurrence code was added and is used in conjunction with the station ID to identify a unique observation. For observation files without a POCode field, a default value of 1 is used.
    • Missing values were changed from -99 to -999.
    • Longitudes are no longer forced to be negative.
  4. The process for evaluation of the daily max 8hr ozone metric has been simplified.
    • airs2ext, cast2ext and rd_airs are no longer needed to reformat hourly AQS and CASTNET observed data.
    • A new program sitecmp_dailyo3 is now provided to match observed and modeled daily max 8hr ozone values. The run script for the new utility is very similar to the sitecmp utility.
  5. Four additional Fortran utilities have been released:
    • appendwrf -- user can concatenate variables from multiple WRF input or output files into a single file along the time (unlimited) dimension
    • bldoverlay -- user can create an observation overlay file that can be imported into either PAVE or VERDI
    • hr2day -- user can create gridded IOAPI files with daily values (e.g. daily sum, daily max 8hr average, etc.) from gridded IOAPI files containing hourly values
    • writesite -- user can generate a csv file from an IOAPI data file for a set of species at defined site locations

Installation Instructions

The CMAQv5.1 Evaluation Tools tarfile includes source code and scripts (build and run) for each postprocessing program. To install the CMAQ_TOOLS, first download, install, and build the base version of the model. Then download the CMAQ_TOOLS tar file and untar into the CMAQv5.1 home directory:

cd $M3HOME/../
tar xvzf CMAQv5.1_CMAQ_TOOLS.Apr2016.tar.gz

Use the bldit scripts as you would the base cctm build script, as described for each processing program below.

Note that you will need to have the libraries (I/O API, netCDF, MPI) and model builder (bldmake) required by the base model to compile this version of the code. See the base model README for instructions on building these components.


Build CMAQ_TOOLS executables

Create the processing tools executables following these steps:


cd $M3HOME/scripts/tools/appendwrf
./bldit.appendwrf |& tee bldit.appendwrf.log
cd $M3HOME/scripts/tools/bldoverlay
./bldit.bldoverlay |& tee bldit.bldoverlay.log
cd $M3HOME/scripts/tools/combine
./bldit.combine |& tee bldit.combine.log
cd $M3HOME/scripts/tools/hr2day
./bldit.hr2day |& tee bldit.hr2day.log
cd $M3HOME/scripts/tools/sitecmp
./bldit.sitecmp |& tee bldit.sitecmp.log
cd $M3HOME/scripts/tools/sitecmp_dailyo3
./bldit.sitecmp_dailyo3 |& tee bldit.sitecmp_dailyo3.log
cd $M3HOME/scripts/tools/writesite
./bldit.writesite |& tee bldit.writesite.log

Run Instructions

The CMAQ_TOOLS distribution package includes a full set of test input data needed to run for one simulation day. Instructions are provided below for the general approach for running CMAQ_TOOLS.

APPENDWRF utility program

This program concatenates variables from multiple WRF input or output files into a single file along the ìTimeî (unlimited) dimension. This can be useful in cases where a user may have WRF input or output files that were generated for shorter time periods and wants to combine them into files with longer (e.g. monthly) duration.

Environment variables used

INFILE_1      input file number 1, (max of 15).
INFILE_2      input file number 2, (max of 15).
OUTFILE       output file name


To Run: Edit the sample run script (run.appenwrf), then run:

run.appendwrf |& tee appendwrf.log

Check the log file to ensure complete and correct execution without errors.

BLDOVERLAY utility program

This program creates an observation overlay file that can be imported into either PAVE or VERDI. It requires as input a file containing observed data in a specific format, and then creates a PAVE/VERDI compatible overlay file.

Environment variables used

SDATE      start date in the format: YYYYDDD
EDATE      end date in the format: YYYYDDD
FILETYPE   type of input file to be used (see information below).  Choices are: OBS, SITES (default it OBS)
OLAYTYPE   type of data for the overlay output file.  If input data is daily this should be set to DAILY.
           If input data is hourly choices are: HOURLY, 1HRMAX, 8HRMAX
SPECIES    list of names of the species in the input file (e.g. setenv SPECIES 'O3,NO,CO')
UNITS      list of units of the species in the input file (e.g. setenv UNITS 'ppb,ppb,ppb')
INFILE     file containing input observed data
VALUE      static value to use as "observed" concentration for SITES filetype (default is 1)
OUTFILE    name of overlay file to create

Input file types and format:

Bldoverlay accepts "OBS" and "SITES" formats (FILETYPE) for the input file. For hourly output data (OLAYTYPE HOURLY) the program assumes that observations are in local standard time (LST) and applies a simple timezone shift to GMT using timezones every 15 degrees longitude. For daily output data (OLAYTYPE DAILY, 1HRMAX or 8HRMAX) no time shifting is done so the output data remains in LST. In this case the user can use the HR2DAY utility to time shift and average hourly model data to create daily model fields in LST.

OBS format:     The OBS format consists of comma separated values in the format YYYDDD, HH, Site_ID, Longitude, Latitude, Value1[, Value2, Value3,...]. 
                Note that if the input data is daily that an hour column (HH) is still required in the input data file.  In this case HH is ignored 
                so the user could set this value to 0 for all records.
SITES format:   Set to create a static site file using the value set by VALUE (default is 1). The format is a tab delimited file with the structure Site_ID Longitude Latitude.

To Run: Edit the sample run script (run.bldoverlay), then run:

run.bldoverlay |& tee bldoverlay.log

Check the log file to ensure complete and correct execution without errors.

COMBINE utility program

This program combines fields from a set of IOAPI or wrfout input files to an output file. The file assigned to environmental variable SPECIES_DEF defines the new species variables and how they are constructed. This means that all the species listed in the SPECIES_DEF files need to be output when CMAQ is being run. One option is to set the ACONC (or CONC) output to be all species.

Environment variables used

GENSPEC      Indicates to generate a new SPECIES_DEF file (does not generate OUTFILE)
SPECIES_DEF  Species definition file defining the new variables of the output file
INFILE1      input file number 1, (max of 9).
OUTFILE      IOAPI output file name, opened as read/write if it does not exist and read/write/update if it already exists

Environment Variables (not required):

IOAPI_ISPH  projection sphere type (use type #20 to match WRF/CMAQ)
            (the default for this program is 20, overriding the ioapi default of 8) 

Record type descriptions in SPECIES_DEF file

/ records are comment lines
! records are comment lines
# records can be used to define parameters
#start   YYYYDDD  HHMMSS
#end     YYYYDDD  HHMMSS
#layer      KLAY     (default is all layers)

All other records are read as variable definition records

format of variable definition records (comma seperated fields)
field 1: variable name (maximum of 16 characters)
field 2: units (maximum of 10 characters)
field 3: formular expression (maximum of 512 characters)

Formular expressions supports operators ^+-*/ and are evaluated from left to right using precedence order of ^*/+-. Order of evaluation can be forced by use of parentheses. When part of an expression is enclosed in parentheses, that part is evaluated first. Other supported functions include "LOG", "EXP", "SQRT", and "ABS". In addition, expresssions can be combined to create conditional statements of the form "expression_for_condition ? expresssion_if_true : expression_if_false".

Variables from input files are defined by their name followed by its file number enclosed in brackets. Once defined in a species definition file, variables can subsequently be referred to by their name and the number zero enclosed in brackets. Adding a + or - sign before the file number within the bracket instructs combine to use the variable value for the next or previous timestep instead of the current time step when evaluating the expression. This can be used to define variables that are computed as difference between the current and previous time step, for example to compute hourly precipitation as the difference in WRF cumulative precipitation values between successive timesteps.

Examples of possible expressions are shown in the sample SPECIES_DEF files distributed with the CMAQ_TOOLS package.


To run: Edit the sample run script (run.combine.aconc), then run:

run.combine.aconc |& tee combine.aconc.log

A sample run script for creating a combine file for evaluating deposition is also provided (run.combine.dep). Check the log file to ensure complete and correct execution without errors.


Note on the use of wrfout files as input to combine: In previous releases of combine, meteorological variables used as part of the SPECIES_DEF file needed to be in IOAPI files, typically these files would have been generated by MCIP. The ability to use wrfout files as input to combine was added in this release to support post-processing of outputs from the two-way model when MCIP files may not be available. While combine allows a combination of IOAPI and (netcdf) wrfout files as input files, the first input file (i.e. INFILE1) must be an IOAPI file and its grid description information will be used to define the grid for OUTFILE. Only wrfout variables defined with dimensions "west_east", "south_north", and optionally "bottom_top" can be utilized by combine and referenced in the SPECIES_DEF file. It is assumed that the projection used in the WRF simulation that generated the wrfout files is the same as the projection defined in the IOAPI files, specifically INFILE1. If necessary, combine will window the variables from the wrfout file to the domain specified in INFILE1, this often is the case when the CMAQ domain was a subset of the WRF domain.

Note on time steps: Unless "start" and "end" are defined in the SPECIES_DEF file, combine will determine the longest time period that is common to all input files and will produce outputs for that time period.

HR2DAY utility program

This program creates gridded IOAPI files with daily values from gridded IOAPI files containing hourly values.

Environment variables used

USELOCAL      use local time when computing daily values (default N)
USEDST        use daylight savings time when computing daily values (default N)
TZFILE        location of time zone data file, tz.csv (this is a required input file)
PARTIAL_DAY   allow use of partial days when computing daily values. If this is set to N, the program will require at least 18 out of 24 values to
              be present in the time zone of interest to compute a daily value (default N)
HROFFSET      constant hour offset between desired time zone and GMT to use when computing daily values.
              For example, if one wants to compute daily values with respect to Eastern Standard Time (EST) and the time zone for the IOAPI input
              file is GMT, this should be set to 5 (default 0).  
START_HOUR    starting hour to use when computing daily values (default 0)
END_HOUR      ending hour to use when computing daily values (default 23)
TEMPERATURE   temperature variable to be used in the @MAXT operation (default TEMP2)
INFILE        input IOAPI file name with hourly values. Supported map projections are Lambert conformal, polar stereographic, and lat/lon
OUTFILE       output IOAPI file name with computed daily values

Environment Variables (not required):

IOAPI_ISPH  projection sphere type (use type #20 to match WRF/CMAQ)
            (ioapi default is 8)

Species and operator definitions: Defines the name, units, expression and daily operation for each variable in OUTFILE. These definitions are specified by environment variables SPECIES_[n]

format:  SPECIES_1 = "[variable1_name], [variable1_units], [model_expression1], [operation1]"
         SPECIES_2 = "[variable2_name], [variable2_units], [model_expression2], [operation2]"

variable[n]_name: desired name of the daily output variable, maximum 16 characters
         
variable[n]_units: units of the daily output variable, maximum 16 characters
         
model_expression[n]: Formular expressions supports operators +-*/ and are evaluated from left to right using precedence order of */+-. Order of
                     evaluation can be forced by use of parentheses. When part of an expression is enclosed in parentheses, that part is evaluated
                     first.   Other supported functions include "LOG", "EXP", "SQRT", and "ABS". In addition, expresssions can be combined to create
                              conditional statements of the form "expression_for_condition ? expresssion_if_true :  expression_if_false". 
         
operation[n]: daily operation to perform. Options are
           
SUM - sums the 24 hour values
AVG- sums the 24 values and divides by 24
MIN- uses the minimum hourly value
MAX- uses the maximum hourly value
HR@MIN - hour of the minimum hourly value
HR@MAX - hour of the maximum hourly value
@MAXT - uses the hourly value at maximum temperature
MAXDIF - uses the maximum hourly change
8HRMAX - uses the maximum 8 hour period
W126 - computes the secondary ozone standard, weighted average between 8am & 7pm
@8HRMAXO3 - averages the value within the 8-hr-max ozone period
HR@8HRMAX - Starting hour of the 8-hr-max period 
SUM06 - computes the SUM06 ozone value

examples:
               
setenv SPECIES_1 "O3,ppbV,1000*O3,8HRMAX"    (computes the 8-hr daily maximum value of 1000 * O3 from INFILE (assumed to be in ppmV) and writes the 
                                              result to OUTFILE as O3 with units ppbV)
setenv SPECIES_2 "ASO4J_AVG,ug/m3,ASO4J,AVG" (computes the 24-hr average value of ASO4J from INFILE (assumed to be in ug/m3) and writes the result 
                                              to OUTFILE as ASO4J_AVG with units ug/m3)
setenv SPECIES_3 "ASO4J_MAX,ug/m3,ASO4J,MAX" (computes the daily maximum value of ASO4J from INFILE (assumed to be in ug/m3) and writes the result 
                                              to OUTFILE as ASO4J_MAX with units ug/m3)


MERGE_AQS_SPECIES utility program

This program creates a merged AQS data file from pre-generated files posted on the EPA's AQS website (link below). The user must specify the location where the merged files should be created, the base location of the downloaded AQS files (it is then assumed the files will be in sub-directories from the base directory of /YYYY/hourly and YYYY/daily). The user must also specify the year (YYYY) and whether merging daily or hourly files (the script must be run separately for each time average).

The formatted observation files generated from running this script have been provided in this release for 2001 - 2014. This utility is included to allow the user to generate formatted observation files for different years if needed.

This program requires the R script merge_aqs_species.R. The R code will work with the base installation of R (https://cran.r-project.org/) and does not require installation of any additional libraries.

This utility also requires .csv files downloaded from the EPA's AQS website:

http://aqsdr1.epa.gov/aqsweb/aqstmp/airdata/download_files.html

The required files from that site are:

daily_88101_****.csv
daily_88502_****.csv
daily_81102_****.csv
daily_HAPS_****.csv
daily_SPEC_****.csv
daily_VOCS_****.csv
hourly_88101_****.csv
hourly_88502_****.csv
hourly_81102_****.csv
hourly_SPEC_****.csv
hourly_HAPS_****.csv
hourly_NONOxNOy_****.csv
hourly_44201_****.csv
hourly_42401_****.csv
hourly_42101_****.csv
hourly_42602_****.csv
hourly_WIND_****.csv
hourly_TEMP_****.csv
hourly_PRESS_****.csv
hourly_RH_DP_****.csv
hourly_VOCS_****.csv

By default, the species merged are

daily: "PM25","PM10","SO4","NO3","NH4","OC","EC","Na","Cl","Al","Ca","Fe","Si","Ti",
"Mg","K","Mn","Benzene","Propylene","Toluene","Butadiene","Acrolein","Ethylene",
"Acetaldehyde","Formaldehyde","Isoprene","Ethane" 

hourly:"PM25","PM10","O3","CO","SO2","NO","NO2","NOX","NOY","Pressure","RH",
"Temperature","Dewpoint","Wind_Speed","Wind_Direction","Benzene","Propylene",
"Toluene","Butadiene","Isoprene","Ethane","Ethylene","SO4","NO3","OC","EC"


To Run: Edit the sample run script (run.merge.aqs.species), then run:

run.merge.aqs.species |& tee run.merge.aqs.species.log

Check the log file to ensure complete and correct execution without errors.

SITECMP utility program

This program generates a csv (comma separated values) file that compares CMAQ generated concentrations with an observed dataset.

Environment Variables (required):

TABLE_TYPE  dataset type {IMPROVE, CASTNET, STN, NADP, MDN, SEARCH,
            DEARS, AIRMON, OUT_TABLE}
M3_FILE_n   ioapi input files containing modeled species data (max of 12). 
            Supported map projections are Lambert conformal, polar stereographic, and lat/lon
SITE_FILE   input file containing site information for each monitor (site-id, longitude, latitude, 
            and optionally time zone offset between local time and GMT) (tab delimited)
IN_TABLE    input file with observed data (comma delimited with header)
OUT_TABLE   file for output data with columns of paired observed and modeled
            values

Environment Variables (not required):

PRECIP      defines the precipitation field used in WETDEP and
            WETCON calculations (default="Precip")
IOAPI_ISPH  projection sphere type (use type #20 to match WRF/CMAQ)
            (ioapi default is 8)
MISSING     string to indicate missing output data values
            (default="-999")
START_DATE  starting date of time period to process (YYYYJJJ)
START_TIME  starting time of time period to process (HHMMSS)
END_DATE    ending date of time period to process (YYYYJJJ)
END_TIME    ending time of time period to process (HHMMSS)
APPLY_DLS   apply daylight savings time (default N)
TIME_SHIFT  number of hours to add when retrieving time steps from M3_FILE_n files during processing. This should only be non-zero if the M3_FILE_n files
            were pre-processed with a utility like m3tshift (default 0)

Species Definitions: Defines the data columns for your output file. Each can specify the observed and modeled variables of the species you are analyzing. These definitions are specified by environment variables [species-type]_[1-50], where species type is one of the following {AERO, GAS, WETCON, WETDEP, PREC}. See the sample run scripts for additional examples beyond those listed below.


format: [Obs_expression], [Obs_units], [Mod_expression], [Mod_unit], [Variable_name]

expression format: [factor1]*Obs_name1 [+][-] [factor2]*Obs_name2 ...

types: AERO_n  (AEROSOL Variables (1-50) - compute average over time)
       GAS_n   (GAS Variables (1-50)  - compute average over time)
       WETCON_n (Wet Concentration Variables (1-50) - compute
                 volume-weighted average)
       WETDEP_n (Wet Deposition Variables (1-50) - compute
                 accumulated wet deposition)
       PREC_n  (Precipitation Variables (1-50) - compute
                accumulated precipitation)
       CHAR_n  (Character fields (1-50), copies from Obs file)

examples:
       AERO_1="SO4f_val,ug/m3, ASO4T,,sulfate"
             (this defines an aerosol species where the observed values
               are obtained from the "SO4f_val" column setting its units
               to ug/m3, the modeled values are obtained from the "ASO4T"
               variable using its predefined units, both columns will be
               named "sulfate")

       PREC_1="Sub Ppt,mm,10*RT,mm,Precip"
               (this defines a precipitation species where the observed
               values are obtained from the "Sub Ppt" column setting its
               units to mm, the modeled values are obtained by multiplying
               10 times the "RT" variable and setting its units to mm,
               both columns will be named "Precip")

       AERO_2="NH4f_val,ug/m3,,,ammonium"
               (this defines an aerosol species where the observed values
               are obtained from the "NH4f_val" column setting its units
               to ug/m3, there is no modeled values column, the column
               will be named ammonium)

       CHAR_1="NH4f_flag"
               (this defines a character field to copy only from the observed field,
               no units or modeled species are used)

File formats:

SITE_FILE - tab delimited text file containing site-id, longitude,
            latitude, and optionally time zone offset between local time and GMT

M3_FILE_n - IOAPI file containing modeled species data (n=1->12)

IN_TABLE  - text (csv) file containing observed data values
          Each type of dataset requires a site field and fields that define
          the data record's time period. These are the required fields for
          each type.

IMPROVE - site field: "site_code"
          date field: "obs_date"  (YYYYMMDD)
          The time period is 24 hours (midnight to midnight)

NADP    - site field: "Site"
          starting date: "DateOn" (MM/DD/YYYY)
          ending date:   "DateOff" (MM/DD/YYYY)
          The time period is 9:00am to 8:59am

STN     - (Use with CSN data)
          site field: "airs_site_code"
          date field: "DATETIME"  (MM/DD/YYYY)
          The time period is 24 hours (9:00am to 8:59am)

MDN     - site field: "SITE"
          starting date: "START" (MM/DD/YYYY)
          ending date: "STOP"    (MM/DD/YYYY)
          The time period is 9:00am to 8:59am

CASTNET - site field: "Site_id"
          starting date: "DateOn" ("YYYY-MM-DD hh:mm:ss")
          ending date: "DateOff"  ("YYYY-MM-DD hh:mm:ss")

MET     - site field" "site_id"
          starting date: "date_time" ("YYYY-MM-DD hh:mm:ss")
          ending date: 59 minutes added to starting time


SEARCH  - site field: "Site_id"
          starting date: "DateOn" (MM/DD/YYYY hh:mm)
          ending date: "DateOff"  (MM/DD/YYYY hh:mm)

DEARS   - site field: "PID"
          starting date: "StartDate" (MM/DD/YY)
          The time period is 24 hours (9:00am to 8:59am)

AIRMON  - site field: "Site"
          starting date: "Date/Time On" (MM/DD/YYYY hh:mm)
          ending date: "Date/Time Off"  (MM/DD/YYYY hh:mm)


OUT_TABLE - output (csv) text file containing columns of paired observed and
            modeled values

To Run: Edit the sample run script (run.sitecmp*), then run:

run.sitecmp |& tee sitecmp.log

Check the log file to ensure complete and correct execution without errors.

Sample run scripts have been provided for matching model data to observations from the following networks: AERONET, AMON, AQS (hourly data), CASTNET (hourly and weekly data), CSN, IMPROVE, NADP, and SEARCH (hourly and daily data). The formatted observation data files needed for running the sitecmp utility for 2001 through 2014 are available from CMAS. The run scripts for the CSN and SEARCH networks changes depending on the year being studied due to changes in the what species were reported and what names were used in the original data files. The R code used to format all of the observation data sets released is also included in CMAQ-Tools. See documentation for merge_aqs_species.

Note that the run scripts rely on model output that has already been processed using the combine utility. The user should first run combine on ACONC and DEP output files to create the necessary COMBINE_ACONC and COMBINE_DEP files that contain the model species that can be matched to available observations. See the sample run scripts for combine using the CMAQv5.1 benchmark case.

SITECMP_DAILYO3 utility program

This program generates a csv (comma separated values) file that compares various daily ozone metrics computed from hourly CMAQ generated and observed ozone concentrations. The metrics included in the output file are daily maximum 1-hr ozone concentrations, daily maximum 1-hr ozone concentrations in the nine cells surrounding a monitor, time of occurrence of daily maximum 1-hr ozone concentrations, daily maximum 8-hr ozone concentrations, daily maximum 8-hr ozone concentrations in the nine cells surrounding a monitor, time of occurrence of daily maximum 8-hr ozone concentrations, the daily W126 ozone value, and the daily SUM06 ozone value.


Environment Variables (required):

M3_FILE_n      IOAPI input file(s) containing hourly modeled ozone values (max of 10). Supported map projections are Lambert conformal, polar stereographic, and lat/lon
SITE_FILE      input file containing site information for each monitor (site-id, longitude, latitude, and optionally 
               time zone offset between local time and GMT) (tab delimited)
IN_TABLE       input file containing hourly observed ozone data (comma delimited with header). The file can contain 
               columns with species other than ozone, these will be ignored by sitecmp_dailyo3
OBS_SPECIES    name of the ozone species in the header line of IN_TABLE (default "O3" for AQS; use "OZONE" for CASTNET)
OZONE          comma separated string with expression and units for model ozone in M3_FILE_n ([Mod_expression], [Mod_unit])
               [Mod_expression] format: [factor1]*Mod_name1 [+][-] [factor2]*Mod_name2 ...
               [Mod_unit] is used in OUT_TABLE for the daily maximum 1-hr and 8-hr ozone metrics
               Example: setenv OZONE "1000*O3,ppbV"
OBS_FACTOR     conversion factor needed to convert OBS_SPECIES from IN_TABLE to [Mod_unit] specified in OZONE (default 1)
OUT_TABLE      file for output data with columns of paired observed and modeled daily ozone metrics

Environment Variables (not required):

START_DATE     starting date of time period to process (YYYYJJJ)
START_TIME     starting time of time period to process (HHMMSS)
END_DATE       ending date of time period to process (YYYYJJJ)
END_TIME       ending time of time period to process (HHMMSS)
PARTIAL_DAY    start and end hours for partial day calculations (HH,HH). Leave unset/blank for full day calculations. (default )
               Example: setenv PARTIAL_DAY "10,17" 
APPLY_DLS      apply daylight savings time (default N)
TIME_SHIFT     number of hours to add when retrieving time steps from M3_FILE_n files during processing. This should only be non-zero if the M3_FILE_n files
               were pre-processed with a utility like m3tshift (default 0)
QA_FLAG_CHECK  does IN_TABLE include a QA flag for ozone values, and should it be used? (Default N because not present in AQS data. Should set to Y for CASTNET) 
QA_FLAG_HEADER if QA_FLAG_CHECK is Y, name of the ozone QA flag in the header line of IN_TABLE (default "OZONE_F" to correspond to CASTNET data)
QA_FLAG_VALUES if QA_FLAG_CHECK is Y, string composed of single-character QA flags that should be treated as missing values (default "BCDFIMP" to correspond to CASTNET data)
MISSING        string to indicate missing output data values (default "m")
IOAPI_ISPH     projection sphere type (use type #20 to match WRF/CMAQ)(IOAPI default 8)
LAMBXY         include x/y projection values for each site in OUT_TABLE (default N)


File formats:

SITE_FILE - tab delimited text file containing site-id, longitude,
            latitude, and optionally time zone offset between local time and GMT

M3_FILE_n - IOAPI file containing hourly modeled ozone data (n=1->10)

IN_TABLE  - text (csv) file containing observed hourly ozone values in CASTNET table type format

CASTNET - site field: "Site_id"
          starting date: "DateOn" ("YYYY-MM-DD hh:mm:ss")
          ending date: "DateOff"  ("YYYY-MM-DD hh:mm:ss")

OUT_TABLE - output (csv) text file containing columns of paired observed and
            modeled values


To Run: Edit the sample run script (run.sitecmp_dailyo3), then run:

run.sitecmp_dailyo3|& tee sitecmp_dailyo3.log

Check the log file to ensure complete and correct execution without errors.

WRITESITE utility program

This program generates a csv file from an IOAPI data file for a set of species at defined site locations.

Options:

  1. Program can shift to local standard time for hourly data based on default time zone file
  2. Data at all cells or at defined site locations can be specified
  3. Date range can be specified
  4. Grid layer can be specified


Environment variables:

INFILE         name of IOAPI input file. Supported map projections are Lambert conformal, polar stereographic, and lat/lon
SITE_FILE      name of input file containing sites to process (default is all cells)
DELIMITER      delimiter used in site file (default is <tab>)
USECOLROW      site file contains column/row values (default is N, meaning lon/lat values will be used)
TZFILE         location of time zone data file, tz.csv (this is a required input file)
OUTFILE        name of output file
LAYER          grid layer to output (default is 1)
USELOCAL       adjust to local standard time (default is N)
TIMESHIFT      shifts time of data (default is 0)
PRTHEAD        switch to output header records (default is Y)
PRT_XY         switch to output map projection coordinates (default is Y) 
STARTDATE      first date to process (default is starting date of input file)
ENDDATE        last date to process (default is ending date of input file)
SPECIES_#      list of species to output

Environment Variables (not required):

IOAPI_ISPH  projection sphere type (use type #20 to match WRF/CMAQ)
            (ioapi default is 8)

To Run: Edit the sample run script (run.writesite), then run:

run.writesite|& tee writesite.log

Check the log file to ensure complete and correct execution without errors.

Observational Data

Download aerometric observations from the CMAS Center by year formatted for the CMAQ v5.1 Tools and Utilities. Dataset includes the following networks: AERONET, AQS (daily and hourly), CASTNET (hourly and weekly), CSN, IMPROVE, NADP, NAPS (daily and hourly), SEARCH, AMoN, and MDN.

README file documenting the source of the data.