CMAQ version 5.3release UNC10 Compiler Test

From CMASWIKI
Revision as of 12:53, 9 September 2019 by Lizadams (talk | contribs) (Created page with "= Timing Information for CMAQv5.3beta_UNC10 - September 2019 = The Community Multiscale Air Quality (CMAQ) modeling system Version 5.3 (CMAQv5.3) was released in Aug. 2019....")
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

Timing Information for CMAQv5.3beta_UNC10 - September 2019

The Community Multiscale Air Quality (CMAQ) modeling system Version 5.3 (CMAQv5.3) was released in Aug. 2019.

Release Testing

The CMAS Center tested the CMAQ release package with the debug and optimized compiler option to compare with the EPA. All tests were conducted with the U.S. EPA SE52 12km domain July 11, 2011 test dataset. The code used in this testing was obtained from EPA's github account, as a pull request to my github account. https://github.com/lizadams/CMAQ/pull/13

Fahim requested the configuration of how many processors I was running on. I ran using 8 processors for the optimized version.

  @ NPCOL  =  2; @ NPROW =  4

I tried running with 8 processors for the debug version, but it doesn't finish in 4 hours, so I am currently running the debug version on 16 processors.

  @ NPCOL  =  4; @ NPROW =  4


I obtained Fahim's output from atmos and transfered the first day's output to /proj/ie/proj/CMAS/CMAQ/from_EPA/output_CCTM_v53_gcc9.1_Bench_2016_12SE1 /proj/ie/proj/CMAS/CMAQ/from_EPA/output_CCTM_v53_gcc9.1_Bench_2016_12SE1_debug

Two classes of tests:

  • Compiler tests used the default benchmark configuration with the GNU debug compiler.

Compiler flags:

  • GCC DEBUG: mpifort -c -ffixed-form -ffixed-line-length-132 -funroll-loops -finit-character=32 -Wall -O0 -g -fcheck=all -ffpe-trap=invalid,zero,overflow -fbacktrace -I /proj/ie/proj/CMAS/CMAQ/CMAQv5.3_branch_UNC8/openmpi_4.0.1_gcc_9.1.0/lib/x86_64/gcc/ioapi/lib -I /proj/ie/proj/CMAS/CMAQ/CMAQv5.3_branch_UNC8/openmpi_4.0.1_gcc_9.1.0/lib/x86_64/gcc/ioapi/include_files -I /proj/ie/proj/CMAS/CMAQ/CMAQv5.3_branch_UNC8/openmpi_4.0.1_gcc_9.1.0/lib/x86_64/gcc/mpi -I. -Dparallel -Dm3dry_opt -DSUBST_BARRIER=SE_BARRIER -DSUBST_GLOBAL_MAX=SE_GLOBAL_MAX -DSUBST_GLOBAL_MIN=SE_GLOBAL_MIN -DSUBST_GLOBAL_MIN_DATA=SE_GLOBAL_MIN_DATA -DSUBST_GLOBAL_TO_LOCAL_COORD=SE_GLOBAL_TO_LOCAL_COORD -DSUBST_GLOBAL_SUM=SE_GLOBAL_SUM -DSUBST_GLOBAL_LOGICAL=SE_GLOBAL_LOGICAL -DSUBST_LOOP_INDEX=SE_LOOP_INDEX -DSUBST_SUBGRID_INDEX=SE_SUBGRID_INDEX -DSUBST_HI_LO_BND_PE=SE_HI_LO_BND_PE -DSUBST_SUM_CHK=SE_SUM_CHK -DSUBST_INIT_ARRAY=SE_INIT_ARRAY -DSUBST_COMM=SE_COMM -DSUBST_MY_REGION=SE_MY_REGION -DSUBST_SLICE=SE_SLICE -DSUBST_GATHER=SE_GATHER -DSUBST_DATA_COPY=SE_DATA_COPY -DSUBST_IN_SYN=SE_IN_SYN -DSUBST_PE_COMM=\"./PE_COMM.EXT\" -DSUBST_CONST=\"./CONST.EXT\" -DSUBST_FILES_ID=\"./FILES_CTM.EXT\" -DSUBST_EMISPRM=\"./EMISPRM.EXT\" -DSUBST_MPI=\"/proj/ie/proj/CMAS/CMAQ/CMAQv5.3_branch_UNC8/openmpi_4.0.1_gcc_9.1.0/lib/x86_64/gcc/mpi/include/mpif.h\" subhfile.F
  • Link step:

mpifort *.o -L/proj/ie/proj/CMAS/CMAQ/CMAQv5.3_branch_UNC8/openmpi_4.0.1_gcc_9.1.0/lib/x86_64/gcc/ioapi/lib -lioapi -L/proj/ie/proj/CMAS/CMAQ/CMAQv5.3_branch_UNC8/openmpi_4.0.1_gcc_9.1.0/lib/x86_64/gcc/netcdf/lib -lnetcdff -lnetcdf -o CCTM_v53.exe

GCC O3 OPT:


Table 1. CMAQv5.3beta compilation testing manifest on UNC Dogwood Cluster
Scenario Compiler netCDF I/O API MPI_YN (#P) Module Timing(1 Lay, 12 Var)real/user/system Timing(35 Lay, 12 Var) Timing (35 Lay, 215 Var) Notes
Gfortran Single PE Debug Gfortran version 9.1.0 4.0.1 3.2 N gcc_9.1.0 (stopped at timestep 19, due to limit of 4 hours in the debug queue) limit stacksize unlimited time -p $BLD/$EXEC
Gfortran OpenMPI Debug Gfortran version 9.1.0 4.0.1 3.2 Y (32) openmpi_4.0.1_gcc_9.1.0 1837.6 1884.24 limit stacksize unlimited time mpirun -np $NPROCS
Gfortran OpenMPI (EPA) Debug Gfortran version 9.1.0 4.0.1 3.2 Y (?) openmpi_4.0.1_gcc_9.1.0 limit stacksize unlimited time mpirun -np $NPROCS
Gfortran Openmpi UNC O3 OPT Gfortran version 9.1.0 4.0.1 3.2 Y (32) openmpi_4.0.1_gcc_9.1.0 778 limit stacksize unlimited time mpirun -np $NPROCS

QA plots are generated on longleaf using the interactive queue

R script for comparing model output results from different compilers

  • module load r/3.4.3
  • setenv R_LIBS /nas/longleaf/home/lizadams/R/x86_64-pc-linux-gnu-library/3.4
  • cd /proj/ie/proj/CMAS/CMAQ/CMAQv5.3_branch_UNC3/CMAQ_QA

If you are on the login node, your prompt will contain the login node id:

  • [lizadams@longleaf-login2 ~]

Start an interactive queue (don't run on the login node), your environment variables and R module will be set up

  • srun -p interact --pty /bin/csh

Your login prompt will be something like

  • [lizadams@off03 CMAQ_QA]$

Run the R script that compares model output

  • R CMD BATCH plot_max_diff_compiler_sens_v5.3beta.R

Plotting the difference in values between the base (EPA debug) and sensitivity (UNC debug) cases at the column, row, layer of the maximum difference between the variables. | Compilation Testing Plots

Plotting the difference in values between the UNC multiprocessor debug case and single pe case at the column, row, layer of the maximum difference between the variables. Note these runs were done for default output configuration of 1 layer and 12 variables. | Multi-processor to Single PE Comparison using Debug compile

Plotting the difference in values between the UNC multiprocessor debug and OPT case versus the EPA debug case at the column, row, layer of the maximum difference between the variables. Note these runs were done for all variables, all layers. (https://dataviewer-dept-cempd.cloudapps.unc.edu/compare.cfm?back_address1=/CMAQv5.3_branch_UNC8_QA/base_gcc9.1_Bench_2016_12SE1_opt_vs_debug/layer1_only&back_address2=/CMAQv5.3_branch_UNC8_QA/base_gcc9.1_Bench_2016_12SE1_opt_vs_debug/all_layers | Multi-processor to Single PE Comparison using Debug compile ]

Found an option to add information to an executable to allow the user to understand what compile options and libraries were used to build it.


Another idea is to have users add the

-frecord-gcc-switches option to the compile flags for the GNU compiler:


I tried evaluating the CCTM

readelf -p .GCC.command.line CCTM_v53.exe


It may prove to be useful when we are trying to help people understand how their executable was compiled.


``` String dump of section '.GCC.command.line':

 [     0]  se_bndy_copy_info_ext.f
 [    18]  -ffixed-form
 [    25]  -I /proj/ie/proj/CMAS/CMAQ/CMAQv5.3_branch_UNC8/openmpi_4.0.1_gcc_9.1.0/lib/x86_64/gcc/ioapi/lib
 [    86]  -I /proj/ie/proj/CMAS/CMAQ/CMAQv5.3_branch_UNC8/openmpi_4.0.1_gcc_9.1.0/lib/x86_64/gcc/ioapi/include_files
 [    f1]  -I /proj/ie/proj/CMAS/CMAQ/CMAQv5.3_branch_UNC8/openmpi_4.0.1_gcc_9.1.0/lib/x86_64/gcc/mpi
 [   14c]  -I .
 [   151]  -I /nas/longleaf/apps-dogwood/mpi/gcc_9.1.0/openmpi_4.0.1/include
 [   193]  -I /nas/longleaf/apps-dogwood/mpi/gcc_9.1.0/openmpi_4.0.1/lib
 [   1d1]  -mtune=generic
 [   1e0]  -march=x86-64
 [   1ee]  -g
 [   1f1]  -O0
 [   1f5]  -ffixed-line-length-132
 [   20d]  -funroll-loops
 [   21c]  -finit-character=32
 [   230]  -frecord-gcc-switches
 [   246]  -fintrinsic-modules-path /nas/longleaf/apps/gcc/9.1.0/lib/gcc/x86_64-pc-linux-gnu/9.1.0/finclude

```



I found these instructions from the following link:

https://stackoverflow.com/questions/12112338/get-the-compiler-options-from-a-compiled-executable