Merge pull request #211 from allibco/alli-pycect

moved statistical ensemble test to cesm. added pyCECT to externels Description of changes Moved the statistical_ensemble_test files to cesm (created a new tools directory). Also added pyCECT to externals.cfg (I made it not required). I'll do a separate PR for CIME to remove pyCECT from there. Specific notes Contributors other than yourself, if any: none Fixes: [Github issue #s] And brief description of each issue. ESMCI/cime#4262 User interface changes?: [ No/Yes ] [ If yes, describe what changed, and steps taken to ensure backward compatibility ] No Testing performed (automated tests and/or manual tests):
ESCOMP · Jun 30, 2022 · 5aac0eb · 5aac0eb
2 parents f3db6b1 + e1b8ca4
commit 5aac0eb
Show file tree

Hide file tree

Showing 6 changed files with 845 additions and 0 deletions.
diff --git a/Externals.cfg b/Externals.cfg
@@ -125,6 +125,13 @@ local_path = components/pop
 externals = Externals_POP.cfg
 required = True
 
+[pycect]
+tag = 3.2.2
+protocol = git
+repo_url = https://github.com/NCAR/PyCECT
+local_path = tools/statistical_ensemble_test/pyCECT
+required = False
+
 [rtm]
 tag = rtm1_0_78 
 protocol = git

diff --git a/tools/statistical_ensemble_test/README b/tools/statistical_ensemble_test/README
@@ -0,0 +1,116 @@
+------------------------------------------
+CESM-ECT (CESM Ensemble Consistency Test:
+------------------------------------------
+
+CESM-ECT is a suite of tests to determine whether a new
+simulation set up (new machine, compiler, etc.) is statistically
+distinguishable from an accepted ensemble.  The verification tools in
+the CESM-ECT suite are:
+
+CAM-ECT - detects issues in CAM and CLM (12 month runs)
+UF-CAM-ECT - detects issues in CAM and CLM (9 time step runs)
+POP-ECT - detects issues in POP and CICE (12 month runs)
+
+The ECT process involves comparing runs generated with
+the new scenario ( 3 for CAM-ECT and UF-CAM-ECT, and 1 for POP-ECT)
+to an ensemble built on a trusted machine (currently
+cheyenne). The python ECT tools are located in the pyCECT
+subdirectory or https://github.com/NCAR/PyCECT/releases.
+
+-OR-
+
+We now provide a web server for CAM-ECT and UF-CAM-ECT,  where
+you can upload the (3) generated runs for comparison to our ensemble.
+Please see the webpage at http://www.cesm.ucar.edu/models/cesm2/verification/
+for further instructions.
+
+-----------------------------------
+Creating or obtaining a summary file:
+-----------------------------------
+
+Before the test can be run, a summary file is needed of the ensemble
+runs to which the comparison will be made. Ensemble summary files
+(NetCDF) for existing tags for CAM-ECT, UF-CAM-ECT, and POP-ECT that
+were created by CSEG are located (respectively) in the CESM input data
+directories:
+
+$CESMDATAROOT/inputdata/validation/ensembles
+$CESMDATAROOT/inputdata/validation/uf_ensembles
+$CESMDATAROOT/inputdata/validation/pop_ensembles
+
+If none of our ensembles are suitable for your needs, then you may create
+your own ensemble (and summary file) using the following instructions:
+
+(1) To create a new ensemble, use the ensemble.py script in this directory.
+This script creates and compiles a case, then creates clones of the
+original case, where the initial temperature perturbation is slightly modified
+for each ensemble member.  At this time, cime includes functionality
+to create ensembles for CAM-ECT, UF-CAM-ECT, and POP-ECT.
+
+(2) Use --ect <pop,cam> to specify whether ensemble is for CAM or POP.
+(See 'python ensemble.py -h' for additional details).
+
+(3) Use --ensemble <size> to specify the ensemble size.
+Recommended ensemble sizes:
+CAM-ECT: 151
+UF-CAM-ECT: 350
+POP-ECT 40
+
+(4) Examples:
+
+CAM-ECT:
+
+python ensemble.py --case /glade/scratch/cesm_user/cesm_tag/ensemble/ensemble.cesm_tag.000 --mach cheyenne   --ensemble 151 --ect cam --project P99999999
+
+
+UF-CAM-ECT:
+
+python ensemble.py --case /glade/scratch/cesm_user/cesm_tag/uf_ensemble/ensemble.cesm_tag.uf.000 --mach cheyenne  --ensemble 350 --uf --ect cam --project P99999999
+
+POP-ECT:
+
+python ensemble.py --case /glade/scratch/cesm_user/cesm_tag/uf_ensemble/ensemble.cesm_tag.000 --mach cheyenne  --ensemble 40 --ect pop --project P99999999
+
+Notes:
+       (a) ensemble.py accepts (most of) the argumenets of create_newcase
+
+       (b) case name must end in ".000" and include the full path
+
+       (c) ensemble size must be specified, and suggested defaults are listed
+       	   above. Note that for CAM-ECT and UF-CAM-ECT, the ensemble size
+	   needs to be larger than the number of variables that ECT will evaluate.
+
+
+(5) Once all ensemble simulations have run successfully, copy every cam history
+file (*.cam.h0.*) for CAM-ECT and UF-CAM-ECT) or monthly pop history file
+(*.pop.h.*) for POP-ECT from each ensemble run directory into a separate directory.
+Next create the ensemble summary using the pyCECT tool pyEnsSum.py (for CAM-ECT and
+UF-CAM-ECT) or pyEnsSumPop.py (for POP-ECT).  For details see README_pyEnsSum.rst
+and README_pyEnsSumPop.rst with the pyCECT tools.
+
+-------------------
+Creating test runs:
+-------------------
+
+(1) Once an ensemble summary file has been created or chosen to
+use from $CESMDATAROOT/inputdata/validation, the simulation
+run(s) to be verified by ECT must be created via script ensemble.py.
+
+NOTE: It is important that the **same** resolution and compset be used in the
+individual runs as in the ensemble.  The NetCDF ensemble summary file global
+attributes give this information.
+
+(2) For example, for CAM-ECT:
+
+python ensemble.py --case /glade/scratch/cesm_user/cesm_tag/camcase.cesm_tag.000 --ect cam --mach cheyenne --project P99999999
+--compset   F2000climo --res f19_f19_mg17
+For example, for UF-CAM-ECT:
+
+python ensemble.py --case /glade/scratch/cesm_user/cesm_tag/uf.camcase.cesm_tag.000 --ect cam --uf --mach cheyenne --project P99999999 --compset   F2000climo --res f19_f19_mg17
+
+For example, for POP-ECT:
+
+python ensemble.py --case /glade/scratch/cesm_user/cesm_tag/popcase.cesm_tag.000 --ect pop --mach cheyenne  --project P99999999 --compset   G --res T62_g17
+
+(3) Next verify the new simulation(s) with the pyCECT tool pyCECT.py (see
+README_pyCECT.rst with the pyCECT tools).
diff --git a/tools/statistical_ensemble_test/addmetadata.sh b/tools/statistical_ensemble_test/addmetadata.sh
@@ -0,0 +1,64 @@
+#!/usr/bin/env bash
+#
+# Adds metadata to netcdf statistical ensemble test files.
+#
+Args=("$@")
+i=0
+while [ $i -le ${#Args[@]} ]; do
+  case ${Args[$i]} in
+    --caseroot )
+      i=$((i+1))
+      caseroot=${Args[$i]}
+      if [ ! -d ${caseroot} ]; then
+        echo "ERROR: caseroot not found: $caseroot"
+        exit 2
+      fi
+      if [ ! -f ${caseroot}/xmlquery ]; then
+	echo "ERROR: Directory $caseroot does not appear to be a cesm case directory"
+	exit 3
+      fi
+    ;;
+    --histfile )
+      i=$((i+1))
+      histfile=${Args[$i]}
+      if [ ! -f ${histfile} ]; then
+	echo "ERROR: file not found $histfile"
+        exit 4
+      fi
+    ;;
+    --help )
+      echo "usage: addmetadata --histroot CASEROOT --histfile HISTFILE [--help]
+      "
+      echo "Script to add metadata to validation files.
+      "
+       echo "Optional arguments:
+     --help           show this help message and exit
+     --caseroot       Full pathname to the CASE directory.
+     --histfile       Full filename of the history file to add the metadata."
+      exit
+    ;;
+  esac
+  i=$((i+1))
+done
+
+if [ "$caseroot" = "" ] || [ "$histfile" = "" ]; then
+  echo "Please run ./addmetadata.sh --help for correct usage."
+  exit
+fi
+
+cd $caseroot
+
+stop_option=`./xmlquery --value STOP_OPTION`
+test_type="UF-ECT"
+if [ "$stop_option" = "nmonths" ]; then
+    test_type="ECT"
+elif [ "$stop_option" = "nyears" ]; then
+    test_type="POP-ECT"
+fi
+
+if hash ncks 2>/dev/null; then
+ ncks --glb compset=`./xmlquery --value COMPSET` --glb grid=`./xmlquery --value GRID` --glb testtype="$test_type" --glb compiler=`./xmlquery --value COMPILER` --glb machineid=`./xmlquery --value MACH`  --glb model_version=`./xmlquery --value MODEL_VERSION`   $histfile $histfile.tmp
+  mv $histfile.tmp $histfile
+else
+  echo "This script requires the ncks tool"
+fi
diff --git a/tools/statistical_ensemble_test/ensemble.py b/tools/statistical_ensemble_test/ensemble.py
@@ -0,0 +1,200 @@
+#!/usr/bin/python
+from __future__ import print_function
+import os, sys, getopt
+import random
+from single_run import process_args_dict, single_case
+
+# ==============================================================================
+# set up and submit 12-month (original) or 9-time step (uf) run.  then create
+# clones for a complete ensemble or a set of (3) test cases
+# ==============================================================================
+
+# generate <num_pick> positive random integers in [0, end-1]
+# can't have any duplicates
+def random_pick(num_pick, end):
+    ar = range(0, end)
+    rand_list = random.sample(ar, num_pick)
+    # for i in rand_list:
+    #    print i
+    return rand_list
+
+
+# get the pertlim corressponding to the random int
+def get_pertlim_uf(rand_num):
+    i = rand_num
+    if i == 0:
+        ptlim = 0
+    else:
+        j = 2 * int((i - 1) / 100) + 101
+        k = (i - 1) % 100
+        if i % 2 != 0:
+            ll = j + int(k / 2) * 18
+            ippt = str(ll).zfill(3)
+            ptlim = "0." + ippt + "d-13"
+        else:
+            ll = j + int((k - 1) / 2) * 18
+            ippt = str(ll).zfill(3)
+            ptlim = "-0." + ippt + "d-13"
+    return ptlim
+
+
+def main(argv):
+
+    caller = "ensemble.py"
+
+    # directory with single_run.py and ensemble.py
+    stat_dir = os.path.dirname(os.path.realpath(__file__))
+    print("STATUS: stat_dir = " + stat_dir)
+
+    opts_dict, case_flags = process_args_dict(caller, argv)
+
+    # default is verification mode (3 runs)
+    run_type = "verify"
+    if opts_dict["ect"] == "pop":
+        clone_count = 0
+    else:
+        clone_count = 2
+
+    uf = opts_dict["uf"]
+
+    # check for run_type change (i.e., if doing ensemble instead of verify)
+    ens_size = opts_dict["ensemble"]
+    if ens_size > 0:
+        run_type = "ensemble"
+        clone_count = ens_size - 1
+        if ens_size > 999:
+            print("Error: cannot have an ensemble size greater than 999.")
+            sys.exit()
+        print("STATUS: ensemble size = " + str(ens_size))
+
+    # generate random pertlim(s) for verify
+    if run_type == "verify":
+        if opts_dict["ect"] == "pop":
+            rand_ints = random_pick(1, 40)
+        else:  # cam
+            if uf:
+                end_range = 350
+            else:
+                end_range = 150
+            rand_ints = random_pick(3, end_range)
+
+    # now create cases
+    print("STATUS: creating first case ...")
+
+    # create first case - then clone
+    if run_type == "verify":
+        opts_dict["pertlim"] = get_pertlim_uf(rand_ints[0])
+    else:  # full ensemble
+        opts_dict["pertlim"] = "0"
+
+    # first case
+    single_case(opts_dict, case_flags, stat_dir)
+
+    # clone?
+    if clone_count > 0:
+
+        # now clone
+        print("STATUS: cloning additional cases ...")
+
+        # scripts dir
+        print("STATUS: stat_dir = " + stat_dir)
+        ret = os.chdir(stat_dir)
+        ret = os.chdir("../../cime/scripts")
+        scripts_dir = os.getcwd()
+        print("STATUS: scripts dir = " + scripts_dir)
+
+        # we know case name ends in '.000' (already checked)
+        clone_case = opts_dict["case"]
+        case_pfx = clone_case[:-4]
+
+        for i in range(1, clone_count + 1):  # 1: clone_count
+            if run_type == "verify":
+                this_pertlim = get_pertlim_uf(rand_ints[i])
+            else:  # full ensemble
+                this_pertlim = get_pertlim_uf(i)
+
+            iens = "{0:03d}".format(i)
+            new_case = case_pfx + "." + iens
+
+            os.chdir(scripts_dir)
+            print("STATUS: creating new cloned case: " + new_case)
+
+            clone_args = " --keepexe --case " + new_case + " --clone " + clone_case
+            print("        with args: " + clone_args)
+
+            command = scripts_dir + "/create_clone" + clone_args
+            ret = os.system(command)
+
+            print("STATUS: running setup for new cloned case: " + new_case)
+            os.chdir(new_case)
+            command = "./case.setup"
+            ret = os.system(command)
+
+            # adjust perturbation
+            if opts_dict["ect"] == "pop":
+                if run_type == "verify":  # remove old init_ts_perturb
+                    f = open("user_nl_pop", "r+")
+                    all_lines = f.readlines()
+                    f.seek(0)
+                    for line in all_lines:
+                        if line.find("init_ts_perturb") == -1:
+                            f.write(line)
+                    f.truncate()
+                    f.close()
+                    text = "init_ts_perturb = " + this_pertlim
+                else:
+                    text = "\ninit_ts_perturb = " + this_pertlim
+
+                # now append new pertlim
+                with open("user_nl_pop", "a") as f:
+                    f.write(text)
+
+            else:
+                if run_type == "verify":  # remove old pertlim first
+                    f = open("user_nl_cam", "r+")
+                    all_lines = f.readlines()
+                    f.seek(0)
+                    for line in all_lines:
+                        if line.find("pertlim") == -1:
+                            f.write(line)
+                    f.truncate()
+                    f.close()
+                    text = "pertlim = " + this_pertlim
+                else:
+                    text = "\npertlim = " + this_pertlim
+
+                # now append new pertlim
+                with open("user_nl_cam", "a") as f:
+                    f.write(text)
+
+            # preview namelists
+            command = "./preview_namelists"
+            ret = os.system(command)
+
+            # submit?
+            if opts_dict["ns"] == False:
+                command = "./case.submit"
+                ret = os.system(command)
+
+    # Final output
+    if run_type == "verify":
+        if opts_dict["ect"] == "pop":
+            print("STATUS: ---POP-ECT VERIFICATION CASE COMPLETE---")
+            print("Set up one case using the following init_ts_perturb value:")
+            print(get_pertlim_uf(rand_ints[0]))
+        else:
+            print("STATUS: ---CAM-ECT VERIFICATION CASES COMPLETE---")
+            print("Set up three cases using the following pertlim values:")
+            print(
+                get_pertlim_uf(rand_ints[0])
+                + "   "
+                + get_pertlim_uf(rand_ints[1])
+                + "   "
+                + get_pertlim_uf(rand_ints[2])
+            )
+    else:
+        print("STATUS: --ENSEMBLE CASES COMPLETE---")
+
+
+if __name__ == "__main__":
+    main(sys.argv[1:])