Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

O3.7.3 Overcome precompiling every (julia) ensemble member on HPC #331

Closed
1 task done
odunbar opened this issue Oct 3, 2023 · 2 comments
Closed
1 task done

O3.7.3 Overcome precompiling every (julia) ensemble member on HPC #331

odunbar opened this issue Oct 3, 2023 · 2 comments

Comments

@odunbar
Copy link
Collaborator

odunbar commented Oct 3, 2023

We would like to make a recommendation for packages used to run ensembles more conveniently on HPC clusters. In particular anything that overcomes needing to compile on every node for every member when they are running the same source code at different parameter values. Such tools exist in a few forms, e.g., PrecompileTools.jl or PackageCompiler.jl

Tasks

@odunbar
Copy link
Collaborator Author

odunbar commented Jan 31, 2024

With @nefrathenrici

Rough benchmarks through Calibration of ClimaAtmos.jl

Having added additional code with PrecompileTools.jl - requires effectively building and running 1 step of the simulator, then precompiles the typed methods called during this operation.

Old timings (roughly):
login-node: enter here
1. node 1: Instantiate the project, precompile all packages (takes ~7mins)
2. node 1: Create(update) ensemble (compile 10s, run takes ~1s)
3. node 2+(0, ... ,99): run ensemble members, update ensemble (compile 1min, run takes 1hr)
Repeat step 2&3 until converged

New timings (roughly):
login-node: enter here
1. node 1: Instantiate the project, precompile all packages (takes ~10-15mins)
2. node 1: Create initial ensemble (compile 10s, run takes ~1s)
3. node 2(1) - 2(100): run ensemble members, update ensemble (compile 15-30sec, run takes 1hr)
Repeat step 2&3 until converged

Tentative conclusions

  1. For reasonable performance application of PrecompileTools.jl may not be a black-box operation, one must configure and run a short instance of the code to access methods and types that need to be precompiled.
  2. We do generally recommend doing this procedure, benefits will occur only after some moderate number of iterations that may vary with experiments
  3. We particularly recommend this when configuring calibration algorithm parameters, (e.g. trying different priors, inflations, (EKI-)timesteppers etc.) that affect the ensemble values only, and not the simulator.

@odunbar
Copy link
Collaborator Author

odunbar commented Apr 17, 2024

Nat's implementation and documentation in CalibrateAtmos.jl addresses this

@odunbar odunbar closed this as completed Apr 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant