Problems with ert=2.21.0 when running large ensembles #367

edubarrosTNO · 2021-03-20T12:52:43Z

I was running a couple of tests last Friday (for PR's #217 and #365) and observed difficulties to complete runs for the FlowNet experiments in the Norne example (with 500 realizations and 10 ES-MDA iterations). At first I thought they were related to problems in the PR's, but then I re-run the same case with the new release in PyPI (flownet==0.5.2) and again it failed with the ERT process stopping to print to the screen iter-0 and timing out after some more iterations / simulations running on the background. I then investigated further whether this is an issue with ERT and noticed that all these FlowNet branches and release versions share ert==2.21.0 in common, while I know that I can run the same case with the previous version of FlowNet which uses ert==2.20.1. As a final test, I tried running the same case using the CI config locally and everything runs successfully (for 2 realizations and 2 ES-MDA iterations), including properly logging / printing info to screen after the completion of iter-0.

So my hypothesis is that this new release of ERT might be behaving strangely for larger ensembles (> 500 realizations). Can anyone else test this to confirm the behavior?

The text was updated successfully, but these errors were encountered:

wouterjdb · 2021-03-20T14:06:32Z

Have you manually installed the previous release of ert and ran the same simulation?

edubarrosTNO · 2021-03-20T18:30:36Z

Have you manually installed the previous release of ert and ran the same simulation?

Yes, I did that using the latest release version of flownet==0.5.2 and installing ert==2.20.1 manually, and then the same FlowNet experiment ran. We should check why / report that ert==2.21.0 is not behaving properly.

The problem now is that the experiment run with flownet==0.5.2 does not reproduce the same results as the experiment that was run before with flownet==0.5.0, namely: more failed simulations with 0.5.2 causing the HM to be interrupted due to not meeting the requirements of percentage of successful realizations, while with 0.5.0 this requirement would be met. But this is a separate issue from the one on the ert version, maybe more has changed in flownet in between the releases

wouterjdb · 2021-03-20T21:46:00Z

Are you running with the modified Norne model now? You ruled out any changes caused by that?

edubarrosTNO · 2021-03-21T11:13:23Z

Yes, I used the same version of the Norne model in both experiments. The first experiment with flownet==0.5.0 was run two weeks ago, so it was the previous version of what is now in the master of flownet-testdata (which was updated last Thursday)

edubarrosTNO · 2021-03-24T17:40:02Z

When repeating my tests with a fresh installation of flownet==0.5.0 and flownet==0.5.2, I noticed that there is a new version of ERT, ert==2.21.1. I checked this new release version (https://github.com/equinor/ert/releases/tag/2.21.1) and found out that there was a bug which is supposedly fixed now:

ert==2.21.1
Bugfix:
Don't assume singular snapshot in CLI. Fixes a problem where ERT would crash on iiteration 1 if a realization failed in iteration 0.

I will re-run my tests now and see if the behavior described in this issue is fixed.

wouterjdb · 2021-03-25T08:55:55Z

Problem solved. We can close this issue now.

edubarrosTNO added bug Something isn't working enhancement New feature or request help wanted Extra attention is needed and removed enhancement New feature or request labels Mar 20, 2021

wouterjdb added this to Backlog 📝 in FlowNet via automation Mar 24, 2021

wouterjdb moved this from Backlog 📝 to Prioritized 🚀 in FlowNet Mar 24, 2021

wouterjdb assigned edubarrosTNO Mar 24, 2021

wouterjdb mentioned this issue Mar 24, 2021

Installation issue on server #352

Closed

wouterjdb closed this as completed Mar 25, 2021

FlowNet automation moved this from Prioritized 🚀 to Done 🏁 Mar 25, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Problems with ert=2.21.0 when running large ensembles #367

Problems with ert=2.21.0 when running large ensembles #367

edubarrosTNO commented Mar 20, 2021 •

edited

Loading

wouterjdb commented Mar 20, 2021

edubarrosTNO commented Mar 20, 2021

wouterjdb commented Mar 20, 2021

edubarrosTNO commented Mar 21, 2021

edubarrosTNO commented Mar 24, 2021 •

edited

Loading

wouterjdb commented Mar 25, 2021

Problems with ert=2.21.0 when running large ensembles #367

Problems with ert=2.21.0 when running large ensembles #367

Comments

edubarrosTNO commented Mar 20, 2021 • edited Loading

wouterjdb commented Mar 20, 2021

edubarrosTNO commented Mar 20, 2021

wouterjdb commented Mar 20, 2021

edubarrosTNO commented Mar 21, 2021

edubarrosTNO commented Mar 24, 2021 • edited Loading

wouterjdb commented Mar 25, 2021

edubarrosTNO commented Mar 20, 2021 •

edited

Loading

edubarrosTNO commented Mar 24, 2021 •

edited

Loading