Speed up Wind Toolkit tests #338

akeeste · 2024-07-02T18:40:39Z

This PR addresses #315 by:

changes the data called by the Wind Toolkit tests so that they run faster
updates the .csv files that the tests compare against
updates a few descriptions in the metocean example, fixes a sorting issue, reduces the data downloaded there

akeeste · 2024-07-10T19:20:44Z

These changes were faster locally when data was pulled through the metocean example, but are still slower on actions (4h 2m) than the last wind hindcast cache test on master (2h 13m). I'll keep investigating but I have a hard time seeing a good path forward here

akeeste · 2024-07-17T14:23:57Z

Upon rerunning, the most recent wind hindcast cache test took 1h 43m, a substantial improvement if it is consistent. The rest of the hindcast tests only failed on the rerun because I did not rerun the wave hindcast upon which they depend. They passed on prior workflows

ssolson · 2024-07-17T14:50:51Z

That is a substantial improvement. Lets see if it is repeatable by merging the current development branch and pushing up to this feature branch. I am also interested to check that my hindcast logic for the notebook tests is working correctly.

Do you have any ideas for greater time improvements?

akeeste · 2024-07-17T15:30:09Z

@ssolson Will do on pulling the develop branch updates in.

I'm very doubtful whether or not this time improvement will be consistent. If not, I don't think there's a good way for us to actually pull data from the NREL server each time the test runs. One way we could eliminate the problem entirely is by committing the cached data and using it when testing these functions.

This would be less robust, but perhaps not that risky in the end. The biggest risk that I see is not consistently testing our dependency on NREL rex.

ssolson · 2024-07-17T17:37:33Z

Looks like the hindcast notebook logic worked and ran into an issue in the metocean example.

akeeste · 2024-07-17T18:23:47Z

I'll double check that example. I merged some changes and may have missed something.

akeeste · 2024-07-17T19:31:11Z

This would be less robust, but perhaps not that risky in the end. The biggest risk that I see is not consistently testing our dependency on NREL rex.

Apparently I tempted fate... I don't see anything obviously wrong with the notebook on our end.

Notes so far on the latest error:

The line that is throwing errors on the test wind_toolkit.plot_region(requested_region, lat_lon=wtk_inputs["lat_lon"]) was not changed in this PR (lines 725 and 663 would be changed)
The error on the tests gets thrown by NREL-rex several functions down from plot_region.
I can run the notebook locally without errors using Windows, Python 3.11, the cache cleared, and the same version of NREL-rex that the tests use (v0.2.86).

@ssolson any ideas on what is going on with the metocean example?

ssolson · 2024-07-17T20:03:12Z

The notebook example data should be changed to match the wind station used in testing. The idea in the way the tests are designed is that we only hit the API once with the initial cache. Then reuse the data in all the tests.

Is it possible to switch the metocean example to use cached data (the same data used in the metocean tests)?

akeeste · 2024-07-17T21:09:29Z

Got it, thanks @ssolson. I was on the same page with the caching goal, but didn't realize that our testing workflow would actually prevent new data from being downloaded and cached outside of the prepare-wind-hindcast-cache job.

Yes I can add a test to test_wind_toolkit or change the data called in the metocean_example

ssolson · 2024-07-18T12:33:28Z

Got it, thanks @ssolson. I was on the same page with the caching goal, but didn't realize that our testing workflow would actually prevent new data from being downloaded and cached outside of the prepare-wind-hindcast-cache job.

Yes I can add a test to test_wind_toolkit or change the data called in the metocean_example

I was speaking to the tests efficiency, especially since we are already having trouble with the API we should avoid new calls if possible. Nothing in the tests is preventing making new API requests.

ssolson · 2024-07-18T12:49:37Z

It looks like the time was repeated at an hour and forty minutes 🥳. A huge improvement.

ssolson · 2024-07-18T13:51:48Z

Looking at the error this morning I attempted a fresh install and I cannot recreate it locally.

I do not see any recent updates in NREL-Rex, h5py, h5pyd, or hsds.

The error says that the endpoint is not set for retrieving regional lat lon data. This leads me to think the issue is with accessing the .hscfg from the notebook (as no dependencies have recent updates).

However, the environment setup is consistent will all other calls to the hindcast APIs. So without being able to recreate the issue I will see if changing the example data fixes the issue and address based on the outcome.

ssolson · 2024-07-18T15:12:33Z

Looking at the re-run test failures the issue appeared for the wave hindcast example as well.

I added a step which will copy the .hscfg to the examples folder for the notebook run to see if this fixes the issue.

ssolson · 2024-07-22T12:23:00Z

The wind notebook is failing due to hitting a timeout of 20 minutes when hitting the wind hindcast API made by the following call in the final cell of the notebook:

wtk_temp, wtk_metadata = wind_toolkit.request_wtk_point_data(
   '1-hour',
   [
    'temperature_2m',
   'temperature_20m',
   'temperature_40m',
   'temperature_80m',
   'temperature_120m',
   'temperature_160m'
  ] ,
  ((40.748, -124.527),),
  [2018],
)

This differs from the cached test test_multi_parm in only that the the cached test asks for less parameters as shown:

wtk_multiparm, meta = wind_toolkit.request_wtk_point_data(
   '1-hour',
   [
   'temperature_20m',
   'temperature_40m',
  ] ,
  ((40.748, -124.527),),
  [2018],
)

Removing datapoints from the example would make the example less useful so I am going to try adding additional temperatures to the test call and see how it impacts the test time.

ssolson

Adam I made the tests match the notebooks. This increased the WTK tests from 1 hr 40 mins to ~2 hours. Then I setup the notebook tests to have access to the cached data and tightened up the notebook runtimes. Lastly I reduced the size of the dir_spectra.nc used in the test from ~40 MB to ~40 kB. This is ready to go from my side.

This PR addresses #315 by: - Changes the data called by the Wind Toolkit tests so that they run faster - Updates the .csv files that the tests compare against - Updates a few descriptions in the metocean example, fixes a sorting issue, reduces the data downloaded there - Tests in hindcast match the notebooks and use the same cache.

akeeste added 5 commits July 2, 2024 12:01

update data downloaded in WTK tests

fa3fd3d

update descriptions and data in metocean example

953fbf9

update correct wtk test data

fe4d08d

in request_wtk_point_data, pass preferred_region to multiloc check

97d43cb

black formatting

a407b92

akeeste added 2 commits July 17, 2024 10:05

merge develop

e94fd41

merge metocean examples

a3d5a2c

akeeste marked this pull request as ready for review July 17, 2024 16:13

akeeste requested a review from ssolson July 17, 2024 16:13

akeeste changed the base branch from master to develop July 17, 2024 16:13

Copy .hscfg file to examples directory

fc16066

ssolson added 2 commits July 19, 2024 13:05

increase metocean runtime

7ab461b

fix typos

2efc799

ssolson added 3 commits July 22, 2024 08:25

match test call to notebook API call

92010b4

plot_region Offshore_CA

90c38b0

add test for wind to match notebook & cache

d9d4402

ssolson added 5 commits July 26, 2024 09:47

match tests to example notebook calls

8fa05ff

update datasets to match example notebook data

a624b44

download hindcast datain notebooks

9a1d83b

tighten nb run times

3cdf72a

Down sample to reduce test data size in repo

8bc2b6f

ssolson approved these changes Jul 31, 2024

View reviewed changes

ssolson merged commit fec6c2e into MHKiT-Software:develop Aug 8, 2024
40 checks passed

ssolson mentioned this pull request Aug 12, 2024

MHKiT v0.8.2 #347

Merged

akeeste mentioned this pull request Aug 15, 2024

CI: Wind hindcast API Calls #315

Closed

akeeste deleted the speed_up_WTK_test branch August 15, 2024 16:31

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Speed up Wind Toolkit tests #338

Speed up Wind Toolkit tests #338

akeeste commented Jul 2, 2024

akeeste commented Jul 10, 2024 •

edited

Loading

akeeste commented Jul 17, 2024

ssolson commented Jul 17, 2024

akeeste commented Jul 17, 2024

ssolson commented Jul 17, 2024

akeeste commented Jul 17, 2024

akeeste commented Jul 17, 2024 •

edited

Loading

ssolson commented Jul 17, 2024 •

edited

Loading

akeeste commented Jul 17, 2024

ssolson commented Jul 18, 2024

ssolson commented Jul 18, 2024

ssolson commented Jul 18, 2024 •

edited

Loading

ssolson commented Jul 18, 2024

ssolson commented Jul 22, 2024 •

edited

Loading

ssolson left a comment •

edited

Loading

Speed up Wind Toolkit tests #338

Speed up Wind Toolkit tests #338

Conversation

akeeste commented Jul 2, 2024

akeeste commented Jul 10, 2024 • edited Loading

akeeste commented Jul 17, 2024

ssolson commented Jul 17, 2024

akeeste commented Jul 17, 2024

ssolson commented Jul 17, 2024

akeeste commented Jul 17, 2024

akeeste commented Jul 17, 2024 • edited Loading

ssolson commented Jul 17, 2024 • edited Loading

akeeste commented Jul 17, 2024

ssolson commented Jul 18, 2024

ssolson commented Jul 18, 2024

ssolson commented Jul 18, 2024 • edited Loading

ssolson commented Jul 18, 2024

ssolson commented Jul 22, 2024 • edited Loading

ssolson left a comment • edited Loading

Choose a reason for hiding this comment

akeeste commented Jul 10, 2024 •

edited

Loading

akeeste commented Jul 17, 2024 •

edited

Loading

ssolson commented Jul 17, 2024 •

edited

Loading

ssolson commented Jul 18, 2024 •

edited

Loading

ssolson commented Jul 22, 2024 •

edited

Loading

ssolson left a comment •

edited

Loading