Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generalize file splitting for output writers #3515

Merged
merged 29 commits into from
Mar 24, 2024

Conversation

glwagner
Copy link
Member

This PR generalizes file splitting for JLD2OutputWriter, replacing max_filesize with a file_splitting keyword argument. To implement a maximum filesize we use

JLD2OutputWriter(model, outputs; file_splitting=FileSizeLimit(max_filesize), ...)

This feature should also be extended to NetCDFOutputWriter in this PR -- hopefully @josuemtzmo can help!

This PR will enable TimeInterval (or any other schedule --- IterationInterval, WallTimeInterval) to be used for splitting output files.

Closes #2967

@glwagner glwagner marked this pull request as draft March 15, 2024 21:24
@navidcy
Copy link
Collaborator

navidcy commented Mar 16, 2024

We should also add a test similar to the one that was added in #3512.

@glwagner
Copy link
Member Author

We should also add a test similar to the one that was added in #3512.

Doesn't that already exist?

We just have to adapt the test

@glwagner
Copy link
Member Author

Also open to comments about the new names / kwargs.

@navidcy
Copy link
Collaborator

navidcy commented Mar 16, 2024

We should also add a test similar to the one that was added in #3512.

Doesn't that already exist?

We just have to adapt the test

Yes, the test I think only tests the NetCDFOutputWriter.

@josuemtzmo
Copy link
Collaborator

Doesn't that already exist?
We just have to adapt the test

Yes, the test I think only tests the NetCDFOutputWriter.

There is already a test for the JLD2OutputWriter

@josuemtzmo
Copy link
Collaborator

josuemtzmo commented Mar 17, 2024

I will work on the time splitting schedule. Should we do it in this PR? or should I do another one once it is ready?

@@ -258,18 +261,17 @@ end
function write_output!(writer::JLD2OutputWriter, model)

verbose = writer.verbose
path = writer.filepath
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why this change?

Copy link
Collaborator

@josuemtzmo josuemtzmo Mar 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because path was used later on in the code, but it wasn't updated. Thus the code crashed while creating the file and when the writer.filepath changed but not the path . The easier fix was to replace all instances of path by the writer.filepath.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, that's a good reason!

@josuemtzmo josuemtzmo marked this pull request as ready for review March 19, 2024 09:12
@navidcy navidcy self-requested a review March 22, 2024 08:32
Copy link
Collaborator

@navidcy navidcy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm!

@josuemtzmo
Copy link
Collaborator

josuemtzmo commented Mar 22, 2024

Why are some of the tests failing?

@navidcy
Copy link
Collaborator

navidcy commented Mar 22, 2024

Sometimes (often) they just fail to initialize. Don’t assume you broke them necessarily. Uou need to see the log to understand if actually the tests broke or never started.

@navidcy
Copy link
Collaborator

navidcy commented Mar 22, 2024

@glwagner would you like to have a look at this or should we merge when tests pass?

@navidcy navidcy added the feature 🌟 Something new and shiny label Mar 22, 2024
@navidcy
Copy link
Collaborator

navidcy commented Mar 23, 2024

Some of the doctests are failing because of the changes:

https://buildkite.com/clima/oceananigans/builds/15164#018e6aa7-67f8-4280-b331-ad7179892dd1/32-18831

@glwagner
Copy link
Member Author

I think it looks pretty good! Thanks @josuemtzmo for the work on NetCDFOutputWriter.

In a future PR, we can add a test to ensure that file_splitting = TimeInterval(T) works.

Also I'm trying to brainstorm a better word to use than "actuated" for talking about scheduling.

But I think once the tests pass we should merge this.

@navidcy navidcy merged commit 1104449 into main Mar 24, 2024
48 checks passed
@navidcy navidcy deleted the glw/generalized-file-splitting branch March 24, 2024 11:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature 🌟 Something new and shiny output 💾
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Splitting NetCDF output with NetCDFOutputWriter
3 participants