Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generalize file splitting for output writers #3515

Merged
merged 29 commits into from
Mar 24, 2024
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
Show all changes
29 commits
Select commit Hold shift + click to select a range
e06d78e
Generalize file splitting for JLD2OutputWriter so that alternative cr…
glwagner Mar 15, 2024
b1ad9b6
Update src/OutputWriters/jld2_output_writer.jl
glwagner Mar 16, 2024
183e729
Export FileSizeLimit and update JLD2OutputWriter test
glwagner Mar 16, 2024
0f5fedb
implementing file splitting in netcdf
josuemtzmo Mar 16, 2024
9276ba6
Properly export FileSizeLimit
glwagner Mar 16, 2024
6c6e07e
fix handeling of path writer.filepath with the file_splitting
josuemtzmo Mar 17, 2024
ab7fec6
merge with glw/generalized-file-splitting
josuemtzmo Mar 17, 2024
2937dc1
add support to file splitting by size in netCDFs
josuemtzmo Mar 17, 2024
4f9af7e
add support to file splitting by size in netCDFs
josuemtzmo Mar 17, 2024
73a94ef
update warning to properly print variable.
josuemtzmo Mar 18, 2024
cd30f3e
return to the use of FileSizeLimit(200KiB) in the test to make it eas…
josuemtzmo Mar 18, 2024
2f0e4f6
update netcdf to match jld2, and add return in update_file_splitting_…
josuemtzmo Mar 18, 2024
09c4abb
fix tests filesize tests
josuemtzmo Mar 18, 2024
db94858
Apply suggestions from code review
navidcy Mar 22, 2024
14ce9a8
Merge branch 'main' into glw/generalized-file-splitting
navidcy Mar 22, 2024
4804fc2
Update test_jld2_output_writer.jl
navidcy Mar 22, 2024
3a846e5
Merge branch 'main' into glw/generalized-file-splitting
josuemtzmo Mar 22, 2024
4c1db46
fix show for NetCDFOutputWriter
navidcy Mar 23, 2024
f9b082f
fix doctests
navidcy Mar 23, 2024
527ac31
Update src/OutputWriters/netcdf_output_writer.jl
navidcy Mar 23, 2024
78998bc
Update src/OutputWriters/jld2_output_writer.jl
navidcy Mar 23, 2024
60de913
fix doctests
navidcy Mar 23, 2024
fc49fb7
Merge branch 'glw/generalized-file-splitting' of github.com:CliMA/Oce…
navidcy Mar 23, 2024
d8233b0
fix doctests
navidcy Mar 23, 2024
28cc203
fix doctest
navidcy Mar 24, 2024
97fa9bb
cleanup unecessary imports
navidcy Mar 24, 2024
71dab3a
fix doctest
navidcy Mar 24, 2024
4c1cb92
fix doctests
navidcy Mar 24, 2024
6c0e974
fix doctests
navidcy Mar 24, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
23 changes: 12 additions & 11 deletions src/OutputWriters/jld2_output_writer.jl
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,10 @@ default_included_properties(::ShallowWaterModel) = [:grid, :coriolis, :closure]
default_included_properties(::HydrostaticFreeSurfaceModel) = [:grid, :coriolis, :buoyancy, :closure]

update_file_splitting_schedule!(schedule, filepath) = nothing
update_file_splitting_schedule!(schedule::FileSizeLimit, filepath) = schedule.path = filepath

function update_file_splitting_schedule!(schedule::FileSizeLimit, filepath)
schedule.path = filepath
josuemtzmo marked this conversation as resolved.
Show resolved Hide resolved
end

struct NoFileSplitting end
(::NoFileSplitting)(model) = false
Expand Down Expand Up @@ -184,7 +187,7 @@ function JLD2OutputWriter(model, outputs; filename, schedule,
mkpath(dir)
filename = auto_extension(filename, ".jld2")
filepath = joinpath(dir, filename)
update_file_splitting_schedule!(schedule, filepath)
update_file_splitting_schedule!(file_splitting, filepath)
overwrite_existing && isfile(filepath) && rm(filepath, force=true)

outputs = NamedTuple(Symbol(name) => construct_output(outputs[name], model.grid, indices, with_halos)
Expand Down Expand Up @@ -258,18 +261,17 @@ end
function write_output!(writer::JLD2OutputWriter, model)

verbose = writer.verbose
path = writer.filepath
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why this change?

Copy link
Collaborator

@josuemtzmo josuemtzmo Mar 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because path was used later on in the code, but it wasn't updated. Thus the code crashed while creating the file and when the writer.filepath changed but not the path . The easier fix was to replace all instances of path by the writer.filepath.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, that's a good reason!

current_iteration = model.clock.iteration

# Some logic to handle writing to existing files
if iteration_exists(path, current_iteration)
if iteration_exists(writer.filepath, current_iteration)

if writer.overwrite_existing
# Something went wrong, so we remove the file and re-initialize it.
rm(path, force=true)
rm(writer.filepath, force=true)
initialize_jld2_file!(writer, model)
else # nothing we can do since we were asked not to overwrite_existing, so we skip output writing
@warn "Iteration $current_iteration was found in $path. Skipping output writing (for now...)"
@warn "Iteration $current_iteration was found in $writer.filepath. Skipping output writing (for now...)"
josuemtzmo marked this conversation as resolved.
Show resolved Hide resolved
end

else # ok let's do this
Expand All @@ -284,14 +286,13 @@ function write_output!(writer::JLD2OutputWriter, model)

# Start a new file if the file_splitting(model) is true
writer.file_splitting(model) && start_next_file(model, writer)
update_file_splitting_schedule!(schedule, writer.filepath)

update_file_splitting_schedule!(writer.file_splitting, writer.filepath)
# Write output from `data`
verbose && @info "Writing JLD2 output $(keys(writer.outputs)) to $path..."

start_time, old_filesize = time_ns(), filesize(path)
jld2output!(path, model.clock.iteration, model.clock.time, data, writer.jld2_kw)
end_time, new_filesize = time_ns(), filesize(path)
start_time, old_filesize = time_ns(), filesize(writer.filepath)
jld2output!(writer.filepath, model.clock.iteration, model.clock.time, data, writer.jld2_kw)
end_time, new_filesize = time_ns(), filesize(writer.filepath)

verbose && @info @sprintf("Writing done: time=%s, size=%s, Δsize=%s",
prettytime((end_time - start_time) / 1e9),
Expand Down
4 changes: 3 additions & 1 deletion test/test_jld2_output_writer.jl
Original file line number Diff line number Diff line change
Expand Up @@ -50,6 +50,8 @@ function test_jld2_file_splitting(arch)
file["boundary_conditions/fake"] = π
end

filesizelimit = FileSizeLimit(200KiB)

navidcy marked this conversation as resolved.
Show resolved Hide resolved
ow = JLD2OutputWriter(model, (; u=model.velocities.u);
dir = ".",
filename = "test.jld2",
Expand All @@ -58,7 +60,7 @@ function test_jld2_file_splitting(arch)
including = [:grid],
array_type = Array{Float64},
with_halos = true,
file_splitting = FileSizeLimit(200KiB),
file_splitting = filesizelimit,
josuemtzmo marked this conversation as resolved.
Show resolved Hide resolved
overwrite_existing = true)

push!(simulation.output_writers, ow)
Expand Down