Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Passes grid argument to NetCDFOutputWriter #3576

Merged
merged 25 commits into from
May 10, 2024
Merged

Conversation

tomchor
Copy link
Collaborator

@tomchor tomchor commented May 2, 2024

Related to #3460

@tomchor
Copy link
Collaborator Author

tomchor commented May 3, 2024

Unfortunately this example

using Oceananigans

grid = RectilinearGrid(size = (1, 1, 8), extent = (1,1,1));
model = NonhydrostaticModel(; grid, closure = ScalarDiffusivity=1e-2))

set!(model, u=(x, y, z,) -> z)

simulation = Simulation(model,
                        Δt=0.5*maximum(zspacings(grid, Center())) / maximum(abs, model.velocities.u),
                        stop_time=20)
# Create regular output
simulation.output_writers[:fullfields] = NetCDFOutputWriter(model, (; model.velocities.u),
                                                            filename = "fullfields.nc",
                                                            schedule = TimeInterval(5),
                                                            overwrite_existing = true,)

# Create interpolated u on coarse grid
coarse_grid = RectilinearGrid(size = (grid.Nx, grid.Ny, grid.Nz÷2), extent = (grid.Lx, grid.Ly, grid.Lz))
coarse_u = Field{Face, Center, Center}(coarse_grid)

using Oceananigans.Fields: interpolate!
update_coarse_u(simulation) = interpolate!(coarse_u, simulation.model.velocities.u)
simulation.callbacks[:update_interp] = Callback(update_coarse_u)

# Create coarse output
simulation.output_writers[:coarsefields] = NetCDFOutputWriter(model, (; coarse_u,), coarse_grid;
                                                              filename="coarsefields.nc",
                                                              schedule=TimeInterval(5),
                                                              overwrite_existing=true,)

run!(simulation)

Throws the following error:

ERROR: LoadError: DimensionMismatch: new dimensions (1, 1, 8, 1) must be consistent with array size 4
Stacktrace:
  [1] (::Base.var"#throw_dmrsa#328")(dims::NTuple{4, Int64}, len::Int64)
    @ Base ./reshapedarray.jl:41
  [2] reshape(a::Array{Float64, 3}, dims::NTuple{4, Int64})
    @ Base ./reshapedarray.jl:45
  [3] setindex_disk!(::NCDatasets.Variable{Float64, 4, NCDatasets.NCDataset{Nothing, Missing}}, ::Array{Float64, 3}, ::Function, ::Vararg{Any})
    @ DiskArrays ~/.julia/packages/DiskArrays/bZBJE/src/diskarray.jl:56
  [4] setindex!
    @ ~/.julia/packages/DiskArrays/bZBJE/src/diskarray.jl:229 [inlined]
  [5] setindex!(::CommonDataModel.CFVariable{…}, ::Array{…}, ::Colon, ::Colon, ::Colon, ::UnitRange{…})
    @ CommonDataModel ~/.julia/packages/CommonDataModel/GGvMn/src/cfvariable.jl:477
  [6] save_output!(ds::NCDatasets.NCDataset{…}, output::Field{…}, model::NonhydrostaticModel{…}, ow::NetCDFOutputWriter{…}, time_index::Int64, name::String)
    @ Oceananigans.OutputWriters ~/repos/Oceananigans.jl/src/OutputWriters/netcdf_output_writer.jl:482
  [7] write_output!(ow::NetCDFOutputWriter{…}, model::NonhydrostaticModel{…})
    @ Oceananigans.OutputWriters ~/repos/Oceananigans.jl/src/OutputWriters/netcdf_output_writer.jl:525
  [8] initialize!(sim::Simulation{…})
    @ Oceananigans.Simulations ~/repos/Oceananigans.jl/src/Simulations/run.jl:212
  [9] time_step!(sim::Simulation{…})
    @ Oceananigans.Simulations ~/repos/Oceananigans.jl/src/Simulations/run.jl:112
 [10] run!(sim::Simulation{…}; pickup::Bool)
    @ Oceananigans.Simulations ~/repos/Oceananigans.jl/src/Simulations/run.jl:97
 [11] run!(sim::Simulation{NonhydrostaticModel{…}, Float64, Float64, OrderedCollections.OrderedDict{…}, OrderedCollections.OrderedDict{…}, OrderedCollections.OrderedDict{…}})
    @ Oceananigans.Simulations ~/repos/Oceananigans.jl/src/Simulations/run.jl:85
 [12] top-level scope
    @ ~/repos/Oceananigans.jl/sandbox/mwe.jl:31
 [13] include(fname::String)
    @ Base.MainInclude ./client.jl:489

So it's not as trivial as the single change I just made. From glancing at the code we at least have to modify initialize_nc_file!() to take a grid option as well:

function initialize_nc_file!(filepath,
outputs,
schedule,
array_type,
indices,
with_halos,
global_attributes,
output_attributes,
dimensions,
overwrite_existing,
deflatelevel,
model)

plus a couple of other things. Still pretty easy, but more work/time that I have right now.

@iuryt feel free to jump in here and make these changes if you feel it's necessary, since creating a whole separate model can be a bit onerous and wastes precious GPU memory.

@glwagner
Copy link
Member

glwagner commented May 3, 2024

Still pretty easy, but more work/time that I have right now.

Reading the code this is just one more line. Can you be more specific about how much more time this will take?

To clarify, I'm pretty sure the main difficulty of this PR is the documentation, not the code itself. The documentation was always going to take a bit of effort. I say this to give accurate impression of the effort this stuff requires.

@glwagner glwagner marked this pull request as ready for review May 3, 2024 16:53
@glwagner
Copy link
Member

glwagner commented May 3, 2024

Ok, the above example works by adding grid as a property to the output writer, and then passing grid explicitly to the initialization.

Making this change required a couple minutes. The main technique was to search the file for model.grid, and make the necessary changes to use the user-provided grid instead. In this process I noticed that file initialization, which required the grid, has to occur outside the output writer constructor. This implies that that the "output writer grid" (which is now different from the model grid) must be stored within the output writer. So I added a grid property to the output writer.

These changes took about 5 minutes. However, the main work is still there, to document this and add tests and an example needed.

If there's any other source code changes needed I'm happy to put those in. The documentation will take more effort.

@tomchor
Copy link
Collaborator Author

tomchor commented May 3, 2024

Still pretty easy, but more work/time that I have right now.

Reading the code this is just one more line. Can you be more specific about how much more time this will take?

I wasn't specific about time because I wasn't sure. I often find myself spending way longer on PRs than I initially anticipate so I'm generally not very good at assessing these things lol

But I'm glad it's working now, pending docs. If the example above works without needing to create a second model, then I don't think anything else is needed on the coding part. @iuryt ?

@glwagner
Copy link
Member

glwagner commented May 3, 2024

I wasn't specific about time because I wasn't sure. I often find myself spending way longer on PRs than I initially anticipate so I'm generally not very good at assessing these things lol

ok ok! Here's a rule of thumb: a bugfix or refactor is the least committment, because you can get away with no new tests or docs.

A new feature takes more time because of documentation. One should expect to spend most of their time on docs and tests. If you're spending a lot of time implementing a feature, then either the feature is very complicated / hard, or your workflow can use some improvement.

Among new features, different types of features require different amount of effort. This case is one of the easier cases --- it's extending functionality without changing existing functionality, also its fairly niche (at this point) so simple documentation will suffice. The work required to test the feature also has already been partially completed (the script you posted).

Other features, like adding new physics will take much longer, because often you'll have to run a full validation case + analysis to assess whether things work as expected. So even if the source code change is small to implement new physics, the validation will take a while and dominate the development effort.

If a new feature also requires changing existing functionality / refactoring, that's going to take the most amount of time, because you will probably also have to change existing tests. And many tests are poorly written, so updating test code is a potential rabbit hole.

@glwagner
Copy link
Member

glwagner commented May 3, 2024

Tests pass, so documentation in your courts @iuryt @tomchor. I think for the docstring we can just say that the grid needs to be provided if outputs are on a different grid than model.

But it could be nice to also put an example in the docs, since I think the whole idea of output on a different grid from the model takes a bit to explain. Eg here: https://clima.github.io/OceananigansDocumentation/stable/model_setup/output_writers/

I can contribute a test if there's desire to put in the documentation.

@iuryt
Copy link
Collaborator

iuryt commented May 3, 2024

I can work on the docs

@iuryt
Copy link
Collaborator

iuryt commented May 6, 2024

Once the wording is finalized, I will edit the docstring.

@@ -158,6 +158,36 @@ NetCDFOutputWriter scheduled on IterationInterval(1):
└── file size: 17.8 KiB
```

`NetCDFOutputWriter` can also be configured for `outputs` that are interpolated or regridded to a different grid than `model.grid`.
To use this functionality, the `output_grid` must be passed explicitly when constructing `NetCDFOutputWriter` along with the regridded / interpolated `outputs`.
Copy link
Collaborator Author

@tomchor tomchor May 8, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we mention that once NetCDFWriter has output_grid passed it can only write outputs on that grid? In other words, you can't have outputs that are the original/full grid and the interpolated/coarse grid in the same NetCDFWriter file (for now at least). I feel this can be a source of confusion for users in the future.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also we don't pass output_grid. We pass the keyword grid, correct?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also we don't pass output_grid. We pass the keyword grid, correct?

We made it a positional argument. But we can change that and make a keyword argument for sure. What do you think?

Also, I agree 100% that making that more explicit is a good idea. The grid and outputs must be consistent, otherwise output writing will fail.

Also, you often write NetCDFWriter. I like that name because it's a little more concise ("output" is obvious). Should we change this and JLD2?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that NetCDFWriter is better and that we should also change that for JLD.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, after this is merged we will get on that.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also we don't pass output_grid. We pass the keyword grid, correct?

We made it a positional argument. But we can change that and make a keyword argument for sure. What do you think?

Ah, true. I have a bit of a preference for keyword arguments, but I don't feel strongly about it.

Also, I agree 100% that making that more explicit is a good idea. The grid and outputs must be consistent, otherwise output writing will fail.

Also, you often write NetCDFWriter. I like that name because it's a little more concise ("output" is obvious). Should we change this and JLD2?

Ah, yeah, sorry. I write that for the exact reason you mentioned, and the fact that NetCDFOutputWriter is kinda long lol

I often thought abut opening an issue to suggest that change myself, but haven't gotten around to it. So I'm very much in favor of making that change!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm ok I agree it should be a kwarg.

Copy link
Member

@glwagner glwagner May 8, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I noticed that the docs examples are identical to the doc string examples. Ideally these would be different... there's no point in copy pasting here. It's more efficient to use Documenter to insert the docstring if we decide that we don't have the bandwidth to come up with unique examples.

But I personally think that the docstring and docs should have different examples. The docstring should have short concise examples and the docs should go into more detail.

Anyways, task for the future I guess.

@iuryt
Copy link
Collaborator

iuryt commented May 8, 2024

Let me know what else is missing or if we just need a third review to merge.

@tomchor
Copy link
Collaborator Author

tomchor commented May 8, 2024

Let me know what else is missing or if we just need a third review to merge.

Tests still aren't passing.

@glwagner
Copy link
Member

glwagner commented May 8, 2024

Some commits got added that overwrote changes that I made

@glwagner
Copy link
Member

glwagner commented May 8, 2024

Hmm ok somehow changes I made were lost, but I can't figure out how. Anyways I have tried to put them back in.

@iuryt
Copy link
Collaborator

iuryt commented May 8, 2024

Hmm ok somehow changes I made were lost, but I can't figure out how. Anyways I have tried to put them back in.

I was wondering the same. I noticed the error was the same we had before.

@iuryt
Copy link
Collaborator

iuryt commented May 9, 2024

Not sure why this is failing now.

@glwagner
Copy link
Member

glwagner commented May 9, 2024

Might just need a retry -- gave it one. Thanks for the help!

@glwagner
Copy link
Member

heck ya

@glwagner glwagner merged commit e5a9c41 into main May 10, 2024
46 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants