Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

(0.91.6) Distributed FFTs using Oceananigans' inhouse DiscreteTransforms #3279

Merged
merged 769 commits into from
Aug 7, 2024

Conversation

simone-silvestri
Copy link
Collaborator

@simone-silvestri simone-silvestri commented Sep 20, 2023

This PR removes the PencilFFT library from Oceananigans and builds a distributed FFT solver using Oceananigans' inhouse transforms. This allows us to run on GPUs both periodic and bounded domains.
No stretched mesh is supported at the moment (that will come in a later PR)

The transposition is performed through a custom transpose routine built for Oceananigans' fields that assumes

  • the starting configuration is always a z-free configuration.
  • the transpose directions are z-free -> y-free -> x-free -> y-free -> x-free
  • the y-direction is integer divisible by the number of ranks that divide the x-direction
  • the z-direction is integer divisible by the number of ranks that divide the y-direction

An additional assumption is that:

  • if TY is Bounded, also TZ needs to be Bounded
  • if TX is Bounded, also TY needs to be Bounded

All these assumptions can be relaxed in following PRs

Copy link
Collaborator

@tomchor tomchor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @simone-silvestri this is great and I'm happy with the changes, so I'm approving.

I will mention though, there are still quite a few comments from @glwagner from Jun 28th that seem to be unresolved, especially for files distributed_transpose.jl amd transposable_field.jl. But I see that he already approved the PR so maybe it doesn't matter :)

🚀

src/DistributedComputations/distributed_transpose.jl Outdated Show resolved Hide resolved
Dockerfile Outdated Show resolved Hide resolved
Dockerfile Outdated Show resolved Hide resolved
src/DistributedComputations/distributed_transpose.jl Outdated Show resolved Hide resolved

zgrid = field_in.grid # We support only a 2D partition in X and Y
ygrid = twin_grid(zgrid; free_dimension = :y)
xgrid = twin_grid(zgrid; free_dimension = :x)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a method defined below?

Comment on lines 54 to 59
1. `storage.zfield`, partitioned over ``(x, y)`` is initialized with the `rhs`.
2. Transform along ``z``.
3 Transpose + communicate to `storage[2]` in layout ``(x, z, y)``,
which is distributed into `(Rx, Ry)` processes in ``(z, y)``.
4. Transform along ``x``.
5. Transpose + communicate to `last(storage)` in layout ``(y, x, z)``,
which is distributed into `(Rx, Ry)` processes in ``(x, z)``.
6. Transform in ``y``.
3 Transpose + communicate to `storage.yfield` partitioned into `(Rx, Ry)` processes in ``(x, z)``.
4. Transform along ``y``.
5. Transpose + communicate to `storage.xfield` partitioned into `(Rx, Ry)` processes in ``(y, z)``.
6. Transform in ``x``.
Copy link
Collaborator

@navidcy navidcy Aug 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds same question as mine.. (the one by @tomchor egarding storage)

@simone-silvestri
Copy link
Collaborator Author

a method defined below?

I don't understand this comment

@simone-silvestri
Copy link
Collaborator Author

Sounds same question as mine.. (the one by @tomchor egarding storage)

I have added a line in the docstring explaining what storage is

@simone-silvestri
Copy link
Collaborator Author

@navidcy are you satisfied with the corrections?

@simone-silvestri simone-silvestri changed the title (0.91.5) Distributed FFTs using Oceananigans' inhouse DiscreteTransforms (0.92.0) Distributed FFTs using Oceananigans' inhouse DiscreteTransforms Aug 5, 2024
@tomchor
Copy link
Collaborator

tomchor commented Aug 5, 2024

@simone-silvestri you are now bumping to 0.92. Can you clarify what's breaking about this PR?

@simone-silvestri
Copy link
Collaborator Author

simone-silvestri commented Aug 5, 2024

Well it adds a new feature. I could just bump to 0.91.6 instead but it seems quite a major feature to justify the bump in version.
Also it removes dependencies on quite some packages.
What do people think?

@glwagner
Copy link
Member

glwagner commented Aug 5, 2024

Only bump to a new version for a breaking change, 91 versions is enough!

@simone-silvestri simone-silvestri changed the title (0.92.0) Distributed FFTs using Oceananigans' inhouse DiscreteTransforms (0.91.6) Distributed FFTs using Oceananigans' inhouse DiscreteTransforms Aug 5, 2024
@simone-silvestri
Copy link
Collaborator Author

merge time?

Copy link
Collaborator

@navidcy navidcy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@simone-silvestri simone-silvestri merged commit c3d0d59 into main Aug 7, 2024
46 checks passed
@simone-silvestri simone-silvestri deleted the ss/distributed-fft branch August 7, 2024 12:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
distributed 🕸️ Our plan for total cluster domination
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants