Skip to content

Commit

Permalink
(0.88.0) MPI communication and computation overlap in the `Hydrostati…
Browse files Browse the repository at this point in the history
…cFreeSurfaceModel` and `NonhydrostaticModel` (#3125)

* comment

* fixed tag problems

* bugfix

* Update scalar_biharmonic_diffusivity.jl

* Update src/Distributed/multi_architectures.jl

Co-authored-by: Navid C. Constantinou <[email protected]>

* Update src/Distributed/partition_assemble.jl

Co-authored-by: Navid C. Constantinou <[email protected]>

* Update src/ImmersedBoundaries/ImmersedBoundaries.jl

Co-authored-by: Navid C. Constantinou <[email protected]>

* Update src/ImmersedBoundaries/active_cells_map.jl

Co-authored-by: Navid C. Constantinou <[email protected]>

* Update src/Distributed/interleave_comm_and_comp.jl

Co-authored-by: Navid C. Constantinou <[email protected]>

* Clean up batched tridiagonal solver and vertically implicit solver

* Fix bug in batched tridiagonal solver

* bugfix

* Try to fix multi region immersed boundary issue

* Hopefully fix immersed boundary grid constructor

* Another fix

* fixed project and manifest

* convert instead of FT

* export KernelParameters

* remove FT

* removed useless where FT

* small bugfix

* update manifest

* remove unbuffered communication

* little bit of a cleanup

* removed `views` comment

* couple of bugfixes

* fixed tests

* probably done

* same thing for nonhydrostatic model

* include file

* bugfix

* prepare for nonhydrostatic multiregion

* also here

* bugfix

* other bugfix

* fix closures

* bugfix

* simplify

* 2D leith requires 2 halos!

* AMD and Smag require 1 halo!

* wrong order

* correct halo handling for diffusivities

* correct Leith formulation + fixes

* `only_local_halos` kwarg in `fill_halo_regions!`

* bugfix

* FT on GPU

* bugfix

* bugfix

* last bugfix?

* removed all offsets from kernels + fixed all tests

* fix `_compute!`

* finished

* fixed broken tests

* fixed docs

* miscellaneous changes

* bugfix

* removed tests for vertical subdivision

* test corner passing

* correction

* retry

* fixed all problems

* Added a validation example

* fixed tests

* try new test

* fill send buffers in the correct place

* fixed comments

* define async

* pass the grid

* bugfix

* fix show method

* RefValue for mpi_tag

* comment

* add catke preprint

* remove warning; add ref to catke preprint

* some code cleanup

* correct the example

* Update src/TurbulenceClosures/vertically_implicit_diffusion_solver.jl

Co-authored-by: Navid C. Constantinou <[email protected]>

* bugfix

* Refactor unit tests

* Generalize regridding for lat-lon

* bugfix

* Add newline

* small correction

* new tests

* bugfix

* bugfix

* back for testing

* update manifest

* more options

* more

* finished

* test hypothesis

* fixed bug - correct speed now

* add space

* bugfix

* test

* more info

* removed left-right connected computation

* bugfix

* remove info

* improve

* typo

* bugfix

* bugfix

* correct comments

* bugfix

* bugfix prescribed velocities

* fixes

* ok on mac

* bugfix

* bug fixed

* bugfixxed

* new default

* bugfix

* remove <<<<HEAD

* bugfix PrescribedVelocityFields

* default in another PR

* bugfix

* flat grids only in Grids

* last bugfix

* bugfix

* try partial cells

* bugfix

* bugfix

* Update test_turbulence_closures.jl

* small fixes

* rework IBG and MRG

* Update src/TurbulenceClosures/vertically_implicit_diffusion_solver.jl

* small bugfix

* remove multiregion ibg with arrays for the moment

* bugfix

* little cleaner

* fixed tests

* see what the error is

* allow changing halos from checkpointer

* test it

* finally fixed it

* better naming

* bugfix

* bugfix

* bugfix

* bugfix

* removed useless tendency

* small fix

* dummy commit

* fix active cell map

* comment

* bugfix

* bugfix

* removed useless tendency

* maybe just keep it does not harm too much

* should have fixed it?

* let's go now

* done

* bugfix

* no need for this

* convert Δt in time stepping

* maximum

* minimum substeps

* more flexibility

* bugfix

* mutlidimensional

* fallback methods

* test a thing

* change

* chnage

* change

* change

* update

* update

* new offsets + return to previous KA

* bugfix

* bugfixxed

* remove debugging

* going back

* more robus partitioning

* quite new

* bugfix

* updated Manifest

* build with 1.9.3

* switch boundary_buffer to required_halo_size

* bugfix

* Update src/Models/HydrostaticFreeSurfaceModels/single_column_model_mode.jl

Co-authored-by: Gregory L. Wagner <[email protected]>

* Update src/Models/HydrostaticFreeSurfaceModels/update_hydrostatic_free_surface_model_state.jl

Co-authored-by: Gregory L. Wagner <[email protected]>

* bugfix

* biharmonic requires 2 halos

* buggfix

* compute_auxiliaries!

* bugfix

* fixed it

* little change

* some changes

* bugfix

* bugfix

* bugfixxed

* another bugfix

* compute_diffusivities!

* required halo size

* all fixed

* shorten line

* fix comment

* remove abbreviation

* remove unused functions

* better explanation of the MPI tag

* Update src/ImmersedBoundaries/active_cells_map.jl

Co-authored-by: Navid C. Constantinou <[email protected]>

* Update src/Solvers/batched_tridiagonal_solver.jl

Co-authored-by: Navid C. Constantinou <[email protected]>

* change name

* docstring

* name change on rank

* interior active cells

* calculate -> compute

* fixed tests

* do not compute momentum in prescribed velocities

* DistributedComputations

* DistributedComputations part #2

* bugfix

* fixed the docs

---------

Co-authored-by: Navid C. Constantinou <[email protected]>
Co-authored-by: Gregory L. Wagner <[email protected]>
  • Loading branch information
3 people committed Sep 19, 2023
1 parent a1031f9 commit a2e83df
Show file tree
Hide file tree
Showing 119 changed files with 2,706 additions and 1,963 deletions.
4 changes: 2 additions & 2 deletions Project.toml
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
name = "Oceananigans"
uuid = "9e8cae18-63c1-5223-a75c-80ca9d6e9a09"
authors = ["Climate Modeling Alliance and contributors"]
version = "0.87.4"
version = "0.88.0"

[deps]
Adapt = "79e6a3ab-5dfb-504d-930d-738a2a938a0e"
Expand Down Expand Up @@ -38,6 +38,7 @@ StructArrays = "09ab397b-f2b6-538f-b94a-2f83cf4a842a"
[compat]
Adapt = "3"
CUDA = "4"
KernelAbstractions = "^0.9"
Crayons = "4"
CubedSphere = "0.1, 0.2"
Distances = "0.10"
Expand All @@ -47,7 +48,6 @@ Glob = "1.3"
IncompleteLU = "0.2"
IterativeSolvers = "0.9"
JLD2 = "^0.4"
KernelAbstractions = "0.9"
MPI = "0.16, 0.17, 0.18, 0.19, 0.20"
NCDatasets = "0.12.10"
OffsetArrays = "1.4"
Expand Down
4 changes: 2 additions & 2 deletions benchmark/distributed_nonhydrostatic_model_mpi.jl
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ using JLD2
using BenchmarkTools

using Oceananigans
using Oceananigans.Distributed
using Oceananigans.DistributedComputations

Logging.global_logger(OceananigansLogger())

Expand All @@ -28,7 +28,7 @@ local_rank = MPI.Comm_rank(comm)
@info "Setting up distributed nonhydrostatic model with N=($Nx, $Ny, $Nz) grid points and ranks=($Rx, $Ry, $Rz) on rank $local_rank..."

topo = (Periodic, Periodic, Periodic)
arch = DistributedArch(CPU(), topology=topo, ranks=(Rx, Ry, Rz), communicator=MPI.COMM_WORLD)
arch = Distributed(CPU(), topology=topo, ranks=(Rx, Ry, Rz), communicator=MPI.COMM_WORLD)
distributed_grid = RectilinearGrid(arch, topology=topo, size=(Nx, Ny, Nz), extent=(1, 1, 1))
model = NonhydrostaticModel(grid=distributed_grid)

Expand Down
4 changes: 2 additions & 2 deletions benchmark/distributed_shallow_water_model_mpi.jl
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ using JLD2
using BenchmarkTools

using Oceananigans
using Oceananigans.Distributed
using Oceananigans.DistributedComputations
using Benchmarks

Logging.global_logger(OceananigansLogger())
Expand All @@ -30,7 +30,7 @@ Ry = parse(Int, ARGS[4])
@info "Setting up distributed shallow water model with N=($Nx, $Ny) grid points and ranks=($Rx, $Ry) on rank $local_rank..."

topo = (Periodic, Periodic, Flat)
arch = DistributedArch(CPU(), topology=topo, ranks=(Rx, Ry, 1), communicator=MPI.COMM_WORLD)
arch = Distributed(CPU(), topology=topo, ranks=(Rx, Ry, 1), communicator=MPI.COMM_WORLD)
distributed_grid = RectilinearGrid(arch, topology=topo, size=(Nx, Ny), extent=(1, 1))
model = ShallowWaterModel(grid=distributed_grid, gravitational_acceleration=1.0)
set!(model, h=1)
Expand Down
2 changes: 1 addition & 1 deletion docs/src/appendix/library.md
Original file line number Diff line number Diff line change
Expand Up @@ -61,7 +61,7 @@ Private = false
## Distributed

```@autodocs
Modules = [Oceananigans.Distributed]
Modules = [Oceananigans.DistributedComputations]
Private = false
```

Expand Down
14 changes: 5 additions & 9 deletions src/AbstractOperations/computed_field.jl
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@

using KernelAbstractions: @kernel, @index
using Oceananigans.Grids: default_indices
using Oceananigans.Fields: FieldStatus, reduced_dimensions, validate_indices, offset_compute_index
using Oceananigans.Fields: FieldStatus, reduced_dimensions, validate_indices, offset_index
using Oceananigans.Utils: launch!

import Oceananigans.Fields: Field, compute!
Expand Down Expand Up @@ -75,17 +75,13 @@ end

function compute_computed_field!(comp)
arch = architecture(comp)
launch!(arch, comp.grid, size(comp), _compute!, comp.data, comp.operand, comp.indices)
parameters = KernelParameters(size(comp), map(offset_index, comp.indices))
launch!(arch, comp.grid, parameters, _compute!, comp.data, comp.operand)
return comp
end

"""Compute an `operand` and store in `data`."""
@kernel function _compute!(data, operand, index_ranges)
@kernel function _compute!(data, operand)
i, j, k = @index(Global, NTuple)

i′ = offset_compute_index(index_ranges[1], i)
j′ = offset_compute_index(index_ranges[2], j)
k′ = offset_compute_index(index_ranges[3], k)

@inbounds data[i′, j′, k′] = operand[i′, j′, k′]
@inbounds data[i, j, k] = operand[i, j, k]
end
3 changes: 1 addition & 2 deletions src/Advection/Advection.jl
Original file line number Diff line number Diff line change
Expand Up @@ -54,8 +54,7 @@ abstract type AbstractUpwindBiasedAdvectionScheme{B, FT} <: AbstractAdvectionSch
# Note that it is not possible to compile schemes for `advection_buffer = 41` or higher.
const advection_buffers = [1, 2, 3, 4, 5, 6]

@inline boundary_buffer(::AbstractAdvectionScheme{B}) where B = B
@inline required_halo_size(scheme::AbstractAdvectionScheme{B}) where B = B
@inline required_halo_size(::AbstractAdvectionScheme{B}) where B = B

include("centered_advective_fluxes.jl")
include("upwind_biased_advective_fluxes.jl")
Expand Down
2 changes: 1 addition & 1 deletion src/Advection/flat_advective_fluxes.jl
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
##### Flat Topologies
#####

using Oceananigans.Operators: XFlatGrid, YFlatGrid, ZFlatGrid
using Oceananigans.Grids: XFlatGrid, YFlatGrid, ZFlatGrid

for SchemeType in [:CenteredScheme, :UpwindScheme]
@eval begin
Expand Down
22 changes: 11 additions & 11 deletions src/Advection/reconstruction_coefficients.jl
Original file line number Diff line number Diff line change
Expand Up @@ -123,19 +123,19 @@ Examples
julia> using Oceananigans.Advection: calc_reconstruction_stencil
julia> calc_reconstruction_stencil(1, :right, :x)
:(+(FT(coeff1_right[1]) * ψ[i + 0, j, k]))
:(+(convert(FT, coeff1_right[1]) * ψ[i + 0, j, k]))
julia> calc_reconstruction_stencil(1, :left, :x)
:(+(FT(coeff1_left[1]) * ψ[i + -1, j, k]))
:(+(convert(FT, coeff1_left[1]) * ψ[i + -1, j, k]))
julia> calc_reconstruction_stencil(1, :symmetric, :x)
:(FT(coeff2_symmetric[2]) * ψ[i + -1, j, k] + FT(coeff2_symmetric[1]) * ψ[i + 0, j, k])
:(convert(FT, coeff2_symmetric[2]) * ψ[i + -1, j, k] + convert(FT, coeff2_symmetric[1]) * ψ[i + 0, j, k])
julia> calc_reconstruction_stencil(2, :symmetric, :x)
:(FT(coeff4_symmetric[4]) * ψ[i + -2, j, k] + FT(coeff4_symmetric[3]) * ψ[i + -1, j, k] + FT(coeff4_symmetric[2]) * ψ[i + 0, j, k] + FT(coeff4_symmetric[1]) * ψ[i + 1, j, k])
:(convert(FT, coeff4_symmetric[4]) * ψ[i + -2, j, k] + convert(FT, coeff4_symmetric[3]) * ψ[i + -1, j, k] + convert(FT, coeff4_symmetric[2]) * ψ[i + 0, j, k] + convert(FT, coeff4_symmetric[1]) * ψ[i + 1, j, k])
julia> calc_reconstruction_stencil(3, :left, :x)
:(FT(coeff5_left[5]) * ψ[i + -3, j, k] + FT(coeff5_left[4]) * ψ[i + -2, j, k] + FT(coeff5_left[3]) * ψ[i + -1, j, k] + FT(coeff5_left[2]) * ψ[i + 0, j, k] + FT(coeff5_left[1]) * ψ[i + 1, j, k])
:(convert(FT, coeff5_left[5]) * ψ[i + -3, j, k] + convert(FT, coeff5_left[4]) * ψ[i + -2, j, k] + convert(FT, coeff5_left[3]) * ψ[i + -1, j, k] + convert(FT, coeff5_left[2]) * ψ[i + 0, j, k] + convert(FT, coeff5_left[1]) * ψ[i + 1, j, k])
```
"""
@inline function calc_reconstruction_stencil(buffer, shift, dir, func::Bool = false)
Expand All @@ -154,16 +154,16 @@ julia> calc_reconstruction_stencil(3, :left, :x)
c = n - buffer - 1
if func
stencil_full[idx] = dir == :x ?
:(FT($coeff[$(order - idx + 1)]) * ψ(i + $c, j, k, grid, args...)) :
:(convert(FT, $coeff[$(order - idx + 1)]) * ψ(i + $c, j, k, grid, args...)) :
dir == :y ?
:(FT($coeff[$(order - idx + 1)]) * ψ(i, j + $c, k, grid, args...)) :
:(FT($coeff[$(order - idx + 1)]) * ψ(i, j, k + $c, grid, args...))
:(convert(FT, $coeff[$(order - idx + 1)]) * ψ(i, j + $c, k, grid, args...)) :
:(convert(FT, $coeff[$(order - idx + 1)]) * ψ(i, j, k + $c, grid, args...))
else
stencil_full[idx] = dir == :x ?
:(FT($coeff[$(order - idx + 1)]) * ψ[i + $c, j, k]) :
:(convert(FT, $coeff[$(order - idx + 1)]) * ψ[i + $c, j, k]) :
dir == :y ?
:(FT($coeff[$(order - idx + 1)]) * ψ[i, j + $c, k]) :
:(FT($coeff[$(order - idx + 1)]) * ψ[i, j, k + $c])
:(convert(FT, $coeff[$(order - idx + 1)]) * ψ[i, j + $c, k]) :
:(convert(FT, $coeff[$(order - idx + 1)]) * ψ[i, j, k + $c])
end
end
return Expr(:call, :+, stencil_full...)
Expand Down
14 changes: 7 additions & 7 deletions src/Advection/topologically_conditional_interpolation.jl
Original file line number Diff line number Diff line change
Expand Up @@ -26,12 +26,12 @@ const AUGXYZ = AUG{<:Any, <:Bounded, <:Bounded, <:Bounded}
# Left-biased buffers are smaller by one grid point on the right side; vice versa for right-biased buffers
# Center interpolation stencil look at i + 1 (i.e., require one less point on the left)

@inline outside_symmetric_bufferᶠ(i, N, adv) = (i >= boundary_buffer(adv) + 1) & (i <= N + 1 - boundary_buffer(adv))
@inline outside_symmetric_bufferᶜ(i, N, adv) = (i >= boundary_buffer(adv)) & (i <= N + 1 - boundary_buffer(adv))
@inline outside_left_biased_bufferᶠ(i, N, adv) = (i >= boundary_buffer(adv) + 1) & (i <= N + 1 - (boundary_buffer(adv) - 1))
@inline outside_left_biased_bufferᶜ(i, N, adv) = (i >= boundary_buffer(adv)) & (i <= N + 1 - (boundary_buffer(adv) - 1))
@inline outside_right_biased_bufferᶠ(i, N, adv) = (i >= boundary_buffer(adv)) & (i <= N + 1 - boundary_buffer(adv))
@inline outside_right_biased_bufferᶜ(i, N, adv) = (i >= boundary_buffer(adv) - 1) & (i <= N + 1 - boundary_buffer(adv))
@inline outside_symmetric_haloᶠ(i, N, adv) = (i >= required_halo_size(adv) + 1) & (i <= N + 1 - required_halo_size(adv))
@inline outside_symmetric_haloᶜ(i, N, adv) = (i >= required_halo_size(adv)) & (i <= N + 1 - required_halo_size(adv))
@inline outside_left_biased_haloᶠ(i, N, adv) = (i >= required_halo_size(adv) + 1) & (i <= N + 1 - (required_halo_size(adv) - 1))
@inline outside_left_biased_haloᶜ(i, N, adv) = (i >= required_halo_size(adv)) & (i <= N + 1 - (required_halo_size(adv) - 1))
@inline outside_right_biased_haloᶠ(i, N, adv) = (i >= required_halo_size(adv)) & (i <= N + 1 - required_halo_size(adv))
@inline outside_right_biased_haloᶜ(i, N, adv) = (i >= required_halo_size(adv) - 1) & (i <= N + 1 - required_halo_size(adv))

# Separate High order advection from low order advection
const HOADV = Union{WENO,
Expand Down Expand Up @@ -60,7 +60,7 @@ for bias in (:symmetric, :left_biased, :right_biased)
@eval @inline $alt_interp(i, j, k, grid::$GridType, scheme::LOADV, args...) = $interp(i, j, k, grid, scheme, args...)
end

outside_buffer = Symbol(:outside_, bias, :_buffer, loc)
outside_buffer = Symbol(:outside_, bias, :_halo, loc)

# Conditional high-order interpolation in Bounded directions
if ξ == :x
Expand Down
Loading

2 comments on commit a2e83df

@navidcy
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@JuliaRegistrator
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Registration pull request created: JuliaRegistries/General/91747

After the above pull request is merged, it is recommended that a tag is created on this repository for the registered package version.

This will be done automatically if the Julia TagBot GitHub Action is installed, or can be done manually through the github interface, or via:

git tag -a v0.88.0 -m "<description of version>" a2e83dfe54079f2939d514380a9aae65d8a0bc43
git push origin v0.88.0

Please sign in to comment.