Skip to content

Commit

Permalink
(0.88.0) MPI communication and computation overlap in the `Hydrostati…
Browse files Browse the repository at this point in the history
…cFreeSurfaceModel` and `NonhydrostaticModel` (#3125)

* comment

* fixed tag problems

* bugfix

* Update scalar_biharmonic_diffusivity.jl

* Update src/Distributed/multi_architectures.jl

Co-authored-by: Navid C. Constantinou <[email protected]>

* Update src/Distributed/partition_assemble.jl

Co-authored-by: Navid C. Constantinou <[email protected]>

* Update src/ImmersedBoundaries/ImmersedBoundaries.jl

Co-authored-by: Navid C. Constantinou <[email protected]>

* Update src/ImmersedBoundaries/active_cells_map.jl

Co-authored-by: Navid C. Constantinou <[email protected]>

* Update src/Distributed/interleave_comm_and_comp.jl

Co-authored-by: Navid C. Constantinou <[email protected]>

* Clean up batched tridiagonal solver and vertically implicit solver

* Fix bug in batched tridiagonal solver

* bugfix

* Try to fix multi region immersed boundary issue

* Hopefully fix immersed boundary grid constructor

* Another fix

* fixed project and manifest

* convert instead of FT

* export KernelParameters

* remove FT

* removed useless where FT

* small bugfix

* update manifest

* remove unbuffered communication

* little bit of a cleanup

* removed `views` comment

* couple of bugfixes

* fixed tests

* probably done

* same thing for nonhydrostatic model

* include file

* bugfix

* prepare for nonhydrostatic multiregion

* also here

* bugfix

* other bugfix

* fix closures

* bugfix

* simplify

* 2D leith requires 2 halos!

* AMD and Smag require 1 halo!

* wrong order

* correct halo handling for diffusivities

* correct Leith formulation + fixes

* `only_local_halos` kwarg in `fill_halo_regions!`

* bugfix

* FT on GPU

* bugfix

* bugfix

* last bugfix?

* removed all offsets from kernels + fixed all tests

* fix `_compute!`

* finished

* fixed broken tests

* fixed docs

* miscellaneous changes

* bugfix

* removed tests for vertical subdivision

* test corner passing

* correction

* retry

* fixed all problems

* Added a validation example

* fixed tests

* try new test

* fill send buffers in the correct place

* fixed comments

* define async

* pass the grid

* bugfix

* fix show method

* RefValue for mpi_tag

* comment

* add catke preprint

* remove warning; add ref to catke preprint

* some code cleanup

* correct the example

* Update src/TurbulenceClosures/vertically_implicit_diffusion_solver.jl

Co-authored-by: Navid C. Constantinou <[email protected]>

* bugfix

* Refactor unit tests

* Generalize regridding for lat-lon

* bugfix

* Add newline

* small correction

* new tests

* bugfix

* bugfix

* back for testing

* update manifest

* more options

* more

* finished

* test hypothesis

* fixed bug - correct speed now

* add space

* bugfix

* test

* more info

* removed left-right connected computation

* bugfix

* remove info

* improve

* typo

* bugfix

* bugfix

* correct comments

* bugfix

* bugfix prescribed velocities

* fixes

* ok on mac

* bugfix

* bug fixed

* bugfixxed

* new default

* bugfix

* remove <<<<HEAD

* bugfix PrescribedVelocityFields

* default in another PR

* bugfix

* flat grids only in Grids

* last bugfix

* bugfix

* try partial cells

* bugfix

* bugfix

* Update test_turbulence_closures.jl

* small fixes

* rework IBG and MRG

* Update src/TurbulenceClosures/vertically_implicit_diffusion_solver.jl

* small bugfix

* remove multiregion ibg with arrays for the moment

* bugfix

* little cleaner

* fixed tests

* see what the error is

* allow changing halos from checkpointer

* test it

* finally fixed it

* better naming

* bugfix

* bugfix

* bugfix

* bugfix

* removed useless tendency

* small fix

* dummy commit

* fix active cell map

* comment

* bugfix

* bugfix

* removed useless tendency

* maybe just keep it does not harm too much

* should have fixed it?

* let's go now

* done

* bugfix

* no need for this

* convert Δt in time stepping

* maximum

* minimum substeps

* more flexibility

* bugfix

* mutlidimensional

* fallback methods

* test a thing

* change

* chnage

* change

* change

* update

* update

* new offsets + return to previous KA

* bugfix

* bugfixxed

* remove debugging

* going back

* more robus partitioning

* quite new

* bugfix

* updated Manifest

* build with 1.9.3

* switch boundary_buffer to required_halo_size

* bugfix

* Update src/Models/HydrostaticFreeSurfaceModels/single_column_model_mode.jl

Co-authored-by: Gregory L. Wagner <[email protected]>

* Update src/Models/HydrostaticFreeSurfaceModels/update_hydrostatic_free_surface_model_state.jl

Co-authored-by: Gregory L. Wagner <[email protected]>

* bugfix

* biharmonic requires 2 halos

* buggfix

* compute_auxiliaries!

* bugfix

* fixed it

* little change

* some changes

* bugfix

* bugfix

* bugfixxed

* another bugfix

* compute_diffusivities!

* required halo size

* all fixed

* shorten line

* fix comment

* remove abbreviation

* remove unused functions

* better explanation of the MPI tag

* Update src/ImmersedBoundaries/active_cells_map.jl

Co-authored-by: Navid C. Constantinou <[email protected]>

* Update src/Solvers/batched_tridiagonal_solver.jl

Co-authored-by: Navid C. Constantinou <[email protected]>

* change name

* docstring

* name change on rank

* interior active cells

* calculate -> compute

* fixed tests

* do not compute momentum in prescribed velocities

* DistributedComputations

* DistributedComputations part #2

* bugfix

* fixed the docs

---------

Co-authored-by: Navid C. Constantinou <[email protected]>
Co-authored-by: Gregory L. Wagner <[email protected]>
  • Loading branch information
3 people committed Sep 20, 2023
1 parent 9e3bb48 commit 68cb4c5
Show file tree
Hide file tree
Showing 119 changed files with 2,706 additions and 1,963 deletions.
4 changes: 2 additions & 2 deletions Project.toml
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
name = "Oceananigans"
uuid = "9e8cae18-63c1-5223-a75c-80ca9d6e9a09"
authors = ["Climate Modeling Alliance and contributors"]
version = "0.87.4"
version = "0.88.0"

[deps]
Adapt = "79e6a3ab-5dfb-504d-930d-738a2a938a0e"
Expand Down Expand Up @@ -38,6 +38,7 @@ StructArrays = "09ab397b-f2b6-538f-b94a-2f83cf4a842a"
[compat]
Adapt = "3"
CUDA = "4"
KernelAbstractions = "^0.9"
Crayons = "4"
CubedSphere = "0.1, 0.2"
Distances = "0.10"
Expand All @@ -47,7 +48,6 @@ Glob = "1.3"
IncompleteLU = "0.2"
IterativeSolvers = "0.9"
JLD2 = "^0.4"
KernelAbstractions = "0.9"
MPI = "0.16, 0.17, 0.18, 0.19, 0.20"
NCDatasets = "0.12.10"
OffsetArrays = "1.4"
Expand Down
4 changes: 2 additions & 2 deletions benchmark/distributed_nonhydrostatic_model_mpi.jl
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ using JLD2
using BenchmarkTools

using Oceananigans
using Oceananigans.Distributed
using Oceananigans.DistributedComputations

Logging.global_logger(OceananigansLogger())

Expand All @@ -28,7 +28,7 @@ local_rank = MPI.Comm_rank(comm)
@info "Setting up distributed nonhydrostatic model with N=($Nx, $Ny, $Nz) grid points and ranks=($Rx, $Ry, $Rz) on rank $local_rank..."

topo = (Periodic, Periodic, Periodic)
arch = DistributedArch(CPU(), topology=topo, ranks=(Rx, Ry, Rz), communicator=MPI.COMM_WORLD)
arch = Distributed(CPU(), topology=topo, ranks=(Rx, Ry, Rz), communicator=MPI.COMM_WORLD)
distributed_grid = RectilinearGrid(arch, topology=topo, size=(Nx, Ny, Nz), extent=(1, 1, 1))
model = NonhydrostaticModel(grid=distributed_grid)

Expand Down
4 changes: 2 additions & 2 deletions benchmark/distributed_shallow_water_model_mpi.jl
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ using JLD2
using BenchmarkTools

using Oceananigans
using Oceananigans.Distributed
using Oceananigans.DistributedComputations
using Benchmarks

Logging.global_logger(OceananigansLogger())
Expand All @@ -30,7 +30,7 @@ Ry = parse(Int, ARGS[4])
@info "Setting up distributed shallow water model with N=($Nx, $Ny) grid points and ranks=($Rx, $Ry) on rank $local_rank..."

topo = (Periodic, Periodic, Flat)
arch = DistributedArch(CPU(), topology=topo, ranks=(Rx, Ry, 1), communicator=MPI.COMM_WORLD)
arch = Distributed(CPU(), topology=topo, ranks=(Rx, Ry, 1), communicator=MPI.COMM_WORLD)
distributed_grid = RectilinearGrid(arch, topology=topo, size=(Nx, Ny), extent=(1, 1))
model = ShallowWaterModel(grid=distributed_grid, gravitational_acceleration=1.0)
set!(model, h=1)
Expand Down
2 changes: 1 addition & 1 deletion docs/src/appendix/library.md
Original file line number Diff line number Diff line change
Expand Up @@ -61,7 +61,7 @@ Private = false
## Distributed

```@autodocs
Modules = [Oceananigans.Distributed]
Modules = [Oceananigans.DistributedComputations]
Private = false
```

Expand Down
14 changes: 5 additions & 9 deletions src/AbstractOperations/computed_field.jl
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@

using KernelAbstractions: @kernel, @index
using Oceananigans.Grids: default_indices
using Oceananigans.Fields: FieldStatus, reduced_dimensions, validate_indices, offset_compute_index
using Oceananigans.Fields: FieldStatus, reduced_dimensions, validate_indices, offset_index
using Oceananigans.Utils: launch!

import Oceananigans.Fields: Field, compute!
Expand Down Expand Up @@ -75,17 +75,13 @@ end

function compute_computed_field!(comp)
arch = architecture(comp)
launch!(arch, comp.grid, size(comp), _compute!, comp.data, comp.operand, comp.indices)
parameters = KernelParameters(size(comp), map(offset_index, comp.indices))
launch!(arch, comp.grid, parameters, _compute!, comp.data, comp.operand)
return comp
end

"""Compute an `operand` and store in `data`."""
@kernel function _compute!(data, operand, index_ranges)
@kernel function _compute!(data, operand)
i, j, k = @index(Global, NTuple)

i′ = offset_compute_index(index_ranges[1], i)
j′ = offset_compute_index(index_ranges[2], j)
k′ = offset_compute_index(index_ranges[3], k)

@inbounds data[i′, j′, k′] = operand[i′, j′, k′]
@inbounds data[i, j, k] = operand[i, j, k]
end
3 changes: 1 addition & 2 deletions src/Advection/Advection.jl
Original file line number Diff line number Diff line change
Expand Up @@ -54,8 +54,7 @@ abstract type AbstractUpwindBiasedAdvectionScheme{B, FT} <: AbstractAdvectionSch
# Note that it is not possible to compile schemes for `advection_buffer = 41` or higher.
const advection_buffers = [1, 2, 3, 4, 5, 6]

@inline boundary_buffer(::AbstractAdvectionScheme{B}) where B = B
@inline required_halo_size(scheme::AbstractAdvectionScheme{B}) where B = B
@inline required_halo_size(::AbstractAdvectionScheme{B}) where B = B

include("centered_advective_fluxes.jl")
include("upwind_biased_advective_fluxes.jl")
Expand Down
2 changes: 1 addition & 1 deletion src/Advection/flat_advective_fluxes.jl
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
##### Flat Topologies
#####

using Oceananigans.Operators: XFlatGrid, YFlatGrid, ZFlatGrid
using Oceananigans.Grids: XFlatGrid, YFlatGrid, ZFlatGrid

for SchemeType in [:CenteredScheme, :UpwindScheme]
@eval begin
Expand Down
22 changes: 11 additions & 11 deletions src/Advection/reconstruction_coefficients.jl
Original file line number Diff line number Diff line change
Expand Up @@ -123,19 +123,19 @@ Examples
julia> using Oceananigans.Advection: calc_reconstruction_stencil
julia> calc_reconstruction_stencil(1, :right, :x)
:(+(FT(coeff1_right[1]) * ψ[i + 0, j, k]))
:(+(convert(FT, coeff1_right[1]) * ψ[i + 0, j, k]))
julia> calc_reconstruction_stencil(1, :left, :x)
:(+(FT(coeff1_left[1]) * ψ[i + -1, j, k]))
:(+(convert(FT, coeff1_left[1]) * ψ[i + -1, j, k]))
julia> calc_reconstruction_stencil(1, :symmetric, :x)
:(FT(coeff2_symmetric[2]) * ψ[i + -1, j, k] + FT(coeff2_symmetric[1]) * ψ[i + 0, j, k])
:(convert(FT, coeff2_symmetric[2]) * ψ[i + -1, j, k] + convert(FT, coeff2_symmetric[1]) * ψ[i + 0, j, k])
julia> calc_reconstruction_stencil(2, :symmetric, :x)
:(FT(coeff4_symmetric[4]) * ψ[i + -2, j, k] + FT(coeff4_symmetric[3]) * ψ[i + -1, j, k] + FT(coeff4_symmetric[2]) * ψ[i + 0, j, k] + FT(coeff4_symmetric[1]) * ψ[i + 1, j, k])
:(convert(FT, coeff4_symmetric[4]) * ψ[i + -2, j, k] + convert(FT, coeff4_symmetric[3]) * ψ[i + -1, j, k] + convert(FT, coeff4_symmetric[2]) * ψ[i + 0, j, k] + convert(FT, coeff4_symmetric[1]) * ψ[i + 1, j, k])
julia> calc_reconstruction_stencil(3, :left, :x)
:(FT(coeff5_left[5]) * ψ[i + -3, j, k] + FT(coeff5_left[4]) * ψ[i + -2, j, k] + FT(coeff5_left[3]) * ψ[i + -1, j, k] + FT(coeff5_left[2]) * ψ[i + 0, j, k] + FT(coeff5_left[1]) * ψ[i + 1, j, k])
:(convert(FT, coeff5_left[5]) * ψ[i + -3, j, k] + convert(FT, coeff5_left[4]) * ψ[i + -2, j, k] + convert(FT, coeff5_left[3]) * ψ[i + -1, j, k] + convert(FT, coeff5_left[2]) * ψ[i + 0, j, k] + convert(FT, coeff5_left[1]) * ψ[i + 1, j, k])
```
"""
@inline function calc_reconstruction_stencil(buffer, shift, dir, func::Bool = false)
Expand All @@ -154,16 +154,16 @@ julia> calc_reconstruction_stencil(3, :left, :x)
c = n - buffer - 1
if func
stencil_full[idx] = dir == :x ?
:(FT($coeff[$(order - idx + 1)]) * ψ(i + $c, j, k, grid, args...)) :
:(convert(FT, $coeff[$(order - idx + 1)]) * ψ(i + $c, j, k, grid, args...)) :
dir == :y ?
:(FT($coeff[$(order - idx + 1)]) * ψ(i, j + $c, k, grid, args...)) :
:(FT($coeff[$(order - idx + 1)]) * ψ(i, j, k + $c, grid, args...))
:(convert(FT, $coeff[$(order - idx + 1)]) * ψ(i, j + $c, k, grid, args...)) :
:(convert(FT, $coeff[$(order - idx + 1)]) * ψ(i, j, k + $c, grid, args...))
else
stencil_full[idx] = dir == :x ?
:(FT($coeff[$(order - idx + 1)]) * ψ[i + $c, j, k]) :
:(convert(FT, $coeff[$(order - idx + 1)]) * ψ[i + $c, j, k]) :
dir == :y ?
:(FT($coeff[$(order - idx + 1)]) * ψ[i, j + $c, k]) :
:(FT($coeff[$(order - idx + 1)]) * ψ[i, j, k + $c])
:(convert(FT, $coeff[$(order - idx + 1)]) * ψ[i, j + $c, k]) :
:(convert(FT, $coeff[$(order - idx + 1)]) * ψ[i, j, k + $c])
end
end
return Expr(:call, :+, stencil_full...)
Expand Down
14 changes: 7 additions & 7 deletions src/Advection/topologically_conditional_interpolation.jl
Original file line number Diff line number Diff line change
Expand Up @@ -26,12 +26,12 @@ const AUGXYZ = AUG{<:Any, <:Bounded, <:Bounded, <:Bounded}
# Left-biased buffers are smaller by one grid point on the right side; vice versa for right-biased buffers
# Center interpolation stencil look at i + 1 (i.e., require one less point on the left)

@inline outside_symmetric_bufferᶠ(i, N, adv) = (i >= boundary_buffer(adv) + 1) & (i <= N + 1 - boundary_buffer(adv))
@inline outside_symmetric_bufferᶜ(i, N, adv) = (i >= boundary_buffer(adv)) & (i <= N + 1 - boundary_buffer(adv))
@inline outside_left_biased_bufferᶠ(i, N, adv) = (i >= boundary_buffer(adv) + 1) & (i <= N + 1 - (boundary_buffer(adv) - 1))
@inline outside_left_biased_bufferᶜ(i, N, adv) = (i >= boundary_buffer(adv)) & (i <= N + 1 - (boundary_buffer(adv) - 1))
@inline outside_right_biased_bufferᶠ(i, N, adv) = (i >= boundary_buffer(adv)) & (i <= N + 1 - boundary_buffer(adv))
@inline outside_right_biased_bufferᶜ(i, N, adv) = (i >= boundary_buffer(adv) - 1) & (i <= N + 1 - boundary_buffer(adv))
@inline outside_symmetric_haloᶠ(i, N, adv) = (i >= required_halo_size(adv) + 1) & (i <= N + 1 - required_halo_size(adv))
@inline outside_symmetric_haloᶜ(i, N, adv) = (i >= required_halo_size(adv)) & (i <= N + 1 - required_halo_size(adv))
@inline outside_left_biased_haloᶠ(i, N, adv) = (i >= required_halo_size(adv) + 1) & (i <= N + 1 - (required_halo_size(adv) - 1))
@inline outside_left_biased_haloᶜ(i, N, adv) = (i >= required_halo_size(adv)) & (i <= N + 1 - (required_halo_size(adv) - 1))
@inline outside_right_biased_haloᶠ(i, N, adv) = (i >= required_halo_size(adv)) & (i <= N + 1 - required_halo_size(adv))
@inline outside_right_biased_haloᶜ(i, N, adv) = (i >= required_halo_size(adv) - 1) & (i <= N + 1 - required_halo_size(adv))

# Separate High order advection from low order advection
const HOADV = Union{WENO,
Expand Down Expand Up @@ -60,7 +60,7 @@ for bias in (:symmetric, :left_biased, :right_biased)
@eval @inline $alt_interp(i, j, k, grid::$GridType, scheme::LOADV, args...) = $interp(i, j, k, grid, scheme, args...)
end

outside_buffer = Symbol(:outside_, bias, :_buffer, loc)
outside_buffer = Symbol(:outside_, bias, :_halo, loc)

# Conditional high-order interpolation in Bounded directions
if ξ == :x
Expand Down
Loading

0 comments on commit 68cb4c5

Please sign in to comment.