Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Some CATKE performance optimizations #3453

Draft
wants to merge 148 commits into
base: main
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from 146 commits
Commits
Show all changes
148 commits
Select commit Hold shift + click to select a range
ee1b34e
full interior map
simone-silvestri Nov 26, 2023
31ba1d6
bugfix
simone-silvestri Nov 26, 2023
1403161
bugfix
simone-silvestri Nov 26, 2023
7d76203
bugfixes
simone-silvestri Nov 26, 2023
b5d6a42
hmmm
simone-silvestri Nov 26, 2023
ee62bec
disambiguate
simone-silvestri Nov 26, 2023
e880b2c
some organizing
simone-silvestri Nov 26, 2023
6f4aaad
hmmm
simone-silvestri Nov 26, 2023
f0b59a3
improve speed
simone-silvestri Nov 27, 2023
5dddbb9
now we get going
simone-silvestri Nov 28, 2023
90b0f7a
check it out
simone-silvestri Nov 28, 2023
6ce5b45
check bathymetry
simone-silvestri Nov 28, 2023
2875984
fixit
simone-silvestri Nov 28, 2023
6cd7444
rmove distributed
simone-silvestri Nov 28, 2023
8319dbc
test it like this
simone-silvestri Dec 1, 2023
b78b042
I hope it works!
simone-silvestri Dec 3, 2023
11e8143
bugfix
simone-silvestri Dec 3, 2023
2a2f772
bugfix
simone-silvestri Dec 3, 2023
d91cd93
bugfix
simone-silvestri Dec 3, 2023
8eaf808
bugfix
simone-silvestri Dec 3, 2023
023e404
couple of bugfixes
simone-silvestri Dec 3, 2023
81e70f6
bugfix
simone-silvestri Dec 3, 2023
7ab646b
bugfixes
simone-silvestri Dec 3, 2023
a6ab840
changes
simone-silvestri Dec 3, 2023
6c96131
try like this
simone-silvestri Dec 4, 2023
8057909
some tests...
simone-silvestri Dec 4, 2023
183ea90
show the coordinate
simone-silvestri Dec 4, 2023
05593e2
bugfix
simone-silvestri Dec 4, 2023
44dbee0
bugfix
simone-silvestri Dec 4, 2023
8c81e15
test this hypothesis
simone-silvestri Dec 5, 2023
3e88fb4
another test
simone-silvestri Dec 5, 2023
d74f5f5
bugfix
simone-silvestri Dec 5, 2023
25a1dbb
other bugfix
simone-silvestri Dec 5, 2023
188eedc
now we'll see...
simone-silvestri Dec 5, 2023
7d418e0
now it will work hopefully
simone-silvestri Dec 5, 2023
83b4d5b
all bugs fixed?
simone-silvestri Dec 5, 2023
7a0df19
bugfix
simone-silvestri Dec 5, 2023
2626579
remove the shows
simone-silvestri Dec 5, 2023
95e90e7
unroll the loop
simone-silvestri Dec 5, 2023
4b1f2cd
fully unrolled
simone-silvestri Dec 5, 2023
20a12d1
split explicit loop unrolling
simone-silvestri Dec 6, 2023
58e7acb
update
simone-silvestri Dec 7, 2023
217e3af
annotations
simone-silvestri Dec 7, 2023
8cf6453
using NVTX
simone-silvestri Dec 7, 2023
3cc1468
add NVTX
simone-silvestri Dec 7, 2023
1c1ff63
bugfix
simone-silvestri Dec 7, 2023
e0bedee
bugfix
simone-silvestri Dec 7, 2023
b2f92dd
utils
simone-silvestri Dec 7, 2023
00458ab
try like this
simone-silvestri Dec 8, 2023
08a86b5
text like this
simone-silvestri Dec 8, 2023
e402f5c
remove reduced fields
simone-silvestri Dec 8, 2023
6cf89bc
small test
simone-silvestri Dec 8, 2023
3769873
small change
simone-silvestri Dec 11, 2023
d60b643
nvtx on fill halos
simone-silvestri Dec 11, 2023
1b0a440
all NVTX
simone-silvestri Dec 11, 2023
6f9d400
fill it all
simone-silvestri Dec 11, 2023
ea5e56b
check it out
simone-silvestri Dec 11, 2023
47dd569
bugfixxed
simone-silvestri Dec 11, 2023
62dad92
bugfixed
simone-silvestri Dec 11, 2023
9d5ada2
bugfix
simone-silvestri Dec 12, 2023
76bfb5e
annotate the convert
simone-silvestri Dec 12, 2023
3f645ce
bugfix
simone-silvestri Dec 12, 2023
324aaef
bugfix
simone-silvestri Dec 12, 2023
67df158
add cudaconvert
simone-silvestri Dec 12, 2023
74d3bad
remove NVTX
simone-silvestri Dec 12, 2023
955d2c1
model grid
simone-silvestri Dec 12, 2023
15f60f7
try like this?
simone-silvestri Dec 13, 2023
3c8e34f
bugfix
simone-silvestri Dec 13, 2023
246c6d9
fix
simone-silvestri Dec 13, 2023
837a119
should work?
simone-silvestri Dec 13, 2023
148a2c8
add here
simone-silvestri Dec 13, 2023
0cf5c77
add here
simone-silvestri Dec 13, 2023
d1f4f83
bugfix
simone-silvestri Dec 13, 2023
41a0857
back to how it was
simone-silvestri Dec 13, 2023
ee97dde
try it like this maybe?
simone-silvestri Dec 13, 2023
6f5e6b7
convert
simone-silvestri Dec 13, 2023
acd1a54
fixxing
simone-silvestri Dec 13, 2023
ca73268
try it now?
simone-silvestri Dec 13, 2023
2d8ae26
bugfix
simone-silvestri Dec 13, 2023
5341b71
add distributed
simone-silvestri Dec 13, 2023
1ce6a5a
bugfix
simone-silvestri Dec 14, 2023
53055d2
allow unrolling
simone-silvestri Dec 14, 2023
4185152
convert in archs
simone-silvestri Dec 14, 2023
0aa5b10
bugfix
simone-silvestri Dec 14, 2023
f282bbe
Merge branch 'main' into ss/no-immersed-cells2
simone-silvestri Dec 14, 2023
7428374
just for testing
simone-silvestri Dec 18, 2023
ec911ce
Merge branch 'ss/no-immersed-cells2' of github.com:CliMA/Oceananigans…
simone-silvestri Dec 18, 2023
b48d00c
removed useless particles
simone-silvestri Dec 18, 2023
4d36cc4
removed bacthed stuff
simone-silvestri Dec 18, 2023
8842d05
tracer advetion type
simone-silvestri Dec 18, 2023
7656643
Merge branch 'main' into ss/no-immersed-cells2
simone-silvestri Dec 18, 2023
3454818
Merge branch 'main' into ss/no-immersed-cells2
navidcy Dec 29, 2023
47ab44b
bugfix
simone-silvestri Jan 8, 2024
881bdb5
bugfix
simone-silvestri Jan 8, 2024
f79a056
other bugfix
simone-silvestri Jan 8, 2024
c3a21a4
other small bugfix
simone-silvestri Jan 9, 2024
782f247
first bugfix
simone-silvestri Jan 9, 2024
7b92c64
correct error
simone-silvestri Jan 9, 2024
5e6dcb9
some bugfixes
simone-silvestri Jan 9, 2024
056def5
bugfix
simone-silvestri Jan 9, 2024
630f0fa
slightly more optim
simone-silvestri Jan 9, 2024
e70a57d
simplifying more
simone-silvestri Jan 9, 2024
c1c3101
all tests should be ok
simone-silvestri Jan 10, 2024
ff66175
try it
simone-silvestri Jan 10, 2024
469224b
correct for last time
simone-silvestri Jan 10, 2024
d09e5fe
try again
simone-silvestri Jan 10, 2024
73f8b09
fixed
simone-silvestri Jan 10, 2024
69b9b98
tests fixxed
simone-silvestri Jan 10, 2024
b42b115
finally tests fixed
simone-silvestri Jan 10, 2024
dcffb79
back to previous dt
simone-silvestri Jan 10, 2024
9801ec0
bugfix
simone-silvestri Jan 12, 2024
c64f404
tests fixed?
simone-silvestri Jan 15, 2024
359a083
ale
simone-silvestri Jan 15, 2024
f34a0e0
Merge remote-tracking branch 'origin/main' into ss/no-immersed-cells2
simone-silvestri Jan 15, 2024
72f286e
Update src/Models/HydrostaticFreeSurfaceModels/update_hydrostatic_fre…
simone-silvestri Jan 16, 2024
46ef24c
Update src/TimeSteppers/quasi_adams_bashforth_2.jl
simone-silvestri Jan 16, 2024
6b74e7a
removed NVTX
simone-silvestri Jan 16, 2024
950606f
Merge branch 'ss/no-immersed-cells2' of github.com:CliMA/Oceananigans…
simone-silvestri Jan 16, 2024
2061300
remove one line
simone-silvestri Jan 16, 2024
f46f7a9
if inside
simone-silvestri Jan 16, 2024
a276111
better comment
simone-silvestri Jan 16, 2024
4fedc37
some docstrings
simone-silvestri Jan 16, 2024
8f342fa
remove NVTX
simone-silvestri Jan 16, 2024
b7c871a
test an hypothesis
simone-silvestri Jan 23, 2024
7ff259d
test it now
simone-silvestri Jan 23, 2024
f3fa448
optimization
simone-silvestri Jan 24, 2024
0f946df
bugfixes
simone-silvestri Jan 24, 2024
27f1a28
bugfix
simone-silvestri Jan 24, 2024
0c18521
bugfixxes
simone-silvestri Jan 24, 2024
e91a0c9
adapt
simone-silvestri Jan 26, 2024
ce9e49e
Merge branch 'ss/optimize-catke' of github.com:CliMA/Oceananigans.jl …
simone-silvestri Jan 26, 2024
5fc071c
clipping to zero
simone-silvestri Feb 1, 2024
3ef7984
shear is at faces in z
simone-silvestri Feb 2, 2024
b7efdd5
code alignment
navidcy Feb 2, 2024
882706f
add capitalization
simone-silvestri Feb 2, 2024
62c5621
Merge branch 'ss/optimize-catke' of github.com:CliMA/Oceananigans.jl …
simone-silvestri Feb 2, 2024
e27728f
Update src/Advection/tracer_advection_operators.jl
simone-silvestri Feb 2, 2024
dacd623
adding a minimum dissipation length scale
simone-silvestri Feb 8, 2024
de87ac4
Merge branch 'ss/optimize-catke' of github.com:CliMA/Oceananigans.jl …
simone-silvestri Feb 8, 2024
f316157
remove zero clipping
simone-silvestri Feb 8, 2024
a394670
conditional for `ϵ == Inf`
simone-silvestri Feb 8, 2024
ac08ab5
bugfix
simone-silvestri Feb 9, 2024
b13c728
should preserve positivity
simone-silvestri Feb 12, 2024
9832f00
better comment
simone-silvestri Feb 12, 2024
93a4932
correct the sign of implicit w'b'
simone-silvestri Feb 12, 2024
3e5a0d9
bugfix
simone-silvestri Feb 12, 2024
5958cb6
back to fully implicit dissipation
simone-silvestri Feb 12, 2024
8ffbff0
Merge remote-tracking branch 'origin/main' into ss/optimize-catke
simone-silvestri Mar 1, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
12 changes: 6 additions & 6 deletions Manifest.toml
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
# This file is machine-generated - editing it directly is not advised

julia_version = "1.9.3"
julia_version = "1.9.4"
manifest_format = "2.0"
project_hash = "72ed8b1b7715053c6d7b675f75dd867b9f153685"
project_hash = "21eb6b02d2870a916430d805acf3d926ca95d5b2"

[[deps.AbstractFFTs]]
deps = ["LinearAlgebra"]
Expand Down Expand Up @@ -420,12 +420,12 @@ uuid = "4af54fe1-eca0-43a8-85a7-787d91b784e3"
[[deps.LibCURL]]
deps = ["LibCURL_jll", "MozillaCACerts_jll"]
uuid = "b27032c2-a3e7-50c8-80cd-2d36dbcbfd21"
version = "0.6.3"
version = "0.6.4"

[[deps.LibCURL_jll]]
deps = ["Artifacts", "LibSSH2_jll", "Libdl", "MbedTLS_jll", "Zlib_jll", "nghttp2_jll"]
uuid = "deac9b47-8bc7-5906-a0fe-35ac56dc84c0"
version = "7.84.0+0"
version = "8.4.0+0"

[[deps.LibGit2]]
deps = ["Base64", "NetworkOptions", "Printf", "SHA"]
Expand All @@ -434,7 +434,7 @@ uuid = "76f85450-5226-5b5a-8eaa-529ad045b433"
[[deps.LibSSH2_jll]]
deps = ["Artifacts", "Libdl", "MbedTLS_jll"]
uuid = "29816b5a-b9ab-546f-933c-edad1886dfa8"
version = "1.10.2+0"
version = "1.11.0+1"

[[deps.Libdl]]
uuid = "8f399da3-3557-5675-b5ff-fb832c97cbdb"
Expand Down Expand Up @@ -988,7 +988,7 @@ version = "5.8.0+0"
[[deps.nghttp2_jll]]
deps = ["Artifacts", "Libdl"]
uuid = "8e850ede-7688-5339-a07c-302acd2aaf8d"
version = "1.48.0+0"
version = "1.52.0+1"

[[deps.p7zip_jll]]
deps = ["Artifacts", "Libdl"]
Expand Down
124 changes: 62 additions & 62 deletions ext/OceananigansEnzymeCoreExt.jl
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ EnzymeCore.EnzymeRules.inactive_noinl(::typeof(Oceananigans.Utils.flatten_reduce
EnzymeCore.EnzymeRules.inactive(::typeof(Oceananigans.Grids.total_size), x...) = nothing

@inline batch(::Val{1}, ::Type{T}) where T = T
@inline batch(::Val{N}, ::Type{T}) where {T,N} = NTuple{N,T}
@inline batch(::Val{N}, ::Type{T}) where {N, T} = NTuple{N, T}

function EnzymeCore.EnzymeRules.augmented_primal(config,
func::EnzymeCore.Const{Type{Field}},
Expand All @@ -18,31 +18,32 @@ function EnzymeCore.EnzymeRules.augmented_primal(config,
EnzymeCore.Duplicated{<:Tuple}},
grid::EnzymeCore.Const{<:Oceananigans.Grids.AbstractGrid},
T::EnzymeCore.Const{<:DataType}; kw...) where RT
primal = if EnzymeCore.EnzymeRules.needs_primal(config)
func.val(loc.val, grid.val, T.val; kw...)
else
nothing
end

if haskey(kw, :a)
# copy zeroing
kw[:data] = copy(kw[:data])
end

shadow = if EnzymeCore.EnzymeRules.width(config) == 1
func.val(loc.val, grid.val, T.val; kw...)
else
ntuple(Val(EnzymeCore.EnzymeRules.width(config))) do i
Base.@_inline_meta
func.val(loc.val, grid.val, T.val; kw...)
end
end

return EnzymeCore.EnzymeRules.AugmentedReturn{EnzymeCore.EnzymeRules.needs_primal(config) ? RT : Nothing, batch(Val(EnzymeCore.EnzymeRules.width(config)), RT), Nothing}(primal, shadow, nothing)

primal = if EnzymeCore.EnzymeRules.needs_primal(config)
func.val(loc.val, grid.val, T.val; kw...)
else
nothing
end

if haskey(kw, :a)
# copy zeroing
kw[:data] = copy(kw[:data])
end

shadow = if EnzymeCore.EnzymeRules.width(config) == 1
func.val(loc.val, grid.val, T.val; kw...)
else
ntuple(Val(EnzymeCore.EnzymeRules.width(config))) do i
Base.@_inline_meta
func.val(loc.val, grid.val, T.val; kw...)
end
end

return EnzymeCore.EnzymeRules.AugmentedReturn{EnzymeCore.EnzymeRules.needs_primal(config) ? RT : Nothing, batch(Val(EnzymeCore.EnzymeRules.width(config)), RT), Nothing}(primal, shadow, nothing)
end

function EnzymeCore.EnzymeRules.reverse(config::EnzymeCore.EnzymeRules.ConfigWidth{1}, func::EnzymeCore.Const{Type{Field}}, ::RT, tape, loc::Union{EnzymeCore.Const{<:Tuple}, EnzymeCore.Duplicated{<:Tuple}}, grid::EnzymeCore.Const{<:Oceananigans.Grids.AbstractGrid}, T::EnzymeCore.Const{<:DataType}; kw...) where RT
return (nothing, nothing, nothing)
return (nothing, nothing, nothing)
end


Expand All @@ -69,68 +70,67 @@ function EnzymeCore.EnzymeRules.augmented_primal(config,
offset = Oceananigans.Utils.offsets(workspec.val)

if !isnothing(only_active_cells)
workgroup, worksize = Oceananigans.Utils.active_cells_work_layout(workgroup, worksize, only_active_cells, grid.val)
workgroup, worksize = Oceananigans.Utils.active_cells_work_layout(workgroup, worksize, only_active_cells, grid.val)
offset = nothing
end

if worksize != 0

# We can only launch offset kernels with Static sizes!!!!
# We can only launch offset kernels with Static sizes!!!!

if isnothing(offset)
loop! = kernel!.val(Oceananigans.Architectures.device(arch.val), workgroup, worksize)
dloop! = (typeof(kernel!) <: EnzymeCore.Const) ? nothing : kernel!.dval(Oceananigans.Architectures.device(arch.val), workgroup, worksize)
else
loop! = kernel!.val(Oceananigans.Architectures.device(arch.val), KernelAbstractions.StaticSize(workgroup), Oceananigans.Utils.OffsetStaticSize(contiguousrange(worksize, offset)))
dloop! = (typeof(kernel!) <: EnzymeCore.Const) ? nothing : kernel!.val(Oceananigans.Architectures.device(arch.val), KernelAbstractions.StaticSize(workgroup), Oceananigans.Utils.OffsetStaticSize(contiguousrange(worksize, offset)))
end
if isnothing(offset)
loop! = kernel!.val(Oceananigans.Architectures.device(arch.val), workgroup, worksize)
dloop! = (typeof(kernel!) <: EnzymeCore.Const) ? nothing : kernel!.dval(Oceananigans.Architectures.device(arch.val), workgroup, worksize)
else
loop! = kernel!.val(Oceananigans.Architectures.device(arch.val), KernelAbstractions.StaticSize(workgroup), Oceananigans.Utils.OffsetStaticSize(contiguousrange(worksize, offset)))
dloop! = (typeof(kernel!) <: EnzymeCore.Const) ? nothing : kernel!.val(Oceananigans.Architectures.device(arch.val), KernelAbstractions.StaticSize(workgroup), Oceananigans.Utils.OffsetStaticSize(contiguousrange(worksize, offset)))
end

@debug "Launching kernel $kernel! with worksize $worksize and offsets $offset from $workspec.val"
@debug "Launching kernel $kernel! with worksize $worksize and offsets $offset from $workspec.val"

duploop = (typeof(kernel!) <: EnzymeCore.Const) ? EnzymeCore.Const(loop!) : EnzymeCore.Duplicated(loop!, dloop!)

duploop = (typeof(kernel!) <: EnzymeCore.Const) ? EnzymeCore.Const(loop!) : EnzymeCore.Duplicated(loop!, dloop!)
config2 = EnzymeCore.EnzymeRules.Config{#=needsprimal=#false, #=needsshadow=#false, #=width=#EnzymeCore.EnzymeRules.width(config), EnzymeCore.EnzymeRules.overwritten(config)[5:end]}()
subtape = EnzymeCore.EnzymeRules.augmented_primal(config2, duploop, EnzymeCore.Const{Nothing}, kernel_args...).tape

config2 = EnzymeCore.EnzymeRules.Config{#=needsprimal=#false, #=needsshadow=#false, #=width=#EnzymeCore.EnzymeRules.width(config), EnzymeCore.EnzymeRules.overwritten(config)[5:end]}()
subtape = EnzymeCore.EnzymeRules.augmented_primal(config2, duploop, EnzymeCore.Const{Nothing}, kernel_args...).tape

tape = (duploop, subtape)
tape = (duploop, subtape)
else
tape = nothing
tape = nothing
end

return EnzymeCore.EnzymeRules.AugmentedReturn{Nothing, Nothing, Any}(nothing, nothing, tape)
end

function EnzymeCore.EnzymeRules.reverse(config::EnzymeCore.EnzymeRules.ConfigWidth{1},
func::EnzymeCore.Const{typeof(Oceananigans.Utils.launch!)},
::Type{EnzymeCore.Const{Nothing}},
tape,
arch,
grid,
workspec,
kernel!,
kernel_args...;
include_right_boundaries = false,
reduced_dimensions = (),
location = nothing,
only_active_cells = nothing,
kwargs...)
func::EnzymeCore.Const{typeof(Oceananigans.Utils.launch!)},
::Type{EnzymeCore.Const{Nothing}},
tape,
arch,
grid,
workspec,
kernel!,
kernel_args...;
include_right_boundaries = false,
reduced_dimensions = (),
location = nothing,
only_active_cells = nothing,
kwargs...)

subrets = if tape !== nothing
duploop, subtape = tape
duploop, subtape = tape

config2 = EnzymeCore.EnzymeRules.Config{#=needsprimal=#false, #=needsshadow=#false, #=width=#EnzymeCore.EnzymeRules.width(config), EnzymeCore.EnzymeRules.overwritten(config)[5:end]}()
config2 = EnzymeCore.EnzymeRules.Config{#=needsprimal=#false, #=needsshadow=#false, #=width=#EnzymeCore.EnzymeRules.width(config), EnzymeCore.EnzymeRules.overwritten(config)[5:end]}()

EnzymeCore.EnzymeRules.reverse(config2, duploop, EnzymeCore.Const{Nothing}, subtape, kernel_args...)
else
ntuple(Val(length(kernel_args))) do _
Base.@_inline_meta
nothing
end
end
EnzymeCore.EnzymeRules.reverse(config2, duploop, EnzymeCore.Const{Nothing}, subtape, kernel_args...)
else
ntuple(Val(length(kernel_args))) do _
Base.@_inline_meta
nothing
end
end

return (nothing, nothing, nothing, nothing, subrets...)

end

end
end # module
34 changes: 34 additions & 0 deletions src/Advection/tracer_advection_operators.jl
Original file line number Diff line number Diff line change
@@ -1,6 +1,34 @@
using Oceananigans.Operators: Vᶜᶜᶜ
using Oceananigans.Fields: ZeroField

struct TracerAdvection{N, FT, A, B, C} <: AbstractAdvectionScheme{N, FT}
x :: A
y :: B
z :: C

TracerAdvection{N, FT}(x::A, y::B, z::C) where {N, FT, A, B, C} = new{N, FT, A, B, C}(x, y, z)
end

"""
function TracerAdvection(; x, y, z)

Builds a `TracerAdvection` type with reconstruction schemes in `x`, `y`, and `z`.
"""
function TracerAdvection(; x, y, z)
Nx = required_halo_size(x)
Ny = required_halo_size(y)
Nz = required_halo_size(z)

FT = eltype(x)

return TracerAdvection{max(Nx, Ny, Nz), FT}(x, y, z)
end

Adapt.adapt_structure(to, scheme::TracerAdvection{N, FT}) where {N, FT} =
TracerAdvection{N, FT}(Adapt.adapt(to, scheme.x),
Adapt.adapt(to, scheme.y),
Adapt.adapt(to, scheme.z))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does this stuff have to do with CATKE performance optimization?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that is code from another PR #3404 we have to merge before this one


@inline _advective_tracer_flux_x(args...) = advective_tracer_flux_x(args...)
@inline _advective_tracer_flux_y(args...) = advective_tracer_flux_y(args...)
@inline _advective_tracer_flux_z(args...) = advective_tracer_flux_z(args...)
Expand Down Expand Up @@ -32,3 +60,9 @@ which ends up at the location `ccc`.
δyᵃᶜᵃ(i, j, k, grid, _advective_tracer_flux_y, advection, U.v, c) +
δzᵃᵃᶜ(i, j, k, grid, _advective_tracer_flux_z, advection, U.w, c))
end

@inline function div_Uc(i, j, k, grid, advection::TracerAdvection, U, c)
return 1/Vᶜᶜᶜ(i, j, k, grid) * (δxᶜᵃᵃ(i, j, k, grid, _advective_tracer_flux_x, advection.x, U.u, c) +
δyᵃᶜᵃ(i, j, k, grid, _advective_tracer_flux_y, advection.y, U.v, c) +
δzᵃᵃᶜ(i, j, k, grid, _advective_tracer_flux_z, advection.z, U.w, c))
end
44 changes: 20 additions & 24 deletions src/Advection/vector_invariant_advection.jl
Original file line number Diff line number Diff line change
Expand Up @@ -111,9 +111,9 @@ Vector Invariant, Dimension-by-dimension reconstruction
function VectorInvariant(; vorticity_scheme = EnstrophyConserving(),
vorticity_stencil = VelocityStencil(),
vertical_scheme = EnergyConserving(),
kinetic_energy_gradient_scheme = vertical_scheme,
divergence_scheme = vertical_scheme,
upwinding = OnlySelfUpwinding(; cross_scheme = vertical_scheme),
kinetic_energy_gradient_scheme = divergence_scheme,
upwinding = OnlySelfUpwinding(; cross_scheme = divergence_scheme),
multi_dimensional_stencil = false)

N = required_halo_size(vorticity_scheme)
Expand All @@ -132,7 +132,6 @@ end
const MultiDimensionalVectorInvariant = VectorInvariant{<:Any, <:Any, true}

# VectorInvariant{N, FT, M, Z (vorticity scheme)
const MultiDimensionalVectorInvariant = VectorInvariant{<:Any, <:Any, true}
const VectorInvariantEnergyConserving = VectorInvariant{<:Any, <:Any, <:Any, <:EnergyConserving}
const VectorInvariantEnstrophyConserving = VectorInvariant{<:Any, <:Any, <:Any, <:EnstrophyConserving}
const VectorInvariantUpwindVorticity = VectorInvariant{<:Any, <:Any, <:Any, <:AbstractUpwindBiasedAdvectionScheme}
Expand All @@ -145,10 +144,10 @@ const VectorInvariantKEGradientEnergyConserving = VectorInvariant{<:Any, <:Any,
const VectorInvariantKineticEnergyUpwinding = VectorInvariant{<:Any, <:Any, <:Any, <:Any, <:Any, <:Any, <:AbstractUpwindBiasedAdvectionScheme}


# VectorInvariant{N, FT, M, Z, ZS, V, K, D, U (upwinding)
const VectorInvariantCrossVerticalUpwinding = VectorInvariant{<:Any, <:Any, <:Any, <:Any, <:Any, <:AbstractUpwindBiasedAdvectionScheme, <:Any, <:Any, <:CrossAndSelfUpwinding}
const VectorInvariantSelfVerticalUpwinding = VectorInvariant{<:Any, <:Any, <:Any, <:Any, <:Any, <:AbstractUpwindBiasedAdvectionScheme, <:Any, <:Any, <:OnlySelfUpwinding}
const VectorInvariantVelocityVerticalUpwinding = VectorInvariant{<:Any, <:Any, <:Any, <:Any, <:Any, <:AbstractUpwindBiasedAdvectionScheme, <:Any, <:Any, <:VelocityUpwinding}
# VectorInvariant{N, FT, M, Z, ZS, V, K, D, U (upwinding)
const VectorInvariantCrossVerticalUpwinding = VectorInvariant{<:Any, <:Any, <:Any, <:Any, <:Any, <:Any, <:Any, <:AbstractUpwindBiasedAdvectionScheme, <:CrossAndSelfUpwinding}
const VectorInvariantSelfVerticalUpwinding = VectorInvariant{<:Any, <:Any, <:Any, <:Any, <:Any, <:Any, <:Any, <:AbstractUpwindBiasedAdvectionScheme, <:OnlySelfUpwinding}
const VectorInvariantVelocityVerticalUpwinding = VectorInvariant{<:Any, <:Any, <:Any, <:Any, <:Any, <:Any, <:Any, <:AbstractUpwindBiasedAdvectionScheme, <:VelocityUpwinding}

Base.summary(a::VectorInvariant) = string("Vector Invariant, Dimension-by-dimension reconstruction")
Base.summary(a::MultiDimensionalVectorInvariant) = string("Vector Invariant, Multidimensional reconstruction")
Expand All @@ -167,10 +166,7 @@ Base.show(io::IO, a::VectorInvariant{N, FT}) where {N, FT} =
##### Convenience for WENO Vector Invariant
#####

# VectorInvariant{N, FT, M, Z (vorticity scheme), ZS, V (vertical scheme), K (kinetic energy gradient scheme)
const WENOVectorInvariant = VectorInvariant{<:Any, <:Any, <:Any, <:WENO, <:Any, <:WENO, <:WENO}

nothing_to_default(user_value, default) = isnothing(user_value) ? default : user_value
nothing_to_default(user_value; default) = isnothing(user_value) ? default : user_value

"""
function WENOVectorInvariant(; upwinding = nothing,
Expand All @@ -189,23 +185,23 @@ function WENOVectorInvariant(; upwinding = nothing,
weno_kw...)

if isnothing(order) # apply global defaults
vorticity_order = nothing_to_default(vorticity_order, default=9)
vertical_order = nothing_to_default(vertical_order, default=5)
divergence_order = nothing_to_default(divergence_order, default=5)
kinetic_energy_gradient_order = nothing_to_default(kinetic_energy_gradient_order, default=5)
vorticity_order = nothing_to_default(vorticity_order, default = 9)
vertical_order = nothing_to_default(vertical_order, default = 5)
divergence_order = nothing_to_default(divergence_order, default = 5)
kinetic_energy_gradient_order = nothing_to_default(kinetic_energy_gradient_order, default = 5)
else # apply user supplied `order` unless overridden by more specific value
vorticity_order = nothing_to_default(vorticity_order, default=order)
vertical_order = nothing_to_default(vertical_order, default=order)
divergence_order = nothing_to_default(divergence_order, default=order)
kinetic_energy_gradient_order = nothing_to_default(kinetic_energy_gradient_order, default=order)
vorticity_order = nothing_to_default(vorticity_order, default = order)
vertical_order = nothing_to_default(vertical_order, default = order)
divergence_order = nothing_to_default(divergence_order, default = order)
kinetic_energy_gradient_order = nothing_to_default(kinetic_energy_gradient_order, default = order)
end

vorticity_scheme = WENO(; order=vorticity_order, weno_kw...)
vertical_scheme = WENO(; order=vertical_order, weno_kw...)
kinetic_energy_gradient_scheme = WENO(; order=kinetic_energy_gradient_order, weno_kw...)
divergence_scheme = WENO(; order=divergence_order, weno_kw...)
vorticity_scheme = WENO(; order = vorticity_order, weno_kw...)
vertical_scheme = WENO(; order = vertical_order, weno_kw...)
kinetic_energy_gradient_scheme = WENO(; order = kinetic_energy_gradient_order, weno_kw...)
divergence_scheme = WENO(; order = divergence_order, weno_kw...)

default_upwinding = OnlySelfUpwinding(cross_scheme=divergence_scheme)
default_upwinding = OnlySelfUpwinding(cross_scheme = divergence_scheme)
upwinding = nothing_to_default(upwinding; default = default_upwinding)

schemes = (vorticity_scheme, vertical_scheme, kinetic_energy_gradient_scheme, divergence_scheme)
Expand Down
7 changes: 7 additions & 0 deletions src/Architectures.jl
Original file line number Diff line number Diff line change
Expand Up @@ -112,4 +112,11 @@ end
@inline unsafe_free!(a::CuArray) = CUDA.unsafe_free!(a)
@inline unsafe_free!(a) = nothing

# Convert arguments to GPU-compatible types

@inline convert_args(::CPU, args) = args
@inline convert_args(::GPU, args) = CUDA.cudaconvert(args)
@inline convert_args(::GPU, args::Tuple) = map(CUDA.cudaconvert, args)


end # module
2 changes: 1 addition & 1 deletion src/BoundaryConditions/fill_halo_regions.jl
Original file line number Diff line number Diff line change
Expand Up @@ -50,7 +50,7 @@ function fill_halo_regions!(c::MaybeTupledData, boundary_conditions, indices, lo

arch = architecture(grid)

fill_halos!, bcs = permute_boundary_conditions(boundary_conditions)
fill_halos!, bcs = permute_boundary_conditions(boundary_conditions)
number_of_tasks = length(fill_halos!)

# Fill halo in the three permuted directions (1, 2, and 3), making sure dependencies are fulfilled
Expand Down
Loading