Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AbstractOperations for diagnostics (and beyond...) #463

Merged
merged 144 commits into from
Oct 24, 2019
Merged

Conversation

glwagner
Copy link
Member

@glwagner glwagner commented Oct 12, 2019

Introduction

This PR introduces AbstractOperations for constructing 3D expressions with a friendly syntax (eg, not writing kernels), which can then be evaluated online during a simulation for diagnostics / output purposes.

This PR introduces four kinds of AbstractOperations:

  1. Derivative (things like ∂x, ∂y)
  2. BinaryOperation (things like a-b, a^b)
  3. UnaryOperation (things like sqrt, sin, cos)
  4. PolynaryOperation (things like a + b + c + d or a * b * c * d)

All of these act on combinations of Field, Function, and Number. `

AbstractOperations arise from what appears to be ordinary arithmetic and calculus performed on Fields. Consider this example:

julia> using Oceananigans; using Oceananigans: Face, Cell; using Oceananigans.AbstractOperations

grid = RegularCartesianGrid((16, 16, 16), (16, 16, 16));
a = Field(Cell, Cell, Cell, CPU(), grid);
b = Field(Cell, Cell, Cell, CPU(), grid);
a_times_b = a * b;

julia> typeof(a_times_b).name.wrapper
Oceananigans.AbstractOperations.BinaryOperation

Here, the object a_times_b is of the type BinaryOperation. a_times_b can be indexed into like an array or field, however:

julia> a_times_b[8, 8, 8]
0.0

set!(a, rand(size(grid)...))
set!(b, rand(size(grid)...))

julia> a_times_b[8, 8, 8]
0.5342645056606357

Staggered grids, interpolation, and operation "location"

Operations can also be defined between fields at different locations in the staggered grid:

julia> u, v, w = Oceananigans.VelocityFields(CPU(), grid);

julia> u_times_v = u * v;

julia> Oceananigans.location(u_times_v)
(Face, Cell, Cell)

Note that the location of an operation defaults to the location of the first field involved.

This submodule also exports a macro @at for specifying the location of an operation. Therefore we can write

julia> kinetic_energy = @at (Cell, Cell, Cell) (u^2 + v^2 + w^2) / 2;

julia> Oceananigans.location(kinetic_energy)
(Cell, Cell, Cell)

kinetic_enegy has a deep nested structure that is difficult to interrogate. Therefore one speculative possible item to add to this PR is to write a

  • better show method for Operations, using some kind of tree visualization?

For the purposes of this PR description, we hack together a visualization of the structure of kinetic_energy:

julia> (kinetic_energy.op, typeof(kinetic_energy.a).name.wrapper, kinetic_energy.b)
(/, Oceananigans.AbstractOperations.PolynaryOperation, 2)

We see that kinetic_energy, at the top level, is BinaryOperation between a PolynaryOperation and an integer (2), involving division / (which is indeed, what we wrote).

Delving into the PolynaryOperation one level down,

julia> kinetic_energy.a.op
+ (generic function with 185 methods)

julia> names = [typeof(a).name.wrapper for a in kinetic_energy.a.args]
3-element Array{UnionAll,1}:
 Oceananigans.AbstractOperations.BinaryOperation
 Oceananigans.AbstractOperations.BinaryOperation
 Oceananigans.AbstractOperations.BinaryOperation

we find that it consists of a sum of three BinaryOperations. Finally, we see that each BinaryOperation,

julia> names = [(a.op, typeof(a.a).name.wrapper, a.b) for a in kinetic_energy.a.args]
3-element Array{Tuple{typeof(^),UnionAll,Int64},1}:
 (^, OffsetArrays.OffsetArray, 2)
 (^, OffsetArrays.OffsetArray, 2)
 (^, OffsetArrays.OffsetArray, 2)

involves taking an OffsetArray (each of which holds the underlying data in u, v, w) to the power 2. Still, we index into it in the same way we index into other fields to obtain its data:

julia> noise(x, y, z) = rand()

julia> [set!(ϕ, noise) for ϕ in (u, v, w)];

julia> kinetic_energy[8, 8, 8]
0.789860912635921

Special considerations

There are a few special rules to how operations are handled:

  • A BinaryOperation between two fields at the same location is always performed at their common location;

  • a BinaryOperation between a field and a number always takes place at the location of the field.

These special rules override the specification of operator location via @at. Thus, for example, in the operation

uv = @at (Cell, Cell, Cell) u * v + v^2

the product u*v is computed at the cell center, while v^2 is computed at the v-point Cell, Face, Cell, and afterwards interpolated to cell centers. This functionality is achieved by endowing BinaryOperation with three interpolation operators: two interpolation operators applied to each field prior to interpolation, and an interpolation operator that is applied to the result. Special cases can then be handled by defining operators for cases in which a computation should be performed at the location of one or both of the fields.

Furthermore, we do not provide PolynaryOperations with default locations. In other words, a PolynaryOperation only arises when location is specified, eg:

julia> op1 = u + v + w;

julia> op2 = @at (Cell, Cell, Cell) u + v + w;

julia> typeof(op1).name.wrapper
Oceananigans.AbstractOperations.BinaryOperation

julia> typeof(op2).name.wrapper
Oceananigans.AbstractOperations.PolynaryOperation

This also means that all fields in a PolynaryOperation are interpolated prior to computation.

Computations

This PR also adds a type for making "computations" with an AbstractOperation. We call this a Computation. From the tests:

T, S = model.tracers
computation = Computation(S + T, model.pressures.pHY′)
compute!(computation)

which launches a 3D kernel to compute the result of an AbstractOperation at every grid point, storing the result in computation.result (which here has been defined as model.pressures.pHY′).

We also extend HorizontalAverage to work with Computation, (and add a constructor that constructs a Computation from an AbstractOperation and Model), eg

T, S = model.tracers
ST = HorizontalAverage(S + T, model)
computed_profile = ST(model)

Calling run_diagnostic for HorizontalAverage{<:Computation} first calls compute!(computation), and then compute the horizontal average of computation.result.

Some important details

  • This PR changes the meaning of data, which is used extensively in forming operations. This resolves Behavior of data is inconsistent with behavior of Base.parent #454. A new function interior serves the original purpose of data.

  • This PR depends on the arbitrary tracers PR. The arbitrary tracers PR should be merged before this one.

  • To permit mixing Functions in with Fields and Numbers in AbstractOperations, we also define a FunctionField type that acts like a field, but invokes a function to compute values at its location under the hood. This type probably needs more tests. We should also investigate whether this type can be used to specify things like background fields. A FunctionField can either be time-dependent and a function of x, y, z, t (achieved by allowing it to possess a reference to clock), or a static function of x, y, z.

  • This PR changes the result of a horizontal average to HorizontalAverage.result to provide a common terminology with Computation, as well as future reductions along other dimensions (previously the result of a horizontal average was called profile)

  • This PR adapts Field to work on the GPU (after the hard work of adapting all the AbstractOperations to work on the GPU, this seemed trivial). That this works now should be tested. If it does indeed work and there is no loss of performance, we can eliminate a lot of cruft from our time_step!, and also use fields directly in AbstractOperations (right now we extract the underlying OffsetArray instead).

In summary

Miraculously, tests pass on the GPU. However, this framework is quite general and powerful, so we need to

  • think carefully about the tests we need (and don't need).

And certainly before merging we also need

  • docstrings...

Also, it'd be nice to

  • use AbstractOperation to define useful output in an example.

Some discussion may be warranted about what's exported from AbstractOperations; we almost always want to have Face and Cell, for example. Ultimately, I think it'd be nice to write using Oceananigans.Fields, Oceananigans.AbstractOperators to get what's needed for this purpose. But that's yet another future PR.

I hope this sparks some discussion about the future of the Field abstraction as well. I think it has the potential to be quite powerful.

Resolves #454
Resolves #428

Copy link
Member

@ali-ramadhan ali-ramadhan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't fully understand or appreciate the AbstractOperations module yet but I'm hoping to play around and experiment with it over the next few weeks and find out what I like and if I find anything that could be improved.

Definitely poised to become a killer feature I think!

More or less happy to merge this as-is because every other PR will introduce merge conflicts at this point, and it would be good to keep things moving. I think the best way to review a feature this big is to just start using it. We should probably tag v0.14.0 once this is merged.

Most of my comments were related to adding more explanation in the form of comments and docstrings so that mere mortals can understand the code.

The PR introduction you wrote up could serve quite well as a first stab at documenting this feature.

The two_dimensional_turbulence.jl example is great at showcasing abstract operations!

Pedantic comment but apparently the "Wikipedia sanctioned" name for n-ary operations might be polyadic, multary, or multiary: https://en.wikipedia.org/wiki/Arity#Other_names

Questions:

  • Some of the abstract operations tests are CPU-only (the non-computation ones?). Would it make sense to run them on the GPU as well?
  • It seems that model.pressures.pHY′ has taken on the role of being the temporary array. Can't think of a great way of doing temporary arrays without using up extra memory but maybe it makes sense to define a function so it's easy to change the temporary array?
temporary_array(model) = model.pressures.pHY′

Some ideas for next steps:

  • Performance benchmarking (will likely be quite extensive to cover many possible use cases).
  • Will probably need quite extensive documentation so that users can make full use of abstract operations.

Note: I made comments along the way with the understanding that I might answer my own question by the time I finish reviewing this PR. I'll make a second pass through my comments and resolve the ones that I ended up answering.

src/AbstractOperations/AbstractOperations.jl Outdated Show resolved Hide resolved
src/AbstractOperations/AbstractOperations.jl Outdated Show resolved Hide resolved
src/AbstractOperations/AbstractOperations.jl Show resolved Hide resolved
src/AbstractOperations/AbstractOperations.jl Outdated Show resolved Hide resolved
src/AbstractOperations/AbstractOperations.jl Show resolved Hide resolved
src/AbstractOperations/interpolation_utils.jl Show resolved Hide resolved
function interpolation_operator(from, to)
x, y, z = (interpolation_code(X, Y) for (X, Y) in zip(from, to))

if all(ξ === :a for ξ in (x, y, z))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just curious why === instead of just == here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess I figure if === works then its best because its more explicit?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that === is the fallback for == when == is not implemented (presumably == is mostly needed for numeric comparisons?)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Edit: == is also used for strings.

src/AbstractOperations/polynary_operations.jl Outdated Show resolved Hide resolved
@@ -136,11 +136,12 @@ Abstract supertype for fields stored on an architecture `A` and defined on a gri
abstract type AbstractField{A, G} end
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is AbstractField{A, G} still used or should we just define AbstractLocatedField{X, Y, Z, A, G}?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a design question. At the moment we do not need both because all subtypes of AbstractField are also subtypes of AbstractLocatedField. Thus functionally we could nuke AbstractLocatedField and change the definition of AbstractField to carry around X, Y, Z.

The only concern is that at some point in the future we may actually want the concept of an "unlocated" field; in that case we would regret such a change. Some "fields" like a constant value, may not naturally have a "location". On the other hand, the field arithmetic introduced in this PR requires a field to have a location; thus even a constant field would have to define a nominal location to participate in abstract operations, even though that location would have no effect on operations.

In summary: probably ok to delete AbstractLocatedField. But maybe save for future PR?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess it's only a one-liner so seems find to keep it the way it is right now.

I agree maybe there will be a use for it in the future (although that's probably not a good reason for keeping around unused code).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Such defensive code design is not altogether terrible. It's used in GPUifyLoops for example (the abstract type GPU with one subtype, CUDA).

We do actually utilize AbstractField to make dispatching on A a bit easier, since getting to A in AbstractLocatedField requires wading through X, Y, Z.

src/fields.jl Show resolved Hide resolved
@glwagner
Copy link
Member Author

glwagner commented Oct 24, 2019

Just curious if DataAPI, Formatting, SortingAlgorithms, StatsBase, and UnicodePlots are required for this PR? They got added to Manifest.toml but no new dependencies were added to Project.toml.

No. I have no idea why that happened. What can we do to remove them?

@glwagner
Copy link
Member Author

Pedantic comment but apparently the "Wikipedia sanctioned" name for n-ary operations might be polyadic, multary, or multiary: https://en.wikipedia.org/wiki/Arity#Other_names

I didn't spend enough time researching! Ok, let's change polynary to multiary.

@glwagner
Copy link
Member Author

Some of the abstract operations tests are CPU-only (the non-computation ones?). Would it make sense to run them on the GPU as well?

We can. Such a test may end up running on the CPU via scalar operations though... ?

It seems that model.pressures.pHY′ has taken on the role of being the temporary array. Can't think of a great way of doing temporary arrays without using up extra memory but maybe it makes sense to define a function so it's easy to change the temporary array?

That's a good idea for sure to permit dispatch on various type parameters of model.

Note that Computation allows the user to specify their own temporary array. model.pressures.pHY′ is used as a default when model is passed to Computation in place of an array or field.

Performance benchmarking (will likely be quite extensive to cover many possible use cases).

I think just a few will suffice for shallow and deep operations trees, perhaps choosing common use cases to ensure that using abstract operations rather than hard-coded kernels doesn't result in a big performance hit. It will be hard to interpret the results of a benchmark on a deep tree anyways, because we won't have an alternate implementation to compare against. Future performance optimization could use some kind of tree analysis utility + shared memory to accelerate kernels.

@glwagner
Copy link
Member Author

Will probably need quite extensive documentation so that users can make full use of abstract operations.

Why extensive? I'm just not sure what to write: the rules for how things work are already all there in the docstrings. Maybe examples are what's needed?

@ali-ramadhan
Copy link
Member

ali-ramadhan commented Oct 24, 2019

No. I have no idea why that happened. What can we do to remove them?

Looks like you already did. I would usually try something like ]update Printf which should remove orphaned packages from Manifest.toml.

We can. Such a test may end up running on the CPU via scalar operations though... ?

Ah ok, maybe not a great idea then as it would slow testing down. I guess the computations are tested on the GPU which is good enough.

Note that Computation allows the user to specify their own temporary array. model.pressures.pHY′ is used as a default when model is passed to Computation in place of an array or field.

Ah nice. I guess I was thinking in case model.pressures.pHY′ disappears one day.

I think just a few will suffice for shallow and deep operations trees, perhaps choosing common use cases to ensure that using abstract operations rather than hard-coded kernels doesn't result in a big performance hit. It will be hard to interpret the results of a benchmark on a deep tree anyways, because we won't have an alternate implementation to compare against. Future performance optimization could use some kind of tree analysis utility + shared memory to accelerate kernels.

Hmmm, I was thinking it would be good to benchmark each operator at least once but I suppose if sin is fast then we can assume cos and tanh will also be fast.

Shallow and deep trees makes sense. True we may not have an alternative implementation but we can compare the deep and shallow tree computations to get an idea. I find comparing the computation time to the time per iteration (~30 ms for 256^3) to be helpful.

Why extensive? I'm just not sure what to write: the rules for how things work are already all there in the docstrings. Maybe examples are what's needed?

Didn't mean to suggest that we need extensive documentation right away. Having examples of what's possible will be really useful, but we can build up a collection of good examples over time.

PS: Think you missed half my comments as GitHub collapsed and hid them, but it was mostly minor comments anyways.

@ali-ramadhan
Copy link
Member

ali-ramadhan commented Oct 24, 2019

PS: This PR also resolves #12, the oldest open issue! Also works towards #259.

@glwagner
Copy link
Member Author

Hmmm, I was thinking it would be good to benchmark each operator at least once but I suppose if sin is fast then we can assume cos and tanh will also be fast.

I think the proper benchmark is to compare a unary operation via one function of our choice (say, sin) and a hand-coded benchmark for the same function. We don't benchmark additional functionality of AbstractOperations by testing multiple unary functions; those would just benchmark the performance of the unary function itself, which we don't care about (because we are not going to improve the performance of Base.sin or CUDAnative.sin.)

@glwagner
Copy link
Member Author

Shallow and deep trees makes sense. True we may not have an alternative implementation but we can compare the deep and shallow tree computations to get an idea.

When I said "we don't have an alternative implementation" I should have actually said that for a "deep" tree, an alternative implementation to an abstract operation would be difficult to code. We definitely do have alternative implementations to abstract operations --- we can simply hand-code kernels as we have done many times in the past. The hand-coded kernel is the standard against which we should compare an abstractly-defined kernel.

@glwagner
Copy link
Member Author

glwagner commented Oct 24, 2019

Didn't mean to suggest that we need extensive documentation right away. Having examples of what's possible will be really useful, but we can build up a collection of good examples over time.

The main confusion about abstract operations is not actually the abstract operations themselves, but the way that julia parses infix operators. Would it be clearer if I told you that you can construct an operation tree by writing a mess of nested functions like

kinetic_energy = /(+(^(u, 2), ^(v, 2), ^(w, 2)), 2)

This is what happens when you write kinetic_energy = (u^2 + v^2 + w^2) / 2 --- the infix operations +, /, and ^ parse that way.

We can document this of course but this is not our code. This is base julia.

I understand that it can be mysterious if you are not completely comfortable with how infix operators translate to ordinary function calls. Thus we can demonstrate usage through examples? But there's not much documentation to write. We've define a function +(a, b) that acts on a and b. We defined ^(u, 2), or ^(a, b). We compose operators by nesting them inside one another (a binary operator can be defined between fields or fields and numbers). A unary function has one argument, a binary function has two, and a multiary function has 2+. The user can define an infinite of their own binary, unary, and multiary functions. If those user-defined functions are also infix operators, they will have the same interesting infix-parsing behavior that many of the operators we pre-define do.

@suyashbire1
Copy link
Collaborator

Because ζ does not involve interpolation, we can interpret ζ_at_u as interpolating ζ in y from Face to Cell point. So

ζ_at_u[i, j, k] = 0.5 * (ζ[i, j, k] + ζ[i, j+1, k])

So I think this implies that

0.5 * (ζ_at_u[8, 8, 8] + ζ_at_u[8, 7, 8]) = 0.25 * ζ[8, 7, 8] + 0.5 * ζ[8, 8, 8] + 0.25 * ζ[8, 9, 8]

In other words, I don't think we should expect

0.5 * (ζ_at_u[8, 8, 8] + ζ_at_u[8, 7, 8]) = ζ[8, 9, 8]

Does my logic work out or am I crazy?

Sorry, that was my bad. This makes sense.

@glwagner glwagner merged commit 20f9d08 into master Oct 24, 2019
@glwagner glwagner deleted the abstract-operations branch October 24, 2019 18:52

# Ordinary functions needed for fields
architecture(::FunctionField) = nothing
data(f::FunctionField) = f
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, are you using data to represent a function's operation tree?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, this is just dispatch because we need a way to get at field.data. It's similar to how Base.parent works. Base.parent extracts the "parent' array when called on a wrapper around an array, or the array itself if called on an unwrapped array.


"Returns a view over the interior points of the `field.data`."
@inline data(f::AbstractField) = view(f.data, 1:f.grid.Nx, 1:f.grid.Ny, 1:f.grid.Nz)
@inline interior(f::Field) = view(f.data, 1:f.grid.Nx, 1:f.grid.Ny, 1:f.grid.Nz)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

interior is not an intuitive term for data held by an object. Is this change required because data is used to represent a function's operation tree (in function_fields.jl)?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's more mundane than that. The function data now returns field.data for field::Field, and the object itself for an AbstractOperation. We use this because while AbstractOperation can be used on the GPU, a Field cannot and so we need to extract the underlying OffsetArray from it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

interior returns the grid points in the model "interior" (ie, excluding halos).

What would be more intuitive?

Hopefully in the future we can use dispatch on Colon (see #458) so that the user can ignore these functions altogether. I think we should strive for syntax that allows the user to think of a Field in the way one would think of a mathematical continuous field, rather than having to interact with the underlying discrete data and such.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The function data now returns field.data for field::Field, and the object itself for an AbstractOperation

I would prefer values or data. If data works, why are you using interior? Does data return halos as well? I was just curious.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

values is a julia Base function: for Dict-like objects it returns the values (see also the function keys)...

Maybe interior_values?

For field::Field, data(field) returns field.data, which is an OffsetArray. interior returns a view over the same offset array, which excludes the halo points. The difference only matters for broadcasting an operation over an array. For example, if you have an array of values you would like to set the solution to, then

field.data .= user_array

will fail, unless user_array includes halo points. This is undesirable because halos are arbitrary. Thus they would use

interior(field) .= user_array

which due to the view, will work when user_array contains only interior values.

However, I think we should encourage users to use our functions rather than manually setting values, as in:

set!(field, user_array)

For plotting interior would also prove useful if you don't want to plot halos. Almost always, however, we are plotting slices (not the whole array); in that case dispatch on Colon may suffice since we can design getindex such that

field[:, 1, :]

returns a slice over interior values, excluding halos.

arcavaliere pushed a commit to arcavaliere/Oceananigans.jl that referenced this pull request Nov 6, 2019
AbstractOperations for diagnostics (and beyond...)

Former-commit-id: 20f9d08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
abstractions 🎨 Whatever that means feature 🌟 Something new and shiny
Projects
None yet
3 participants