Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Named tuples for parameters #42

Open
glwagner opened this issue Aug 24, 2021 · 9 comments
Open

Named tuples for parameters #42

glwagner opened this issue Aug 24, 2021 · 9 comments

Comments

@glwagner
Copy link
Member

glwagner commented Aug 24, 2021

For applications with large numbers of parameters input into complex models, it's useful (even necessary) to use objects like NamedTuple that associate parameter names with values to hold / transport lists of parameters.

However, as far as I can tell, parameters are constrained to be vectors. Throughout the code in fact, many data structures seem constrained to Array or Vector. Eg, this code does not work:

using Distributions
using LinearAlgebra
using Random

using EnsembleKalmanProcesses.EnsembleKalmanProcessModule
using EnsembleKalmanProcesses.ParameterDistributionStorage
using EnsembleKalmanProcesses.EnsembleKalmanProcessModule: EnsembleKalmanProcess, construct_initial_ensemble

rng_seed = 41
Random.seed!(rng_seed)

# The system
θ★ = (a=1, b=-1)
G₁(θ) = [sqrt((θ.a - θ★.a)^2 +.b - θ★.b)^2)]

# Hyperparameters
N_ensemble = 50
noise_level = 1e-3

unconstrained_prior_distributions = (a=Parameterized(Normal(0, 1)), b=Parameterized(Normal(0, 1)))
constraints = (a=no_constraint(), b=no_constraint())
parameter_names = ("a", "b")

prior = ParameterDistribution(unconstrained_prior_distributions, constraints, parameter_names)

and I get

julia> prior = ParameterDistribution(unconstrained_prior_distributions, constraints, ["a", "b"])
ERROR: MethodError: no method matching ParameterDistribution(::NamedTuple{(:a, :b), Tuple{Parameterized, Parameterized}}, ::NamedTuple{(:a, :b), Tuple{Constraint, Constraint}}, ::Vector{String})

Is it necessary to constrain parameters to type Vector? It seems like it should be possible to generically support any iterable object (or objects that support broadcasting?) In the case that we need to perform LinearAlgebra operations, we can constructor the appropriate matrices or vectors from iterable containers as needed:

julia> parameters = (a=1, b=2)
(a = 1, b = 2)

julia> parameter_vector =for θ in parameters]
2-element Vector{Int64}:
 1
 2

Allocations could even be minimized, though I think this is usually unnecessary because typical applications of EnsembleKalmanProcesses.jl have computations dominated by the cost of evaluating a loss function.

I'm happy to get to work on refactoring the internals to support more general parameter containers if someone can identify the minimal requirements for the parameters object. It's tough to figure out what methods would have to be defined without detailed knowledge of the code because the struct definitions are over-constrained, preventing experimentation.

@odunbar
Copy link
Collaborator

odunbar commented Aug 26, 2021

Internally to the algorithms, as you say we treat parameter instances as columns in a single ensemble matrix. This is because once we first initialize prior mean and covariances, and have an initial ensemble, we work with Gaussian methods (completely determined by this information). Therefore everything is treated as vectors in R^d and actions are matrix-vector type operations. We have src/DataStorage.jl to keep e.g dimensions consistent so these operations succeed. We do require parameter information again, when passing back to the physical model and there we just need to give back the values to the right parameter.
at the moment this is done by assigning the name to the column see the (terribly named) batch function in src/ParameterDistribution.jl

    function batch(pd:ParameterDistribution)
Returns a list of contiguous [collect(1:i), collect(i+1:j),... ] used to split parameter arrays by distribution dimensions

In general applications parameters may be defined as multidimensional too e.g with values a= [0.0], b = [1.0,2.0,4.0,6.0,9.6] (a is defined with a 1D prior distribution, b is defined with a 5D prior distribution). In this case for example with [a,b], the parameter matrices will be 6 by ensemble size, and the batch function will return [collect(1:1), collect(2:6)] on return to slice back into a and b.

I'm sure the user interface could be cleaner as I certainly didnt optimize it. But internally I felt it was quite robust and doesnt require more functionality than Ensemble Kalman methods require (e.g matrices and vectors with consistently enforced dimensions)

@glwagner
Copy link
Member Author

I think what I'm asking is what methods need to be defined for a parameter type. We can relax the type restrictions in the struct definitions, and design an interface based on methods. Then many types could be used by savvy users to store parameters, provided those types define the right methods. Flexible parameter types might be useful for complicated loss functions (such as ours).

@glwagner
Copy link
Member Author

I'll experiment a bit and open a PR that will hopefully illustrate what I'm talking about!

@glwagner
Copy link
Member Author

Just to provide a simple example, broadcasting works for any subtype of AbstractArray provided that it defines a couple methods like size, getindex, setindex!, etc. We don't have to broadcast with literal Array --- we can broadcast with any "array-like" object. Similar flexibility and julian design here would create a powerful and flexible infrasturcture.

@glwagner
Copy link
Member Author

Internally to the algorithms, as you say we treat parameter instances as columns in a single ensemble matrix. This is because once we first initialize prior mean and covariances, and have an initial ensemble, we work with Gaussian methods (completely determined by this information). Therefore everything is treated as vectors in R^d and actions are matrix-vector type operations

I think what I am saying is that there are many julia types that may be used to represent "vectors in R^d". I think it would help to write the software in a way that supports many vector-like types --- not just Vector. Here's a simple example from the julia documentation:

struct SquaresVector <: AbstractArray{Int, 1}
    count::Int
end

Base.size(S::SquaresVector) = (S.count,)
Base.IndexStyle(::Type{<:SquaresVector}) = IndexLinear()
Base.getindex(S::SquaresVector, i::Int) = i*i

Note that SquaresVector is not the type Vector, but nevertheless has properties indistinguishable from a mathematical vector in R^1:

julia> s = SquaresVector(4)
4-element SquaresVector:
  1
  4
  9
 16

julia> s[s .> 8]
2-element Vector{Int64}:
  9
 16

julia> s + s
4-element Vector{Int64}:
  2
  8
 18
 32

julia> sin.(s)
4-element Vector{Float64}:
  0.8414709848078965
 -0.7568024953079282
  0.4121184852417566
 -0.2879033166650653

This shows why we don't want to overrestrict types in struct definitions --- allowing flexible types generates the possibility for powerful behaviors. "Alternative array types" like SquaresVector only need to define the correct methods for the code to work. The methods that need to be defined are sometimes surprisingly few.

Essentially, we almost never should restrict the types of objects, except in very special cases (which probably do not occur in our software).

@ilopezgp
Copy link
Contributor

@glwagner We should move away from deleting the typing in arguments or objects of structs. It adds flexibility at the cost of interpretability of the codebase. What we should do is substitute the types by more general (possibly abstract) types.

@ilopezgp
Copy link
Contributor

I think changing the typing to more abstract types would really benefit the codebase. However, unit testing becomes slightly more difficult (there are a lot of corner cases). I am not familiar with this kind of typing in Julia, but we should implement types similar to those in the collections.abc python package.

@jakebolewski
Copy link
Contributor

jakebolewski commented Jan 23, 2022

You cannot capture what @glwagner is promoting with abstract types because the code used here is a composition of required type traits / interfaces (ex. generic iteration, get/set index, etc.). No one abstract type hierarchy will necessarily capture the composition of traits that the package needs. In general that is why you rely on dynamic polymorphic behavior (ex. no contraint on types) and raise error messages when a particular type instance does not implement some method. This makes the code compositional and easier to extend. Because Julia doesn't have explicit declaration of type traits you rely on implicit runtime behavior (like all dynamic languages), comments, and a good test suite.

@ilopezgp
Copy link
Contributor

ilopezgp commented Jan 23, 2022

I am not sure I understand why types cannot capture everything we need. If you need to say that an argument allows generic iteration, it should be an iterable. If it needs to be used as a dict key, it is a hashable.

An example of how this is possible, while maintaining extreme generality of the inputs in a dynamically typed language is python's xarray, see some of the function handles here, for instance. Typing helps a lot with debugging, which is kind of the weakest point of Julia. It's either that or adding asserts in every function.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants