Skip to content

Commit

Permalink
update getters for obs and obs_noise_cov
Browse files Browse the repository at this point in the history
removed Observations Module

format

Redesign of Observation

Observation tests and format

tests for minibatchers

ObservationSeries tested

build=true default for get_obs and get_obs_noise_cov

interface for EKP

add some more convenience functions for ObservationSeries

test no_minibatching setup

updated examples with MB

UKI constructor

remove build-bug where obs_noise_cov append flattens array

typo

format

add vec

typo

added Dict to construct ObservationSeries, and added == operations

added storage of observation inverses
  • Loading branch information
odunbar committed Jun 25, 2024
1 parent fb45f75 commit 586c48a
Show file tree
Hide file tree
Showing 23 changed files with 1,087 additions and 329 deletions.
2 changes: 1 addition & 1 deletion docs/src/API/Observations.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
# Observations

```@meta
CurrentModule = EnsembleKalmanProcesses.Observations
CurrentModule = EnsembleKalmanProcesses
```

```@docs
Expand Down
2 changes: 1 addition & 1 deletion docs/src/examples/Cloudy_example.md
Original file line number Diff line number Diff line change
Expand Up @@ -114,7 +114,7 @@ for i in 1:n_samples
y_t[:, i] = G_t .+ rand(MvNormal(μ, Γy))
end

truth = Observations.Observation(y_t, Γy, data_names)
truth = Observation(y_t, Γy, data_names)
```


Expand Down
2 changes: 1 addition & 1 deletion docs/src/observations.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,6 @@ Here is a typical construction of the object:
yt = rand(MvNormal(μ, Γy), 100) # generate 100 samples
name = "zero-mean mvnormal"

true_data = Observations.Observation(yt, Γy, name)
true_data = Observation(yt, Γy, name)
```
Currently, the data is retrieved by accessing the stored variables, e.g the fifth data sample is given by `truth_data.samples[5]`, or the covariance matrix by `truth_data.cov`.
19 changes: 4 additions & 15 deletions examples/AerosolActivation/aerosol_activation.jl
Original file line number Diff line number Diff line change
Expand Up @@ -139,40 +139,29 @@ nothing # hide
# calibrate is also used to generate observations.
# We use the total number and mass of activated aerosol particles as
# our observational data.
observation_data_names = ["N_act", "M_act"];
observation_data_names = ["N_act_and_M_act"];
nothing # hide

# We generate artificial truth samples based on the default values
# of parameters we are calibrating.
n_samples = 10
G_t = run_activation_model(molar_mass_true, osmotic_coeff_true)
y_t = zeros(length(G_t), n_samples)

Γy = convert(Array, LinearAlgebra.Diagonal([0.01 * G_t[1], 0.01 * G_t[2]]))
μ = zeros(length(G_t));
nothing # hide

# And add noise to the generated truth sample.
for i in 1:n_samples
y_t[:, i] = G_t .+ rand(Distributions.MvNormal(μ, Γy))
end

truth_array = EKP.Observations.Observation(y_t, Γy, observation_data_names)
nothing # hide
y_t = G_t .+ rand(Distributions.MvNormal(μ, Γy))

# One could try for the truth to be a mean of the generated array.
# Or do the calibration for all individual truth samples and then
# compute the mean of calibrated parameters.
# For now we are just taking one truth array member.
truth_sample = truth_array.samples[1];
observation = EKP.Observation(Dict("samples" => y_t, "covariances" => Γy, "names" => observation_data_names))
nothing # hide

# We use 50 ensemble members and do 10 iterations.
N_ens = 50
N_iter = 10

initial_par = EKP.construct_initial_ensemble(rng, priors, N_ens)
ekiobj = EKP.EnsembleKalmanProcess(initial_par, truth_sample, truth_array.obs_noise_cov, EKP.Inversion())
ekiobj = EKP.EnsembleKalmanProcess(initial_par, observation, EKP.Inversion())
nothing # hide

# Finally, we can run the Ensemble Kalman Process calibration.
Expand Down
1 change: 0 additions & 1 deletion examples/ClimateMachine/init_calibration.jl
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,6 @@ using Distributions
using ArgParse
# Import EnsembleKalmanProcesses modules
using EnsembleKalmanProcesses
using EnsembleKalmanProcesses.Observations
using EnsembleKalmanProcesses.ParameterDistributions
include(joinpath(@__DIR__, "helper_funcs.jl"))

Expand Down
1 change: 0 additions & 1 deletion examples/ClimateMachine/sstep_calibration.jl
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,6 @@ using NCDatasets
using LinearAlgebra
# Import EnsembleKalmanProcesses modules
using EnsembleKalmanProcesses
using EnsembleKalmanProcesses.Observations
using EnsembleKalmanProcesses.ParameterDistributions

include(joinpath(@__DIR__, "helper_funcs.jl"))
Expand Down
14 changes: 3 additions & 11 deletions examples/Cloudy/Cloudy_example_eki.jl
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,6 @@ using Random

# Import Calibrate-Emulate-Sample modules
using EnsembleKalmanProcesses
using EnsembleKalmanProcesses.Observations
using EnsembleKalmanProcesses.ParameterDistributions
using EnsembleKalmanProcesses.DataContainers

Expand Down Expand Up @@ -71,7 +70,7 @@ priors = combine_distributions([prior_N0, prior_θ, prior_k])
### Define the data from which we want to learn the parameters
###

data_names = ["M0", "M1", "M2"]
data_names = ["M0_M1_M2"]
moments = [0.0, 1.0, 2.0]
n_moments = length(moments)

Expand Down Expand Up @@ -111,8 +110,7 @@ for i in 1:n_samples
y_t[:, i] = G_t .+ rand(MvNormal(μ, Γy))
end

truth = Observations.Observation(y_t, Γy, data_names)
truth_sample = truth.mean
truth = Observation(Dict("samples" => vec(mean(y_t, dims = 2)), "covariances" => Γy, "names" => data_names))


###
Expand All @@ -124,13 +122,7 @@ N_iter = 8 # number of EKI iterations
# initial parameters: N_par x N_ens

initial_par = construct_initial_ensemble(rng, priors, N_ens)
ekiobj = EnsembleKalmanProcess(
initial_par,
truth_sample,
truth.obs_noise_cov,
Inversion(),
timestepper = DefaultTimestepper(0.1),
)
ekiobj = EnsembleKalmanProcess(initial_par, truth, Inversion(), timestepper = DefaultTimestepper(0.1))

# Initialize a ParticleDistribution with dummy parameters. The parameters
# will then be set within `run_dyn_model`
Expand Down
8 changes: 3 additions & 5 deletions examples/Cloudy/Cloudy_example_uki.jl
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,6 @@ using Random

# Import Calibrate-Emulate-Sample modules
using EnsembleKalmanProcesses
using EnsembleKalmanProcesses.Observations
using EnsembleKalmanProcesses.ParameterDistributions

# Import the module that runs Cloudy
Expand Down Expand Up @@ -72,7 +71,7 @@ priors = combine_distributions([prior_N0, prior_θ, prior_k])
### Define the data from which we want to learn the parameters
###

data_names = ["M0", "M1", "M2"]
data_names = ["M0_M1_M2"]
moments = [0.0, 1.0, 2.0]
n_moments = length(moments)

Expand Down Expand Up @@ -111,8 +110,7 @@ for i in 1:n_samples
y_t[:, i] = G_t .+ rand(MvNormal(μ, Γy))
end

truth = Observations.Observation(y_t, Γy, data_names)
truth_sample = truth.mean
truth = Observation(Dict("samples" => vec(mean(y_t, dims = 2)), "covariances" => Γy, "names" => data_names))


###
Expand All @@ -130,7 +128,7 @@ N_iter = 50 # number of iterations
update_freq = 1

process = Unscented(mean(priors), cov(priors); α_reg = α_reg, update_freq = update_freq)
ukiobj = EnsembleKalmanProcess(truth_sample, truth.obs_noise_cov, process)
ukiobj = EnsembleKalmanProcess(truth, process)

# Initialize a ParticleDistribution with dummy parameters. The parameters
# will then be set within `run_dyn_model`
Expand Down
8 changes: 3 additions & 5 deletions examples/Lorenz/Lorenz_example.jl
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,6 @@ using JLD2

# CES
using EnsembleKalmanProcesses
using EnsembleKalmanProcesses.Observations
using EnsembleKalmanProcesses.DataContainers
using EnsembleKalmanProcesses.ParameterDistributions

Expand Down Expand Up @@ -85,7 +84,7 @@ end
###
### Define the data from which we want to learn the parameters
###
data_names = ["y0", "y1"]
data_names = ["y0_y1"]


###
Expand Down Expand Up @@ -178,8 +177,7 @@ end


# Construct observation object
truth = Observations.Observation(yt, Γy, data_names)
truth_sample = truth.mean
truth = Observation(Dict("samples" => vec(mean(yt, dims = 2)), "covariances" => Γy, "names" => data_names))
###
### Calibrate: Ensemble Kalman Inversion
###
Expand All @@ -194,7 +192,7 @@ N_iter = 5 # number of EKI iterations
# initial parameters: N_params x N_ens
initial_params = construct_initial_ensemble(rng, priors, N_ens)

ekiobj = EKP.EnsembleKalmanProcess(initial_params, truth_sample, truth.obs_noise_cov, Inversion())
ekiobj = EKP.EnsembleKalmanProcess(initial_params, truth, Inversion())

# EKI iterations
println("EKP inversion error:")
Expand Down
10 changes: 4 additions & 6 deletions examples/Lorenz/Lorenz_example_ukp.jl
Original file line number Diff line number Diff line change
Expand Up @@ -10,9 +10,8 @@ using Random
using JLD2

using Plots
# CES
# EKP
using EnsembleKalmanProcesses
using EnsembleKalmanProcesses.Observations
using EnsembleKalmanProcesses.ParameterDistributions
const EKP = EnsembleKalmanProcesses

Expand Down Expand Up @@ -83,7 +82,7 @@ end
###
### Define the data from which we want to learn the parameters
###
data_names = ["y0", "y1"]
data_names = ["y0_y1"]


###
Expand Down Expand Up @@ -175,8 +174,7 @@ else
end

# Construct observation object
truth = Observations.Observation(yt, Γy, data_names)
truth_sample = truth.mean
truth = Observation(Dict("samples" => vec(mean(yt, dims = 2)), "covariances" => Γy, "names" => data_names))
###
### Calibrate: Ensemble Kalman Inversion
###
Expand All @@ -199,7 +197,7 @@ update_freq = 0
# n_param+2 : sample n_param+2 sigma points
N_ens = 2n_param + 1
process = Unscented(mean(priors), cov(priors); α_reg = α_reg, update_freq = update_freq, sigma_points = "symmetric")
ukiobj = EKP.EnsembleKalmanProcess(truth_sample, truth.obs_noise_cov, process)
ukiobj = EKP.EnsembleKalmanProcess(truth, process)


# UKI iterations
Expand Down
8 changes: 3 additions & 5 deletions examples/Lorenz/distributed_Lorenz_example.jl
Original file line number Diff line number Diff line change
Expand Up @@ -34,7 +34,6 @@ using JLD2

# CES
using EnsembleKalmanProcesses
using EnsembleKalmanProcesses.Observations
using EnsembleKalmanProcesses.DataContainers
using EnsembleKalmanProcesses.ParameterDistributions

Expand Down Expand Up @@ -109,7 +108,7 @@ end
###
### Define the data from which we want to learn the parameters
###
data_names = ["y0", "y1"]
data_names = ["y0_y1"]


###
Expand Down Expand Up @@ -202,8 +201,7 @@ end


# Construct observation object
truth = Observations.Observation(yt, Γy, data_names)
truth_sample = truth.mean
truth = Observation(Dict("samples" => vec(mean(yt, dims = 2)), "covariances" => Γy, "names" => data_names))
###
### Calibrate: Ensemble Kalman Inversion
###
Expand All @@ -218,7 +216,7 @@ N_iter = 5 # number of EKI iterations
# initial parameters: N_params x N_ens
initial_params = construct_initial_ensemble(rng, priors, N_ens)

ekiobj = EKP.EnsembleKalmanProcess(initial_params, truth_sample, truth.obs_noise_cov, Inversion())
ekiobj = EKP.EnsembleKalmanProcess(initial_params, truth, Inversion())

# EKI iterations
println("EKP inversion error:")
Expand Down
15 changes: 9 additions & 6 deletions examples/example_template.jl
Original file line number Diff line number Diff line change
Expand Up @@ -3,9 +3,9 @@

# Import external modules

# Import CES modules
# Import EKP modules
using EnsembleKalmanProcesses
using EnsembleKalmanProcesses.EnsembleKalmanProcessModule
using EnsembleKalmanProcesses.Observations
using EnsembleKalmanProcesses.ParameterDistributionStorage
using EnsembleKalmanProcesses.DataStorage

Expand All @@ -29,9 +29,13 @@ noise_covariance = prescribed_observational_noise_covariance
##
## Store the observation and covariance
##
truth_sample = truth_data[1]

truth =
Observation(Dict("samples" => truth_sample, "covariances" => observational_noise_covariance, "names" => data_names))
# note that you can stack data(i.e. different observed variables) conveniently with combine_observations([truth1,truth2])
# if you have a lot of data you can also minibatch it conveniently by using an ObservationSeries - see the docs

truth = Observations.Obs(truth_data, observational_noise_covariance, data_names)
truth_sample = truth.samples[1]

##
## Now define the priors:
Expand All @@ -47,8 +51,7 @@ priors = ParameterDistribution(prior_dict)
params = construct_initial_ensemble(priors, N_ens)

# constructor for the EKI object
ekiobj =
EnsembleKalmanProcessModule.EnsembleKalmanProcess(initial_params, truth_sample, truth.obs_noise_cov, Inversion())
ekiobj = EnsembleKalmanProcessModule.EnsembleKalmanProcess(initial_params, truth, Inversion())

#outer EKI loop
for i in 1:N_iterations
Expand Down
6 changes: 3 additions & 3 deletions src/EnsembleKalmanInversion.jl
Original file line number Diff line number Diff line change
Expand Up @@ -122,12 +122,12 @@ function update_ensemble!(
fh = ekp.failure_handler

# Scale noise using Δt
scaled_obs_noise_cov = ekp.obs_noise_cov / ekp.Δt[end]
scaled_obs_noise_cov = get_obs_noise_cov(ekp) / ekp.Δt[end]
noise = sqrt(scaled_obs_noise_cov) * rand(ekp.rng, MvNormal(zeros(N_obs), I), ekp.N_ens)

# Add obs_mean (N_obs) to each column of noise (N_obs × N_ens) if
# Add obs (N_obs) to each column of noise (N_obs × N_ens) if
# G is deterministic
y = deterministic_forward_map ? (ekp.obs_mean .+ noise) : (ekp.obs_mean .+ zero(noise))
y = deterministic_forward_map ? (get_obs(ekp) .+ noise) : (get_obs(ekp) .+ zero(noise))

if isnothing(failed_ens)
_, failed_ens = split_indices_by_success(g)
Expand Down
Loading

0 comments on commit 586c48a

Please sign in to comment.