EKP for general likelihoods? #262

sdwfrost · 2023-03-17T17:46:17Z

Is it straightforward to implement the algorithm described in https://www.sciencedirect.com/science/article/pii/S0167715222000967?

odunbar · 2023-03-17T23:29:43Z

I've had some discussion with @eviatarbach about this.

The key point is that, in the notation of the paper linked, they view a forward map H as deterministic.

In our code, we view the EKP forward map G can be nonlinear and/or non-deterministic (and non-Gaussian). For example you could view our forward map G in the paper notation as an evaluation theta -> p_D(z|H(theta)) where D is some unknown distribution and H as an unknown deterministic map.

If you do so, with perfect observation p(y|z)=z then you obtain exactly the Algorithm 1 in the paper. (NB We don't yet have the timestepping implementation for h)

The only non-default option you therefore require if you wanted to do something like this, is to be able to observe the non-deterministic map perfectly.

For this we have a keyword argument deterministic_forward_map::Bool = true in update_ensemble! (at least for EKI Inversion() update). If

true, the form p(y|G(x)) is assumed Gaussian with given noise covariance obs_noise_cov
false, the form p(y|G(x)) is assumed to be a delta function y = G(x)

Quick resolution

You should be able to run the code with the following alteration

call update_ensemble!(..., deterministic_forward_map=false).
you may need to provide a small observational noise e.g. obs_noise_cov = 1e-4*I into EnsembleKalmanProcess(....,Inversion()) viewed as a small algorithmic regularization, though you can try with obs_noise_cov=0*I .

Let us know how you get on!

Future actions

We currently have an open issue into implementing a range of timestepping schemes for EKP i.e. the correct choice of h in the paper. It's on our radar, and we will hopefully get to it later in the year.

sdwfrost · 2023-03-18T01:32:09Z

Thanks @odunbar, this is super helpful. I have to deal with count (integer) data (eg number of new infections per day for an infectious disease), that I normally transform via log(x+1). In the standard framework, I get biased parameter estimates (populations are small, so my statistics are far from being Gaussian), though for an initial condition for another scheme, your EKP library is super useful.

odunbar · 2023-03-20T18:39:16Z

Thanks!

I should caveat that EKP still only has good theory for linear & Gaussian problems.

That is to say, although our parameter distribution setups and the algorithms do allow for more interesting priors and likelihoods by pushing nonlinearities/non-Gaussianities into the forward map, this can still yield biases, as EKP still only evolve a Gaussian approximation to perform optimization. In practice though it is has shown effective in many cases.

PS RE the count data, I am by no means an expert, but we do have some work in a filtering context, where we converted infection test-results into PPV/FOR data for data assimilation (c.f. "Synthetic Data" section). This is a different transformation that puts the data into [0,1], and might be of interest to you, though the boundaries can still prove challenging.

sdwfrost · 2023-03-27T16:31:51Z

Thanks for the feedback. Here is a writeup of my toy model using EKI and EKS. I'm pretty happy with how well it worked out - thanks!

odunbar · 2023-04-03T23:27:02Z

Awesome! Thanks for the write-up, I'm pleased it worked out for you, it's really great to hear.

If you ever work with a model you can't run >10^4 times for EKS, we do also have an ML-based pipeline CalibrateEmulateSample.jl (CES), currently just for moderate dimensional parameter O(10) , to accelerate obtaining approximate posterior distributions based on the output of EKI. In fact, for CES the posteriors do not need to be based on the Gaussianity that EKS requires, so they may also prove more flexible in capturing correlations.

[Please let me know if you are interested in trying this out too. It should hopefully just require a short script taking the priors and EKI ensemble as input]

odunbar · 2023-04-13T22:20:24Z

Closing as resolved.

odunbar closed this as completed Apr 13, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

EKP for general likelihoods? #262

EKP for general likelihoods? #262

sdwfrost commented Mar 17, 2023

odunbar commented Mar 17, 2023 •

edited

Loading

sdwfrost commented Mar 18, 2023

odunbar commented Mar 20, 2023 •

edited

Loading

sdwfrost commented Mar 27, 2023

odunbar commented Apr 3, 2023 •

edited

Loading

odunbar commented Apr 13, 2023

EKP for general likelihoods? #262

EKP for general likelihoods? #262

Comments

sdwfrost commented Mar 17, 2023

odunbar commented Mar 17, 2023 • edited Loading

Quick resolution

Future actions

sdwfrost commented Mar 18, 2023

odunbar commented Mar 20, 2023 • edited Loading

sdwfrost commented Mar 27, 2023

odunbar commented Apr 3, 2023 • edited Loading

odunbar commented Apr 13, 2023

odunbar commented Mar 17, 2023 •

edited

Loading

odunbar commented Mar 20, 2023 •

edited

Loading

odunbar commented Apr 3, 2023 •

edited

Loading