Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unscented Kalman Inversion #8

Merged
merged 1 commit into from
Apr 2, 2021
Merged

Unscented Kalman Inversion #8

merged 1 commit into from
Apr 2, 2021

Conversation

Zhengyu-Huang
Copy link
Contributor

Implement the unscented Kalman inversion [1] in the EKP library

The method is a general derivative-free optimization approach for inverse problems

  • give both mean and covariance approximation of the posterior distribution with an uninformative prior
  • require 2Nθ + 1 particles in general
  • no empirical variance inflation or early stopping needed
  • no ensemble collapse
  • the 3-sigma confidence interval covers the truth parameters for perfect models

Several examples are presented

  1. Cloudy example
    The convergence of the mean and 3-sigma of the three transformed variables and the true values are

Cloudy

  1. Lorenz96 example
    The convergence of the mean and 3-sigma of the three transformed variables and the true values are

Lorenz96

  1. Huang, Daniel Z., Tapio Schneider, and Andrew M. Stuart. "Unscented kalman inversion." arXiv preprint arXiv:2102.01580 (2021).

@odunbar
Copy link
Collaborator

odunbar commented Mar 9, 2021

This looks great! Thanks Daniel, let's have a chat about incorporating this into the EKP framework. I see 2 parts to it (Please others chime in if you disagree):

  1. We extract from the UnscentedKalmanProcess all the general parameters (e.g the observation covariance), and then we can create the EnsembleKalmanProcess{FT,IT, <:Process} struct as seen in EnsembleKalmanProcesses. We require a new process type, and so the rest of the parameters from UnscentedKalmanProcess should be specific to UKI, (e.g UKI weights), and we put them in here:
struct  Unscented <: Process
    uki_param1::P1
    uki_param2::P2
end
  1. The general EKP functions should be written to be dispatched on the <:Process objects e.g your new update function should be called as
update_ensemble!(ekp::EnsembleKalmanProcess{FT, IT, Unscented},...)

It's fine for you to keep everything in the current file you have. Let's see how easily we can do this!

PS. you call it an optimization method - is it not an approximate UQ method?

@codecov
Copy link

codecov bot commented Mar 13, 2021

Codecov Report

❗ No coverage uploaded for pull request base (main@b12ffb5). Click here to learn what that means.
The diff coverage is 99.20%.

❗ Current head f6483e1 differs from pull request most recent head 16796db. Consider uploading reports for the commit 16796db to get more accurate results
Impacted file tree graph

@@           Coverage Diff           @@
##             main       #8   +/-   ##
=======================================
  Coverage        ?   91.18%           
=======================================
  Files           ?        5           
  Lines           ?      454           
  Branches        ?        0           
=======================================
  Hits            ?      414           
  Misses          ?       40           
  Partials        ?        0           
Impacted Files Coverage Δ
src/EnsembleKalmanProcesses.jl 100.00% <ø> (ø)
src/EnsembleKalmanProcess.jl 93.08% <99.20%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update b12ffb5...16796db. Read the comment docs.

@Zhengyu-Huang
Copy link
Contributor Author

This looks great! Thanks Daniel, let's have a chat about incorporating this into the EKP framework. I see 2 parts to it (Please others chime in if you disagree):

  1. We extract from the UnscentedKalmanProcess all the general parameters (e.g the observation covariance), and then we can create the EnsembleKalmanProcess{FT,IT, <:Process} struct as seen in EnsembleKalmanProcesses. We require a new process type, and so the rest of the parameters from UnscentedKalmanProcess should be specific to UKI, (e.g UKI weights), and we put them in here:
struct  Unscented <: Process
    uki_param1::P1
    uki_param2::P2
end
  1. The general EKP functions should be written to be dispatched on the <:Process objects e.g your new update function should be called as
update_ensemble!(ekp::EnsembleKalmanProcess{FT, IT, Unscented},...)

It's fine for you to keep everything in the current file you have. Let's see how easily we can do this!

PS. you call it an optimization method - is it not an approximate UQ method?

@odunbar I have updated with Unscented process, could you have a look?

Copy link
Collaborator

@odunbar odunbar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's looking better! We need to update the function so that it is more like
update_ensemble!(uki, u^{n}, G(u^{n})) and the output is u^{n+1}.

I think this is possible by setting ourselves up to calculate

  1. In the constructor, for EKP copy-paste the lines 558-584: and define the init_params = construct_sigma_ensemble(process, u_p_mean, uu_p_cov) (as this doesn't rely upon the model right?)
  2. remove applying G within update_ensemble and instead apply G to u_p in the outer loop (in the example script) and pass in the data matrix "G(u_p)" into update_ensemble
  3. update_ensemble now effectively starts at line 589-610, and then move the block 558-584 to the end of this currently written function so that it ends with creating then new u_p.

If this doesn't make sense I'm happy to meet. But it looks good!

@Zhengyu-Huang
Copy link
Contributor Author

It's looking better! We need to update the function so that it is more like
update_ensemble!(uki, u^{n}, G(u^{n})) and the output is u^{n+1}.

I think this is possible by setting ourselves up to calculate

  1. In the constructor, for EKP copy-paste the lines 558-584: and define the init_params = construct_sigma_ensemble(process, u_p_mean, uu_p_cov) (as this doesn't rely upon the model right?)
  2. remove applying G within update_ensemble and instead apply G to u_p in the outer loop (in the example script) and pass in the data matrix "G(u_p)" into update_ensemble
  3. update_ensemble now effectively starts at line 589-610, and then move the block 558-584 to the end of this currently written function so that it ends with creating then new u_p.

If this doesn't make sense I'm happy to meet. But it looks good!

I feel giving the operator G to the update_ensemble will make the code much clearer.

Is there any difference calling G in/out of the update_ensemble function?

I am thinking about running EKP with other black box code, like fortran GCM, how do you handle it ?
Is the G still a Julia function that calls some fortran code and reads the output ?
As for the case, we can also pass G to unpdate_esemble, right ?

@odunbar
Copy link
Collaborator

odunbar commented Mar 17, 2021

Well in complex applications we may not be able to simply "call G" in the update ensemble, that's the issue. I agree your current implementation is nice and clean but this code is likely going to be hooked to possibly nasty models (e.g the climate model scenario).

The way we handle it with the blackbox code is:

  1. initialize (and output) ensemble parameters i.e generate a collection of parameter values
  2. pass ensemble parameters to black box to generate data
  3. Call update_ensemble on the data to produce a new ensemble of parameters, go back to 2)

END) iterate 2-3 until desired convergence/iterations

step 2) could be literally anything, in the GCM setting it often requires writing parameters into run-files, then setting off ensemble of simulations on HPC that reads these different files (currently based on bash and Slurm scripts), then postprocessing all the data. Working with parameters and data, rather than distributions and functions, is just more flexible here.

step 2) takes (\theta^n_i) -> G(\theta^n_i) and
step 3) takes (G(\theta^n_i) -> \theta^{n+1}_i

@Zhengyu-Huang
Copy link
Contributor Author

Well in complex applications we may not be able to simply "call G" in the update ensemble, that's the issue. I agree your current implementation is nice and clean but this code is likely going to be hooked to possibly nasty models (e.g the climate model scenario).

The way we handle it with the blackbox code is:

  1. initialize (and output) ensemble parameters i.e generate a collection of parameter values
  2. pass ensemble parameters to black box to generate data
  3. Call update_ensemble on the data to produce a new ensemble of parameters, go back to 2)

END) iterate 2-3 until desired convergence/iterations

step 2) could be literally anything, in the GCM setting it often requires writing parameters into run-files, then setting off ensemble of simulations on HPC that reads these different files (currently based on bash and Slurm scripts), then postprocessing all the data. Working with parameters and data, rather than distributions and functions, is just more flexible here.

step 2) takes (\theta^n_i) -> G(\theta^n_i) and
step 3) takes (G(\theta^n_i) -> \theta^{n+1}_i

Ok, I updated it , could you have a look?

Copy link
Collaborator

@odunbar odunbar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be the last iteration and then I think we are done. Although we will need some runtests for this.

examples/Cloudy/Cloudy_example_ukp.jl Outdated Show resolved Hide resolved
src/EnsembleKalmanProcess.jl Outdated Show resolved Hide resolved
src/EnsembleKalmanProcess.jl Outdated Show resolved Hide resolved
src/EnsembleKalmanProcess.jl Outdated Show resolved Hide resolved
Copy link
Collaborator

@odunbar odunbar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the latest revision!
The source code / examples look good to me. (i have one small comment)

All we need now are the runtests, So we need something to run through your source functions so the codecov is happy. Let me know if you want to chat about how to set these up.

Δt = Array([Δt])

EnsembleKalmanProcess{FT, IT, Unscented}(init_params, obs_mean, obs_noise_cov, N_ens, g,
err, Δt, process)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing indentation

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated

@bielim
Copy link
Contributor

bielim commented Mar 25, 2021

A minor thing: Could the ordering be changed such that the Unscented process and its outer constructor are moved to the other Processes (Inversion, Sampler) at the top of the module? Likewise, the EnsembleKalmanProcess and all its constructors should be grouped together in one place.

@bielim
Copy link
Contributor

bielim commented Mar 25, 2021

In the construction of an EnsembleKalmanProcess of process type Unscented, both the initialization of the Unscented object and the initialization of the EnsembleKalmanProcess (using that Unscented object) take the mean and covariance of the observational noise as inputs, which seems a bit redundant. Also, as far as I can see, the \Delta t argument to EnsembleKalmanProcess remains unused and can thus be confusing. I get that the idea was for EnsembleKalmanProcess to take some general arguments such as observation covariance, as well as the process type, which is a struct holding all the information that is specific to the process (Inversion, Sampler, or Unscented). However, I wonder if it might be easier to get rid of the "general arguments" (since they might lead to redundancy or confusion) and instead include all of them in the process type structs, such that the construction of a new process would be done simply by ek_process = EnsembleKalmanProcess(ek_type_struct::Union{Inversion, Sampler, Unscented}).

@bielim
Copy link
Contributor

bielim commented Mar 25, 2021

Also: I think that we should have a separate "cosmetic" PR that takes care of things like proper doc strings for functions (unless it's totally obvious what they do, e.g., getters and setters), making variable names consistent (e.g., the variables denoting the dimension of the data include "N_y", "data_dim", and "output_dim"), etc., in the entire module. Establishing these conventions will make things easier down the road, as people typically use the existing code base as a starting point/template for developing their contributions.

@Zhengyu-Huang
Copy link
Contributor Author

Also: I think that we should have a separate "cosmetic" PR that takes care of things like proper doc strings for functions (unless it's totally obvious what they do, e.g., getters and setters), making variable names consistent (e.g., the variables denoting the dimension of the data include "N_y", "data_dim", and "output_dim"), etc., in the entire module. Establishing these conventions will make things easier down the road, as people typically use the existing code base as a starting point/template for developing their contributions.

Yes, we should have a docstrings PR in to make these more clear for users

@Zhengyu-Huang
Copy link
Contributor Author

A minor thing: Could the ordering be changed such that the Unscented process and its outer constructor are moved to the other Processes (Inversion, Sampler) at the top of the module? Likewise, the EnsembleKalmanProcess and all its constructors should be grouped together in one place.

Updated, thanks!

@Zhengyu-Huang
Copy link
Contributor Author

In the construction of an EnsembleKalmanProcess of process type Unscented, both the initialization of the Unscented object and the initialization of the EnsembleKalmanProcess (using that Unscented object) take the mean and covariance of the observational noise as inputs, which seems a bit redundant. Also, as far as I can see, the \Delta t argument to EnsembleKalmanProcess remains unused and can thus be confusing. I get that the idea was for EnsembleKalmanProcess to take some general arguments such as observation covariance, as well as the process type, which is a struct holding all the information that is specific to the process (Inversion, Sampler, or Unscented). However, I wonder if it might be easier to get rid of the "general arguments" (since they might lead to redundancy or confusion) and instead include all of them in the process type structs, such that the construction of a new process would be done simply by ek_process = EnsembleKalmanProcess(ek_type_struct::Union{Inversion, Sampler, Unscented}).

I am not sure how to modify this one, I just tried to make unscented Kalman inversion constructor look like the others, and reuse these data structures

@Zhengyu-Huang
Copy link
Contributor Author

Thanks for the latest revision!
The source code / examples look good to me. (i have one small comment)

All we need now are the runtests, So we need something to run through your source functions so the codecov is happy. Let me know if you want to chat about how to set these up.

Updated a calibration test (3) in the runtest

@odunbar
Copy link
Collaborator

odunbar commented Mar 29, 2021

Also: I think that we should have a separate "cosmetic" PR that takes care of things like proper doc strings for functions (unless it's totally obvious what they do, e.g., getters and setters), making variable names consistent (e.g., the variables denoting the dimension of the data include "N_y", "data_dim", and "output_dim"), etc., in the entire module. Establishing these conventions will make things easier down the road, as people typically use the existing code base as a starting point/template for developing their contributions.

@bielim if you could create issue(s) for this? Certainly I am hoping we can do a docs push in the next couple of for EKP and CES which will i hope will help with both writing docstrings and enforcing more consistency as we get more eyes looking over it

@bielim
Copy link
Contributor

bielim commented Mar 30, 2021

Also: I think that we should have a separate "cosmetic" PR that takes care of things like proper doc strings for functions (unless it's totally obvious what they do, e.g., getters and setters), making variable names consistent (e.g., the variables denoting the dimension of the data include "N_y", "data_dim", and "output_dim"), etc., in the entire module. Establishing these conventions will make things easier down the road, as people typically use the existing code base as a starting point/template for developing their contributions.

@bielim if you could create issue(s) for this? Certainly I am hoping we can do a docs push in the next couple of for EKP and CES which will i hope will help with both writing docstrings and enforcing more consistency as we get more eyes looking over it

Done!

@Zhengyu-Huang
Copy link
Contributor Author

Zhengyu-Huang commented Mar 31, 2021

Also: I think that we should have a separate "cosmetic" PR that takes care of things like proper doc strings for functions (unless it's totally obvious what they do, e.g., getters and setters), making variable names consistent (e.g., the variables denoting the dimension of the data include "N_y", "data_dim", and "output_dim"), etc., in the entire module. Establishing these conventions will make things easier down the road, as people typically use the existing code base as a starting point/template for developing their contributions.

@bielim if you could create issue(s) for this? Certainly I am hoping we can do a docs push in the next couple of for EKP and CES which will i hope will help with both writing docstrings and enforcing more consistency as we get more eyes looking over it

Any further change requests? Should I rebase and merge? @bielim @odunbar

@odunbar
Copy link
Collaborator

odunbar commented Apr 2, 2021

LGTM! Thanks

@bielim bielim self-requested a review April 2, 2021 18:42
add Cloudy test
add Lorenz test
@Zhengyu-Huang
Copy link
Contributor Author

bors r+

@bors
Copy link
Contributor

bors bot commented Apr 2, 2021

Build succeeded:

@bors bors bot merged commit 6cb4a23 into main Apr 2, 2021
@bors bors bot deleted the uki branch April 2, 2021 19:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants