-
Notifications
You must be signed in to change notification settings - Fork 17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Complete redesign of "Observations" object enabling introduction of minibatching #384
Conversation
removed Observations Module format Redesign of Observation Observation tests and format tests for minibatchers ObservationSeries tested build=true default for get_obs and get_obs_noise_cov interface for EKP add some more convenience functions for ObservationSeries test no_minibatching setup updated examples with MB UKI constructor remove build-bug where obs_noise_cov append flattens array typo format add vec typo added Dict to construct ObservationSeries, and added == operations added storage of observation inverses
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is excellent, thank you for all the work on this!
I made a few comments in the code. Besides these comments, I was wondering why you removed the multiple samples of y in Cloudy_example_eki.jl
and aerosol_activation.jl
? It seems like it would be useful to have an example with multiple samples, especially now that they can be handled better.
X = FT.((u .- mean(u, dims = 2)) / sqrt(m - 1)) | ||
Y = FT.((g .- mean(g, dims = 2)) / sqrt(m - 1)) | ||
Ω = inv(I + Y' * Γ_inv * Y) | ||
w = FT.(Ω * Y' * Γ_inv * (y .- mean(g, dims = 2))) | ||
tmp = get_buffer(get_process(ekp)) # the buffer stores Y' * Γ_inv of [size(Y,2),size(Y,1)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This whole section of code is quite difficult to read and verbose. Any way it can be simplified?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+ 1 . If it can't be simplified, some additional comments would also be helpful (and commented out code snippets removed)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I managed to allocate all the buffers up front, then I could remove the logic for the computation, I think it looks much cleaner now!
X = FT.((u .- mean(u, dims = 2)) / sqrt(m - 1)) | ||
Y = FT.((g .- mean(g, dims = 2)) / sqrt(m - 1)) | ||
Ω = inv(I + Y' * Γ_inv * Y) | ||
w = FT.(Ω * Y' * Γ_inv * (y .- mean(g, dims = 2))) | ||
tmp = get_buffer(get_process(ekp)) # the buffer stores Y' * Γ_inv of [size(Y,2),size(Y,1)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+ 1 . If it can't be simplified, some additional comments would also be helpful (and commented out code snippets removed)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for addressing the comments - Looks good to me
@eviatarbach In response to your comment on the examples, before we were not actually using more than one sample even though we gathered many, so what was happening in those examples did not really make sense anyway. My replacement does not effect the functionality. However you are right - we could include the multiple samples in examples in future, but perhaps this can be left to a future PR? |
LGTM! Thank you. |
Purpose
Closes #382
Closes #383
Closes #385
Improves some of the getters in #386
Summarized by the docs here
and the API docs here
Content
Observations
object. And Replaces the old and largely uselessObservation
objectObservation
ObservationSeries
andMinibatcher
object. that can storey
'sGamma
's and minibatching framework to batch up epochs over them.ObservationSeries
rather thany
and\Gamma
separately.y
and blocked-\Gamma
from theObservationSeries
object for the current minibatchupdate_minibatch!(
is called at the end of each EKP step, updating the batch. At the end of the epoch, this also callscreate_new_epoch!(
for the minibatcher to create a new epoch of minibatchesObservation
stores the inverse obs noise cov too.y
andGamma
to EKPObservation
object