Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

initial gifs for priors #45

Merged
merged 11 commits into from
Sep 7, 2021
Merged

initial gifs for priors #45

merged 11 commits into from
Sep 7, 2021

Conversation

odunbar
Copy link
Collaborator

@odunbar odunbar commented Aug 27, 2021

Resolves #44

Check out the parameter distributions section in the documentation

TODO:

  • Replace @example with @setup blocks, to avoid documenter bug
  • slower gifs
  • include the building of the prior for each animated example.
  • added transform of normal distribution means
  • added transform of normal distribution +/- sd

@haakon-e
Copy link
Member

Looks mostly good.
I think the gifs can be slower.
You could experiment with adding a vertical line to denote the mean, to visualize the relationship between the mean of the normal variable and the mean of the transformed distribution.
Similarly, +/- 1 or 2sd can be shaded if it's not too hard to implement.

@haakon-e
Copy link
Member

haakon-e commented Sep 2, 2021

@haakon-e
Copy link
Member

haakon-e commented Sep 2, 2021

On this line: https://github.com/CliMA/EnsembleKalmanProcesses.jl/pull/45/files#diff-7ae191bc35679cdac72255325c38d8565dd63e1a5f3e0fc4bd5d1e90b25b0859R79

Could add: "... the physical parameter has the (provided) lower and upper bounds"

Copy link
Member

@haakon-e haakon-e left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I took a "full" read-through. It looks quite good now.

Would it help new users to be explicit about how the the push-forward should work?
For example, if I want to set up a bounded_below(0.0) constraint, knowing the parameter is O(500), what should I set the mean of the normal to? (My understanding is log(500), but perhaps it would be useful to make this explicit?

Alternatively, would it be useful to add a few lines of code to show how you could plot the resulting distribution for given fixed parameters? That may be useful if I need to set up a specific problem.

docs/src/parameter_distributions.md Outdated Show resolved Hide resolved
docs/src/parameter_distributions.md Show resolved Hide resolved
docs/src/parameter_distributions.md Outdated Show resolved Hide resolved
docs/src/parameter_distributions.md Show resolved Hide resolved
docs/src/parameter_distributions.md Outdated Show resolved Hide resolved
docs/src/parameter_distributions.md Outdated Show resolved Hide resolved
Copy link
Member

@trontrytel trontrytel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is very helpful! It reads well and should provide good example on how to start specifying your priors. Really like the animations

docs/src/parameter_distributions.md Outdated Show resolved Hide resolved
docs/src/parameter_distributions.md Outdated Show resolved Hide resolved
docs/src/parameter_distributions.md Show resolved Hide resolved
Copy link
Contributor

@bielim bielim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this a lot. One thing I was wondering is whether users would appreciate the ability to plot ParameterDistributions, both in unconstrained an in constrained space. E.g., one could provide a function plot_pdf(xarray::Array{<:Real,1}, pd::ParameterDistribution, constrained::Bool=false, kw...) that takes a parameter distribution and plots either the pdf of the unconstrained distribution or that of the constrained distribution. I haven't thought carefully about how best to implement this, but I think that people will often struggle to come up with parameters for the unconstrained prior (e.g., the mean of 0.5 and standard deviation of 1 used in the example), and being given an easy way of seeing what the distribution of the unconstrained pdf gets mapped into when sending it into the constrained space might help with this.
(This would probably be a separate PR.)

docs/src/parameter_distributions.md Outdated Show resolved Hide resolved
docs/src/parameter_distributions.md Outdated Show resolved Hide resolved
@odunbar
Copy link
Collaborator Author

odunbar commented Sep 3, 2021

@bielim RE your point about plotting. I'm a little wary of it just because plotting is quite problem specific, and only clear in small dimensions.

I think it would be worth looking being able to construct a pdf function. But we have to be a little careful as in a case where part or all of your distribution comes from correlated samples, you have to also make a histogram/KDE in those dimensions (which would involve further user choices of bin size etc)

@odunbar
Copy link
Collaborator Author

odunbar commented Sep 6, 2021

@haakon-e @bielim could you add any further comments after my latest changes?

Copy link
Member

@haakon-e haakon-e left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Let's merge.

@bielim
Copy link
Contributor

bielim commented Sep 7, 2021

@bielim RE your point about plotting. I'm a little wary of it just because plotting is quite problem specific, and only clear in small dimensions.

I think it would be worth looking being able to construct a pdf function. But we have to be a little careful as in a case where part or all of your distribution comes from correlated samples, you have to also make a histogram/KDE in those dimensions (which would involve further user choices of bin size etc)

Yep, I agree that it would require a lot of thought and work to implement plotting capability in a reasonably general way. But perfect is the enemy of good, so maybe even if the implementation will be limited to simple cases (e.g., only for distributions from independent samples) it would still be useful. I think that in practice, the choice of priors is one of the most difficult aspects of using EnsembleKalmanProcesses.jl , as it requires one to understand the concept of transformations between constrained and unconstrained spaces (which this PR does a great job of explaining), and also because most scientists working with a model really tend to think about parameter distributions in the constrained model space (rather than in the unconstrained space where the algorithm takes place), since that is what they have some physical insight or intuition about. So it would be nice for them to see what parameter distributions they are feeding into their models. Anyway, as I mentioned in the original comment, I didn't mean to suggest that this functionality should be part of this PR, but I'll open an issue just so the idea doesn't get lost.

@bielim
Copy link
Contributor

bielim commented Sep 7, 2021

And this now looks good to me, too! Bors it in!

Copy link
Member

@trontrytel trontrytel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks great!

@odunbar
Copy link
Collaborator Author

odunbar commented Sep 7, 2021

Thanks everyone!

@odunbar
Copy link
Collaborator Author

odunbar commented Sep 7, 2021

bors r+

@bors
Copy link
Contributor

bors bot commented Sep 7, 2021

Build succeeded:

@bors bors bot merged commit 78473b0 into main Sep 7, 2021
@bors bors bot deleted the orad/prior-docs branch September 7, 2021 17:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Visualization of priors in Docs
4 participants