-
-
Notifications
You must be signed in to change notification settings - Fork 263
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add HSGP reference and example nb #647
Conversation
Check out this pull request on See visual diffs & provide feedback on Jupyter Notebooks. Powered by ReviewNB |
I've started a review, that I'll push directly to your branch when ready |
thanks @AlexAndorra! Also very open to splitting this up -- it is pretty long, sorry. That might make it easier to review. |
Would be cool to have |
Perhaps emphasize in the text what you are looking for in the gram plots (the bright yellow diagonal)? |
Would it be worth having an alias for |
Added this text:
Yah I agree! it'd clean the nb up a bit too. Any thoughts on a name? I really don't like how vague "approx_params" is but couldn't think of anything particularly better.
Definitely, especially now I think it's a good idea that users access it directly sometimes. |
View / edit / reply to this conversation on ReviewNB juanitorduz commented on 2024-03-25T20:22:20Z In #627 I added the BibTex references for the main articles, you you can use them here ;)
Also, it might be nice to add reference links to the classes in thee PyMC docs (also see #627) |
View / edit / reply to this conversation on ReviewNB juanitorduz commented on 2024-03-25T20:22:21Z https://github.com/arviz-devs/preliz is a nice package to do these plots ;) |
View / edit / reply to this conversation on ReviewNB juanitorduz commented on 2024-03-25T20:22:22Z It would be nice to add the true values via ref_val ( and remove overlapping titles XD) AlexAndorra commented on 2024-05-21T15:37:59Z Done |
View / edit / reply to this conversation on ReviewNB juanitorduz commented on 2024-03-25T20:22:23Z What about adding the HDO of the likelihood? In many applications one is interested in both, the GP and the whole fit. The reason I bring this in is because in many PyMC examples this is not done, so this could be a good place to have it ;) bwengals commented on 2024-03-29T00:43:23Z What's HDO stand for (though I think I know what you mean)? Yeah, that's a good idea
juanitorduz commented on 2024-03-29T09:35:50Z HDI 🤪 AlexAndorra commented on 2024-05-21T15:58:16Z Done |
View / edit / reply to this conversation on ReviewNB juanitorduz commented on 2024-03-25T20:22:24Z "Additive processes result in a product of the power spectral densities." I think I know what you mean by this statement because I have read the papers carefully, but maybe you want to explain a bit more? bwengals commented on 2024-03-29T00:47:21Z you're right this isn't well written. How about:
Instead of constructing and then directly adding them, the sum of two HSGPs can be computed more efficiently by first taking the product of their power spectral densities, and then creating a single GP from the combined power spectral density. This reduces the number of unknown parameters because the two GPs can share the same basis set.
Still feels like a mouthful, but lmk what you think. I don't want to explain what a PSD here, or any theory really, so maybe this is a good place to cite something too. juanitorduz commented on 2024-03-29T09:38:37Z I think this version is clearer. You can think about adding this into a little box so that users see it as a side remark
{admonition} Adding HGSPs Instead of constructing and then directly adding them, the sum of two HSGPs can be computed more efficiently by first taking the product of their power spectral densities, and then creating a single GP from the combined power spectral density. This reduces the number of unknown parameters because the two GPs can share the same basis set.
See https://jupyterbook.org/en/stable/content/content-blocks.html AlexAndorra commented on 2024-05-21T16:46:11Z Done |
View / edit / reply to this conversation on ReviewNB juanitorduz commented on 2024-03-25T20:22:25Z Line #18. m_options = [3, 3, 5] Can you add a legend for m? At the moment It is not clear which curve has a larger m. bwengals commented on 2024-03-29T00:50:58Z hmm, maybe I could say something in text? A legend for each panel clutters things quite a bit. The first panel has 3 basis functions, the middle has 3, and the rightmost has 5.
This actually doesn't line up with the text, which just says there are 3 basis functions per panel. I'll change it to 3 for all panels.
|
View / edit / reply to this conversation on ReviewNB juanitorduz commented on 2024-03-25T20:22:25Z I think (If I have not missed something) this is the first time you talk about "first eigenvector". Maybe is worth commenting above that the basis functions come as eigenvectors of the Dirichle's Laplacian? bwengals commented on 2024-03-29T00:54:20Z Oh good point. What if restrict myself to saying just "basis vector"? I feel like saying eigenvectors of Dirichlet's Laplacian will only be meaningful to some people. And if you wanna actually know what that means the papers are the source for that info.
juanitorduz commented on 2024-03-29T09:39:50Z basis vector seems good! |
View / edit / reply to this conversation on ReviewNB juanitorduz commented on 2024-03-25T20:22:26Z geek nitpick: do not use bwengals commented on 2024-03-29T00:56:23Z I would agree... But, I'm using the notation from the "Practical HSGP" paper, so blame them
juanitorduz commented on 2024-03-29T09:40:37Z ahaha ok XD |
View / edit / reply to this conversation on ReviewNB juanitorduz commented on 2024-03-25T20:22:27Z I get the point but to be honest, to the naked eye all these plots look very similar and are hard to use as a way of validating the approximation. Is there a numerical metric to compare these matrices? maybe a simple matrix norm? AlexAndorra commented on 2024-03-26T15:41:51Z What do you mean by "a lower fidelity HSGP approximation" here? bwengals commented on 2024-03-29T01:03:34Z You're right @juanitorduz, the differences don't jump out, except for maybe the top right. You can also see it in the upper left and bottom right blocks on the bottom row. I could lower m and c a touch to make them worse, or maybe add better explanation in the text?
I agree that matrix norm makes sense, but I don't want to propose any kind of approach here... since I have no idea what norm would be a good cutoff. And I also don't think this is a widely useful approach anyway because maybe K is too big to calculate directly.
The goal was more to merely get a feel for whats going on that makes a bad HSGP approximation bad. And so that would show how someone might start doing some detective work to see if the approximation would be good enough for their particular use case.
@AlexAndorra, I was meaning like, just low fidelty = bad approximation. The GP approximation is too "compressed" in a sense and so it cant reproduce the true GP. juanitorduz commented on 2024-03-29T09:43:23Z I think both suggestions would improve this : - lower m and c a touch to make them worse - Add explanation in the text where to focus concretely AlexAndorra commented on 2024-05-21T19:35:33Z Done |
View / edit / reply to this conversation on ReviewNB juanitorduz commented on 2024-03-25T20:22:28Z Maybe comment a bit the meaning of the two regions? AlexAndorra commented on 2024-03-26T19:44:15Z I think it's just that the test set in the region where x1 > 2 and x1 < 4 bwengals commented on 2024-03-29T01:05:42Z Yup, what Alex said. This is written nowhere though so I'll add some text for this AlexAndorra commented on 2024-05-21T15:30:04Z Done |
View / edit / reply to this conversation on ReviewNB juanitorduz commented on 2024-03-25T20:22:29Z I suggest adding an into sentence as: Here is the model structure. Below we describe its main components ... AlexAndorra commented on 2024-05-21T15:30:15Z Done |
View / edit / reply to this conversation on ReviewNB juanitorduz commented on 2024-03-25T20:22:30Z fix overlapping titles AlexAndorra commented on 2024-03-28T15:14:39Z Done on my version, I'll push it to Bill's branch ASAP bwengals commented on 2024-03-29T01:07:35Z @juanitorduz, how do you do that with arviz? It drives me crazy... is there a trick or is it super manual? juanitorduz commented on 2024-03-29T09:45:47Z Yes: I always use this code to plot traces / rank plots:
AlexAndorra commented on 2024-05-21T15:32:52Z Done |
View / edit / reply to this conversation on ReviewNB juanitorduz commented on 2024-03-25T20:22:31Z Is there any metric to evaluate the fits which is not purely visual? I have not worked much with 2-dim GPs AlexAndorra commented on 2024-03-28T15:15:14Z These plots are absolutely beautiful!! bwengals commented on 2024-03-29T01:09:34Z thanks Alex! the qmc thing is what made them actually look nice. I think I should probably use a diverging colormap that's white at zero...
RE metrics, I dont know of anything specific to GPs. Any lack of fit metric? juanitorduz commented on 2024-03-29T09:47:50Z This was more of a question (outside the scope of the notebook) XD |
View / edit / reply to this conversation on ReviewNB juanitorduz commented on 2024-03-25T20:22:31Z add title anx axis labesl ;) bwengals commented on 2024-03-29T01:11:16Z good call, i'll add these.
Also, this figure is rendering absolutely massive in reviewnb, I hope it doesn't look like this in the docs. Do you know of recommendations for figure sizes in pymc-examples? juanitorduz commented on 2024-03-29T09:49:31Z I usually create figures as
s = pm.draw(pm.Lognormal.dist(mu=0.5, sigma=0.5), draws=10_000) fig, ax = plt.subplots(figsize=(10, 6)) ax.hist(s, np.linspace(0, 5, 100)); juanitorduz commented on 2024-03-29T09:49:58Z This way you can specify the sizes directly AlexAndorra commented on 2024-03-29T14:59:27Z I handled the size issue in the version I'm working on @Bill AlexAndorra commented on 2024-03-29T15:01:02Z That being said, I don't think this particular plot is useful, I'd remove it. Which point are you tring to convey? |
View / edit / reply to this conversation on ReviewNB juanitorduz commented on 2024-03-25T20:22:32Z maybe we use a dicrete color palette as the blues intensity does not have a meaning? bwengals commented on 2024-03-29T01:12:58Z I did originally, but changed it to blues so it was comparable to the kronecker example.
Still makes sense to change it though, but in case you hadnt gotten to the kronecker part yet, what do you think in that context? juanitorduz commented on 2024-03-29T09:51:00Z Right! after the kronecker part this makes sense. No strong opinion here. Both work together. |
View / edit / reply to this conversation on ReviewNB juanitorduz commented on 2024-03-25T20:22:33Z I think you never introduced what a "spectral density" is. so maibe is worth add a comment at the beginning? AlexAndorra commented on 2024-03-29T17:17:11Z +1 |
View / edit / reply to this conversation on ReviewNB juanitorduz commented on 2024-03-25T20:22:34Z why are the pymc spectral densities not vectorized? juanitorduz commented on 2024-03-25T20:55:32Z Can you add a reference to equation (3) in [Practical Hilbert space approximate Bayesian Gaussian processes for probabilistic programming](https://link.springer.com/article/10.1007/s11222-022-10167-2) bwengals commented on 2024-03-29T01:15:35Z In PyMC they are vectorized over input dimension for a single GP with different lengthscales for each dimension. This one is vectorized over lengthscale, but assumes input_dim = 1 and each GP has a different lengthscale.
Tried and couldnt figure out how to do both in the PyMC version... I bet it's possible though juanitorduz commented on 2024-03-29T09:51:42Z ah ok! thanks for the clarification :) AlexAndorra commented on 2024-03-29T17:18:23Z Can you copy the PyMC implementation and spot out the changes you made, to make them more explicit? I think it'd help understanding, and reproducibility for other kernels |
View / edit / reply to this conversation on ReviewNB juanitorduz commented on 2024-03-25T20:22:35Z This is a fantastic example! |
View / edit / reply to this conversation on ReviewNB juanitorduz commented on 2024-03-25T20:22:36Z fix overlaping titles AlexAndorra commented on 2024-03-29T20:25:06Z Done |
View / edit / reply to this conversation on ReviewNB juanitorduz commented on 2024-03-25T20:22:37Z Here the dicrete palette would be easier to read so that we can identify each of the 10 GPs AlexAndorra commented on 2024-03-29T20:55:16Z Done |
View / edit / reply to this conversation on ReviewNB juanitorduz commented on 2024-03-25T20:22:38Z I need more time to diggest this example! AlexAndorra commented on 2024-03-31T22:18:48Z Do you think we can / need to detail the idea of Kronecker structure a bit more @Bill? I feel like this version goes a bit too fast
|
The first two examples, for the Basic Usage NB are ready for review @fonnesbeck @juanitorduz @bwengals 🍾 |
View / edit / reply to this conversation on ReviewNB juanitorduz commented on 2024-05-27T08:41:52Z In #627 I added some of these references to the AlexAndorra commented on 2024-05-27T15:15:08Z Do I have to do that? I personally really dislike this syntax and try to avoid it as much as possible 🙈 juanitorduz commented on 2024-05-27T16:29:35Z hehe ok! I do not think this is a strong requirement . |
View / edit / reply to this conversation on ReviewNB juanitorduz commented on 2024-05-27T08:41:52Z Maybe add a legend to the plot to explain the meaning of the shared region? AlexAndorra commented on 2024-05-27T17:28:23Z Done |
View / edit / reply to this conversation on ReviewNB juanitorduz commented on 2024-05-27T08:41:53Z Add comment on the results and introudece the final plot (just to continue with the storyline) AlexAndorra commented on 2024-05-27T17:31:53Z Done |
View / edit / reply to this conversation on ReviewNB juanitorduz commented on 2024-05-27T08:41:54Z you could wrapp thin into a AlexAndorra commented on 2024-05-27T17:48:09Z I think I've done it, but please make sure to check, as it's really hard to know what the result will be with that syntax 🙏 juanitorduz commented on 2024-05-27T18:22:55Z It worked! See https://pymcio--647.org.readthedocs.build/projects/examples/en/647/gaussian_processes/HSGP-Basic.html |
View / edit / reply to this conversation on ReviewNB juanitorduz commented on 2024-05-27T08:41:54Z Regaring AlexAndorra commented on 2024-05-27T17:53:34Z Done |
View / edit / reply to this conversation on ReviewNB juanitorduz commented on 2024-05-27T08:41:55Z AlexAndorra commented on 2024-05-27T17:54:24Z Yes, x1, the one on the y-axis |
View / edit / reply to this conversation on ReviewNB juanitorduz commented on 2024-05-27T08:41:56Z Aagain, wrapping this into a AlexAndorra commented on 2024-05-27T17:55:49Z Mmmmh, here I think it's a bit too big for a tip actually, especially since I'd say it's something we'd like people to read, not ignore as a "nice-to-know" |
@AlexAndorra This looks great! I left some minor suggestions. I think it would be nice to follow the style guide that suggests direct link to the PyMC classes and methods (e.g. {class} |
Do I have to do that? I personally really dislike this syntax and try to avoid it as much as possible 🙈 View entire conversation on ReviewNB |
Thanks @juanitorduz ! On it right now. Question though:
Isn't that what I did already? Note that I'm really not versed into this syntax (I really dislike it, so there is a high chance my question is stupid) |
hehe ok! I do not think this is a strong requirement . View entire conversation on ReviewNB |
I do not think so. For example, you have: The but it would be nice to have The {class} so that the link is clickable so that the user can get into the class definition in the PyMC doc. See for example, the raw file https://raw.githubusercontent.com/pymc-devs/pymc-examples/main/examples/gaussian_processes/GP-Births.myst.md where you can see how the classes are written so that they are clickable as in https://www.pymc.io/projects/examples/en/latest/gaussian_processes/GP-Births.html For more details see https://pymc-data-umbrella.xyz/en/latest/about/contributing_to_documentation/docs_presentation.html#cross-referencing-roles I do not think this is a blocker. But is very useful for reference and tutorial notebooks ;) |
Done View entire conversation on ReviewNB |
Done View entire conversation on ReviewNB |
I think I've done it, but please make sure to check, as it's really hard to know what the result will be with that syntax 🙏 View entire conversation on ReviewNB |
Done View entire conversation on ReviewNB |
Yes, x1, the one on the y-axis View entire conversation on ReviewNB |
Mmmmh, here I think it's a bit too big for a tip actually, especially since I'd say it's something we'd like people to read, not ignore as a "nice-to-know" View entire conversation on ReviewNB |
Thanks a lot for the quick reviews @juanitorduz 🥳 |
The class reference worked! See I think it would be nice to have the same syntax in the intro, as is the first entry point of these classes. Besides this, it looks great! |
Just reran the NB with the latest changes of the PR on the PyMC repo. Actually, let's wait for this PR to be merged before merging that one |
Just merged #7335 @juanitorduz , we can merge this one 🥳 |
Fantastic work @AlexAndorra ! |
HSGP reference and examples
Corresponding to issue #646. The first two examples, for the Basic Usage NB are ready for review 🍾
NB: needs this version of PyMC to run