Rethink our `Prior` class and how to work with priors in general #365

tomicapretto · 2021-07-05T22:23:03Z

At some point, we will have to update how we ask users to specify prior distributions. I think our approach is quite good for the current state of Bambi (basically, for GLMMs), but it won't be satisfactory in the nearby future if we add planned features such as splines.

Priors and current features

I think our system should be better in the two following scenarios (there may be more, these are the ones that come to my mind right now)

Specify the hyperprior for the standard deviation of a group-specific effect.

Right now, you can't modify just the hyperprior, you have to specify each term prior as a whole. For example, if we want to have a HalfStudentT hyperprior with certain parameter values for the sigma parameter we have to rewrite the whole prior:

priors = {
  "x|group": Prior("Normal", mu=0, sigma=Prior("HalfStudentT", nu=4, sigma=2))
}
Model("y ~ x + (x|group)", data, priors=priors)

For this case, brms has set_prior("<prior>", class = "sd", group = "<group>").

Priors for correlation matrices of group-specific effects.

Since #360 one can have correlated priors for group-specific terms. These effects have a multivariate normal prior with the correlation matrix having an LKJ prior. The eta parameter of this LKJ prior can be passed with the priors_cor argument in Model(). However, I think this shouldn't be the way to go in the future. Instead of adding more and more arguments, we should come up with a way to pass all the prior specifications to the priors argument.

Here brms has set_prior("lkj(2)", class = "cor").

Priors and planned features

If we manage to implement things like

Splines
Non-linear effects
Gaussian processes
More?

we will need to have something prepared to make things simpler when it comes to specifying priors specific to these models.

This section of the brms documentation is a good reference I think because it makes you realize how many things need to be taken into account when deciding how we handle priors. I'm not saying we should follow the same approach, but we should take it as a reference.

The text was updated successfully, but these errors were encountered:

aloctavodia · 2021-07-06T11:11:42Z

Following brms API, has at least two advantages. For the users, similar API between packages. For us, we do not have to think a lot about a good API for all those cases. I still would like to better learn about brms API.

An alternative, off the top of my head, is to offer something closer to PyMC3 syntax. So it would be possible to start from a bambi model and progressively turn it into a PyMC3 model, or the other way around.

tomicapretto · 2022-01-10T10:31:23Z

One alternative could use a target argument somewhere

For example:

bmb.Prior("Normal", mu=0, sigma=1, target="mu")

but that mixes the arguments of the prior distribution with the target argument. I'm not sure if that's intuitive and/or safe.
Maybe

priors = {
    "predictor": [
        {
            "prior": bmb.Prior("Normal", mu=0, sigma=1), "target": "mu"
        }, 
        {
            "prior": bmb.Prior("HalfNormal", sigma=1), "target": "sigma"
        }
    ]
}

but it can be too verbose?

Another idea is to match arguments positionally in a list/tuple

priors = {
    "predictor": [[bmb.Prior("Normal", mu=0, sigma=1), "mu"],[bmb.Prior("HalfNormal", sigma=1), "sigma"]]
}

but I still feel it may be confusing.

Either consciously or unconsciously I'm comparing with set_prior from brms in my mind. But our current API always asks users to pass the term (or classes of terms) together with the prior i.e. {"common": bmb.Prior(...)} while set_prior() has a default class value and you optionally pass the name of the terms..

Let's keep thinking about better approaches...

tomicapretto · 2022-01-13T19:51:08Z

We also need to consider multivariate terms, like the ones used in the new categorical family or the ones that will be used in ordinal models.

So far, I think the following would work to specify different means and sds for a multivariate parameter

prior = bmb.Prior("Normal", mu=[0, 0.5, 1], sigma=[0.5, 0.5, 0.5])

But it's not possible to different families as priors. This is because we assign the priors for all the dimensions in the multivariate term as a single PyMC distribution. I don't know if it's going to be possible/straightforward to allow different families for different dimensions.

For an example of what I'm saying see the first model in this article https://solomonkurz.netlify.app/post/2021-12-29-notes-on-the-bayesian-cumulative-probit/

tomicapretto · 2022-03-23T11:21:14Z

I've found brmp (like brms but in Python). It's interesting to review this documentation about Priors https://brmp.readthedocs.io/en/latest/priors.html. It seems to follow the brms way of doing it (as expected)

tomicapretto added the Discussion Issue open for discussion, still not ready for a PR on it. label Sep 17, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Rethink our `Prior` class and how to work with priors in general #365

Rethink our `Prior` class and how to work with priors in general #365

tomicapretto commented Jul 5, 2021

aloctavodia commented Jul 6, 2021

tomicapretto commented Jan 10, 2022 •

edited

Loading

tomicapretto commented Jan 13, 2022

tomicapretto commented Mar 23, 2022

Rethink our Prior class and how to work with priors in general #365

Rethink our Prior class and how to work with priors in general #365

Comments

tomicapretto commented Jul 5, 2021

Priors and current features

Priors and planned features

aloctavodia commented Jul 6, 2021

tomicapretto commented Jan 10, 2022 • edited Loading

tomicapretto commented Jan 13, 2022

tomicapretto commented Mar 23, 2022

Rethink our `Prior` class and how to work with priors in general #365

Rethink our `Prior` class and how to work with priors in general #365

tomicapretto commented Jan 10, 2022 •

edited

Loading