Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Simple syntax for querying the model and posterior sample #989

Closed
mohamed82008 opened this issue Nov 22, 2019 · 30 comments
Closed

Simple syntax for querying the model and posterior sample #989

mohamed82008 opened this issue Nov 22, 2019 · 30 comments

Comments

@mohamed82008
Copy link
Member

mohamed82008 commented Nov 22, 2019

In this issue, I propose the creation of a new macro @logprob to query Turing models as well as the sampled posterior. Let's take the following model as an example using the syntax of #965:

@model demo(x, y) = begin
    a ~ Normal()
    b ~ Gamma()
    y .~ Normal.(a .* x, b)
end
model = demo(rand(100), rand(100))
chain = sample(model, NUTS(10, 0.65), 10000)

I propose the following syntax:

  • @logprob a = 1.0, b = 2.0 | model = model -> returns log the prior
  • @logprob a = 1.0, b = 2.0, x = rand(100), y = rand(100) | model = model -> returns log the joint probability
  • @logprob 0.2 <= a <= 0.3, 1.0 <= b <= 1.1 | model = model, chain = chain -> returns log the ratio of the number of samples in chain with 0.2 <= a <= 0.3, 1.0 <= b <= 1.1. For discrete distributions, we can also use a = 2 or a == 2 for example. Passing model here can be optional.
  • @logprob x = rand(10), y = rand(10) | model = model, chain = chain returns the log likelihood of x = rand(10), y = rand(10) for each sample in chain
  • @logprob x = rand(10), y = rand(10) | model = model, a = 1.0, b = 2.0 returns the log likelihood of x = rand(10), y = rand(10) using a = 1.0, b = 2.0.

Let me know if you have comments on the syntax or if I missed any use case.

@mohamed82008
Copy link
Member Author

I can also provide an @prob macro that returns the probability instead of log the probability.

@mohamed82008
Copy link
Member Author

I suggest we also allow = and == to be used interchangeably in the examples above.

@cpfiffer
Copy link
Member

I like it, I think it'd be a good tool to have. I have no helpful comments, so I guess you're welcome?

@mohamed82008
Copy link
Member Author

I am working on it.

@xukai92
Copy link
Member

xukai92 commented Nov 26, 2019

@logprob a = 1.0, b = 2.0, x = rand(100), y = rand(100) | model = model -> returns log the joint probability

model is created by demo(rand(100), rand(100)) in which x and y are already provided. It's unclear to me what the log joint is supposed to return.

@devmotion
Copy link
Member

devmotion commented Nov 26, 2019

I agree, it feels a bit strange to me. I think it would be more natural to write

@logprob a = 1.0, b = 2.0 | model = demo

for the prior probabilities p(a = 1.0, b = 2.0) and

@logprob a = 1.0, b = 2.0, x = x, y = y | model = demo

for the joint probability p(a = 1.0, b = 2.0, x = x, y = y). I'm not even sure if it should be a macro, couldn't one just define a function logprob that allows to evaluate logprob(demo, (a = 1.0, b = 2.0)) or logprob(demo, (a = 1.0, b = 2.0, x = x, y = y))? Maybe for convenience one could even define demo.logprob instead which allows to write demo.logprob(a = 1.0, b = 2.0).

@mohamed82008
Copy link
Member Author

mohamed82008 commented Nov 26, 2019

@xukai92 the log joint probability case will replace the data inside model and compute the log joint probability for the given data and parameters. If that's not intuitive enough, I can limit this macro to the likelihood and posterior only. Convenience functions can then be made to calculate the prior and joint probability for NamedTuple parameters as @devmotion suggested. In case of the joint probability, the data inside the model can be used.

Actually, for all of them I can just make a function under the hood that can be called instead of the macro. So I have a few questions for you all:

  1. Do you prefer a macro and a function API, a function only API or a macro only API?
  2. Should I include the prior and joint probability syntax in the macro with the model = demo suggestion?
  3. Is conveniently calculating the log joint probability for new data and new parameters useful in practice or not?

@cpfiffer
Copy link
Member

I would actually prefer to have both. People who do a lot of automated model stuff would prefer a function API, whereas all the folks who are messing around with the REPL prefer the macro API. Might be that one ends up being better than the other, so we could deprecate one if it is just less useful.

@mohamed82008
Copy link
Member Author

Sounds good.

@xukai92
Copy link
Member

xukai92 commented Dec 2, 2019

Re 1: I agree with Cameron that we do need both.
Re 2: I feel model = demo is more intuitive than model = model
Re 3: I think it's useful in general.

@jonasmac16
Copy link
Contributor

Will this allow this also allow to return the point logp of the each data point given a parameter set?

@mohamed82008
Copy link
Member Author

Yes

@jonasmac16
Copy link
Contributor

jonasmac16 commented Dec 10, 2019

That's pretty cool. Is this available once the referenced PR is accepted?

@mohamed82008
Copy link
Member Author

Yup!

@yebai
Copy link
Member

yebai commented Dec 17, 2019

@logprob 0.2 <= a <= 0.3, 1.0 <= b <= 1.1 | model = model, chain = chain -> returns log the ratio of the number of samples in chain with 0.2 <= a <= 0.3, 1.0 <= b <= 1.1. For discrete distributions, we can also use a = 2 or a == 2 for example. Passing model here can be optional.

I am slightly confused by the above feature. It seems like this should be a filter-style function in the MCMCChains package. Also, what are the potential use cases?

@cpfiffer
Copy link
Member

Hypothesis testing, for one. If you wanted to have some measure of confidence about whether a parameter is negative and you wanted a quick check:

@logprob a >= 0 | model=model, chain=chain

I think it's a little nicer to work with than a Chains object, and it's very quick to write.

@devmotion
Copy link
Member

devmotion commented Dec 17, 2019

I am slightly confused by the above feature. It seems like this should be a filter-style function in the MCMCChains package.

Indeed IMO that feature seems more like one of the other convenient functionalities provided by MCMCChains. It is also not clear to me why specifying model = model would be useful in that example?

The nice thing about using a function instead of a macro would be that one could just dispatch on Chains (or AbstractChains) and implement the chain-based evaluations in MCMCChains. Since there seems to be an agreement that also a macro is needed, maybe one solution could be to provide a generic @logpdf macro in Turing that parses the input and forwards NamedTuples (or something else?) to the functional variant. The macro feels slightly more restricted, however, since it seems to require some protected parameter names (such as model or chain, e.g.).

I think it's a little nicer to work with than a Chains object, and it's very quick to write.

To me logprob(x -> x.a >= 0, chain) or

logprob(chain) do sample
    sample.a >= 0
end

doesn't feel too bad and actually more powerful and flexible than a macro.

@cpfiffer
Copy link
Member

To me logprob(x -> x.a >= 0, chain) or

logprob(chain) do sample
    sample.a >= 0
end

I like that! I think we could probably stick this on the MCMCChains side.

@mohamed82008
Copy link
Member Author

I agree that this doesn't belong here. I actually left it out from #997 .

@devmotion
Copy link
Member

You're referring only to the filtering functionality, I guess? As far as I can see, #997 contains the chain-based feature

Query the likelihood given every sample in the chain using prob"y = rand(3) | chain = chain, model = demo"

@mohamed82008
Copy link
Member Author

@devmotion yes that feature is actually important, @trappmartin and @sethaxen have uses for it. Imo, I see this belongs to Turing (and in the future DynamicPPL) because the same VarNames we use in VarInfo are keys in the chain:Chain output by Turing.sample. I make use of that to feed chain[i] for example into the VarInfo before running the model in the LikelihoodContext. This may not be possible for any arbitrary chain coming from other packages, and going from chain to NamedTuple or another standard underlying representation is completely non-trivial for nested array types, e.g. when the parameter is an array of matrices. So model = demo, chain = chain makes sense for 2 reasons:

  1. Both demo and chain are used to compute the likelihoods, and
  2. chain was generated from demo to begin with and the 2 are compatible, while other seemingly equivalent models may not work with chain and other seemingly equivalent chains may not work with demo.

@devmotion
Copy link
Member

  1. chain was generated from demo to begin with and the 2 are compatible, while other seemingly equivalent models may not work with chain and other seemingly equivalent chains may not work with demo.

To me that sounds like the syntax should be just y = rand(3) | chain = chain (without model = demo) to avoid any incompatibilities between chain and model, and just use the model saved in chain.info which was used to generate chain.

This may not be possible for any arbitrary chain coming from other packages

Sure, but does this imply that this functionality can't be provided in, e.g., MCMCChains?

@mohamed82008
Copy link
Member Author

Hmm using the model in chain sounds good, I had missed the fact that we can store these in chain. The only catch is that the chain.info field optionally stores the model in Turing, but that's easily fixable. Thanks!

@mohamed82008
Copy link
Member Author

Sure, but does this imply that this functionality can't be provided in, e.g., MCMCChains?

Yes we need access to VarInfo which is not seen by MCMCChains, may be in a future PR we can consider moving things and interfacing.

@mohamed82008
Copy link
Member Author

So here is a nice middle ground, if model is defined in chain, we use that. Otherwise, we check if model was passed on the RHS. We do the same for varinfo. That way even if the user forgot to save model and VarInfo in chain, we can still save the day. Sounds good?

@mohamed82008
Copy link
Member Author

Hmm nevermind, that won't be necessary if chain always saves varinfo and model. Any reason not to do save them by default in my PR @cpfiffer?

@mohamed82008
Copy link
Member Author

Even if it is saved by default, I will leave the backup there.

@cpfiffer
Copy link
Member

No reason that I can think. You could probably just set the keyword save_state=true and everything should be good.

@mohamed82008
Copy link
Member Author

Cool, thanks!

@mohamed82008
Copy link
Member Author

Closed via #997.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants