expose fitted parameter values of implicit fits in test statistic calls #1550

lhenkelm · 2021-08-11T18:01:41Z

Description

The proposal is to change the API of documented functions and classes in pyhf.infer such that users can directly access the best-fit model parameters that implicit fits have found, toggleable by a keyword flag. Here "implicit fits" means the calls to pyhf.infer.mle.fit and pyhf.infer.mle.fixed_poi_fit in the test statistic implementations (i.e. in pyhf.infer.test_statistic._tmu_like and pyhf.infer.calculators.generate_asimov).
"Model parameters" means both the NPs, and (where it is being optimised) the POI.

Related problem inconvenience

I had encountered some funny/unexpected p-values coming out of pyhf.infer.hypotest, and I wanted to inspect (and plot predicted physical distributions for), the parameter values that gave rise to the unexpected p-values.
This is not possible in the current API, so I re-ran the fits myself to get the parameter values, which is slow on large and complex models, and has some small risk of divergence creeping in between my user-implemented calls to the fits and the internal calls.

The solution I'd like

The documented test statistic functions in pyhf.infer.test_statistics (qmu, tmu, qmu_tilde, tmu_tilde, q0) should accept
an optional kwarg which causes the function to return a tuple of the value of the test statistic and the tensor of fitted parameters. (pyhf.infer.test_statistic._tmu_like already has such a kwarg)

Similarly, pyhf.infer.calculators.generate_asimov should have the same kwarg,
toggling the returning of the asimov parameters alongside the asimov data if True.

The pyhf.infer.calculators.AsymptoticCalculator should have a property holding the collection of fit results used to find the test statistic values and Asimov data (in a dataclass maybe?). If accessed before pyhf.infer.calculators.AsymptoticCalculator.teststatistic has been called, a RuntimeError should be raised. (as pyhf.infer.calculators.AsymptoticCalculator.distributions currently does in the same case)

The ToyCalculator can IMO be left unchanged. (If I am not confident in the behavior of one fit, or if re-running 5 fits is too slow for me, I am not going to run 10k fits for 2k toys anyways).

Higher-level functions using the AsymptoticCalculator (a.k.a. pyhf.infer.hypotest)
should accept the same kwarg as the test statistic functions above, but instead of a single tensor of parameters,
they return the collection of best-fit-parameter tensors of the underlying calculator, or raise a descriptive error if the
ToyCalculator is used.

Describe alternatives you've considered

Instead of changing a bunch of APIs at once, only the test statistic impementations (or them + the AsymptoticCalculator) could be changed. I dislike this alternative since it still encourages users to duplicate the higher level functionality.
Instead of hypotest returning only the best-fit parameters, it could have a kwarg toggling it to return the entire calculator. This gives even more flexibility for users, but IMO it is a good thing that hypotest is a limited/"garden-path" way of accessing the functionality provided by the calculators. If I don't need that, I can use the calculator myself (and the docs show how.)
Even on simple models where fits converge quickly, many thousands of fits take time. So the ToyCalculator.__init__ having an kw-only arg that toggles saving/not saving the fit parameters may still be convenient (supposing one wants to plot spreads of parameters for some reason, or pull out the "one weird fit" after the fact)? Not doing so is simpler, so I would not do this for now.
add the option to access fitted parameters to pyhf.infer.intervals.upperlimit. Here, similar arguments apply as for the ToyCalculator. Further, the returned result is already complicated and conditional, and recreating n hypotest calls for n mu values seems very straightforward. So I would leave upperlimit as it is.

Additional context

This issue describes an idea that grew out of this discussion: #1547.

The text was updated successfully, but these errors were encountered:

lhenkelm · 2021-08-11T18:02:52Z

PS: sorry for the length, I played around a little with ways to do this, and wanted to write down my thoughts so I don't forget them when getting started on the PR

matthewfeickert added API Changes the public API feat/enhancement New feature or request labels Aug 12, 2021

lhenkelm mentioned this issue Aug 19, 2021

feat: Expose fitted parameter values of implicit fits in test statistic calls #1554

Merged

4 tasks

kratsg closed this as completed in #1554 Oct 13, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

expose fitted parameter values of implicit fits in test statistic calls #1550

expose fitted parameter values of implicit fits in test statistic calls #1550

lhenkelm commented Aug 11, 2021 •

edited

Loading

lhenkelm commented Aug 11, 2021

expose fitted parameter values of implicit fits in test statistic calls #1550

expose fitted parameter values of implicit fits in test statistic calls #1550

Comments

lhenkelm commented Aug 11, 2021 • edited Loading

Description

Related problem inconvenience

The solution I'd like

Describe alternatives you've considered

Additional context

lhenkelm commented Aug 11, 2021

lhenkelm commented Aug 11, 2021 •

edited

Loading