MMM Evaluation, Diagnostics and MLFlow Registry #1368

louismagowan · 2025-01-13T17:17:34Z

Description

Third time's the charm...
This PR is a recreation of this one which I was told to close and restart due to git issues on the pymc-marketing repo and then this one after, which was also closed due to some Git issues.

It could be nice to have some standardised model evaluation and diagnostic functions added to pymc-marketing. Ideally they'd be formulated in a way that makes them easy to log in MLFLow later on.
It would also be cool to build on top of the MLFlow module, to create a custom mlflow.pyfunc.PythonModel class to allow users to be able to register their models in the MLFlow registry. This would allow people to serve and maintain their MMMs more easily, and could help with MMM refreshes too.

Standard model metrics could include:

Bayesian R2
MAPE, RMSE, MAE
Normalised RMSE and MAE (to allow comparisons across datasets and methodologies - in particular with Robyn models, who use NRMSE as one of their 2 key metrics)
etc.
Wrapper functions to calculate those across the entire distributions of posteriors, not just against the means (so we can have HDI lower, upper for each metric too)

Diagnostic metrics could include:

Step size, divergences
LOOCV metrics e.g.

Some additional plots (also useful for diagnosing models):

Plot prior vs posterior distributions, see how data is shifting things
Plot HDI forests for a given var, along with r-hat value

Model Registry / Additional Logging Code:

A wrapper for an MMM model to make it conform to the MLFlow api, enabling registering and easier deployment
Also an option to load models from the registry as well / or download the idata from MLflow too.
I'll open this as a Draft for now - since I'll need advice on how where best to put this code, as well as overall design etc.

Related Issue

Closes Evaluation and Model Diagnostic Functions #911
Closes Add Support for Model Registering to MLFlow Module #973
Related to Expose serialized format #891 MLflow deployment example #901

Checklist

Checked that the pre-commit linting/style checks pass
Included tests that prove the fix is effective or that the new feature works
Added necessary documentation (docstrings and/or example notebooks)
If you are a pro: each commit corresponds to a relevant logical change

Modules affected

MMM
CLV

Type of change

📚 Documentation preview 📚: https://pymc-marketing--1368.org.readthedocs.build/en/1368/

…nostics

…e compliant with standards

…nclude in calc_metric

…conform with API guidelines and enable model registering

…ing an MMMEvaluator class and moving functions there as methods. Breaking out pm.compute_log_likelihood into its own method. Moving over diagnostic functions from MLFlow into this module

…del builder

…V), adding remaining methods

…Adding log_model to autolog, along with log_loocv()

…reation

…olog test

…metrics as distributions first, before then taking summary stats over the distributions of the metrics

…ngle-purpose, just logging of LOOCV metrics and no loglikelihood computation

… to metrics, removing MMMEvaluator and replacing with functions then updating the MLFlow module appropriately

ColtAllen · 2025-01-13T17:50:19Z

WAIC is another good metric to include if the use case involves out-of-sample predictions:

python code

Wikipedia

wd60622 · 2025-01-13T18:14:23Z

WAIC is another good metric to include if the use case involves out-of-sample predictions:

python code

Wikipedia

Can we address this in another pr. This is already months deep

pymc_marketing/mlflow.py

wd60622 · 2025-01-13T19:41:37Z

pymc_marketing/mlflow.py

+def load_mmm(
+    run_id: str,


I'd rather this take model_uri which is more similar to the sklearn mlflow function: https://mlflow.org/docs/latest/python_api/mlflow.sklearn.html#mlflow.sklearn.load_model

This would allow loading from run or from registered model.

If we'd like, we can add helpers to construct the run URI and the registered model URI

pymc_marketing/mlflow.py

…ed in a separate PR

juanitorduz

Some minor comments :) I will let @wd60622 review the mlflow components in detail as I have not been using it lately :)

docs/source/notebooks/mmm/mmm_budget_allocation_example.ipynb

pymc_marketing/mlflow.py

pymc_marketing/mmm/evaluation.py

juanitorduz · 2025-01-13T20:31:17Z

pymc_marketing/mmm/mmm.py

@@ -1065,6 +1065,148 @@ def plot_channel_parameter(self, param_name: str, **plt_kwargs: Any) -> plt.Figu
        )
        return fig

+    def plot_prior_vs_posterior(


maybe we could use https://python.arviz.org/en/stable/api/generated/arviz.plot_dist_comparison.html as we do in https://www.pymc-marketing.io/en/latest/notebooks/general/prior_predictive.html ?

Happy to take it out, but there are some things I prefer in plot_prior_vs_posterior:

Option to sort by difference / alphabetically (I find to be useful when diagnosing models with many channels)

Legend giving you differences in means rather than having to calculate them & add them to the fig

You think it's better to remove though? ☺️

pymc_marketing/mmm/mmm.py

pymc_marketing/mmm/evaluation.py

…d remove duplicate metadata logging

juanitorduz · 2025-01-14T08:43:54Z

It seems there is one error with a test

____________________ ERROR collecting tests/test_mlflow.py _____________________
ImportError while importing test module '/home/runner/work/pymc-marketing/pymc-marketing/tests/test_mlflow.py'.
Hint: make sure your test modules/packages have valid Python names.
Traceback:
/opt/hostedtoolcache/Python/3.10.16/x64/lib/python3.10/importlib/__init__.py:126: in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
tests/test_mlflow.py:29: in <module>
    from pymc_marketing.mlflow import (
E   ImportError: cannot import name 'log_summary_metrics' from 'pymc_marketing.mlflow' (/home/runner/work/pymc-marketing/pymc-marketing/pymc_marketing/mlflow.py)

…n metrics

codecov · 2025-01-14T10:41:06Z

Codecov Report

Attention: Patch coverage is 45.03311% with 83 lines in your changes missing coverage. Please review.

Project coverage is 93.86%. Comparing base (8d94482) to head (407521c).
Report is 2 commits behind head on main.

Files with missing lines	Patch %	Lines
pymc_marketing/mlflow.py	43.37%	47 Missing ⚠️
pymc_marketing/mmm/mmm.py	2.70%	36 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #1368      +/-   ##
==========================================
- Coverage   95.35%   93.86%   -1.49%     
==========================================
  Files          47       48       +1     
  Lines        4995     5135     +140     
==========================================
+ Hits         4763     4820      +57     
- Misses        232      315      +83

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

wd60622 · 2025-01-14T15:49:52Z

Thanks for the work on this @louismagowan
I will merge and we will iterate on some of the latest suggestion

louismagowan added 30 commits January 6, 2025 15:46

feat(evaluation.py): Adding 1st draft of code for evaluation and diag…

0e7d7bb

…nostics

temp: Reopen PR and remove loguru

646f56c

feat(evaluate): Delete unnecessary try except. Rework docstrings to b…

a89f5d6

…e compliant with standards

feat(evaluation): Giving user the option to select which metrics to i…

9ba67b0

…nclude in calc_metric

chore: Adding a comment

554e5b7

feat(mlflow): Adding 1st draft of MLFlow wrapper for MMM, to make it …

5010b87

…conform with API guidelines and enable model registering

chore(mlflow): Adding example to MMMRegistrar docstring

a4d523b

chore(mlflow): Fix typo in docstring

124cae2

feat(evaluation): Unfinished work. Forcing commit to save work. Creat…

fbb60f8

…ing an MMMEvaluator class and moving functions there as methods. Breaking out pm.compute_log_likelihood into its own method. Moving over diagnostic functions from MLFlow into this module

feat(model_builder): Adding pm.compute_log_likelihood as method to mo…

9ea51fc

…del builder

feat(mlflow): Adding code for computing LOOCV metrics to MLFlow logging

c14105a

temp(evaluation): Deleting code that was moved to mlflow module (LOOC…

6ea0b95

…V), adding remaining methods

fix(mlflow): Addressing docstring feedback from Will

bd8339d

feat(mlflow): Tidying MLFLow MMM wrapper, adding log_model function. …

c9d5daf

…Adding log_model to autolog, along with log_loocv()

feat(mlflow): Adding load_model function

11cc753

temp: Comment out evaluation, to remove test failures there

dbd0a5b

fix(mlflow): Updating autolog to work with LOOCV and log-likelihood c…

25f6f07

…reation

fix(tests): Updating mlflow test to include model artifact in mmm_aut…

1b58e3a

…olog test

feat(evaluation): Adding reworked model metric code - calculates the …

5404121

…metrics as distributions first, before then taking summary stats over the distributions of the metrics

fix(mlflow): Fixing LOOCV function to use model on context

fdc68d5

fix(mlflow): Resolve failing test by adding to_numpy()

72d1060

feat(evaluation): Adding prior vs posterior plotting function

5d2f458

chore(mlflow): Change function name

c2224b7

refactor: Move plot_prior_vs_posterior into MMM module

9ce86d2

refactor: Delete compute_log_likelihood and update log_loocv to be si…

6c2f806

…ngle-purpose, just logging of LOOCV metrics and no loglikelihood computation

refactor: Moving plot_hdi_forest to mmm module, moving nrmse and nmae…

8459f4a

… to metrics, removing MMMEvaluator and replacing with functions then updating the MLFlow module appropriately

fix: Add type ignore args to all sample-args

7941a4b

fix(evaluation): Correct import statement for metrics

404dca1

fix(mlflow): Changing arg in autolog to not be the same as function name

8fb34b1

chore: Improving function name and deleting Robyn comment

7070466

github-actions bot added the Deployment label Jan 13, 2025

wd60622 reviewed Jan 13, 2025

View reviewed changes

pymc_marketing/mlflow.py Outdated Show resolved Hide resolved

wd60622 reviewed Jan 13, 2025

View reviewed changes

pymc_marketing/mlflow.py Outdated Show resolved Hide resolved

wd60622 requested a review from juanitorduz January 13, 2025 19:43

Merge remote-tracking branch 'origin/main' into model_evaluation2

9e92f6f

louismagowan force-pushed the model_evaluation2 branch from 23be92f to 9e92f6f Compare January 13, 2025 20:19

chore(tests): Remove the LOOCV test since that function will be handl…

b6ac5b4

…ed in a separate PR

juanitorduz reviewed Jan 13, 2025

View reviewed changes

pymc_marketing/mmm/evaluation.py Outdated Show resolved Hide resolved

louismagowan added 3 commits January 13, 2025 21:41

refactor(mlflow): Move idata forced loading into a helper function an…

63cd338

…d remove duplicate metadata logging

chore: Cosmetic changes from Juan's feedback

d480a3f

refactor: Delete plot_hdi_forest wrapper

25e99f0

louismagowan added 2 commits January 14, 2025 11:35

fix(tests): Update function name in MLflow test that checks evaluatio…

cf6abe2

…n metrics

Merge branch 'main' into model_evaluation2

cb57d40

Merge branch 'main' into model_evaluation2

3d55c61

wd60622 added the enhancement New feature or request label Jan 14, 2025

fix(notebooks): Manually run ruff format on notebook

407521c

wd60622 changed the title ~~Model Evaluation, Diagnostics and MLFlow Registry~~ MMM Evaluation, Diagnostics and MLFlow Registry Jan 14, 2025

wd60622 approved these changes Jan 14, 2025

View reviewed changes

wd60622 merged commit 7986a9c into pymc-labs:main Jan 14, 2025
18 of 20 checks passed

louismagowan deleted the model_evaluation2 branch January 14, 2025 15:54

This was referenced Jan 14, 2025

Reassess plotting internals to make use of arviz plotting #1371

Open

Use model_uri instead of run_id in load_mmm MLflow function #1373

Open

wd60622 added this to the 0.11.0 milestone Jan 14, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

MMM Evaluation, Diagnostics and MLFlow Registry #1368

MMM Evaluation, Diagnostics and MLFlow Registry #1368

louismagowan commented Jan 13, 2025 •

edited by github-actions bot

Loading

ColtAllen commented Jan 13, 2025 •

edited

Loading

wd60622 commented Jan 13, 2025

wd60622 Jan 13, 2025

juanitorduz left a comment

juanitorduz Jan 13, 2025

louismagowan Jan 13, 2025

juanitorduz commented Jan 14, 2025 •

edited

Loading

codecov bot commented Jan 14, 2025 •

edited

Loading

wd60622 commented Jan 14, 2025

MMM Evaluation, Diagnostics and MLFlow Registry #1368

MMM Evaluation, Diagnostics and MLFlow Registry #1368

Conversation

louismagowan commented Jan 13, 2025 • edited by github-actions bot Loading

Description

Related Issue

Checklist

Modules affected

Type of change

ColtAllen commented Jan 13, 2025 • edited Loading

wd60622 commented Jan 13, 2025

wd60622 Jan 13, 2025

Choose a reason for hiding this comment

juanitorduz left a comment

Choose a reason for hiding this comment

juanitorduz Jan 13, 2025

Choose a reason for hiding this comment

louismagowan Jan 13, 2025

Choose a reason for hiding this comment

juanitorduz commented Jan 14, 2025 • edited Loading

codecov bot commented Jan 14, 2025 • edited Loading

Codecov Report

wd60622 commented Jan 14, 2025

louismagowan commented Jan 13, 2025 •

edited by github-actions bot

Loading

ColtAllen commented Jan 13, 2025 •

edited

Loading

juanitorduz commented Jan 14, 2025 •

edited

Loading

codecov bot commented Jan 14, 2025 •

edited

Loading