-
Notifications
You must be signed in to change notification settings - Fork 252
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Budget optimizer refactor #1357
Budget optimizer refactor #1357
Conversation
Check out this pull request on See visual diffs & provide feedback on Jupyter Notebooks. Powered by ReviewNB |
0196b3a
to
6c946f2
Compare
3a348d2
to
10b0783
Compare
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #1357 +/- ##
==========================================
- Coverage 93.94% 93.80% -0.15%
==========================================
Files 48 48
Lines 5137 5198 +61
==========================================
+ Hits 4826 4876 +50
- Misses 311 322 +11 ☔ View full report in Codecov by Sentry. |
yay! 🚀 Let me know if you need any support! |
10b0783
to
36985c4
Compare
Why does this have a CLV label? |
Seems like from file change bot. Then the force push made any clv files have no changes. I removed the clv label |
36985c4
to
7bed11a
Compare
Fixed the tests, NBs next |
@carlosagostini do I need to worry about other functions in https://github.com/pymc-labs/pymc-marketing/blob/7bed11a61b2310835863a7d092b9b63a2ac29081/pymc_marketing/mmm/utility.py Asking because we are no longer raveling samples / automatically summing budget dims inside the optimization code. Not sure if that could has any overlap with this change. |
7bed11a
to
0e42f11
Compare
742494e
to
23e7ccb
Compare
The notebook job stops at the first failure. Is there a way to have it try all the notebooks like pytest does with tests? I thought I was done when I fixed the last failure but now it went and failed in the next one. |
We will work on it 🙇! In the meantime, i will review this one either tonight or tomorrow 🙏 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @ricardoV94 ! This looks great! I left some small comments.
I also see the budget notebook is failing maybe due to an error not being caught by the tests?
There is another relevant notebook which uses the budget optimization: https://www.pymc-marketing.io/en/stable/notebooks/mmm/mmm_case_study.html I can work ok it in a different PR :)
Thanks for the changes @ricardoV94 ! I think the only things missing are the notebooks. Do you want to work on that or do you wanna support so that so can focus on doing your magic ;) ) ? |
I'm on it! I will push soon (and revert the merge @cetagostini just did). I appreciate another deep review as I changed a couple things to make it more clean |
Absolutely 💪 ! |
The logic used here to extract the I noticed this was rather slow while checking if the mmm case study notebook is now passing :D |
e57d20f
to
b4c5848
Compare
Hey the push was trying to rebase to my local, hope didn't took away anything important. I was working on solve those issues: I came up with a quick helper function to create allocations and test. Now a few functions receive an array such as def _create_allocation(value, **kwargs):
"""
Create an xarray.DataArray with flexible dimensions and coordinates.
Parameters:
- value (array-like): The data values for the DataArray. Shape must match the dimensions implied by the kwargs.
- **kwargs: Key-value pairs representing dimension names and their corresponding coordinates.
Returns:
- xarray.DataArray: The resulting DataArray with the specified dimensions and values.
Raises:
- ValueError: If the shape of `value` doesn't match the lengths of the specified coordinates.
"""
# Extract the dimensions and coordinates
dims = list(kwargs.keys())
coords = {dim: kwargs[dim] for dim in dims}
# Validate the shape of `value`
expected_shape = tuple(len(coords[dim]) for dim in dims)
if np.shape(value) != expected_shape:
raise ValueError(f"The shape of 'value' {np.shape(value)} does not match the expected shape {expected_shape} based on the provided dimensions.")
# Create the DataArray
data_array = xr.DataArray(value, coords=coords, dims=dims)
return data_array They could do the following and use it, even with models with more dims. # Example 1: Single dimension (channel)
da1 = create_allocation(value=[1.56753944, 1.43246056], channel=["x1", "x2"])
# Example 2: Two dimensions (channel and hierarchy)
da2 = create_allocation(
value=[[1.5, 2.0], [2.5, 3.0]],
channel=["x1", "x2"],
hierarchy=["a", "b"]
) How does that sounds for you? |
b4c5848
to
3a12939
Compare
@cetagostini I did something simple for the dict->DataArray, for if isinstance(allocation_strategy, dict):
# For backward compatibility
allocation_strategy = DataArray(
pd.Series(allocation_strategy), dims=("channel",)
) It seems like you are reinventing a way to define DataArray. Perhaps it's just better to push users to be comfortable with them, since those are the objects we work with outside of PyMC/PyTensor models? |
@cetagostini I'm not against it though. That can possibly be done as a separate PR at this point, I guess? |
I think the approach you added works, and push users to get familiar. No against, I usually try to create helpers anyway, up to the users if want to get familiar and not use it, or use it. But I'm okay to address on maybe in other PR or #1538 |
3a12939
to
dc6a63c
Compare
Tests are passing! |
Wow! That was fast!!! Thanks! |
This PR refactors the budget optimizer to extract the response graph from the MMM model directly. The users don't have to pass the arguments needed to rebuild the response function, which simplifies things quite a lot and makes the optimizer able to handle arbitrarily complex models [citation needed]
There are some lingering issues. When transformations are done out-of model (as are the channel variables), we need to find them from the model. These would ideally be part of the model, so the optimizer doesn't have to worry about this.
Breaking changes:
total_channel_contributions
deterministic inMMM
models #1387TODO:
Closes #1331
📚 Documentation preview 📚: https://pymc-marketing--1357.org.readthedocs.build/en/1357/