Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Include probability of 0 purchases in time range (T, T+t] #1093

Closed
DylanZammit opened this issue Oct 19, 2024 · 3 comments · Fixed by #1094
Closed

Include probability of 0 purchases in time range (T, T+t] #1093

DylanZammit opened this issue Oct 19, 2024 · 3 comments · Fixed by #1094
Assignees
Labels
CLV duplicate This issue or pull request already exists enhancement New feature or request

Comments

@DylanZammit
Copy link
Contributor

It is often useful to know the probability that a customer will make a purchase or not within the next t time periods. In some organisations "churn" is defined along the lines of

A customer is churned if they do not make a purchase within the next t days/months

Currently to my knowledge, there is no available logic that can give me this information, at least not in the BetaGeo model.

Section 5.3 of Hardie's notes gives an even more general expression, which allows us to get the probability of a customer making y purchases within the next t periods.

We derived the expression for the special case when y=0, giving the definition of "churn" described above. The attached PDF shows the algebra, along with a brief explanation of the techniques used to avoid numerical issues.

probability_no_deposits_derivation.pdf

@wd60622 wd60622 added enhancement New feature or request CLV and removed Needs Triage labels Oct 19, 2024
@ColtAllen ColtAllen added the duplicate This issue or pull request already exists label Oct 21, 2024
@ColtAllen
Copy link
Collaborator

ColtAllen commented Oct 21, 2024

Would the generalized purchase probability expression mentioned in the first comment of #168 meet your needs here?

@ColtAllen ColtAllen linked a pull request Oct 22, 2024 that will close this issue
13 tasks
@DylanZammit
Copy link
Contributor Author

I don't think so. From what I can understand, the expression you are referring to is not conditional on the purchase history of a specific customer, unlike the expression implemented in this PR. I understand that a generalised function which accepts both the number of purchases n and timeframe t and returns the conditional probability is the most ideal scenario.

If you think it is better to implement the generalised function instead of only the special case when n=0, then I can spend some time trying to derive an optimised expression for this general case. Shall we go this route before making any further changes before restricting ourselves to the special case, or shall we leave generalisation as a possible future implementation?

@ColtAllen
Copy link
Collaborator

Sorry for my delayed response; I've been traveling.

From what I can understand, the expression you are referring to is not conditional on the purchase history of a specific customer, unlike the expression implemented in this PR.

This is correct. That particular generalized function is the population expectation, and could also be interpreted as the purchase probability for a new, unobserved customer. However, I'm not sure if this satisfies your specific use case.

Shall we go this route before making any further changes before restricting ourselves to the special case, or shall we leave generalisation as a possible future implementation?

After giving some more thought to this, a major achilles heel of BetaGeoModel is that it assumes all non-repeat (i.e., y=0) customers have a probability alive estimate of 1, and in many cases this is the most common type of customer in the dataset. Permitting purchase probabilities in this special case is a great way to overcome this shortcoming, and a conditional generalized expression is already available for ParetoNBDModel.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLV duplicate This issue or pull request already exists enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants