ENH: Add the negative binomial distribution to rand_distr. #1296

WarrenWeckesser · 2023-03-05T07:33:50Z

No description provided.

dhardy

The negative binomial makes no sense for r=0. But we can generalise to p=0, since our output is floating-point which has a representation for infinity?

The code style looks good.

As for the implementation, all I can say is that it correlates with the mentioned reference, 10.1007/978-1-4613-8643-8. I tried comparing with 10.1002/0471715816 (Johnson 2005, Univariate Discrete Distributions), but my understanding of statistics is lacking. Perhaps @saona-raimundo would be willing to take a look?

WarrenWeckesser · 2023-04-24T16:22:41Z

But we can generalise to p=0, since our output is floating-point which has a representation for infinity?

If p=0, the probability mass function of the distribution would be $p_k = 0$ for all $k$. This is not a valid probability distribution. I don't think returning inf in this case is a meaningful generalization.

saona-raimundo · 2023-04-24T21:23:27Z

Hi!
I am okay with p=0 in this case, especially considering the interpretation of the Negative Binomial

When `r` is an integer, the negative binomial distribution can be interpreted as the distribution of the number of failures in a sequence of Bernoulli trials that continue until `r` successes occur.

Note that Wikipedia accepts p = 0, even if the probability mass function does not make sense in that case.

Regarding the implementation, I confirm it corresponds to the citation.
Although testing the shape of the density is hard, see #357 for a discussion, have you checked that the behaviour of the implementation is the expected one?

@dhardy, I will check your reference 10.1002/0471715816 to see if there are improved algorithms proposed.

saona-raimundo · 2023-04-25T07:25:50Z

rand_distr/src/negative_binomial.rs

+                // and saved in the NegativeBinomial instance, because it
+                // depends on just the parameters `r` and `p`.  We have to
+                // create a new Poisson instance for each variate generated.
+                Poisson::<F>::new(gamma.sample(rng)).unwrap().sample(rng)


Instead of unwrap, one should take care of the case where gamma.sample(rng) returns a float which should not be accepted.
I suggest introducing a loop which samples until one gets a finite sample.
The Float trait has the method is_finite for this.

Can this happen? The Gamma distribution should return strictly positive values, so Poisson::new should never fail.

Sorry, my point was about handling infinity as the result of simulating the gamma variable.

Nothing ensures that a Gamma samples always a positive and finite float.
Not is the signature of the sample method nor in its documentation.

At some point there was a discussion about who should handle infinity out of the simulation: the library or the user. I thought the decision was that the user should handle infinity floats, maybe I am wrong. This is why I suggest handling a possible infinite value here with a loop.

I agree, I need to fix this. If p is extremely small (e.g. 1e-40), then the scale passed to Gamma is huge (1e+40), and with such a scale, Gamma will generate samples that are infinity.

A simple loop would not be safe if we don't have a bound on how frequently infinity is generated.

Yeah, I was not even thinking on extreme values, just the unwrap.
The thing is, if gamma samples infinity, then the Poisson "should" also infinity.
Then, instead of a loop, one should introduce an if checking if the gamma samples infinity.
If it does, return infinity directly, if it does not, sample the Poisson (created with new and unwrap).

saona-raimundo · 2023-04-25T07:49:19Z

The reference 10.1002/0471715816 "Univariate Discrete Distributions" is really nice!
From pages 221-222, this is the description for the simulation of the negative binomial random variable.

The negative binomial with an integer parameter k = N can be generated as
the sum of N geometric rv’s. Except for low values of N (say N = 2, 3, 4), this
method cannot be advocated as it requires many uniforms for a single output
negative binomial rv. This argument applies a fortiori to the use of the sum of a
Poisson number of logarithmic rv’s.

The method generally recommended for generating negative binomial rv’s
with changing parameters is to generate Poisson rv’s with random parameters
drawn from a gamma distribution [see, e.g., algorithm NB3 in Fishman (1978)].
For fixed parameters the use of a fast general method, such as indexed table
look-up, alias, or frequency table, is recommended.

The reference Fishman, G. S. (1978). Principles of Discrete Event Simulation, New York: Wiley.
is not that easy to find online, but we can assume that they refer to the same method implemented in the PR.

If I am not mistaken, rand_distr does not generally implement distributions by table look-ups, so I think the PR is the way to go.

vks · 2023-04-25T11:12:20Z

For the normal distribution, we are using tables (Ziggurat algorithm).

However, I agree that the approach here is fine.

dhardy · 2023-04-25T15:17:47Z

Great, and thanks for the review. Then are we agreed to merge this (once the above is corrected)? I didn't review in detail.

rand_distr/src/negative_binomial.rs

vks

Looks good, thanks! We just need to update the changelog.

Co-authored-by: Vinzent Steinberg <[email protected]>

dhardy · 2023-05-03T18:13:06Z

Looks ready to merge @WarrenWeckesser?

WarrenWeckesser · 2023-05-03T18:19:15Z

@dhardy, I'm still looking into the issue that @saona-raimundo raised here. I'm checking how extreme values of the parameters might break things.

dhardy · 2024-02-08T10:10:02Z

@WarrenWeckesser are you still working on this?

WarrenWeckesser · 2024-03-08T20:21:48Z

I've been away from this (and most of my other open source work) for much of last year, but I haven't forgotten about it. I have a project that I need to finish up before I can get back to this. That might take a few weeks.

dhardy · 2024-11-12T14:33:03Z

Closing as stale. Feel free to re-open.

ENH: Add the negative binomial distribution.

4ece871

WarrenWeckesser mentioned this pull request Mar 5, 2023

Negative binomial distribution? #1295

Closed

WarrenWeckesser added 2 commits March 5, 2023 11:36

Add value stability tests for NegativeBinomial.

93632f2

Add a comment about the generation of negative binomial variates.

c459991

dhardy reviewed Apr 17, 2023

View reviewed changes

saona-raimundo reviewed Apr 25, 2023

View reviewed changes

vks reviewed May 1, 2023

View reviewed changes

rand_distr/src/negative_binomial.rs Outdated Show resolved Hide resolved

vks approved these changes May 1, 2023

View reviewed changes

WarrenWeckesser and others added 3 commits May 3, 2023 12:00

Simplify expression that checks for invalid p.

43f3d66

Co-authored-by: Vinzent Steinberg <[email protected]>

Add a comment that a validation expression also catches nan.

cedbd83

rand_distr: Update CHANGELOG.md: new NegativeBinomial distribution.

7643943

WarrenWeckesser mentioned this pull request May 3, 2023

Poisson sample() hangs when lambda is close to max of the float type. #1312

Closed

dhardy approved these changes May 3, 2023

View reviewed changes

dhardy added the D-work-in-progress Do: draft or unfinished PR label Jul 10, 2024

dhardy added the X-stale Outdated or abandoned work label Nov 12, 2024

dhardy closed this Nov 12, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ENH: Add the negative binomial distribution to rand_distr. #1296

ENH: Add the negative binomial distribution to rand_distr. #1296

WarrenWeckesser commented Mar 5, 2023

dhardy left a comment

WarrenWeckesser commented Apr 24, 2023

saona-raimundo commented Apr 24, 2023

saona-raimundo Apr 25, 2023

vks May 1, 2023 •

edited

Loading

saona-raimundo May 1, 2023

WarrenWeckesser May 3, 2023

saona-raimundo May 3, 2023

saona-raimundo commented Apr 25, 2023

vks commented Apr 25, 2023

dhardy commented Apr 25, 2023 •

edited

Loading

vks left a comment

dhardy commented May 3, 2023

WarrenWeckesser commented May 3, 2023

dhardy commented Feb 8, 2024

WarrenWeckesser commented Mar 8, 2024

dhardy commented Nov 12, 2024

ENH: Add the negative binomial distribution to rand_distr. #1296

ENH: Add the negative binomial distribution to rand_distr. #1296

Conversation

WarrenWeckesser commented Mar 5, 2023

dhardy left a comment

Choose a reason for hiding this comment

WarrenWeckesser commented Apr 24, 2023

saona-raimundo commented Apr 24, 2023

saona-raimundo Apr 25, 2023

Choose a reason for hiding this comment

vks May 1, 2023 • edited Loading

Choose a reason for hiding this comment

saona-raimundo May 1, 2023

Choose a reason for hiding this comment

WarrenWeckesser May 3, 2023

Choose a reason for hiding this comment

saona-raimundo May 3, 2023

Choose a reason for hiding this comment

saona-raimundo commented Apr 25, 2023

vks commented Apr 25, 2023

dhardy commented Apr 25, 2023 • edited Loading

vks left a comment

Choose a reason for hiding this comment

dhardy commented May 3, 2023

WarrenWeckesser commented May 3, 2023

dhardy commented Feb 8, 2024

WarrenWeckesser commented Mar 8, 2024

dhardy commented Nov 12, 2024

vks May 1, 2023 •

edited

Loading

dhardy commented Apr 25, 2023 •

edited

Loading