Skip to content
This repository has been archived by the owner on Jan 30, 2023. It is now read-only.

How to Specify Negative Binomial Distribution as Duration? #27

Open
yanpanlau opened this issue Nov 7, 2017 · 4 comments
Open

How to Specify Negative Binomial Distribution as Duration? #27

yanpanlau opened this issue Nov 7, 2017 · 4 comments

Comments

@yanpanlau
Copy link

Thanks for the nice work.

In the R version of HSMM, we can specify the negative binomial distribution something like this:

hsmm(sim$obs, od = "norm", rd = "nbinom")

But according to the API, it seems only support numpy array as input...is there any easy way I can simply import the distribution from scipy.stats?

Thanks

@StellaAthena
Copy link

It seems to me that the code is structured in a way that fundamentally assumes that durations are discrete. To change this, one would have to change the way durations work to mirror the way emissions work.

@jvkersch
Copy link
Owner

jvkersch commented Nov 8, 2017

@yanpanlau There's currently no way to supply a scipy distribution directly as input, though I agree there should be one. You can work around this by instantiating the distribution, and getting its PMF directly (this is what the R version of the code does internally):

>>> from scipy.stats import nbinom
>>> import numpy as np
>>>
>>> x = np.arange(100)  # or some (large) cutoff
>>> dist = np.vstack([
...     nbinom.pmf(x, 10, 0.5),
...     nbinom.pmf(x, 20, 0.5),
...     nbinom.pmf(x, 40, 0.5)
])
>>> dist.shape
(3, 100)
>>> dist /= dist.sum(axis=1, keepdims=True)
>>> dist
array([[  9.76562500e-04,   4.88281250e-03,   1.34277344e-02,
          2.68554687e-02,   4.36401367e-02,   6.10961914e-02,
          7.63702393e-02,   8.72802734e-02,   9.27352905e-02,
(...)

@StellaAthena Not sure I understand what the link is with discrete versus continuous distributions. Can you elaborate?

@jvkersch
Copy link
Owner

jvkersch commented Nov 8, 2017

Adding direct support for distributions should be fairly straightforward.

@StellaAthena
Copy link

@jvkersch I had a brain fart and thought that @yanpanlau was asking about using a continuous distribution for the durations.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants