Problem with random switch #1013

DomagojGalic · 2019-12-07T17:17:36Z

I'm implementing a model from Cameron Davidson-Pilon's book bayesian methods for hackers link to the notebook.
The model I'm implementing is in the example on inferring behaviour from text-message data.

I can't seem to get the correct result.
Here is my code:

@model smscounter(y) = begin
    n = length(y)
    α = 1.0 / mean(y)
    λ1 ~ Exponential(α)
    λ2 ~ Exponential(α)
    τ ~ DiscreteUniform(1, n)
    
    for k in 1:τ
        y[k] ~ Poisson(λ1)
    end
    
    for k in (τ + 1):n
        y[k] ~ Poisson(λ2)
    end
end;

chain = sample(smscounter(data), MH(), 10_000);

Any comment on what I might be doing wrong would be much appreciated.
I'm using Julia 1.1.1 and Turing 0.7.1

cpfiffer · 2019-12-07T17:23:56Z

Can you tell use what you are getting and what you are expecting to get?

yebai · 2019-12-07T19:47:04Z

τ is a discrete parameter. MH aren’t expected to work with discrete parameters. Maybe try a Gibbs sampler, e.g. PG+MH, or PG+HMC.

DomagojGalic · 2019-12-07T21:19:49Z

Can you tell use what you are getting and what you are expecting to get?

I'm getting this:
Summary Statistics
. Omitted printing of 1 columns
│ Row │ parameters │ mean │ std │ naive_se │ mcse │ ess │
│ 1 │ λ1 │ 3.81492 │ 4.88523e-15 │ 4.88523e-17 │ 0.0 │ 40.1606 │
│ 2 │ λ2 │ 4.89517 │ 7.10578e-15 │ 7.10578e-17 │ 0.0 │ 40.1606 │
│ 3 │ τ │ 40.0 │ 0.0 │ 0.0 │ 0.0 │ NaN │

Quantiles

│ Row │ parameters │ 2.5% │ 25.0% │ 50.0% │ 75.0% │ 97.5% │
│ 1 │ λ1 │ 3.81492 │ 3.81492 │ 3.81492 │ 3.81492 │ 3.81492 │
│ 2 │ λ2 │ 4.89517 │ 4.89517 │ 4.89517 │ 4.89517 │ 4.89517 │
│ 3 │ τ │ 40.0 │ 40.0 │ 40.0 │ 40.0 │ 40.0 │

This is what I get when implementing the same model in pymc3
post.pdf
You may have to squint a little to see the green bars that represent the distribution of τ.

As you can see Turing model returns τ as deterministic value (standard deviation is zero), as well as λ1 and λ2 which are also basically deterministic, with means 3.8 and 4.9 respectively.
The pymc3 model returns variables with greater variability, also λ1 and λ2 have means of about 17 and 23 respectively.

Here is the python code just in case:

with pm.Model() as model:
    alpha = 1.0 / count_data.mean() 
    lambda_1 = pm.Exponential("lambda_1", alpha)
    lambda_2 = pm.Exponential("lambda_2", alpha)

    tau = pm.DiscreteUniform("tau", lower=0, upper=n_count_data - 1)

    idx = np.arange(n_count_data)
    lambda_ = pm.math.switch(tau >= idx, lambda_1, lambda_2)
    
with model:
    observation = pm.Poisson("obs", lambda_, observed=count_data)
    step = pm.Metropolis()
    trace = pm.sample(3000, tune=1000, step=step)

DomagojGalic · 2019-12-07T21:23:57Z

τ is a discrete parameter. MH aren’t expected to work with discrete parameters. Maybe try a Gibbs sampler, e.g. PG+MH, or PG+HMC.

I tried

chain = sample(smscounter(y), Gibbs(PG(100, :τ), MH(:λ1, :λ2)), 10_000);

and also

chain = sample(smscounter(y), Gibbs(PG(100, :τ), HMC(0.05, 10, :λ1, :λ2)), 10_000);

In both cases I got
Stacktrace in the failed task:

KeyError

and then a mile long stacktrace.

cpfiffer · 2019-12-08T15:42:14Z

This would be much easier to debug if we had a fully reproducible example (something we can copy-paste) that runs immediately. Can you provide a MWE (minimum working example) that we can work from? We don't have your dataset or any of your setup code prior to the model declaration.

DomagojGalic · 2019-12-08T17:28:37Z

This would be much easier to debug if we had a fully reproducible example (something we can copy-paste) that runs immediately. Can you provide a MWE (minimum working example) that we can work from? We don't have your dataset or any of your setup code prior to the model declaration.

Sure (the dataset is actually provided with the notebook I linked, but I'm going to just copy it, since it'll be easier than downloading it and loading)

data = TArray(Int64, 74);
y = [13 24 8 24 7 35 14 11 15 11 22 22 11 57 11 19 29 6 19 12 22 12 18 72 32 9 7 13 19 23 27 20 6 17 13 10 14 6 16 15 7 2 15 15 19 70 49 7 53 22 21 31 19 11 18 20 12 35 17 23 17 4 2 31 30 13 27 0 39 37 5 14 13 22];

for i in 1:74
    data[i] = y[i]
end;

@model smscounter(y) = begin
    n = length(y)
    α = 1.0 / mean(y)
    λ1 ~ Exponential(α)
    λ2 ~ Exponential(α)
    τ ~ DiscreteUniform(1, n)
    
    for k in 1:τ
        y[k] ~ Poisson(λ1)
    end
    
    for k in (τ + 1):n
        y[k] ~ Poisson(λ2)
    end
end;

chain1 = sample(smscounter(data), MH(), 10_000);
# chain2 = sample(smscounter(data), Gibbs(PG(100, :τ), MH(:λ1, :λ2)), 10_000);
# chain3 = sample(smscounter(data), Gibbs(PG(100, :τ), HMC(0.05, 10, :λ1, :λ2)), 10_000);

cpfiffer · 2019-12-09T03:08:07Z

The problem here appears to be the use of TArray to wrap your data. TArray is really only for arrays of parameters, not for data. If you delete the TArray stuff and just use the plain vector y with one of the Gibbs samplers, you should be fine.

Modified code with a shortened sample size for illustrative purposes:

using Turing

y = [13 24 8 24 7 35 14 11 15 11 22 22 11 57 11 19 29 6 19 12 22 12 18 72 32 9 7 13 19 23 27 20 6 17 13 10 14 6 16 15 7 2 15 15 19 70 49 7 53 22 21 31 19 11 18 20 12 35 17 23 17 4 2 31 30 13 27 0 39 37 5 14 13 22];

@model smscounter(y) = begin
    n = length(y)
    α = 1.0 / mean(y)
    λ1 ~ Exponential(α)
    λ2 ~ Exponential(α)
    τ ~ DiscreteUniform(1, n)
    
    for k in 1:τ
        y[k] ~ Poisson(λ1)
    end
    
    for k in (τ + 1):n
        y[k] ~ Poisson(λ2)
    end
end;

chain = sample(smscounter(y), Gibbs(PG(100, :τ), MH(:λ1, :λ2)), 1_000);

Output:

Object of type Chains, with data of type 1000×4×1 Array{Union{Missing, Real},3}

Iterations        = 1:1000
Thinning interval = 1
Chains            = 1
Samples per chain = 1000
internals         = lp
parameters        = λ1, λ2, τ

2-element Array{ChainDataFrame,1}

Summary Statistics
. Omitted printing of 1 columns
│ Row │ parameters │ mean      │ std       │ naive_se    │ mcse       │ ess     │
│     │ Symbol     │ Float64   │ Float64   │ Float64     │ Float64    │ Any     │
├─────┼────────────┼───────────┼───────────┼─────────────┼────────────┼─────────┤
│ 1   │ λ1         │ 0.0177012 │ 0.0084283 │ 0.000266526 │ 0.00199435 │ 6.21103 │
│ 2   │ λ2         │ 0.324128  │ 0.0586637 │ 0.00185511  │ 0.0185999  │ 6.14019 │
│ 3   │ τ          │ 1.034     │ 0.759883  │ 0.0240296   │ 0.034      │ 6.49671 │

Quantiles

│ Row │ parameters │ 2.5%      │ 25.0%     │ 50.0%     │ 75.0%     │ 97.5%     │
│     │ Symbol     │ Float64   │ Float64   │ Float64   │ Float64   │ Float64   │
├─────┼────────────┼───────────┼───────────┼───────────┼───────────┼───────────┤
│ 1   │ λ1         │ 0.0157069 │ 0.0157069 │ 0.0157069 │ 0.0157069 │ 0.0593485 │
│ 2   │ λ2         │ 0.122249  │ 0.342728  │ 0.342728  │ 0.342728  │ 0.342728  │
│ 3   │ τ          │ 1.0       │ 1.0       │ 1.0       │ 1.0       │ 1.0       │

DomagojGalic · 2019-12-09T16:12:57Z

There's still the problem of wrong results, I tried your solution and get

│ Row │ parameters │ mean │ std │ naive_se │ mcse │ ess │ r_hat │
│ 1 │ λ1 │ 5.49626 │ 3.55289e-15 │ 3.55289e-17 │ 0.0 │ 40.1606 │ 0.9999 │
│ 2 │ λ2 │ 2.97791 │ 3.997e-15 │ 3.997e-17 │ 0.0 │ 40.1606 │ 0.9999 │
│ 3 │ τ │ 73.9998 │ 0.0141414 │ 0.000141414 │ 0.0002 │ 41.2179 │ 1.0001 |

Quantiles

│ Row │ parameters │ 2.5% │ 25.0% │ 50.0% │ 75.0% │ 97.5% │
│ 1 │ λ1 │ 5.49626 │ 5.49626 │ 5.49626 │ 5.49626 │ 5.49626 │
│ 2 │ λ2 │ 2.97791 │ 2.97791 │ 2.97791 │ 2.97791 │ 2.97791 │
│ 3 │ τ │ 74.0 │ 74.0 │ 74.0 │ 74.0 │ 74.0 │

for

sample(smscounter(y), Gibbs(PG(100, :τ), MH(:λ1, :λ2)), 10_000)

and

Summary Statistics

│ Row │ parameters │ mean │ std │ naive_se │ mcse │ ess │ r_hat │
│ │ Symbol │ Float64 │ Float64 │ Float64 │ Float64 │ Any │ Any │
│ 1 │ λ1 │ 14.7493 │ 2.76869 │ 0.0276869 │ 0.271849 │ 40.1606 │ 1.09066 │
│ 2 │ λ2 │ 0.16302 │ 0.384425 │ 0.00384425 │ 0.0382327 │ 40.1606 │ 1.08598 │
│ 3 │ τ │ 73.9997 │ 0.0173188 │ 0.000173188 │ 0.0003 │ 41.2179 │ 1.0002 │

Quantiles

│ Row │ parameters │ 2.5% │ 25.0% │ 50.0% │ 75.0% │ 97.5% │
│ 1 │ λ1 │ 5.61827 │ 15.1975 │ 15.5239 │ 15.818 │ 16.3676 │
│ 2 │ λ2 │ 0.00163632 │ 0.0159676 │ 0.0395861 │ 0.0838861 │ 1.45111 │
│ 3 │ τ │ 74.0 │ 74.0 │ 74.0 │ 74.0 │ 74.0 │

for

sample(smscounter(y), Gibbs(PG(100, :τ), HMC(0.05, 10, :λ1, :λ2)), 10_000);

Which is quite far from what I'm supposed to get.

cpfiffer · 2019-12-14T18:08:00Z

I think this is because of differences in how PyMC3 and Distributions.jl treat draws from an exponential distribution -- if you replace alpha with just the mean (and not the inverse mean), you should get something very similar to the output in the book. You can now also use MH if you'd like, which is quick-and-dirty:

using Turing

y = [13 24 8 24 7 35 14 11 15 11 22 22 11 57 11 19 29 6 19 12 22 12 18 72 32 9 7 13 19 23 27 20 6 17 13 10 14 6 16 15 7 2 15 15 19 70 49 7 53 22 21 31 19 11 18 20 12 35 17 23 17 4 2 31 30 13 27 0 39 37 5 14 13 22];

@model smscounter(y) = begin
    n = length(y)
    α = mean(y)
    λ1 ~ Exponential(α)
    λ2 ~ Exponential(α)
    τ ~ DiscreteUniform(1, n)
    
    for k in 1:τ
        y[k] ~ Poisson(λ1)
    end
    
    for k in (τ + 1):n
        y[k] ~ Poisson(λ2)
    end
end;

chain1 = sample(smscounter(y), MH(), 100_000);
chain2 = sample(smscounter(y), Gibbs(PG(100, :τ), HMC(0.05, 10, :λ1, :λ2)), 1_000);

Here's the output using MH:

Object of type Chains, with data of type 100000×4×1 Array{Union{Missing, Real},3}

Iterations        = 1:100000
Thinning interval = 1
Chains            = 1
Samples per chain = 100000
internals         = lp
parameters        = λ1, λ2, τ

2-element Array{ChainDataFrame,1}

Summary Statistics

│ Row │ parameters │ mean    │ std      │ naive_se   │ mcse      │ ess     │ r_hat   │
│     │ Symbol     │ Float64 │ Float64  │ Float64    │ Float64   │ Any     │ Any     │
├─────┼────────────┼─────────┼──────────┼────────────┼───────────┼─────────┼─────────┤
│ 1   │ λ1         │ 18.0277 │ 0.761006 │ 0.00240651 │ 0.023779  │ 401.606 │ 1.13277 │
│ 2   │ λ2         │ 22.6626 │ 2.31952  │ 0.00733498 │ 0.0729886 │ 401.606 │ 1.01702 │
│ 3   │ τ          │ 45.3502 │ 6.71208  │ 0.0212255  │ 0.205265  │ 401.606 │ 1.0116  │

Quantiles

│ Row │ parameters │ 2.5%    │ 25.0%   │ 50.0%   │ 75.0%   │ 97.5%   │
│     │ Symbol     │ Float64 │ Float64 │ Float64 │ Float64 │ Float64 │
├─────┼────────────┼─────────┼─────────┼─────────┼─────────┼─────────┤
│ 1   │ λ1         │ 16.9919 │ 17.1646 │ 17.9925 │ 18.4843 │ 19.9866 │
│ 2   │ λ2         │ 15.2514 │ 22.1434 │ 22.6499 │ 23.7294 │ 26.2475 │
│ 3   │ τ          │ 43.0    │ 45.0    │ 45.0    │ 45.0    │ 70.0    │

Using PG and HMC together:

Object of type Chains, with data of type 1000×4×1 Array{Union{Missing, Real},3}

Iterations        = 1:1000
Thinning interval = 1
Chains            = 1
Samples per chain = 1000
internals         = lp
parameters        = λ1, λ2, τ

2-element Array{ChainDataFrame,1}

Summary Statistics

│ Row │ parameters │ mean    │ std      │ naive_se  │ mcse      │ ess     │ r_hat    │
│     │ Symbol     │ Float64 │ Float64  │ Float64   │ Float64   │ Any     │ Any      │
├─────┼────────────┼─────────┼──────────┼───────────┼───────────┼─────────┼──────────┤
│ 1   │ λ1         │ 17.761  │ 0.812477 │ 0.0256928 │ 0.0159142 │ 1057.14 │ 0.999353 │
│ 2   │ λ2         │ 22.6707 │ 1.16725  │ 0.0369118 │ 0.130854  │ 12.4083 │ 1.02775  │
│ 3   │ τ          │ 43.514  │ 5.0817   │ 0.160697  │ 0.954879  │ 6.13911 │ 1.04248  │

Quantiles

│ Row │ parameters │ 2.5%    │ 25.0%   │ 50.0%   │ 75.0%   │ 97.5%   │
│     │ Symbol     │ Float64 │ Float64 │ Float64 │ Float64 │ Float64 │
├─────┼────────────┼─────────┼─────────┼─────────┼─────────┼─────────┤
│ 1   │ λ1         │ 16.2957 │ 17.2178 │ 17.7442 │ 18.2866 │ 19.3092 │
│ 2   │ λ2         │ 20.6154 │ 22.0664 │ 22.7017 │ 23.3752 │ 24.5441 │
│ 3   │ τ          │ 37.95   │ 44.0    │ 44.0    │ 45.0    │ 45.0    │

DomagojGalic · 2019-12-15T08:04:52Z

Than you, never crossed my mind that could be the problem.

DomagojGalic closed this as completed Dec 15, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Problem with random switch #1013

Problem with random switch #1013

DomagojGalic commented Dec 7, 2019

cpfiffer commented Dec 7, 2019

yebai commented Dec 7, 2019

DomagojGalic commented Dec 7, 2019

DomagojGalic commented Dec 7, 2019

cpfiffer commented Dec 8, 2019

DomagojGalic commented Dec 8, 2019

cpfiffer commented Dec 9, 2019

DomagojGalic commented Dec 9, 2019

cpfiffer commented Dec 14, 2019

DomagojGalic commented Dec 15, 2019

Problem with random switch #1013

Problem with random switch #1013

Comments

DomagojGalic commented Dec 7, 2019

cpfiffer commented Dec 7, 2019

yebai commented Dec 7, 2019

DomagojGalic commented Dec 7, 2019

DomagojGalic commented Dec 7, 2019

cpfiffer commented Dec 8, 2019

DomagojGalic commented Dec 8, 2019

cpfiffer commented Dec 9, 2019

DomagojGalic commented Dec 9, 2019

cpfiffer commented Dec 14, 2019

DomagojGalic commented Dec 15, 2019