-
Notifications
You must be signed in to change notification settings - Fork 222
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Problem with random switch #1013
Comments
Can you tell use what you are getting and what you are expecting to get? |
τ is a discrete parameter. MH aren’t expected to work with discrete parameters. Maybe try a Gibbs sampler, e.g. PG+MH, or PG+HMC. |
I'm getting this: Quantiles │ Row │ parameters │ 2.5% │ 25.0% │ 50.0% │ 75.0% │ 97.5% │ This is what I get when implementing the same model in pymc3 As you can see Turing model returns τ as deterministic value (standard deviation is zero), as well as λ1 and λ2 which are also basically deterministic, with means 3.8 and 4.9 respectively. Here is the python code just in case: with pm.Model() as model:
alpha = 1.0 / count_data.mean()
lambda_1 = pm.Exponential("lambda_1", alpha)
lambda_2 = pm.Exponential("lambda_2", alpha)
tau = pm.DiscreteUniform("tau", lower=0, upper=n_count_data - 1)
idx = np.arange(n_count_data)
lambda_ = pm.math.switch(tau >= idx, lambda_1, lambda_2)
with model:
observation = pm.Poisson("obs", lambda_, observed=count_data)
step = pm.Metropolis()
trace = pm.sample(3000, tune=1000, step=step) |
I tried chain = sample(smscounter(y), Gibbs(PG(100, :τ), MH(:λ1, :λ2)), 10_000); and also chain = sample(smscounter(y), Gibbs(PG(100, :τ), HMC(0.05, 10, :λ1, :λ2)), 10_000); In both cases I got KeyError and then a mile long stacktrace. |
This would be much easier to debug if we had a fully reproducible example (something we can copy-paste) that runs immediately. Can you provide a MWE (minimum working example) that we can work from? We don't have your dataset or any of your setup code prior to the model declaration. |
Sure (the dataset is actually provided with the notebook I linked, but I'm going to just copy it, since it'll be easier than downloading it and loading) data = TArray(Int64, 74);
y = [13 24 8 24 7 35 14 11 15 11 22 22 11 57 11 19 29 6 19 12 22 12 18 72 32 9 7 13 19 23 27 20 6 17 13 10 14 6 16 15 7 2 15 15 19 70 49 7 53 22 21 31 19 11 18 20 12 35 17 23 17 4 2 31 30 13 27 0 39 37 5 14 13 22];
for i in 1:74
data[i] = y[i]
end;
@model smscounter(y) = begin
n = length(y)
α = 1.0 / mean(y)
λ1 ~ Exponential(α)
λ2 ~ Exponential(α)
τ ~ DiscreteUniform(1, n)
for k in 1:τ
y[k] ~ Poisson(λ1)
end
for k in (τ + 1):n
y[k] ~ Poisson(λ2)
end
end;
chain1 = sample(smscounter(data), MH(), 10_000);
# chain2 = sample(smscounter(data), Gibbs(PG(100, :τ), MH(:λ1, :λ2)), 10_000);
# chain3 = sample(smscounter(data), Gibbs(PG(100, :τ), HMC(0.05, 10, :λ1, :λ2)), 10_000);
|
The problem here appears to be the use of Modified code with a shortened sample size for illustrative purposes: using Turing
y = [13 24 8 24 7 35 14 11 15 11 22 22 11 57 11 19 29 6 19 12 22 12 18 72 32 9 7 13 19 23 27 20 6 17 13 10 14 6 16 15 7 2 15 15 19 70 49 7 53 22 21 31 19 11 18 20 12 35 17 23 17 4 2 31 30 13 27 0 39 37 5 14 13 22];
@model smscounter(y) = begin
n = length(y)
α = 1.0 / mean(y)
λ1 ~ Exponential(α)
λ2 ~ Exponential(α)
τ ~ DiscreteUniform(1, n)
for k in 1:τ
y[k] ~ Poisson(λ1)
end
for k in (τ + 1):n
y[k] ~ Poisson(λ2)
end
end;
chain = sample(smscounter(y), Gibbs(PG(100, :τ), MH(:λ1, :λ2)), 1_000); Output: Object of type Chains, with data of type 1000×4×1 Array{Union{Missing, Real},3}
Iterations = 1:1000
Thinning interval = 1
Chains = 1
Samples per chain = 1000
internals = lp
parameters = λ1, λ2, τ
2-element Array{ChainDataFrame,1}
Summary Statistics
. Omitted printing of 1 columns
│ Row │ parameters │ mean │ std │ naive_se │ mcse │ ess │
│ │ Symbol │ Float64 │ Float64 │ Float64 │ Float64 │ Any │
├─────┼────────────┼───────────┼───────────┼─────────────┼────────────┼─────────┤
│ 1 │ λ1 │ 0.0177012 │ 0.0084283 │ 0.000266526 │ 0.00199435 │ 6.21103 │
│ 2 │ λ2 │ 0.324128 │ 0.0586637 │ 0.00185511 │ 0.0185999 │ 6.14019 │
│ 3 │ τ │ 1.034 │ 0.759883 │ 0.0240296 │ 0.034 │ 6.49671 │
Quantiles
│ Row │ parameters │ 2.5% │ 25.0% │ 50.0% │ 75.0% │ 97.5% │
│ │ Symbol │ Float64 │ Float64 │ Float64 │ Float64 │ Float64 │
├─────┼────────────┼───────────┼───────────┼───────────┼───────────┼───────────┤
│ 1 │ λ1 │ 0.0157069 │ 0.0157069 │ 0.0157069 │ 0.0157069 │ 0.0593485 │
│ 2 │ λ2 │ 0.122249 │ 0.342728 │ 0.342728 │ 0.342728 │ 0.342728 │
│ 3 │ τ │ 1.0 │ 1.0 │ 1.0 │ 1.0 │ 1.0 │ |
There's still the problem of wrong results, I tried your solution and get │ Row │ parameters │ mean │ std │ naive_se │ mcse │ ess │ r_hat │ Quantiles │ Row │ parameters │ 2.5% │ 25.0% │ 50.0% │ 75.0% │ 97.5% │ for sample(smscounter(y), Gibbs(PG(100, :τ), MH(:λ1, :λ2)), 10_000) and Summary Statistics │ Row │ parameters │ mean │ std │ naive_se │ mcse │ ess │ r_hat │ Quantiles │ Row │ parameters │ 2.5% │ 25.0% │ 50.0% │ 75.0% │ 97.5% │ for sample(smscounter(y), Gibbs(PG(100, :τ), HMC(0.05, 10, :λ1, :λ2)), 10_000); Which is quite far from what I'm supposed to get. |
I think this is because of differences in how PyMC3 and Distributions.jl treat draws from an exponential distribution -- if you replace alpha with just the mean (and not the inverse mean), you should get something very similar to the output in the book. You can now also use using Turing
y = [13 24 8 24 7 35 14 11 15 11 22 22 11 57 11 19 29 6 19 12 22 12 18 72 32 9 7 13 19 23 27 20 6 17 13 10 14 6 16 15 7 2 15 15 19 70 49 7 53 22 21 31 19 11 18 20 12 35 17 23 17 4 2 31 30 13 27 0 39 37 5 14 13 22];
@model smscounter(y) = begin
n = length(y)
α = mean(y)
λ1 ~ Exponential(α)
λ2 ~ Exponential(α)
τ ~ DiscreteUniform(1, n)
for k in 1:τ
y[k] ~ Poisson(λ1)
end
for k in (τ + 1):n
y[k] ~ Poisson(λ2)
end
end;
chain1 = sample(smscounter(y), MH(), 100_000);
chain2 = sample(smscounter(y), Gibbs(PG(100, :τ), HMC(0.05, 10, :λ1, :λ2)), 1_000); Here's the output using Object of type Chains, with data of type 100000×4×1 Array{Union{Missing, Real},3}
Iterations = 1:100000
Thinning interval = 1
Chains = 1
Samples per chain = 100000
internals = lp
parameters = λ1, λ2, τ
2-element Array{ChainDataFrame,1}
Summary Statistics
│ Row │ parameters │ mean │ std │ naive_se │ mcse │ ess │ r_hat │
│ │ Symbol │ Float64 │ Float64 │ Float64 │ Float64 │ Any │ Any │
├─────┼────────────┼─────────┼──────────┼────────────┼───────────┼─────────┼─────────┤
│ 1 │ λ1 │ 18.0277 │ 0.761006 │ 0.00240651 │ 0.023779 │ 401.606 │ 1.13277 │
│ 2 │ λ2 │ 22.6626 │ 2.31952 │ 0.00733498 │ 0.0729886 │ 401.606 │ 1.01702 │
│ 3 │ τ │ 45.3502 │ 6.71208 │ 0.0212255 │ 0.205265 │ 401.606 │ 1.0116 │
Quantiles
│ Row │ parameters │ 2.5% │ 25.0% │ 50.0% │ 75.0% │ 97.5% │
│ │ Symbol │ Float64 │ Float64 │ Float64 │ Float64 │ Float64 │
├─────┼────────────┼─────────┼─────────┼─────────┼─────────┼─────────┤
│ 1 │ λ1 │ 16.9919 │ 17.1646 │ 17.9925 │ 18.4843 │ 19.9866 │
│ 2 │ λ2 │ 15.2514 │ 22.1434 │ 22.6499 │ 23.7294 │ 26.2475 │
│ 3 │ τ │ 43.0 │ 45.0 │ 45.0 │ 45.0 │ 70.0 │ Using Object of type Chains, with data of type 1000×4×1 Array{Union{Missing, Real},3}
Iterations = 1:1000
Thinning interval = 1
Chains = 1
Samples per chain = 1000
internals = lp
parameters = λ1, λ2, τ
2-element Array{ChainDataFrame,1}
Summary Statistics
│ Row │ parameters │ mean │ std │ naive_se │ mcse │ ess │ r_hat │
│ │ Symbol │ Float64 │ Float64 │ Float64 │ Float64 │ Any │ Any │
├─────┼────────────┼─────────┼──────────┼───────────┼───────────┼─────────┼──────────┤
│ 1 │ λ1 │ 17.761 │ 0.812477 │ 0.0256928 │ 0.0159142 │ 1057.14 │ 0.999353 │
│ 2 │ λ2 │ 22.6707 │ 1.16725 │ 0.0369118 │ 0.130854 │ 12.4083 │ 1.02775 │
│ 3 │ τ │ 43.514 │ 5.0817 │ 0.160697 │ 0.954879 │ 6.13911 │ 1.04248 │
Quantiles
│ Row │ parameters │ 2.5% │ 25.0% │ 50.0% │ 75.0% │ 97.5% │
│ │ Symbol │ Float64 │ Float64 │ Float64 │ Float64 │ Float64 │
├─────┼────────────┼─────────┼─────────┼─────────┼─────────┼─────────┤
│ 1 │ λ1 │ 16.2957 │ 17.2178 │ 17.7442 │ 18.2866 │ 19.3092 │
│ 2 │ λ2 │ 20.6154 │ 22.0664 │ 22.7017 │ 23.3752 │ 24.5441 │
│ 3 │ τ │ 37.95 │ 44.0 │ 44.0 │ 45.0 │ 45.0 │ |
Than you, never crossed my mind that could be the problem. |
I'm implementing a model from Cameron Davidson-Pilon's book bayesian methods for hackers link to the notebook.
The model I'm implementing is in the example on inferring behaviour from text-message data.
I can't seem to get the correct result.
Here is my code:
Any comment on what I might be doing wrong would be much appreciated.
I'm using Julia 1.1.1 and Turing 0.7.1
The text was updated successfully, but these errors were encountered: