-
Notifications
You must be signed in to change notification settings - Fork 223
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bug-fixing in dual averaging for HMC #709
Conversation
Models being used
using Revise, Turing
@model gdemo(x, y) = begin
s ~ InverseGamma(2,3)
m ~ Normal(0,sqrt(s))
x ~ Normal(m, sqrt(s))
y ~ Normal(m, sqrt(s))
end
c5 = sample(gdemo(1.5, 2), HMCDA(2000, 0.65, 1.0))
using StatisticalRethinking
using Turing
Turing.setadbackend(:reverse_diff)
d = CSV.read(joinpath(dirname(Base.pathof(StatisticalRethinking)), "..", "data",
"reedfrogs.csv"), delim=';')
size(d) # Should be 48x5
# Set number of tanks
d[:tank] = 1:size(d,1)
@model m12_2(density, tank, surv) = begin
# Separate priors on α and σ for each tank
σ ~ Truncated(Cauchy(0, 1), 0, Inf)
α ~ Normal(0, 1) #Check stancode this might need to be in the for loop below?
# Number of unique tanks in the data set
N_tank = length(tank)
# Set an TArray for the priors/param
# a_tank = TArray{Any}(undef, N_tank)
a_tank = Vector{Real}(undef, N_tank)
# For each tank [1,..,48] set a prior
for i = 1:N_tank
a_tank[i] ~ Normal(α, σ)
end
for i = 1:N_tank
logitp = a_tank[tank[i]]
surv[i] ~ BinomialLogit(density[i], logitp)
end
end
posterior = sample(m12_2(d[:density], d[:tank], d[:surv]),
Turing.NUTS(4000, 500, 0.95),
# Turing.HMCDA(4000, 500, 0.95, 0.3)
)
describe(posterior) LDA # Simulate data
using Distributions
V = 5 # vocab
K = 3 # topic
M = 25 # doc
alpha = collect(ones(K) / K);
beta = collect(ones(V) / V);
theta = rand(Dirichlet(alpha), M)
phi = rand(Dirichlet(beta), K)
@info "true vals" phi theta sum(phi * theta, dims=1)
avg_doc_length = 50
doc_length = rand(Poisson(avg_doc_length), M)
N = sum(doc_length)
w = Vector{Int}(undef, N)
doc = Vector{Int}(undef, N)
n = 1
for m = 1:M, i=1:doc_length[m]
global n
z = rand(Categorical(theta[:,m]))
w[n] = rand(Categorical(phi[:,z]))
doc[n] = m
n = n + 1
end
# LDA
using Turing
struct CategoricalReal<: Distributions.DiscreteUnivariateDistribution
p
end
function Distributions.logpdf(d::CategoricalReal, k::Int64)
return log(d.p[k])
end
@model lda_collapsed(K, V, M, N, w, doc, beta, alpha) = begin
theta = Matrix{Real}(undef, K, M)
for m = 1:M
theta[:,m] ~ Dirichlet(alpha)
end
phi = Matrix{Real}(undef, V, K)
for k = 1:K
phi[:,k] ~ Dirichlet(beta)
end
word_dist = phi * theta
for n = 1:N
w[n] ~ CategoricalReal(word_dist[:,doc[n]])
end
end
mf = lda_collapsed(K, V, M, N, w, doc, beta, alpha)
c3= sample(mf, Turing.NUTS(2000, 0.65)) |
Update: Running the LDA example will occasionally throw At the moment, we don't handle this kind of errors in Turing, and the default Julia behavior is to stop executing the error function and print the stacktrace. A complete solution requires us to catch this type of exception/error in Turing, and reject the current HMC proposal when it happens. This is likely to require some re-work of the error handling mechanism mentioned in TuringLang/AdvancedHMC.jl#16 julia> chain = sample(mf, HMCDA(2000, 0.65, 1.0))
[ Info: [Turing] looking for good initial eps...
[ Info: [Turing] ϵ = 0.2, accept ratio a = 6.58293101371248e-25
[ Info: [Turing] ϵ = 0.15000000000000002, accept ratio a = 1.0
[ Info: [Turing] ϵ = 0.17500000000000002, accept ratio a = 1.0
[ Info: [Turing] ϵ = 0.1875, accept ratio a = 1.0
[ Info: [Turing] ϵ = 0.19375, accept ratio a = 1.0
[ Info: [Turing] ϵ = 0.19687500000000002, accept ratio a = 1.6572427130567486e-9
[ Info: [Turing] ϵ = 0.1953125, accept ratio a = 0.04197809303831285
[ Info: [Turing] ϵ = 0.19453125, accept ratio a = 1.0
[ Info: [Turing] ϵ = 0.194921875, accept ratio a = 1.0
[ Info: [Turing] ϵ = 0.1951171875, accept ratio a = 0.3421968454472759
[ Info: [Turing] found initial ϵ: 0.1951171875
ERROR: DomainError with -2.220446049250313e-16:
log will only return a complex result if called with a complex argument. Try log(Complex(x)).
Stacktrace: 1.0
[1] throw_complex_domainerror(::Symbol, ::Float64) at ./math.jl:31
[2] log(::Float64) at ./special/log.jl:285
[3] log(::ForwardDiff.Dual{ForwardDiff.Tag{getfield(Turing.Core, Symbol("#f#26")){Turing.Core.VarReplay.VarInfo,Turing.Model{Tuple{:theta,:phi},Tuple{:w},getfield(Main, Symbol("###inner_function#413#22")){Int64,Int64,Int64,Int64,Array{Int64,1},Array{Float64,1},Array{Float64,1}},NamedTuple{(:w,),Tuple{Array{Int64,1}}},NamedTuple{(:w,),Tuple{Symbol}}},Turing.Sampler{HMCDA{Turing.Core.ForwardDiffAD{40},Any}}},Real},Float64,10}) at /Users/hg344/.julia/packages/ForwardDiff/N0wMF/src/dual.jl:203
[4] logpdf_with_trans(::Dirichlet{Float64}, ::Array{ForwardDiff.Dual{ForwardDiff.Tag{getfield(Turing.Core, Symbol("#f#26")){Turing.Core.VarReplay.VarInfo,Turing.Model{Tuple{:theta,:phi},Tuple{:w},getfield(Main, Symbol("###inner_function#413#22")){Int64,Int64,Int64,Int64,Array{Int64,1},Array{Float64,1},Array{Float64,1}},NamedTuple{(:w,),Tuple{Array{Int64,1}}},NamedTuple{(:w,),Tuple{Symbol}}},Turing.Sampler{HMCDA{Turing.Core.ForwardDiffAD{40},Any}}},Real},Real,10},1}, ::Bool) at /Users/hg344/.julia/packages/Bijectors/vCpPQ/src/Bijectors.jl:292
[5] assume(::Turing.Sampler{HMCDA{Turing.Core.ForwardDiffAD{40},Any}}, ::Dirichlet{Float64}, ::Turing.Core.VarReplay.VarName, ::Turing.Core.VarReplay.VarInfo) at /Users/hg344/.julia/dev/Turing/src/inference/hmc.jl:277
[6] macro expansion at /Users/hg344/.julia/dev/Turing/src/core/compiler.jl:102 [inlined]
[7] macro expansion at ./REPL[27]:9 [inlined]
[8] (::getfield(Main, Symbol("###inner_function#413#22")){Int64,Int64,Int64,Int64,Array{Int64,1},Array{Float64,1},Array{Float64,1}})(::Turing.Core.VarReplay.VarInfo, ::Turing.Sampler{HMCDA{Turing.Core.ForwardDiffAD{40},Any}}, ::Turing.Model{Tuple{:theta,:phi},Tuple{:w},getfield(Main, Symbol("###inner_function#413#22")){Int64,Int64,Int64,Int64,Array{Int64,1},Array{Float64,1},Array{Float64,1}},NamedTuple{(:w,),Tuple{Array{Int64,1}}},NamedTuple{(:w,),Tuple{Symbol}}}) at /Users/hg344/.julia/dev/Turing/src/core/compiler.jl:389
[9] #call#3 at /Users/hg344/.julia/dev/Turing/src/Turing.jl:62 [inlined]
[10] Model at /Users/hg344/.julia/dev/Turing/src/Turing.jl:62 [inlined]
[11] runmodel!(::Turing.Model{Tuple{:theta,:phi},Tuple{:w},getfield(Main, Symbol("###inner_function#413#22")){Int64,Int64,Int64,Int64,Array{Int64,1},Array{Float64,1},Array{Float64,1}},NamedTuple{(:w,),Tuple{Array{Int64,1}}},NamedTuple{(:w,),Tuple{Symbol}}}, ::Turing.Core.VarReplay.VarInfo, ::Turing.Sampler{HMCDA{Turing.Core.ForwardDiffAD{40},Any}}) at /Users/hg344/.julia/dev/Turing/src/core/VarReplay.jl:110
[12] f at /Users/hg344/.julia/dev/Turing/src/core/ad.jl:109 [inlined]
[13] chunk_mode_gradient!(::Array{Real,1}, ::getfield(Turing.Core, Symbol("#f#26")){Turing.Core.VarReplay.VarInfo,Turing.Model{Tuple{:theta,:phi},Tuple{:w},getfield(Main, Symbol("###inner_function#413#22")){Int64,Int64,Int64,Int64,Array{Int64,1},Array{Float64,1},Array{Float64,1}},NamedTuple{(:w,),Tuple{Array{Int64,1}}},NamedTuple{(:w,),Tuple{Symbol}}},Turing.Sampler{HMCDA{Turing.Core.ForwardDiffAD{40},Any}}}, ::Array{Real,1}, ::ForwardDiff.GradientConfig{ForwardDiff.Tag{getfield(Turing.Core, Symbol("#f#26")){Turing.Core.VarReplay.VarInfo,Turing.Model{Tuple{:theta,:phi},Tuple{:w},getfield(Main, Symbol("###inner_function#413#22")){Int64,Int64,Int64,Int64,Array{Int64,1},Array{Float64,1},Array{Float64,1}},NamedTuple{(:w,),Tuple{Array{Int64,1}}},NamedTuple{(:w,),Tuple{Symbol}}},Turing.Sampler{HMCDA{Turing.Core.ForwardDiffAD{40},Any}}},Real},Real,10,Array{ForwardDiff.Dual{ForwardDiff.Tag{getfield(Turing.Core, Symbol("#f#26")){Turing.Core.VarReplay.VarInfo,Turing.Model{Tuple{:theta,:phi},Tuple{:w},getfield(Main, Symbol("###inner_function#413#22")){Int64,Int64,Int64,Int64,Array{Int64,1},Array{Float64,1},Array{Float64,1}},NamedTuple{(:w,),Tuple{Array{Int64,1}}},NamedTuple{(:w,),Tuple{Symbol}}},Turing.Sampler{HMCDA{Turing.Core.ForwardDiffAD{40},Any}}},Real},Real,10},1}}) at /Users/hg344/.julia/packages/ForwardDiff/N0wMF/src/gradient.jl:140
[14] gradient! at /Users/hg344/.julia/packages/ForwardDiff/N0wMF/src/gradient.jl:37 [inlined]
[15] gradient!(::Array{Real,1}, ::getfield(Turing.Core, Symbol("#f#26")){Turing.Core.VarReplay.VarInfo,Turing.Model{Tuple{:theta,:phi},Tuple{:w},getfield(Main, Symbol("###inner_function#413#22")){Int64,Int64,Int64,Int64,Array{Int64,1},Array{Float64,1},Array{Float64,1}},NamedTuple{(:w,),Tuple{Array{Int64,1}}},NamedTuple{(:w,),Tuple{Symbol}}},Turing.Sampler{HMCDA{Turing.Core.ForwardDiffAD{40},Any}}}, ::Array{Real,1}, ::ForwardDiff.GradientConfig{ForwardDiff.Tag{getfield(Turing.Core, Symbol("#f#26")){Turing.Core.VarReplay.VarInfo,Turing.Model{Tuple{:theta,:phi},Tuple{:w},getfield(Main, Symbol("###inner_function#413#22")){Int64,Int64,Int64,Int64,Array{Int64,1},Array{Float64,1},Array{Float64,1}},NamedTuple{(:w,),Tuple{Array{Int64,1}}},NamedTuple{(:w,),Tuple{Symbol}}},Turing.Sampler{HMCDA{Turing.Core.ForwardDiffAD{40},Any}}},Real},Real,10,Array{ForwardDiff.Dual{ForwardDiff.Tag{getfield(Turing.Core, Symbol("#f#26")){Turing.Core.VarReplay.VarInfo,Turing.Model{Tuple{:theta,:phi},Tuple{:w},getfield(Main, Symbol("###inner_function#413#22")){Int64,Int64,Int64,Int64,Array{Int64,1},Array{Float64,1},Array{Float64,1}},NamedTuple{(:w,),Tuple{Array{Int64,1}}},NamedTuple{(:w,),Tuple{Symbol}}},Turing.Sampler{HMCDA{Turing.Core.ForwardDiffAD{40},Any}}},Real},Real,10},1}}) at /Users/hg344/.julia/packages/ForwardDiff/N0wMF/src/gradient.jl:33
[16] gradient_logp_forward(::Array{Real,1}, ::Turing.Core.VarReplay.VarInfo, ::Turing.Model{Tuple{:theta,:phi},Tuple{:w},getfield(Main, Symbol("###inner_function#413#22")){Int64,Int64,Int64,Int64,Array{Int64,1},Array{Float64,1},Array{Float64,1}},NamedTuple{(:w,),Tuple{Array{Int64,1}}},NamedTuple{(:w,),Tuple{Symbol}}}, ::Turing.Sampler{HMCDA{Turing.Core.ForwardDiffAD{40},Any}}) at /Users/hg344/.julia/dev/Turing/src/core/ad.jl:116
[17] gradient_logp at /Users/hg344/.julia/dev/Turing/src/core/ad.jl:80 [inlined]
[18] #2 at /Users/hg344/.julia/dev/Turing/src/inference/support/hmc_core.jl:12 [inlined]
[19] #_leapfrog#26(::getfield(Turing.Inference, Symbol("##6#7")){Turing.Core.VarReplay.VarInfo,Turing.Sampler{HMCDA{Turing.Core.ForwardDiffAD{40},Any}}}, ::getfield(Turing.Inference, Symbol("##8#9")){Turing.Sampler{HMCDA{Turing.Core.ForwardDiffAD{40},Any}}}, ::Function, ::Array{Real,1}, ::Array{Float64,1}, ::Int64, ::Float64, ::getfield(Turing.Inference, Symbol("##2#3")){Turing.Core.VarReplay.VarInfo,Turing.Sampler{HMCDA{Turing.Core.ForwardDiffAD{40},Any}},Turing.Model{Tuple{:theta,:phi},Tuple{:w},getfield(Main, Symbol("###inner_function#413#22")){Int64,Int64,Int64,Int64,Array{Int64,1},Array{Float64,1},Array{Float64,1}},NamedTuple{(:w,),Tuple{Array{Int64,1}}},NamedTuple{(:w,),Tuple{Symbol}}}}) at /Users/hg344/.julia/dev/Turing/src/inference/support/hmc_core.jl:158
[20] (::getfield(Turing.Inference, Symbol("#kw##_leapfrog")))(::NamedTuple{(:rev_func, :log_func),Tuple{getfield(Turing.Inference, Symbol("##6#7")){Turing.Core.VarReplay.VarInfo,Turing.Sampler{HMCDA{Turing.Core.ForwardDiffAD{40},Any}}},getfield(Turing.Inference, Symbol("##8#9")){Turing.Sampler{HMCDA{Turing.Core.ForwardDiffAD{40},Any}}}}}, ::typeof(Turing.Inference._leapfrog), ::Array{Real,1}, ::Array{Float64,1}, ::Int64, ::Float64, ::Function) at ./none:0
[21] #_hmc_step#27(::Function, ::Function, ::Function, ::Array{Real,1}, ::Float64, ::getfield(Turing.Inference, Symbol("##4#5")){Turing.Core.VarReplay.VarInfo,Turing.Sampler{HMCDA{Turing.Core.ForwardDiffAD{40},Any}},Turing.Model{Tuple{:theta,:phi},Tuple{:w},getfield(Main, Symbol("###inner_function#413#22")){Int64,Int64,Int64,Int64,Array{Int64,1},Array{Float64,1},Array{Float64,1}},NamedTuple{(:w,),Tuple{Array{Int64,1}}},NamedTuple{(:w,),Tuple{Symbol}}}}, ::Function, ::getfield(Turing.Inference, Symbol("##20#21")){Array{Float64,1}}, ::Int64, ::Float64, ::getfield(Turing.Inference, Symbol("##18#19")){Int64,Array{Float64,1}}) at /Users/hg344/.julia/dev/Turing/src/inference/support/hmc_core.jl:195
[22] (::getfield(Turing.Inference, Symbol("#kw##_hmc_step")))(::NamedTuple{(:rev_func, :log_func),Tuple{getfield(Turing.Inference, Symbol("##6#7")){Turing.Core.VarReplay.VarInfo,Turing.Sampler{HMCDA{Turing.Core.ForwardDiffAD{40},Any}}},getfield(Turing.Inference, Symbol("##8#9")){Turing.Sampler{HMCDA{Turing.Core.ForwardDiffAD{40},Any}}}}}, ::typeof(Turing.Inference._hmc_step), ::Array{Real,1}, ::Float64, ::Function, ::Function, ::Function, ::Int64, ::Float64, ::getfield(Turing.Inference, Symbol("##18#19")){Int64,Array{Float64,1}}) at ./none:0
[23] #_hmc_step#28 at /Users/hg344/.julia/dev/Turing/src/inference/support/hmc_core.jl:224 [inlined]
[24] #_hmc_step at ./none:0 [inlined]
[25] #hmc_step#30 at /Users/hg344/.julia/dev/Turing/src/inference/hmcda.jl:72 [inlined]
[26] (::getfield(Turing.Inference, Symbol("#kw##hmc_step")))(::NamedTuple{(:rev_func, :log_func),Tuple{getfield(Turing.Inference, Symbol("##6#7")){Turing.Core.VarReplay.VarInfo,Turing.Sampler{HMCDA{Turing.Core.ForwardDiffAD{40},Any}}},getfield(Turing.Inference, Symbol("##8#9")){Turing.Sampler{HMCDA{Turing.Core.ForwardDiffAD{40},Any}}}}}, ::typeof(Turing.Inference.hmc_step), ::Array{Real,1}, ::Float64, ::Function, ::Function, ::Function, ::Float64, ::HMCDA{Turing.Core.ForwardDiffAD{40},Any}, ::getfield(Turing.Inference, Symbol("##18#19")){Int64,Array{Float64,1}}) at ./none:0
[27] step(::Turing.Model{Tuple{:theta,:phi},Tuple{:w},getfield(Main, Symbol("###inner_function#413#22")){Int64,Int64,Int64,Int64,Array{Int64,1},Array{Float64,1},Array{Float64,1}},NamedTuple{(:w,),Tuple{Array{Int64,1}}},NamedTuple{(:w,),Tuple{Symbol}}}, ::Turing.Sampler{HMCDA{Turing.Core.ForwardDiffAD{40},Any}}, ::Turing.Core.VarReplay.VarInfo, ::Val{false}) at /Users/hg344/.julia/dev/Turing/src/inference/hmc.jl:236
[28] macro expansion at ./util.jl:213 [inlined]
[29] #sample#37(::Bool, ::Nothing, ::Int64, ::Nothing, ::Function, ::Turing.Model{Tuple{:theta,:phi},Tuple{:w},getfield(Main, Symbol("###inner_function#413#22")){Int64,Int64,Int64,Int64,Array{Int64,1},Array{Float64,1},Array{Float64,1}},NamedTuple{(:w,),Tuple{Array{Int64,1}}},NamedTuple{(:w,),Tuple{Symbol}}}, ::HMCDA{Turing.Core.ForwardDiffAD{40},Any}) at /Users/hg344/.julia/dev/Turing/src/inference/hmc.jl:153
[30] sample(::Turing.Model{Tuple{:theta,:phi},Tuple{:w},getfield(Main, Symbol("###inner_function#413#22")){Int64,Int64,Int64,Int64,Array{Int64,1},Array{Float64,1},Array{Float64,1}},NamedTuple{(:w,),Tuple{Array{Int64,1}}},NamedTuple{(:w,),Tuple{Symbol}}}, ::HMCDA{Turing.Core.ForwardDiffAD{40},Any}) at /Users/hg344/.julia/dev/Turing/src/inference/hmc.jl:108
[31] top-level scope at none:0 |
Ready for a review now. The remaining issue with LDA is not step-size related and better be addressed in a separate PR. |
while (iter_num <= max_num_iters) | ||
θ = θ_prime | ||
delta_H = H0 - h # logp(θ`) - logp(θ) | ||
direction = delta_H > log(a_cross) ? 1 : -1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice!
Looks great! Many thanks for catching the bug. I'm happy to merge this PR! |
The step-size adaption problem is probably resolved in this PR. So far I have been using the
gdemo
example, and them12_2
model fromStatisticalRethinking
for testing. Please take a look and let me know if you have better models for testing.The effective sample size for
m12_2
I'm working on a more robust version of
_find_good_eps
. When this is done, the following line will be changed back toϵ = direction == 1 ? 2 * ϵ : 0.5 * ϵ
Turing.jl/src/inference/support/hmc_core.jl
Line 269 in 7464432
Related: #628