Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hvcat stackoverflow #1653

Closed
dleather opened this issue Jul 21, 2024 · 9 comments
Closed

Hvcat stackoverflow #1653

dleather opened this issue Jul 21, 2024 · 9 comments

Comments

@dleather
Copy link

I'm running Enzyme 0.12.23, EnzymeCore 0.7.7, Linear Algebra 1.5.0, Sparse Arrays 1.10.0 in Julia 1.10.4 on a Windows machine. Without fail Enzyme crashes. I can get Zygote to differentiate the function quickly. I've tried replacing all calls to I() with there explicit matrix form, as well as making Λ non-sparse.

using Enzyme, SparseArrays, LinearAlgebra

function compute_L1(Σ::Matrix{T}, Λ::AbstractSparseMatrix{T}) where T <: Real
    # LHS coefficient from Proposition 3.2
    #   L₁ = [Σ ⊗ (Iₙ² + Λₙ)] ⊗ [vec(Iₙ) ⊗ Iₙ]
    N = size(Σ, 1)
    return kron(Σ, I + Λ) * kron(vec(I(N)), I(N))
end

Λ =  spzeros(4, 4)
Λ[1, 1] = 1.0
Λ[3, 2] = 1.0
Λ[2, 3] = 1.0
Λ[4, 4] = 1.0

Δt = 0.25

function f(θ)
    σ_z = θ[1]
    θ_z = θ[2]
    Ω = [sqrt(((σ_z^2)/(2.0 * θ_z))*(1-exp(-2*θ_z*Δt))) 0.0; 0.0 0.0]
    Σ = Ω * Ω'
    L1 = compute_L1(Σ, Λ)
    return L1[1]
end

θ = [1.0, 0.5]
dθ = similar(θ)
f(θ)
Enzyme.autodiff(Reverse, f, Active, Duplicated(θ, dθ))
@wsmoses
Copy link
Member

wsmoses commented Jul 21, 2024

How does Enzyme crash, can you post the log?

@wsmoses
Copy link
Member

wsmoses commented Jul 21, 2024

Also fyi Λ and Δt are type unstable -- which even without an error will make your original code and derivatives slow. You could pass them in as arguments to f, and/or mark the globals as const

@dleather
Copy link
Author

How do I see the log. The REPL crashes and I only see message:

The terminal process "C:\Users\davle\AppData\Local\Programs\Julia-1.10.4\bin\julia.exe '-i', '--banner=no', '--project=C:\Users\davle.julia\environments\v1.10', 'c:\Users\davle.vscode\extensions\julialang.language-julia-1.83.2\scripts\terminalserver\terminalserver.jl', '\.\pipe\vsc-jl-repl-7953c62c-ff44-4ad5-9cc4-a14f065b6128', '\.\pipe\vsc-jl-cr-0aea7085-fd31-40a3-81a3-9f8306afd45b', 'USE_REVISE=true', 'USE_PLOTPANE=true', 'USE_PROGRESS=true', 'ENABLE_SHELL_INTEGRATION=true', 'DEBUG_MODE=false'" terminated with exit code: -1073741571.

@wsmoses
Copy link
Member

wsmoses commented Jul 21, 2024

Running on my linux box -- ah I see you have the type unstable vector constructor issue (x/ref #1134). Julia 1.10 beta3 introduced a change to sparsearrays that causes an infinite recursion here. We should make this handled better, but in the interim this should be resolvable by either making things type stable or not using the array syntactic sugar.

The relevant suggested changes are here:

using Enzyme, SparseArrays, LinearAlgebra

function compute_L1(Σ::Matrix{T}, Λ::AbstractSparseMatrix{T}) where T <: Real
    # LHS coefficient from Proposition 3.2
    #   L₁ = [Σ ⊗ (Iₙ² + Λₙ)] ⊗ [vec(Iₙ) ⊗ Iₙ]
    N = size(Σ, 1)
    return kron(Σ, I + Λ) * kron(vec(I(N)), I(N))
end

Λ =  spzeros(4, 4)
Λ[1, 1] = 1.0
Λ[3, 2] = 1.0
Λ[2, 3] = 1.0
Λ[4, 4] = 1.0

Δt = 0.25

function f(θ, Λ, Δt)
    σ_z = θ[1]
    θ_z = θ[2]
    Ω = zeros(2,2)
    Ω[1,1] = sqrt(((σ_z^2)/(2.0 * θ_z))*(1-exp(-2*θ_z*Δt)))
    Σ = Ω * Ω'
    L1 = compute_L1(Σ, Λ)
    return L1[1]
end

θ = [1.0, 0.5]
dθ = similar(θ)
f(θ)
Enzyme.autodiff(Reverse, f, Active, Duplicated(θ, dθ), Const(Λ), Const(Δt))

This then hits an unrelated oddity in array push/pop, which I can look at later

@dleather
Copy link
Author

Really appreciate you looking into it! It's been running for two hours on my machine once I fixed the type stability.

@wsmoses
Copy link
Member

wsmoses commented Jul 21, 2024

Oh really, that shouldn't happen (and isn't what I see)?

I get this on the code I paste above (on my mac laptop):

julia> Enzyme.autodiff(Reverse, f, Active, Duplicated(θ, dθ), Const(Λ2), Const(Δt))
┌ Warning: TODO reverse jl_array_del_end zero-set used memset rather than runtime type of (true, Vector{Float64}) in {} addrspace(10)* %6
└ @ Enzyme.Compiler ~/.julia/packages/GPUCompiler/Y4hSX/src/utils.jl:59
┌ Warning: TODO reverse jl_array_del_end zero-set used memset rather than runtime type of (false, nothing) in   %getfield23 = load atomic {} addrspace(10)*, {} addrspace(10)* addrspace(10)* %getfield_addr22 unordered, align 8, !dbg !263, !tbaa !266, !alias.scope !268, !noalias !269, !nonnull !200, !dereferenceable !233, !align !234
└ @ Enzyme.Compiler ~/.julia/packages/GPUCompiler/Y4hSX/src/utils.jl:59
┌ Warning: TODO reverse jl_array_del_end zero-set used memset rather than runtime type of (false, nothing) in   %getfield21 = load atomic {} addrspace(10)*, {} addrspace(10)* addrspace(10)* %getfield_addr20 unordered, align 8, !dbg !486, !tbaa !266, !alias.scope !268, !noalias !269, !nonnull !200, !dereferenceable !233, !align !234
└ @ Enzyme.Compiler ~/.julia/packages/GPUCompiler/Y4hSX/src/utils.jl:59
┌ Warning: TODO reverse jl_array_del_end zero-set used memset rather than runtime type of (false, nothing) in   %getfield53 = load atomic {} addrspace(10)*, {} addrspace(10)* addrspace(10)* %getfield_addr52 unordered, align 8, !dbg !464, !tbaa !214, !alias.scope !216, !noalias !265, !nonnull !200, !dereferenceable !231, !align !232
└ @ Enzyme.Compiler ~/.julia/packages/GPUCompiler/Y4hSX/src/utils.jl:59
┌ Warning: TODO reverse jl_array_del_end zero-set used memset rather than runtime type of (false, nothing) in   %getfield57 = load atomic {} addrspace(10)*, {} addrspace(10)* addrspace(10)* %getfield_addr56 unordered, align 8, !dbg !313, !tbaa !214, !alias.scope !216, !noalias !265, !nonnull !200, !dereferenceable !231, !align !232
└ @ Enzyme.Compiler ~/.julia/packages/GPUCompiler/Y4hSX/src/utils.jl:59
┌ Warning: TODO reverse jl_array_del_end zero-set used memset rather than runtime type of (true, Vector{Float64}) in   %getfield3 = load atomic {} addrspace(10)*, {} addrspace(10)* addrspace(11)* %getfield_addr2 unordered, align 8, !dbg !281, !tbaa !205, !alias.scope !214, !noalias !217, !nonnull !200, !dereferenceable !222, !align !223
└ @ Enzyme.Compiler ~/.julia/packages/GPUCompiler/Y4hSX/src/utils.jl:59
┌ Warning: TODO reverse jl_array_del_end zero-set used memset rather than runtime type of (true, Vector{Float64}) in   %getfield7 = load atomic {} addrspace(10)*, {} addrspace(10)* addrspace(11)* %getfield_addr6 unordered, align 8, !dbg !258, !tbaa !205, !alias.scope !214, !noalias !217, !nonnull !200, !dereferenceable !222, !align !223
└ @ Enzyme.Compiler ~/.julia/packages/GPUCompiler/Y4hSX/src/utils.jl:59
┌ Warning: TODO reverse jl_array_del_end zero-set used memset rather than runtime type of (false, nothing) in   %getfield71 = load atomic {} addrspace(10)*, {} addrspace(10)* addrspace(10)* %getfield_addr70 unordered, align 8, !dbg !364, !tbaa !214, !alias.scope !216, !noalias !265, !nonnull !200, !dereferenceable !231, !align !232
└ @ Enzyme.Compiler ~/.julia/packages/GPUCompiler/Y4hSX/src/utils.jl:59
┌ Warning: TODO reverse jl_array_del_end zero-set used memset rather than runtime type of (false, nothing) in   %getfield76 = load atomic {} addrspace(10)*, {} addrspace(10)* addrspace(10)* %getfield_addr75 unordered, align 8, !dbg !398, !tbaa !214, !alias.scope !216, !noalias !265, !nonnull !200, !dereferenceable !231, !align !232
└ @ Enzyme.Compiler ~/.julia/packages/GPUCompiler/Y4hSX/src/utils.jl:59
ERROR: BoundsError: attempt to access 6-element Vector{Int64} at index [0]
Stacktrace:
  [1] _noshapecheck_map
    @ ./essentials.jl:0
  [2] map
    @ /Applications/Julia-1.10.app/Contents/Resources/julia/share/julia/stdlib/v1.10/SparseArrays/src/higherorderfns.jl:1187 [inlined]
  [3] +
    @ /Applications/Julia-1.10.app/Contents/Resources/julia/share/julia/stdlib/v1.10/SparseArrays/src/sparsematrix.jl:2242 [inlined]
  [4] +
    @ /Applications/Julia-1.10.app/Contents/Resources/julia/share/julia/stdlib/v1.10/SparseArrays/src/sparsematrix.jl:4277 [inlined]
  [5] compute_L1
    @ ./REPL[93]:5
  [6] f
    @ ./REPL[119]:7 [inlined]
  [7] f
    @ ./REPL[119]:0 [inlined]
  [8] diffejulia_f_10258_inner_1wrap
    @ ./REPL[119]:0
  [9] macro expansion
    @ ~/git/Enzyme.jl/src/compiler.jl:6633 [inlined]
 [10] enzyme_call
    @ ~/git/Enzyme.jl/src/compiler.jl:6233 [inlined]
 [11] CombinedAdjointThunk
    @ ~/git/Enzyme.jl/src/compiler.jl:6110 [inlined]
 [12] autodiff
    @ ~/git/Enzyme.jl/src/Enzyme.jl:314 [inlined]
 [13] autodiff(::ReverseMode{false, FFIABI, false}, ::typeof(f), ::Type{Active}, ::Duplicated{Vector{Float64}}, ::Const{SparseMatrixCSC{Float64, Int64}}, ::Const{Float64})
    @ Enzyme ~/git/Enzyme.jl/src/Enzyme.jl:326
 [14] top-level scope
    @ REPL[123]:1

@dleather
Copy link
Author

I'm getting the same error as you with the posted code in maybe 30 seconds. Not sure what is happening with the indexing...

@wsmoses
Copy link
Member

wsmoses commented Sep 28, 2024

This now hits the hvcat stackoverflow, which I suppose is an improvement.

julia> Enzyme.autodiff(Reverse, f, Active, Duplicated(θ, dθ))
ERROR: StackOverflowError:
Stacktrace:
     [1] hvcat
       @ /Applications/Julia-1.10.app/Contents/Resources/julia/share/julia/stdlib/v1.10/SparseArrays/src/sparsevector.jl:1269 [inlined]
     [2] hvcat
       @ /Applications/Julia-1.10.app/Contents/Resources/julia/share/julia/stdlib/v1.10/SparseArrays/src/sparsevector.jl:0 [inlined]
     [3] augmented_julia_hvcat_3526_inner_1wrap
       @ /Applications/Julia-1.10.app/Contents/Resources/julia/share/julia/stdlib/v1.10/SparseArrays/src/sparsevector.jl:0
     [4] macro expansion
       @ ~/git/Enzyme.jl/src/compiler.jl:9227 [inlined]
     [5] enzyme_call
       @ ~/git/Enzyme.jl/src/compiler.jl:8793 [inlined]
     [6] AugmentedForwardThunk
       @ ~/git/Enzyme.jl/src/compiler.jl:8630 [inlined]
     [7] runtime_generic_augfwd(activity::Type{…}, runtimeActivity::Val{…}, width::Val{…}, ModifiedBetween::Val{…}, RT::Val{…}, f::typeof(hvcat), df::Nothing, primal_1::Tuple{…}, shadow_1_1::Nothing, primal_2::Float64, shadow_2_1::Base.RefValue{…}, primal_3::Float64, shadow_3_1::Base.RefValue{…}, primal_4::Float64, shadow_4_1::Base.RefValue{…}, primal_5::Float64, shadow_5_1::Base.RefValue{…})
       @ Enzyme.Compiler ~/git/Enzyme.jl/src/rules/jitrules.jl:483
--- the last 7 lines are repeated 8029 more times ---
 [56211] hvcat
       @ /Applications/Julia-1.10.app/Contents/Resources/julia/share/julia/stdlib/v1.10/SparseArrays/src/sparsevector.jl:1269 [inlined]
 [56212] hvcat
       @ /Applications/Julia-1.10.app/Contents/Resources/julia/share/julia/stdlib/v1.10/SparseArrays/src/sparsevector.jl:0 [inlined]
 [56213] augmented_julia_hvcat_3313_inner_1wrap
       @ /Applications/Julia-1.10.app/Contents/Resources/julia/share/julia/stdlib/v1.10/SparseArrays/src/sparsevector.jl:0
 [56214] macro expansion
       @ ~/git/Enzyme.jl/src/compiler.jl:9227 [inlined]
 [56215] enzyme_call
       @ ~/git/Enzyme.jl/src/compiler.jl:8793 [inlined]
 [56216] AugmentedForwardThunk
       @ ~/git/Enzyme.jl/src/compiler.jl:8630 [inlined]
 [56217] runtime_generic_augfwd(activity::Type{…}, runtimeActivity::Val{…}, width::Val{…}, ModifiedBetween::Val{…}, RT::Val{…}, f::typeof(hvcat), df::Nothing, primal_1::Tuple{…}, shadow_1_1::Nothing, primal_2::Float64, shadow_2_1::Base.RefValue{…}, primal_3::Float64, shadow_3_1::Nothing, primal_4::Float64, shadow_4_1::Nothing, primal_5::Float64, shadow_5_1::Nothing)
       @ Enzyme.Compiler ~/git/Enzyme.jl/src/rules/jitrules.jl:483
 [56218] f
       @ ./REPL[9]:4 [inlined]
 [56219] augmented_julia_f_1052wrap
       @ ./REPL[9]:0
 [56220] macro expansion
       @ ~/git/Enzyme.jl/src/compiler.jl:9227 [inlined]
 [56221] enzyme_call
       @ ~/git/Enzyme.jl/src/compiler.jl:8793 [inlined]
 [56222] AugmentedForwardThunk
       @ ~/git/Enzyme.jl/src/compiler.jl:8630 [inlined]
 [56223] autodiff
       @ ~/git/Enzyme.jl/src/Enzyme.jl:384 [inlined]
Some type information was truncated. Use `show(err)` to see complete types.

@wsmoses wsmoses changed the title Enzyme Crashing, MWE. Hvcat stackoverflow Sep 28, 2024
@wsmoses
Copy link
Member

wsmoses commented Nov 12, 2024

A fix has been made to upstream julia: JuliaSparse/SparseArrays.jl#579

It is in the process of being backported to 1.10 and 1.11, closing

@wsmoses wsmoses closed this as completed Nov 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants