Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Model saved under Flux v0.14.16 does not load on v0.14.17 #2476

Closed
cirobr opened this issue Aug 10, 2024 · 6 comments · Fixed by #2477
Closed

Model saved under Flux v0.14.16 does not load on v0.14.17 #2476

cirobr opened this issue Aug 10, 2024 · 6 comments · Fixed by #2477
Labels

Comments

@cirobr
Copy link

cirobr commented Aug 10, 2024

Cheers,

I have a number of model states saved with JLD2 under Flux v0.14.16. As Flux has been upgraded to v0.14.17, the following error shows up when trying to load models under Flux v0.14.17:

ArgumentError: Tried to load (:pad, :σ, :weight, :bias, :groups, :stride, :dilation) into (:pad, :σ, :weight, :bias, :outpad, :groups, :stride, :dilation) but the structures do not match.

The exact functions for saving and loading are as follows:

function saveModelState(fpfn, model)
    modelcpu    = Flux.cpu(model)
    model_state = Flux.state(modelcpu)
    JLD2.jldsave(fpfn; model_state)
end

function loadModelState!(fpfn, modelcpu)
    model_state = JLD2.load(fpfn, "model_state")
    Flux.loadmodel!(modelcpu, model_state)
end

I've made the problem to come and go by switching between Flux versions. Please kind advise how to keep up saved models with evolution of Flux.

Thanks in advance.

@cirobr
Copy link
Author

cirobr commented Aug 10, 2024

Stacktrace follows:

{
"name": "ArgumentError",
"message": "ArgumentError: Tried to load (:pad, :σ, :weight, :bias, :groups, :stride, :dilation) into (:pad, :σ, :weight, :bias, :outpad, :groups, :stride, :dilation) but the structures do not match.",
"stack": "ArgumentError: Tried to load (:pad, :σ, :weight, :bias, :groups, :stride, :dilation) into (:pad, :σ, :weight, :bias, :outpad, :groups, :stride, :dilation) but the structures do not match.

Stacktrace:
[1] loadmodel!(dst::ConvTranspose{2, 4, typeof(identity), Array{Float32, 4}, Vector{Float32}}, src::@NamedTuple{σ::Tuple{}, weight::Array{Float32, 4}, bias::Vector{Float32}, stride::Tuple{Int64, Int64}, pad::NTuple{4, Int64}, dilation::Tuple{Int64, Int64}, groups::Int64}; filter::Function, cache::Base.IdSet{Any})
@ Flux ~/.julia/packages/Flux/JKfiV/src/loading.jl:95
[2] loadmodel!(dst::Tuple{Conv{2, 4, typeof(leakyrelu), Array{Float32, 4}, Vector{Float32}}, Dropout{Float64, Colon, Random.TaskLocalRNG}, Conv{2, 4, typeof(leakyrelu), Array{Float32, 4}, Vector{Float32}}, ConvTranspose{2, 4, typeof(identity), Array{Float32, 4}, Vector{Float32}}, BatchNorm{typeof(leakyrelu), Vector{Float32}, Float32, Vector{Float32}}}, src::Tuple{@NamedTuple{σ::Tuple{}, weight::Array{Float32, 4}, bias::Vector{Float32}, stride::Tuple{Int64, Int64}, pad::NTuple{4, Int64}, dilation::Tuple{Int64, Int64}, groups::Int64}, @NamedTuple{p::Float64, dims::Tuple{}, active::Nothing, rng::Tuple{}}, @NamedTuple{σ::Tuple{}, weight::Array{Float32, 4}, bias::Vector{Float32}, stride::Tuple{Int64, Int64}, pad::NTuple{4, Int64}, dilation::Tuple{Int64, Int64}, groups::Int64}, @NamedTuple{σ::Tuple{}, weight::Array{Float32, 4}, bias::Vector{Float32}, stride::Tuple{Int64, Int64}, pad::NTuple{4, Int64}, dilation::Tuple{Int64, Int64}, groups::Int64}, @NamedTuple{λ::Tuple{}, β::Vector{Float32}, γ::Vector{Float32}, μ::Vector{Float32}, σ²::Vector{Float32}, ϵ::Float32, momentum::Float32, affine::Bool, track_stats::Bool, active::Nothing, chs::Int64}}; filter::Function, cache::Base.IdSet{Any})
@ Flux ~/.julia/packages/Flux/JKfiV/src/loading.jl:105
[3] loadmodel!(dst::Chain{Tuple{Conv{2, 4, typeof(leakyrelu), Array{Float32, 4}, Vector{Float32}}, Dropout{Float64, Colon, Random.TaskLocalRNG}, Conv{2, 4, typeof(leakyrelu), Array{Float32, 4}, Vector{Float32}}, ConvTranspose{2, 4, typeof(identity), Array{Float32, 4}, Vector{Float32}}, BatchNorm{typeof(leakyrelu), Vector{Float32}, Float32, Vector{Float32}}}}, src::@NamedTuple{layers::Tuple{@NamedTuple{σ::Tuple{}, weight::Array{Float32, 4}, bias::Vector{Float32}, stride::Tuple{Int64, Int64}, pad::NTuple{4, Int64}, dilation::Tuple{Int64, Int64}, groups::Int64}, @NamedTuple{p::Float64, dims::Tuple{}, active::Nothing, rng::Tuple{}}, @NamedTuple{σ::Tuple{}, weight::Array{Float32, 4}, bias::Vector{Float32}, stride::Tuple{Int64, Int64}, pad::NTuple{4, Int64}, dilation::Tuple{Int64, Int64}, groups::Int64}, @NamedTuple{σ::Tuple{}, weight::Array{Float32, 4}, bias::Vector{Float32}, stride::Tuple{Int64, Int64}, pad::NTuple{4, Int64}, dilation::Tuple{Int64, Int64}, groups::Int64}, @NamedTuple{λ::Tuple{}, β::Vector{Float32}, γ::Vector{Float32}, μ::Vector{Float32}, σ²::Vector{Float32}, ϵ::Float32, momentum::Float32, affine::Bool, track_stats::Bool, active::Nothing, chs::Int64}}}; filter::Function, cache::Base.IdSet{Any})
@ Flux ~/.julia/packages/Flux/JKfiV/src/loading.jl:105
[4] loadmodel!(dst::@NamedTuple{e4::Chain{Tuple{ConvTranspose{2, 4, typeof(identity), Array{Float32, 4}, Vector{Float32}}, BatchNorm{typeof(leakyrelu), Vector{Float32}, Float32, Vector{Float32}}}}, e3::Chain{Tuple{Conv{2, 4, typeof(leakyrelu), Array{Float32, 4}, Vector{Float32}}, Dropout{Float64, Colon, Random.TaskLocalRNG}, Conv{2, 4, typeof(leakyrelu), Array{Float32, 4}, Vector{Float32}}, ConvTranspose{2, 4, typeof(identity), Array{Float32, 4}, Vector{Float32}}, BatchNorm{typeof(leakyrelu), Vector{Float32}, Float32, Vector{Float32}}}}, e2::Chain{Tuple{Conv{2, 4, typeof(leakyrelu), Array{Float32, 4}, Vector{Float32}}, Dropout{Float64, Colon, Random.TaskLocalRNG}, Conv{2, 4, typeof(leakyrelu), Array{Float32, 4}, Vector{Float32}}, ConvTranspose{2, 4, typeof(identity), Array{Float32, 4}, Vector{Float32}}, BatchNorm{typeof(leakyrelu), Vector{Float32}, Float32, Vector{Float32}}}}, e1::Chain{Tuple{Conv{2, 4, typeof(leakyrelu), Array{Float32, 4}, Vector{Float32}}, Dropout{Float64, Colon, Random.TaskLocalRNG}, Conv{2, 4, typeof(identity), Array{Float32, 4}, Vector{Float32}}, BatchNorm{typeof(leakyrelu), Vector{Float32}, Float32, Vector{Float32}}}}, e0::Conv{2, 4, typeof(identity), Array{Float32, 4}, Bool}, act::TinyMachines.var"#14#16"}, src::@NamedTuple{e4::@NamedTuple{layers::Tuple{@NamedTuple{σ::Tuple{}, weight::Array{Float32, 4}, bias::Vector{Float32}, stride::Tuple{Int64, Int64}, pad::NTuple{4, Int64}, dilation::Tuple{Int64, Int64}, groups::Int64}, @NamedTuple{λ::Tuple{}, β::Vector{Float32}, γ::Vector{Float32}, μ::Vector{Float32}, σ²::Vector{Float32}, ϵ::Float32, momentum::Float32, affine::Bool, track_stats::Bool, active::Nothing, chs::Int64}}}, e3::@NamedTuple{layers::Tuple{@NamedTuple{σ::Tuple{}, weight::Array{Float32, 4}, bias::Vector{Float32}, stride::Tuple{Int64, Int64}, pad::NTuple{4, Int64}, dilation::Tuple{Int64, Int64}, groups::Int64}, @NamedTuple{p::Float64, dims::Tuple{}, active::Nothing, rng::Tuple{}}, @NamedTuple{σ::Tuple{}, weight::Array{Float32, 4}, bias::Vector{Float32}, stride::Tuple{Int64, Int64}, pad::NTuple{4, Int64}, dilation::Tuple{Int64, Int64}, groups::Int64}, @NamedTuple{σ::Tuple{}, weight::Array{Float32, 4}, bias::Vector{Float32}, stride::Tuple{Int64, Int64}, pad::NTuple{4, Int64}, dilation::Tuple{Int64, Int64}, groups::Int64}, @NamedTuple{λ::Tuple{}, β::Vector{Float32}, γ::Vector{Float32}, μ::Vector{Float32}, σ²::Vector{Float32}, ϵ::Float32, momentum::Float32, affine::Bool, track_stats::Bool, active::Nothing, chs::Int64}}}, e2::@NamedTuple{layers::Tuple{@NamedTuple{σ::Tuple{}, weight::Array{Float32, 4}, bias::Vector{Float32}, stride::Tuple{Int64, Int64}, pad::NTuple{4, Int64}, dilation::Tuple{Int64, Int64}, groups::Int64}, @NamedTuple{p::Float64, dims::Tuple{}, active::Nothing, rng::Tuple{}}, @NamedTuple{σ::Tuple{}, weight::Array{Float32, 4}, bias::Vector{Float32}, stride::Tuple{Int64, Int64}, pad::NTuple{4, Int64}, dilation::Tuple{Int64, Int64}, groups::Int64}, @NamedTuple{σ::Tuple{}, weight::Array{Float32, 4}, bias::Vector{Float32}, stride::Tuple{Int64, Int64}, pad::NTuple{4, Int64}, dilation::Tuple{Int64, Int64}, groups::Int64}, @NamedTuple{λ::Tuple{}, β::Vector{Float32}, γ::Vector{Float32}, μ::Vector{Float32}, σ²::Vector{Float32}, ϵ::Float32, momentum::Float32, affine::Bool, track_stats::Bool, active::Nothing, chs::Int64}}}, e1::@NamedTuple{layers::Tuple{@NamedTuple{σ::Tuple{}, weight::Array{Float32, 4}, bias::Vector{Float32}, stride::Tuple{Int64, Int64}, pad::NTuple{4, Int64}, dilation::Tuple{Int64, Int64}, groups::Int64}, @NamedTuple{p::Float64, dims::Tuple{}, active::Nothing, rng::Tuple{}}, @NamedTuple{σ::Tuple{}, weight::Array{Float32, 4}, bias::Vector{Float32}, stride::Tuple{Int64, Int64}, pad::NTuple{4, Int64}, dilation::Tuple{Int64, Int64}, groups::Int64}, @NamedTuple{λ::Tuple{}, β::Vector{Float32}, γ::Vector{Float32}, μ::Vector{Float32}, σ²::Vector{Float32}, ϵ::Float32, momentum::Float32, affine::Bool, track_stats::Bool, active::Nothing, chs::Int64}}}, e0::@NamedTuple{σ::Tuple{}, weight::Array{Float32, 4}, bias::Bool, stride::Tuple{Int64, Int64}, pad::NTuple{4, Int64}, dilation::Tuple{Int64, Int64}, groups::Int64}, act::Tuple{}}; filter::Function, cache::Base.IdSet{Any})
@ Flux ~/.julia/packages/Flux/JKfiV/src/loading.jl:105
[5] loadmodel!(dst::Chain{@NamedTuple{e4::Chain{Tuple{ConvTranspose{2, 4, typeof(identity), Array{Float32, 4}, Vector{Float32}}, BatchNorm{typeof(leakyrelu), Vector{Float32}, Float32, Vector{Float32}}}}, e3::Chain{Tuple{Conv{2, 4, typeof(leakyrelu), Array{Float32, 4}, Vector{Float32}}, Dropout{Float64, Colon, Random.TaskLocalRNG}, Conv{2, 4, typeof(leakyrelu), Array{Float32, 4}, Vector{Float32}}, ConvTranspose{2, 4, typeof(identity), Array{Float32, 4}, Vector{Float32}}, BatchNorm{typeof(leakyrelu), Vector{Float32}, Float32, Vector{Float32}}}}, e2::Chain{Tuple{Conv{2, 4, typeof(leakyrelu), Array{Float32, 4}, Vector{Float32}}, Dropout{Float64, Colon, Random.TaskLocalRNG}, Conv{2, 4, typeof(leakyrelu), Array{Float32, 4}, Vector{Float32}}, ConvTranspose{2, 4, typeof(identity), Array{Float32, 4}, Vector{Float32}}, BatchNorm{typeof(leakyrelu), Vector{Float32}, Float32, Vector{Float32}}}}, e1::Chain{Tuple{Conv{2, 4, typeof(leakyrelu), Array{Float32, 4}, Vector{Float32}}, Dropout{Float64, Colon, Random.TaskLocalRNG}, Conv{2, 4, typeof(identity), Array{Float32, 4}, Vector{Float32}}, BatchNorm{typeof(leakyrelu), Vector{Float32}, Float32, Vector{Float32}}}}, e0::Conv{2, 4, typeof(identity), Array{Float32, 4}, Bool}, act::TinyMachines.var"#14#16"}}, src::@NamedTuple{layers::@NamedTuple{e4::@NamedTuple{layers::Tuple{@NamedTuple{σ::Tuple{}, weight::Array{Float32, 4}, bias::Vector{Float32}, stride::Tuple{Int64, Int64}, pad::NTuple{4, Int64}, dilation::Tuple{Int64, Int64}, groups::Int64}, @NamedTuple{λ::Tuple{}, β::Vector{Float32}, γ::Vector{Float32}, μ::Vector{Float32}, σ²::Vector{Float32}, ϵ::Float32, momentum::Float32, affine::Bool, track_stats::Bool, active::Nothing, chs::Int64}}}, e3::@NamedTuple{layers::Tuple{@NamedTuple{σ::Tuple{}, weight::Array{Float32, 4}, bias::Vector{Float32}, stride::Tuple{Int64, Int64}, pad::NTuple{4, Int64}, dilation::Tuple{Int64, Int64}, groups::Int64}, @NamedTuple{p::Float64, dims::Tuple{}, active::Nothing, rng::Tuple{}}, @NamedTuple{σ::Tuple{}, weight::Array{Float32, 4}, bias::Vector{Float32}, stride::Tuple{Int64, Int64}, pad::NTuple{4, Int64}, dilation::Tuple{Int64, Int64}, groups::Int64}, @NamedTuple{σ::Tuple{}, weight::Array{Float32, 4}, bias::Vector{Float32}, stride::Tuple{Int64, Int64}, pad::NTuple{4, Int64}, dilation::Tuple{Int64, Int64}, groups::Int64}, @NamedTuple{λ::Tuple{}, β::Vector{Float32}, γ::Vector{Float32}, μ::Vector{Float32}, σ²::Vector{Float32}, ϵ::Float32, momentum::Float32, affine::Bool, track_stats::Bool, active::Nothing, chs::Int64}}}, e2::@NamedTuple{layers::Tuple{@NamedTuple{σ::Tuple{}, weight::Array{Float32, 4}, bias::Vector{Float32}, stride::Tuple{Int64, Int64}, pad::NTuple{4, Int64}, dilation::Tuple{Int64, Int64}, groups::Int64}, @NamedTuple{p::Float64, dims::Tuple{}, active::Nothing, rng::Tuple{}}, @NamedTuple{σ::Tuple{}, weight::Array{Float32, 4}, bias::Vector{Float32}, stride::Tuple{Int64, Int64}, pad::NTuple{4, Int64}, dilation::Tuple{Int64, Int64}, groups::Int64}, @NamedTuple{σ::Tuple{}, weight::Array{Float32, 4}, bias::Vector{Float32}, stride::Tuple{Int64, Int64}, pad::NTuple{4, Int64}, dilation::Tuple{Int64, Int64}, groups::Int64}, @NamedTuple{λ::Tuple{}, β::Vector{Float32}, γ::Vector{Float32}, μ::Vector{Float32}, σ²::Vector{Float32}, ϵ::Float32, momentum::Float32, affine::Bool, track_stats::Bool, active::Nothing, chs::Int64}}}, e1::@NamedTuple{layers::Tuple{@NamedTuple{σ::Tuple{}, weight::Array{Float32, 4}, bias::Vector{Float32}, stride::Tuple{Int64, Int64}, pad::NTuple{4, Int64}, dilation::Tuple{Int64, Int64}, groups::Int64}, @NamedTuple{p::Float64, dims::Tuple{}, active::Nothing, rng::Tuple{}}, @NamedTuple{σ::Tuple{}, weight::Array{Float32, 4}, bias::Vector{Float32}, stride::Tuple{Int64, Int64}, pad::NTuple{4, Int64}, dilation::Tuple{Int64, Int64}, groups::Int64}, @NamedTuple{λ::Tuple{}, β::Vector{Float32}, γ::Vector{Float32}, μ::Vector{Float32}, σ²::Vector{Float32}, ϵ::Float32, momentum::Float32, affine::Bool, track_stats::Bool, active::Nothing, chs::Int64}}}, e0::@NamedTuple{σ::Tuple{}, weight::Array{Float32, 4}, bias::Bool, stride::Tuple{Int64, Int64}, pad::NTuple{4, Int64}, dilation::Tuple{Int64, Int64}, groups::Int64}, act::Tuple{}}}; filter::Function, cache::Base.IdSet{Any})
@ Flux ~/.julia/packages/Flux/JKfiV/src/loading.jl:105
[6] loadmodel!(dst::UNet4, src::@NamedTuple{enc::@NamedTuple{layers::@NamedTuple{c1::@NamedTuple{layers::Tuple{@NamedTuple{σ::Tuple{}, weight::Array{Float32, 4}, bias::Vector{Float32}, stride::Tuple{Int64, Int64}, pad::NTuple{4, Int64}, dilation::Tuple{Int64, Int64}, groups::Int64}, @NamedTuple{p::Float64, dims::Tuple{}, active::Nothing, rng::Tuple{}}, @NamedTuple{σ::Tuple{}, weight::Array{Float32, 4}, bias::Vector{Float32}, stride::Tuple{Int64, Int64}, pad::NTuple{4, Int64}, dilation::Tuple{Int64, Int64}, groups::Int64}, @NamedTuple{λ::Tuple{}, β::Vector{Float32}, γ::Vector{Float32}, μ::Vector{Float32}, σ²::Vector{Float32}, ϵ::Float32, momentum::Float32, affine::Bool, track_stats::Bool, active::Nothing, chs::Int64}}}, c2::@NamedTuple{layers::Tuple{Tuple{}, @NamedTuple{σ::Tuple{}, weight::Array{Float32, 4}, bias::Vector{Float32}, stride::Tuple{Int64, Int64}, pad::NTuple{4, Int64}, dilation::Tuple{Int64, Int64}, groups::Int64}, @NamedTuple{p::Float64, dims::Tuple{}, active::Nothing, rng::Tuple{}}, @NamedTuple{σ::Tuple{}, weight::Array{Float32, 4}, bias::Vector{Float32}, stride::Tuple{Int64, Int64}, pad::NTuple{4, Int64}, dilation::Tuple{Int64, Int64}, groups::Int64}, @NamedTuple{λ::Tuple{}, β::Vector{Float32}, γ::Vector{Float32}, μ::Vector{Float32}, σ²::Vector{Float32}, ϵ::Float32, momentum::Float32, affine::Bool, track_stats::Bool, active::Nothing, chs::Int64}}}, c3::@NamedTuple{layers::Tuple{Tuple{}, @NamedTuple{σ::Tuple{}, weight::Array{Float32, 4}, bias::Vector{Float32}, stride::Tuple{Int64, Int64}, pad::NTuple{4, Int64}, dilation::Tuple{Int64, Int64}, groups::Int64}, @NamedTuple{p::Float64, dims::Tuple{}, active::Nothing, rng::Tuple{}}, @NamedTuple{σ::Tuple{}, weight::Array{Float32, 4}, bias::Vector{Float32}, stride::Tuple{Int64, Int64}, pad::NTuple{4, Int64}, dilation::Tuple{Int64, Int64}, groups::Int64}, @NamedTuple{λ::Tuple{}, β::Vector{Float32}, γ::Vector{Float32}, μ::Vector{Float32}, σ²::Vector{Float32}, ϵ::Float32, momentum::Float32, affine::Bool, track_stats::Bool, active::Nothing, chs::Int64}}}, c4::@NamedTuple{layers::Tuple{Tuple{}, @NamedTuple{σ::Tuple{}, weight::Array{Float32, 4}, bias::Vector{Float32}, stride::Tuple{Int64, Int64}, pad::NTuple{4, Int64}, dilation::Tuple{Int64, Int64}, groups::Int64}, @NamedTuple{p::Float64, dims::Tuple{}, active::Nothing, rng::Tuple{}}, @NamedTuple{σ::Tuple{}, weight::Array{Float32, 4}, bias::Vector{Float32}, stride::Tuple{Int64, Int64}, pad::NTuple{4, Int64}, dilation::Tuple{Int64, Int64}, groups::Int64}, @NamedTuple{λ::Tuple{}, β::Vector{Float32}, γ::Vector{Float32}, μ::Vector{Float32}, σ²::Vector{Float32}, ϵ::Float32, momentum::Float32, affine::Bool, track_stats::Bool, active::Nothing, chs::Int64}}}}}, dec::@NamedTuple{layers::@NamedTuple{e4::@NamedTuple{layers::Tuple{@NamedTuple{σ::Tuple{}, weight::Array{Float32, 4}, bias::Vector{Float32}, stride::Tuple{Int64, Int64}, pad::NTuple{4, Int64}, dilation::Tuple{Int64, Int64}, groups::Int64}, @NamedTuple{λ::Tuple{}, β::Vector{Float32}, γ::Vector{Float32}, μ::Vector{Float32}, σ²::Vector{Float32}, ϵ::Float32, momentum::Float32, affine::Bool, track_stats::Bool, active::Nothing, chs::Int64}}}, e3::@NamedTuple{layers::Tuple{@NamedTuple{σ::Tuple{}, weight::Array{Float32, 4}, bias::Vector{Float32}, stride::Tuple{Int64, Int64}, pad::NTuple{4, Int64}, dilation::Tuple{Int64, Int64}, groups::Int64}, @NamedTuple{p::Float64, dims::Tuple{}, active::Nothing, rng::Tuple{}}, @NamedTuple{σ::Tuple{}, weight::Array{Float32, 4}, bias::Vector{Float32}, stride::Tuple{Int64, Int64}, pad::NTuple{4, Int64}, dilation::Tuple{Int64, Int64}, groups::Int64}, @NamedTuple{σ::Tuple{}, weight::Array{Float32, 4}, bias::Vector{Float32}, stride::Tuple{Int64, Int64}, pad::NTuple{4, Int64}, dilation::Tuple{Int64, Int64}, groups::Int64}, @NamedTuple{λ::Tuple{}, β::Vector{Float32}, γ::Vector{Float32}, μ::Vector{Float32}, σ²::Vector{Float32}, ϵ::Float32, momentum::Float32, affine::Bool, track_stats::Bool, active::Nothing, chs::Int64}}}, e2::@NamedTuple{layers::Tuple{@NamedTuple{σ::Tuple{}, weight::Array{Float32, 4}, bias::Vector{Float32}, stride::Tuple{Int64, Int64}, pad::NTuple{4, Int64}, dilation::Tuple{Int64, Int64}, groups::Int64}, @NamedTuple{p::Float64, dims::Tuple{}, active::Nothing, rng::Tuple{}}, @NamedTuple{σ::Tuple{}, weight::Array{Float32, 4}, bias::Vector{Float32}, stride::Tuple{Int64, Int64}, pad::NTuple{4, Int64}, dilation::Tuple{Int64, Int64}, groups::Int64}, @NamedTuple{σ::Tuple{}, weight::Array{Float32, 4}, bias::Vector{Float32}, stride::Tuple{Int64, Int64}, pad::NTuple{4, Int64}, dilation::Tuple{Int64, Int64}, groups::Int64}, @NamedTuple{λ::Tuple{}, β::Vector{Float32}, γ::Vector{Float32}, μ::Vector{Float32}, σ²::Vector{Float32}, ϵ::Float32, momentum::Float32, affine::Bool, track_stats::Bool, active::Nothing, chs::Int64}}}, e1::@NamedTuple{layers::Tuple{@NamedTuple{σ::Tuple{}, weight::Array{Float32, 4}, bias::Vector{Float32}, stride::Tuple{Int64, Int64}, pad::NTuple{4, Int64}, dilation::Tuple{Int64, Int64}, groups::Int64}, @NamedTuple{p::Float64, dims::Tuple{}, active::Nothing, rng::Tuple{}}, @NamedTuple{σ::Tuple{}, weight::Array{Float32, 4}, bias::Vector{Float32}, stride::Tuple{Int64, Int64}, pad::NTuple{4, Int64}, dilation::Tuple{Int64, Int64}, groups::Int64}, @NamedTuple{λ::Tuple{}, β::Vector{Float32}, γ::Vector{Float32}, μ::Vector{Float32}, σ²::Vector{Float32}, ϵ::Float32, momentum::Float32, affine::Bool, track_stats::Bool, active::Nothing, chs::Int64}}}, e0::@NamedTuple{σ::Tuple{}, weight::Array{Float32, 4}, bias::Bool, stride::Tuple{Int64, Int64}, pad::NTuple{4, Int64}, dilation::Tuple{Int64, Int64}, groups::Int64}, act::Tuple{}}}, verbose::Bool}; filter::Function, cache::Base.IdSet{Any})
@ Flux ~/.julia/packages/Flux/JKfiV/src/loading.jl:105
[7] loadmodel!(dst::UNet4, src::@NamedTuple{enc::@NamedTuple{layers::@NamedTuple{c1::@NamedTuple{layers::Tuple{@NamedTuple{σ::Tuple{}, weight::Array{Float32, 4}, bias::Vector{Float32}, stride::Tuple{Int64, Int64}, pad::NTuple{4, Int64}, dilation::Tuple{Int64, Int64}, groups::Int64}, @NamedTuple{p::Float64, dims::Tuple{}, active::Nothing, rng::Tuple{}}, @NamedTuple{σ::Tuple{}, weight::Array{Float32, 4}, bias::Vector{Float32}, stride::Tuple{Int64, Int64}, pad::NTuple{4, Int64}, dilation::Tuple{Int64, Int64}, groups::Int64}, @NamedTuple{λ::Tuple{}, β::Vector{Float32}, γ::Vector{Float32}, μ::Vector{Float32}, σ²::Vector{Float32}, ϵ::Float32, momentum::Float32, affine::Bool, track_stats::Bool, active::Nothing, chs::Int64}}}, c2::@NamedTuple{layers::Tuple{Tuple{}, @NamedTuple{σ::Tuple{}, weight::Array{Float32, 4}, bias::Vector{Float32}, stride::Tuple{Int64, Int64}, pad::NTuple{4, Int64}, dilation::Tuple{Int64, Int64}, groups::Int64}, @NamedTuple{p::Float64, dims::Tuple{}, active::Nothing, rng::Tuple{}}, @NamedTuple{σ::Tuple{}, weight::Array{Float32, 4}, bias::Vector{Float32}, stride::Tuple{Int64, Int64}, pad::NTuple{4, Int64}, dilation::Tuple{Int64, Int64}, groups::Int64}, @NamedTuple{λ::Tuple{}, β::Vector{Float32}, γ::Vector{Float32}, μ::Vector{Float32}, σ²::Vector{Float32}, ϵ::Float32, momentum::Float32, affine::Bool, track_stats::Bool, active::Nothing, chs::Int64}}}, c3::@NamedTuple{layers::Tuple{Tuple{}, @NamedTuple{σ::Tuple{}, weight::Array{Float32, 4}, bias::Vector{Float32}, stride::Tuple{Int64, Int64}, pad::NTuple{4, Int64}, dilation::Tuple{Int64, Int64}, groups::Int64}, @NamedTuple{p::Float64, dims::Tuple{}, active::Nothing, rng::Tuple{}}, @NamedTuple{σ::Tuple{}, weight::Array{Float32, 4}, bias::Vector{Float32}, stride::Tuple{Int64, Int64}, pad::NTuple{4, Int64}, dilation::Tuple{Int64, Int64}, groups::Int64}, @NamedTuple{λ::Tuple{}, β::Vector{Float32}, γ::Vector{Float32}, μ::Vector{Float32}, σ²::Vector{Float32}, ϵ::Float32, momentum::Float32, affine::Bool, track_stats::Bool, active::Nothing, chs::Int64}}}, c4::@NamedTuple{layers::Tuple{Tuple{}, @NamedTuple{σ::Tuple{}, weight::Array{Float32, 4}, bias::Vector{Float32}, stride::Tuple{Int64, Int64}, pad::NTuple{4, Int64}, dilation::Tuple{Int64, Int64}, groups::Int64}, @NamedTuple{p::Float64, dims::Tuple{}, active::Nothing, rng::Tuple{}}, @NamedTuple{σ::Tuple{}, weight::Array{Float32, 4}, bias::Vector{Float32}, stride::Tuple{Int64, Int64}, pad::NTuple{4, Int64}, dilation::Tuple{Int64, Int64}, groups::Int64}, @NamedTuple{λ::Tuple{}, β::Vector{Float32}, γ::Vector{Float32}, μ::Vector{Float32}, σ²::Vector{Float32}, ϵ::Float32, momentum::Float32, affine::Bool, track_stats::Bool, active::Nothing, chs::Int64}}}}}, dec::@NamedTuple{layers::@NamedTuple{e4::@NamedTuple{layers::Tuple{@NamedTuple{σ::Tuple{}, weight::Array{Float32, 4}, bias::Vector{Float32}, stride::Tuple{Int64, Int64}, pad::NTuple{4, Int64}, dilation::Tuple{Int64, Int64}, groups::Int64}, @NamedTuple{λ::Tuple{}, β::Vector{Float32}, γ::Vector{Float32}, μ::Vector{Float32}, σ²::Vector{Float32}, ϵ::Float32, momentum::Float32, affine::Bool, track_stats::Bool, active::Nothing, chs::Int64}}}, e3::@NamedTuple{layers::Tuple{@NamedTuple{σ::Tuple{}, weight::Array{Float32, 4}, bias::Vector{Float32}, stride::Tuple{Int64, Int64}, pad::NTuple{4, Int64}, dilation::Tuple{Int64, Int64}, groups::Int64}, @NamedTuple{p::Float64, dims::Tuple{}, active::Nothing, rng::Tuple{}}, @NamedTuple{σ::Tuple{}, weight::Array{Float32, 4}, bias::Vector{Float32}, stride::Tuple{Int64, Int64}, pad::NTuple{4, Int64}, dilation::Tuple{Int64, Int64}, groups::Int64}, @NamedTuple{σ::Tuple{}, weight::Array{Float32, 4}, bias::Vector{Float32}, stride::Tuple{Int64, Int64}, pad::NTuple{4, Int64}, dilation::Tuple{Int64, Int64}, groups::Int64}, @NamedTuple{λ::Tuple{}, β::Vector{Float32}, γ::Vector{Float32}, μ::Vector{Float32}, σ²::Vector{Float32}, ϵ::Float32, momentum::Float32, affine::Bool, track_stats::Bool, active::Nothing, chs::Int64}}}, e2::@NamedTuple{layers::Tuple{@NamedTuple{σ::Tuple{}, weight::Array{Float32, 4}, bias::Vector{Float32}, stride::Tuple{Int64, Int64}, pad::NTuple{4, Int64}, dilation::Tuple{Int64, Int64}, groups::Int64}, @NamedTuple{p::Float64, dims::Tuple{}, active::Nothing, rng::Tuple{}}, @NamedTuple{σ::Tuple{}, weight::Array{Float32, 4}, bias::Vector{Float32}, stride::Tuple{Int64, Int64}, pad::NTuple{4, Int64}, dilation::Tuple{Int64, Int64}, groups::Int64}, @NamedTuple{σ::Tuple{}, weight::Array{Float32, 4}, bias::Vector{Float32}, stride::Tuple{Int64, Int64}, pad::NTuple{4, Int64}, dilation::Tuple{Int64, Int64}, groups::Int64}, @NamedTuple{λ::Tuple{}, β::Vector{Float32}, γ::Vector{Float32}, μ::Vector{Float32}, σ²::Vector{Float32}, ϵ::Float32, momentum::Float32, affine::Bool, track_stats::Bool, active::Nothing, chs::Int64}}}, e1::@NamedTuple{layers::Tuple{@NamedTuple{σ::Tuple{}, weight::Array{Float32, 4}, bias::Vector{Float32}, stride::Tuple{Int64, Int64}, pad::NTuple{4, Int64}, dilation::Tuple{Int64, Int64}, groups::Int64}, @NamedTuple{p::Float64, dims::Tuple{}, active::Nothing, rng::Tuple{}}, @NamedTuple{σ::Tuple{}, weight::Array{Float32, 4}, bias::Vector{Float32}, stride::Tuple{Int64, Int64}, pad::NTuple{4, Int64}, dilation::Tuple{Int64, Int64}, groups::Int64}, @NamedTuple{λ::Tuple{}, β::Vector{Float32}, γ::Vector{Float32}, μ::Vector{Float32}, σ²::Vector{Float32}, ϵ::Float32, momentum::Float32, affine::Bool, track_stats::Bool, active::Nothing, chs::Int64}}}, e0::@NamedTuple{σ::Tuple{}, weight::Array{Float32, 4}, bias::Bool, stride::Tuple{Int64, Int64}, pad::NTuple{4, Int64}, dilation::Tuple{Int64, Int64}, groups::Int64}, act::Tuple{}}}, verbose::Bool})
@ Flux ~/.julia/packages/Flux/JKfiV/src/loading.jl:90
[8] loadModelState!(fpfn::String, modelcpu::UNet4)
@ LibML ~/.julia/packages/LibML/CMERB/src/modelstate.jl:11
[9] top-level scope
@ ~/projects/knowledge-distillation/jl_notebook_cell_df34fa98e69747e1a8f8a730347b8e2f_X24sdnNjb2RlLXJlbW90ZQ==.jl:10"
}

@mcabbott
Copy link
Member

The struct was changed in #2462 to add a new field. This new field doesn't contain anything that loadmodel! ought to care about, but it does seem to upset its check that the models agree.

I suppose that's a kind of deprecation that needs to be taken care of when adding things. The function here is recursive, so it ought to be possible to add a method like loadmodel!(dst::ConvTranspose, src; kw...) which performs the adjustment before proceeding.

CC @guiyrt, any chance you can have a look?

@mcabbott mcabbott added the bug label Aug 10, 2024
@dorn-gerhard
Copy link

I have the same problem.
Loading a model (established before Flux 0.14.16 breaks with Flux 0.14.17, because ConvTranspose has a new keyword :outpad:

u = Flux.loadmodel!(empty_model, model_state) 

ERROR: ArgumentError: Tried to load (:pad, , :weight, :bias, :groups, :stride, :dilation) into (:pad, , :weight, :bias, :outpad, :groups, :stride, :dilation) but the structures do not match.

Is there a way to update an old model to contain the new keywords?

@mcabbott
Copy link
Member

I think this is something like the minimal way to trigger the problem:

julia> using Flux

julia> ct = ConvTranspose((1,), 1=>1);

julia> Flux.state(ct)
(σ = (), weight = Float32[0.7935649;;;], bias = Float32[0.0], stride = (1,), pad = (0, 0), outpad = (0,), dilation = (1,), groups = 1)

julia> nt == (), weight = Float32[1.4811754;;;], bias = Float32[0.0], stride = (1,), pad = (0, 0), dilation = (1,), groups = 1);  # is what Flux 0.14.17 would return

julia> Flux.loadmodel!(ct, nt)
ERROR: ArgumentError: Tried to load (:pad, , :weight, :bias, :groups, :stride, :dilation) into (:pad, , :weight, :bias, :outpad, :groups, :stride, :dilation) but the structures do not match.

The simplest fix is to add an explicit method like so:

julia> function Flux.loadmodel!(dst::ConvTranspose, src::NamedTuple{(:σ, :weight, :bias, :stride, :pad, :dilation, :groups)}; kw...)
           new_src = (; src.σ, src.weight, src.bias, src.stride, src.pad, dst.outpad, src.dilation, src.groups)
           Flux.loadmodel!(dst, new_src; kw...)
       end

julia> Flux.loadmodel!(ct, nt)
ConvTranspose((1,), 1 => 1)  # 2 parameters

julia> Flux.loadmodel!(Chain(ct, ct), (; layers = (nt, nt)))  # loadmodel! is recursive
Chain(
  ConvTranspose((1,), 1 => 1),          # 2 parameters
  ConvTranspose((1,), 1 => 1),          # 2 parameters
) 

Please try this out. If it works OK then I think it could just be added here:

https://github.com/FluxML/Flux.jl/blob/master/src/deprecations.jl

What this doesn't fix is cases like this, I don't know if anyone uses loadmodel! this way:

julia> Flux.loadmodel!(Flux.state(ct), nt)
ERROR: ArgumentError: Tried to load (:pad, , :weight, :bias, :groups, :stride, :dilation) into (:pad, , :weight, :bias, :outpad, :groups, :stride, :dilation) but the structures do not match.

@guiyrt
Copy link
Contributor

guiyrt commented Aug 12, 2024

I could replicate the problem and was testing out some possible fixes as well. What do you think if we allowed src parameters to be a subset of the dst parameters? This means future changes that add new struct fields will not break backwards compatibility of loadmodel!. This will still raise the error if fields are removed, but I assume that doesn't happen often (but we can also change loadmodel! to allow that).

Currently, loadmodel throw an error if dst and src keys are not exactly the same:

function loadmodel!(dst, src; filter = _ -> true, cache = Base.IdSet())
  ldsts = _filter_children(filter, Functors.children(dst))
  lsrcs = _filter_children(filter, Functors.children(src))
  keys_ldsts = keys(ldsts)
  keys_lsrcs = keys(lsrcs)
  collect(keys_ldsts) == collect(keys_lsrcs) || throw(ArgumentError("Tried to load $(keys_lsrcs) into $(keys_ldsts) but the structures do not match."))
  
  for k in keys_lsrcs
    lsrc, ldst = lsrcs[k], ldsts[k]
    if ldst in cache # we already loaded this parameter before
      _tie_check(ldst, lsrc)
    elseif Functors.isleaf(ldst) # our first time loading this leaf
      push!(cache, ldst)
      loadleaf!(ldst, lsrc)
    else # this isn't a leaf
      loadmodel!(ldst, lsrc; filter, cache)
    end
  end

  return dst
end

My suggested change is to check instead if src keys are a subset of dst keys:

function loadmodel!(dst, src; filter = _ -> true, cache = Base.IdSet())
  ldsts = _filter_children(filter, Functors.children(dst))
  lsrcs = _filter_children(filter, Functors.children(src))
  keys_ldsts = keys(ldsts)
  keys_lsrcs = keys(lsrcs)
  collect(keys_lsrcs)  collect(keys_ldsts) || throw(ArgumentError("Tried to load $(keys_lsrcs) into $(keys_ldsts) but the structures do not match."))
  
  for k in keys_lsrcs
    lsrc, ldst = lsrcs[k], ldsts[k]
    if ldst in cache # we already loaded this parameter before
      _tie_check(ldst, lsrc)
    elseif Functors.isleaf(ldst) # our first time loading this leaf
      push!(cache, ldst)
      loadleaf!(ldst, lsrc)
    else # this isn't a leaf
      loadmodel!(ldst, lsrc; filter, cache)
    end
  end

  return dst
end

What are your thoughts on this?

@mcabbott
Copy link
Member

It's possible that we should have some more permissive rule like that. It would also allow loading of e.g. the state of Embedding into Dense (leaving bias untouched), but not the reverse, do we want that?

julia> Flux.state(Dense(1=>1))
(weight = Float32[0.85921806;;], bias = Float32[0.0], σ = ())

julia> Flux.state(Embedding(1=>1))
(weight = Float32[-1.777719;;],)

julia> Flux.state(Conv((1,), 1=>1))
(σ = (), weight = Float32[-0.11579989;;;], bias = Float32[0.0], stride = (1,), pad = (0, 0), dilation = (1,), groups = 1)

julia> Flux.state(ConvTranspose((1,), 1=>1))
(σ = (), weight = Float32[1.0422757;;;], bias = Float32[0.0], stride = (1,), pad = (0, 0), outpad = (0,), dilation = (1,), groups = 1)

My narrow proposal doesn't entirely avoid this question, since it would let you load state of Conv into ConvTranspose... as you can presently load into CrossCor since its fields agree with Conv.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants