-
Notifications
You must be signed in to change notification settings - Fork 55
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support applying plans by column? (Or is it really 300x slower??) #216
Comments
Internally, FFTW doesn’t actually have an in-place DCT algorithm IIRC, so it is allocating a temporary buffer each time you execute the plan in order to perform the transform out-of-place and copy back the result. With the matrix plan it does this only once, but with your Of course, you could use an out-of-place plan and allocate the buffers yourself. |
TFW you've been using in-place transforms to be "non-allocating" when I could should have been using out-of-place transforms 🤦♂️ Can confirm it's a more reasonable 3x slowdown (due to better multithreading??) julia> n = 10; F = FFTW.plan_r2r(Vector{Float64}(undef,n), FFTW.REDFT00, 1);
julia> function plmul!(Y, F, X)
@inbounds for j in 1:size(X,2)
mul!(view(Y,:,j), F, view(X,:,j))
end
Y
end
plmul! (generic function with 1 method)
julia> Y = randn(n,1000); X = randn(n,1000); @btime plmul!(Y, F, copy(X));
277.529 μs (2 allocations: 78.20 KiB) Any thoughts on whether |
It's not threading by default. Probably the slowdown is because FFTW's matrix in-place plan is only allocating a buffer big enough to hold a single column, and is re-using that for all of the columns? |
What I was thinking is not actually possible in practice since the memory alignment may be off: julia> mul!(view(randn(5,2),:,2), FFTW.plan_r2r(Vector{Float64}(undef,5), FFTW.REDFT10), view(randn(5,2),:,2))
ERROR: ArgumentError: FFTW plan applied to array with wrong memory alignment
Stacktrace:
[1] assert_applicable(p::FFTW.r2rFFTWPlan{Float64, (5,), false, 1, UnitRange{Int64}}, X::SubArray{Float64, 1, Matrix{Float64}, Tuple{Base.Slice{Base.OneTo{Int64}}, Int64}, true})
@ FFTW ~/.julia/packages/FFTW/pHa9y/src/fft.jl:430
[2] assert_applicable(p::FFTW.r2rFFTWPlan{Float64, (5,), false, 1, UnitRange{Int64}}, X::SubArray{Float64, 1, Matrix{Float64}, Tuple{Base.Slice{Base.OneTo{Int64}}, Int64}, true}, Y::SubArray{Float64, 1, Matrix{Float64}, Tuple{Base.Slice{Base.OneTo{Int64}}, Int64}, true})
@ FFTW ~/.julia/packages/FFTW/pHa9y/src/fft.jl:435
[3] mul!(y::SubArray{Float64, 1, Matrix{Float64}, Tuple{Base.Slice{Base.OneTo{Int64}}, Int64}, true}, p::FFTW.r2rFFTWPlan{Float64, (5,), false, 1, UnitRange{Int64}}, x::SubArray{Float64, 1, Matrix{Float64}, Tuple{Base.Slice{Base.OneTo{Int64}}, Int64}, true})
@ FFTW ~/.julia/packages/FFTW/pHa9y/src/fft.jl:876
[4] top-level scope
@ REPL[13]:1 |
I was surprised to learn two thing:
NaN
possibly being special cased):So originally I was going to propose adding support to applying a vector plan to a matrix, but now I'm worried it's too slow. Though the scale of the slow-ness seems unbelievable! So perhaps I've just done something wrong.
The text was updated successfully, but these errors were encountered: