Notes on 21 Oct. 2022

Agenda

Discuss the addition of batched_mul! for >3D arrays in NNlib.jl (link to issue)

Minutes

Video link: https://youtu.be/ByjLYB6_9ng

Discussion on batched_mul!:

NeuralAttentionLib.jl already has a >3D implementation
- Gradients may be slow
- Tried this with ViTs in Metalhead, CPU implementation was much slower
- Probably need to copy NNlib.jl CPU implementation
Expanding NNlib.jl/NNlibCUDA.jl implementations to >3D will require looping over gemm! or CUBLAS.gemm_strided_batched!
Abhirath will tackle an initial NNlib.jl PR and post on Zulip with issues
We can support non-trailing batch dimensions with a keyword argument and a reshape pre-processing step

Brian submitted a benchmarking suite PR to NNlib.jl

This is ready, Kyle will review

Community questions about Diffractor.jl and shared parameters in Optimisers.jl

Diffractor.jl status is unknown
Optimisers.jl supports shared parameters in v0.2.10