- Discuss the addition of
batched_mul!
for >3D arrays in NNlib.jl (link to issue)
Video link: https://youtu.be/ByjLYB6_9ng
Discussion on batched_mul!
:
- NeuralAttentionLib.jl already has a >3D implementation
- Gradients may be slow
- Tried this with ViTs in Metalhead, CPU implementation was much slower
- Probably need to copy NNlib.jl CPU implementation
- Expanding NNlib.jl/NNlibCUDA.jl implementations to >3D will require looping over
gemm!
orCUBLAS.gemm_strided_batched!
- Abhirath will tackle an initial NNlib.jl PR and post on Zulip with issues
- We can support non-trailing batch dimensions with a keyword argument and a
reshape
pre-processing step
Brian submitted a benchmarking suite PR to NNlib.jl
- This is ready, Kyle will review
Community questions about Diffractor.jl and shared parameters in Optimisers.jl
- Diffractor.jl status is unknown
- Optimisers.jl supports shared parameters in v0.2.10