Improve `cat` rules #660

mcabbott · 2022-08-15T05:12:45Z

This should allow FluxML/Zygote.jl#1277, by using @allowscalar in front of indexing.

Also inserts Base.require_one_based_indexing as a guard against upgrades to Base or OffsetArrays.

mzgubic

Thanks for doing this. My (limited) understanding of what is going on here is:

Scalar indexing operations are very slow on the GPU. The design decision of some GPU package is that rather than being silently slow, the code breaks. Scalar indexing must be turned on explicitly for each operation, which is what @allowscalar does.

We are ok with doing that here because it won't happen many times: it will happen N times, where N is the number of array arguments in the Xcat call. Usually we want to avoid doing this M times, where M is the number of elements in an array.

I'm curious: what is the usual approach of dealing with this? Rewriting code in a more efficient way, which does array multiplication rather than scalar indexing? And if that can't be done like in this case, just eat the cost with @allowscalar?

Approving subject to not being totally off the mark in the above :)

mzgubic · 2022-08-15T09:04:54Z

src/rulesets/Base/array.jl

-                sum(view(dY, ind...))
-            end
+            dX = @allowscalar dY[ind...]
+            # Here InplaceableThunk breaks @inferred, removed for now


is this still relevant? I guess it broke inference because of the if statement above?

Indeed, have now restored InplaceableThunk

mcabbott · 2022-08-15T13:27:44Z

it will happen N times, where N is the number of array arguments in the Xcat call. Usually we want to avoid doing this M times, where M is the number of elements in an array.

Yes. In my rough understanding, any cat gradient will involve N trips to the GPU. Hopefully N is small. The error is to warn us about code which accidentally makes M trips.

The scalar case for cat is pretty weird, as you have to get lucky on the forward pass. IDK if this is something anyone really does, or just something Chris once wanted and got included into Zygote's tests. (Exactly one test, of course.)

julia> using JLArrays

julia> vcat(jl([1,2]),3)
3-element JLArray{Int64, 1}:
 1
 2
 3

julia> vcat(0,jl([1,2]),3)
┌ Warning: Performing scalar indexing on task Task (runnable) @0x000000010ccb0010.
│ Invocation of getindex resulted in scalar indexing of a GPU array.
│ This is typically caused by calling an iterating implementation of a method.
│ Such implementations *do not* execute on the GPU, but very slowly on the CPU,
│ and therefore are only permitted from the REPL for prototyping purposes.
│ If you did intend to index this array, annotate the caller with @allowscalar.
└ @ GPUArraysCore ~/.julia/packages/GPUArraysCore/ZBmfM/src/GPUArraysCore.jl:90
4-element Vector{Int64}:
 0
 1
 2
 3

mcabbott added 2 commits August 14, 2022 22:10

use allowscalar in cat rules

2f41ca3

use require_one_based_indexing

6436015

mzgubic approved these changes Aug 15, 2022

View reviewed changes

restore InplaceableThunk

7237dc0

mcabbott merged commit 63cc4e0 into JuliaDiff:main Aug 15, 2022

mcabbott deleted the cat2 branch August 15, 2022 16:02

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve `cat` rules #660

Improve `cat` rules #660

mcabbott commented Aug 15, 2022 •

edited

Loading

mzgubic left a comment

mzgubic Aug 15, 2022

mcabbott Aug 15, 2022

mcabbott commented Aug 15, 2022

Improve cat rules #660

Improve cat rules #660

Conversation

mcabbott commented Aug 15, 2022 • edited Loading

mzgubic left a comment

Choose a reason for hiding this comment

mzgubic Aug 15, 2022

Choose a reason for hiding this comment

mcabbott Aug 15, 2022

Choose a reason for hiding this comment

mcabbott commented Aug 15, 2022

Improve `cat` rules #660

Improve `cat` rules #660

mcabbott commented Aug 15, 2022 •

edited

Loading