avoid intermediate map allocations in multi-arg mapreduce #55301

mbauman · 2024-07-29T17:02:21Z

Now that the mapreduce infrastructure (mostly) supports broadcasted objects, we can use a lazy Broadcasted on array-likes instead of using an intermediate map. This exercises the internals more thoroughly and identifies a few more places where we need to use ::AbstractArrayOrBroadcasted and requires an offset axis bugfix.

Fixes #38558

local benchmarks

julia> using BenchmarkTools

julia> function f(x, y)
           return @inbounds mapreduce(==,+,x, y)
       end
f (generic function with 1 method)

julia> function f2(x, y)
               total=0
               @inbounds for i in 1:length(x)
                       total += x[i]==y[i]
               end
               return total
       end
f2 (generic function with 1 method)

julia> x = randn(10240); y = similar(x);

julia> @btime f2(x,y)
  1.988 μs (0 allocations: 0 bytes)
0

julia> @btime f(x,y)
  2.023 μs (0 allocations: 0 bytes)
0

Alternative to: #41001 (cc @mcabbott)

Fixes #53417

more benchmarking from 53417

julia> x = randn((512,512));

julia> y = randn((512, 512));

julia> g(x) = x^2
g (generic function with 1 method)

julia> @time mapreduce(g,+,x)
  0.031427 seconds (90.73 k allocations: 4.607 MiB, 24.14% gc time, 99.23% compilation time)
262233.95257231314

julia> @time mapreduce(g,+,x)
  0.000064 seconds (1 allocation: 16 bytes)
262233.95257231314

julia> f(x,y) = x * y
f (generic function with 1 method)

julia> mapreduce(f, +, x, y);

julia> @time mapreduce(f, +, x, y);
  0.000746 seconds (3 allocations: 80 bytes)

Now that the mapreduce infrastructure (mostly) supports broadcasted objects, we can use a lazy Broadcasted on array-likes instead of using an intermediate `map`. This exercises the internals more thoroughly and identifies a few more places where we need to use `::AbstractArrayOrBroadcasted` and requires an offset axis bugfix.

N5N3

The change on mapreduce is a broken one as map(f, c...) has no length constraint if ndims of inputs are 1

julia> mapreduce(+, +, 1:3, 1:2) == mapreduce(+, +, 1:2, 1:2)
true

But the rest seems good though, see also #41054.

base/broadcast.jl

mbauman · 2024-07-29T19:07:44Z

mapreduce(+, +, 1:3, 1:2) == mapreduce(+, +, 1:2, 1:2)

😭

Ooof, this is why we can't have nice things. Pretty wildly, this behavior doesn't look to be tested. I could've sworn we had an absurdly long megathread on this (mis)feature, but I can't find it at the moment. It is documented in the generic map...

For multiple collection arguments, apply f elementwise, and stop when any of them is exhausted.

But for the specific method I'm changing here, there's a slightly different spec:

When acting on multi-dimensional arrays of the same ndims, they must all have the same axes, and the answer will too.

I suppose the easy answer is just fall back to the allocating map for the different-ndims case.

mcabbott · 2024-07-29T19:12:36Z

I think I added the documentation at some point to match the reality... which IIRC was added basically by accident in 1.5, and had no tests for ages. On 1.0 map gave an error instead:

julia> VERSION
v"1.0.5"

julia>  map(+, 1:2, 1:3)
ERROR: DimensionMismatch("dimensions must match")

mbauman · 2024-07-29T19:23:29Z

Yeah, but the iterator version has done it "forever":

julia> VERSION
v"1.0.5"

julia> map(+, Iterators.drop(1:5, 1), Iterators.drop(1:5, 2))
3-element Array{Int64,1}:
 5
 7
 9

It's somewhat related to #46707, but I still can't find the discussion I was thinking of.

this now matches the documentation, but still is not *exactly* the previous behavior

mbauman · 2024-07-30T13:36:00Z

So this is still slightly more restrictive than what we had before. Let's see what Nanosoldier says:

@nanosoldier runtests()

KristofferC · 2024-07-30T13:43:55Z

@nanosoldier runtests()

N5N3 · 2024-07-29T23:42:04Z

base/reducedim.jl

-    reduce(op, map(f, A, B...); kw...)
+function mapreduce(f, op, A::AbstractArrayOrBroadcasted, B::AbstractArrayOrBroadcasted...; kwargs...)
+    Adims = ndims(A)
+    if any(b->Adims != ndims(b), B)


Suggested change

if any(b->Adims != ndims(b), B)

if Adims != 1 && any(b->Adims != ndims(b), B)

N5N3 · 2024-07-30T14:42:20Z

base/reducedim.jl

+        return reduce(op, map(f, A, B...); kwargs...)
+    end
+    Aax = axes(A)
+    all(b->Aax==axes(b), B) || throw(ArgumentError("all arguments must have the same axes"))


Perhaps DimensionMismatch is better than ArgumentError here?

N5N3 · 2024-07-30T16:12:53Z

base/reducedim.jl

@@ -87,7 +94,7 @@ end

 # initialization when computing minima and maxima requires a little care
 for (f1, f2, initval, typeextreme) in ((:min, :max, :Inf, :typemax), (:max, :min, :(-Inf), :typemin))
-    @eval function reducedim_init(f, op::typeof($f1), A::AbstractArray, region)
+    @eval function reducedim_init(f, op::typeof($f1), A::AbstractArrayOrBroadcasted, region)


This change seems incomplete as view in L106 has no Broadcasted support.
we might need

reduced_view(A::AbstractArray, ri) = view(A, ri...) reduced_view(bc::Broadcasted, ri) = Broadcasted(bc.style, bc.f, bc.args, ri)

nanosoldier · 2024-07-31T15:09:53Z

The package evaluation job you requested has completed - possible new issues were detected.
The full report is available.

mbauman · 2024-07-31T22:13:59Z

I'll put this on hold for now — #55318 will unblock the hard part here.

mbauman added the fold sum, maximum, reduce, foldl, etc. label Jul 29, 2024

mbauman requested review from nsajko and N5N3 July 29, 2024 17:02

N5N3 reviewed Jul 29, 2024

View reviewed changes

base/broadcast.jl Show resolved Hide resolved

restore support for args of different dimensionalities

07f320a

this now matches the documentation, but still is not *exactly* the previous behavior

N5N3 reviewed Jul 30, 2024

View reviewed changes

mbauman marked this pull request as draft July 31, 2024 22:09

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

avoid intermediate map allocations in multi-arg mapreduce #55301

avoid intermediate map allocations in multi-arg mapreduce #55301

mbauman commented Jul 29, 2024 •

edited

Loading

N5N3 left a comment

mbauman commented Jul 29, 2024 •

edited

Loading

mcabbott commented Jul 29, 2024

mbauman commented Jul 29, 2024

mbauman commented Jul 30, 2024

KristofferC commented Jul 30, 2024

N5N3 Jul 29, 2024

N5N3 Jul 30, 2024

N5N3 Jul 30, 2024

nanosoldier commented Jul 31, 2024

mbauman commented Jul 31, 2024

	if any(b->Adims != ndims(b), B)
	if Adims != 1 && any(b->Adims != ndims(b), B)

avoid intermediate map allocations in multi-arg mapreduce #55301

Are you sure you want to change the base?

avoid intermediate map allocations in multi-arg mapreduce #55301

Conversation

mbauman commented Jul 29, 2024 • edited Loading

N5N3 left a comment

Choose a reason for hiding this comment

mbauman commented Jul 29, 2024 • edited Loading

mcabbott commented Jul 29, 2024

mbauman commented Jul 29, 2024

mbauman commented Jul 30, 2024

KristofferC commented Jul 30, 2024

N5N3 Jul 29, 2024

Choose a reason for hiding this comment

N5N3 Jul 30, 2024

Choose a reason for hiding this comment

N5N3 Jul 30, 2024

Choose a reason for hiding this comment

nanosoldier commented Jul 31, 2024

mbauman commented Jul 31, 2024

mbauman commented Jul 29, 2024 •

edited

Loading

mbauman commented Jul 29, 2024 •

edited

Loading