Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rules for sortslices, unique #546

Merged
merged 6 commits into from
Nov 30, 2021
Merged
Show file tree
Hide file tree
Changes from 4 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
55 changes: 55 additions & 0 deletions src/rulesets/Base/sort.jl
Original file line number Diff line number Diff line change
@@ -1,3 +1,7 @@
#####
##### `sort`
#####

function rrule(::typeof(partialsort), xs::AbstractVector, k::Union{Integer,OrdinalRange}; kwargs...)
inds = partialsortperm(xs, k; kwargs...)
ys = xs[inds]
Expand Down Expand Up @@ -33,3 +37,54 @@ function rrule(::typeof(sort), xs::AbstractVector; kwargs...)
end
return ys, sort_pullback
end

#####
##### `sortslices`
#####

function rrule(::typeof(sortslices), x::AbstractArray{<:Number}; dims::Integer, kw...)
mcabbott marked this conversation as resolved.
Show resolved Hide resolved
p = sortperm(collect(eachslice(x; dims=dims)); kw...)
inds = ntuple(d -> d == dims ? p : (:), ndims(x))
function sortslices_pullback(dy)
# No actual need to zero this, and if you didn't, then you could widen eltype
# Also, you could use similar(dy) here not x, same size?
dx = _zerolike_writeat(x, unthunk(dy), (), inds...)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need the unthunk?
If so, should we push it down inside the _zerolike_writeat ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that ideally, _zerolike_writeat should be upgraded to return an InplaceThunk. And eventually it should be called grad_getindex or something, too.

I'm not sure whether it should handle un-thunking. I guess it wouldn't hurt to add a method. But since most rules at present call unthunk explicitly, maybe it's clearer to call it here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Arguably we shouldn't be unthunking if the destination that we are writing into can accept Any.
(but practically that case doesn't really matter since performance is already shot. And likely Zygote will hate that)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes this is far from being a high-performance function!

If you don't take the shortcut above, then not all entries were unique, and thus _zerolike_writeat has to copy dy into dx at some nontrivial indices. So it has to slice up dy, I don't think it can write just one thunk anywhere.

return (NoTangent(), ProjectTo(x)(dx))
end
return x[inds...], sortslices_pullback
end

#####
##### `unique`
#####

function rrule(::typeof(unique), x::AbstractArray{<:Number}; dims=:)
axes_x = axes(x)
y = unique(x; dims=dims) # accepts only dims=: or dims::Integer
function unique_pullback(dy_raw)
dy = unthunk(dy_raw)
if length(x) == length(y)
# Short-circuit for the case of all unique, since `mask` is fairly expensive:
dx = reshape(dy, axes_x)
return (NoTangent(), ProjectTo(x)(dx))
end
if dims isa Colon
mcabbott marked this conversation as resolved.
Show resolved Hide resolved
xs, ys = vec(x), y
else
xs, ys = collect(eachslice(x; dims=dims)), collect(eachslice(y; dims=dims))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is an issue open on BlueStyle to remcomment against this
JuliaDiff/BlueStyle#80

If we are going to do this then how do you feel about:

Suggested change
xs, ys = collect(eachslice(x; dims=dims)), collect(eachslice(y; dims=dims))
xs, ys = collect.(eachslice((x, y); dims=dims))

Copy link
Member Author

@mcabbott mcabbott Nov 30, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could avoid this for style. But I think the broadcast is confusing, and perhaps you do too, because I also think it's missing an easy-to-miss dot:

julia> x, y = rand(2,3), rand(2,3);

julia> collect.(eachslice((x, y); dims=1))
ERROR: MethodError: no method matching eachslice(::Tuple{Matrix{Float64}, Matrix{Float64}}; dims=1)

julia> collect.(eachslice.((x, y); dims=1))
(SubArray{Float64, 1, Matrix{Float64}, Tuple{Int64, Base.Slice{Base.OneTo{Int64}}}, true}[[0.5119304534786525, 0.6182654562598278, 0.16701230957752622], [0.7959010118362386, 0.9477852004109513, 0.3864

end
mask = isequal.(permutedims(ys), xs) # unique([0.0, -0.0, NaN, NaN])
mask .= (mask .== cumsum(mask, dims=1) .== true)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is the .== true for handling missing ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, it's that false == 0 satisfies the first ==.

This is a hacky way of writing findfirst(randn(3,3) .> 0; dims=1) as that doesn't exist. I feel like there ought to be a cleverer way like accumulate(xor, mask; dims) or something, but I didn't see it yet.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe it wants a comment:

Suggested change
mask .= (mask .== cumsum(mask, dims=1) .== true)
mask .= (mask .== cumsum(mask, dims=1) .== true) # this implements findfirst(mask; dims=1)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we open an issue on JuliaLang/julia and link it here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess another way to write this is map(findfirst, eachcol(mask)), since we have many slices already.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Am lazy to wait for another round of CI, so I think I call it good enough for now.

keep = map(I -> I[1], findall(mask))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
keep = map(I -> I[1], findall(mask))
keep = map(first, findall(mask))

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this will work:

julia> map(first, findall(randn(3,3) .> 0))
ERROR: iteration is deliberately unsupported for CartesianIndex. Use `I` rather than `I...`, or use `Tuple(I)...`

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wow, I hate it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I sort-of understand why you can't splat or broadcast it, but it's pretty weird that you can still index it.

if dims isa Colon
# The function `_zerolike_writeat` is defined near `maximum`, allows
# second derivatives. Should perhaps eventually be shared with `getindex`.
mcabbott marked this conversation as resolved.
Show resolved Hide resolved
dx = reshape(_zerolike_writeat(vec(x), vec(dy), (), keep), axes_x)
else
inds = ntuple(d -> d==dims ? keep : (:), length(axes_x))
dx = _zerolike_writeat(x, dy, (), inds...)
end
return (NoTangent(), ProjectTo(x)(dx))
end
return y, unique_pullback
end
29 changes: 29 additions & 0 deletions test/rulesets/Base/sort.jl
Original file line number Diff line number Diff line change
Expand Up @@ -12,4 +12,33 @@

test_rrule(partialsort, a, 4, fkwargs=(;rev=true))
end

@testset "sortslices" begin
test_rrule(sortslices, rand(3,4); fkwargs=(; dims=2))
test_rrule(sortslices, rand(5,4); fkwargs=(; dims=1, rev=true, by=last))
test_rrule(sortslices, rand(3,4,5); fkwargs=(; dims=3, by=sum), check_inferred=false)

@test_throws Exception sortslices(Diagonal(1:3), dims=1)
end

@testset "unique" begin
# Trivial case, all unique:
test_rrule(unique, rand(5))
test_rrule(unique, rand(3,4))
test_rrule(unique, rand(3,4); fkwargs=(; dims=2))

# Not all unique:
@test rrule(unique, [1,1,2,3])[1] == [1,2,3]
@test rrule(unique, [1,1,2,3])[2]([10,20,30]) == (NoTangent(), [10, 0, 20, 30])

@test rrule(unique, [1 2; 1 4])[1] == [1,2,4]
@test rrule(unique, [1 2; 1 4])[2]([10,20,30]) == (NoTangent(), [10 20; 0 30])

@test rrule(unique, [1 2 1 2; 1 2 1 4], dims=2)[1] == [1 2 2; 1 2 4]
@test rrule(unique, [1 2 1 2; 1 2 1 4], dims=2)[2]([10 20 30; 40 50 60])[2] == [10 20 0 30; 40 50 0 60]

@test rrule(unique, Diagonal([1,2,3]))[1] == [1,0,2,3]
@test rrule(unique, Diagonal([1,2,3]))[2]([10 20 30 40])[2] == [10.0 0.0 0.0; 0.0 30.0 0.0; 0.0 0.0 40.0]
@test rrule(unique, Diagonal([1,2,3]))[2]([10 20 30 40])[2] isa Diagonal
end
end