-
Notifications
You must be signed in to change notification settings - Fork 81
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support WrapperGPUArray
nd indexing by fusing vectorized
fallback.
#512
Conversation
WrapperGPUArray
nd indexing by fusing vectorized
fallback.WrapperGPUArray
nd indexing by fusing vectorized
fallback.
Looks like julia> a = oneArray(rand(Int, 1, 1)) .|> identity
1×1 oneArray{Int64, 2, oneAPI.oneL0.DeviceBuffer}:
2601239328758681190 while this is bad julia> a = oneArray(rand(Int128, 1, 1)) .|> identity
InvalidBitWidth: Invalid bit width in input: 128 As for the test failure, I tried to avoid the |
src/host/indexing.jl
Outdated
## Vectorized index overloading for `WrappedGPUArray` | ||
# We overloading `getindex` by dispatch the copy part to our implement. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add a comment why these lower-level overloads are required?
What changed causing this Int128 code path to be hit now? Unless the test explicitly creates an |
It comes from julia> a = Base.MultiplicativeInverses.SignedMultiplicativeInverse(1)
Base.MultiplicativeInverses.SignedMultiplicativeInverse{Int64}(1, 1, 0, 0x00)
julia> @code_typed divrem(1, a)
CodeInfo(
1 ─ %1 = Base.getfield(b, :multiplier)::Int64
│ %2 = Core.sext_int(Core.Int128, a)::Int128 # here
│ %3 = Base.sext_int(Int128, %1)::Int128 # here
│ %4 = Base.mul_int(%2, %3)::Int128 # here
│ %5 = Base.lshr_int(%4, 0x0000000000000040)::Int128 # here
│ %6 = Base.trunc_int(Int64, %5)::Int64
│ %7 = Base.getfield(b, :addmul)::Int8
│ %8 = Base.sext_int(Int64, %7)::Int64
│ %9 = Base.mul_int(a, %8)::Int64
│ %10 = Base.add_int(%6, %9)::Int64
│ %11 = Base.getfield(b, :divisor)::Int64
│ %12 = Base.flipsign_int(%11, %11)::Int64
│ %13 = (%12 === 1)::Bool
│ %14 = Base.getfield(b, :divisor)::Int64
│ %15 = Base.mul_int(a, %14)::Int64
│ %16 = Base.slt_int(%10, 0)::Bool
│ %17 = Base.getfield(b, :shift)::UInt8
│ %18 = Base.ashr_int(%10, %17)::Int64
│ %19 = Core.zext_int(Core.Int64, %16)::Int64
│ %20 = Core.and_int(%19, 1)::Int64
│ %21 = Base.add_int(%20, %18)::Int64
│ %22 = Core.ifelse(%13, %15, %21)::Int64
│ %23 = Base.getfield(b, :divisor)::Int64
│ %24 = Base.mul_int(%22, %23)::Int64
│ %25 = Base.sub_int(a, %24)::Int64
│ %26 = Core.tuple(%22, %25)::Tuple{Int64, Int64}
└── return %26
) => Tuple{Int64, Int64} I guess this means |
Oh, it's only triggering in the newly added test. Yeah, it's probably fine to mark those as |
A quick trial to extend JuliaLang/julia#52626.
It seems bad to overloading internal functions. But this looks like the simplest way.
I took a quick code search on juliahub, and the result looks quite clean. (I wish there's no ambiguity risk.)
@maleadt Since this PR fuses the current fallback. IIUC, your concern on various lower-level routines should be resolved?