-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Clean up implementation of checked integer arithmetic #14362
Clean up implementation of checked integer arithmetic #14362
Conversation
eschnett
commented
Dec 11, 2015
- Move checked_* implementation in Base into a module Checked in a new file checked.jl
- Provide missing checked_* functions for BigInt (BigInt always checks)
- Provide missing type promotions for checked_* functions
- Move all checked_* tests into their on file test/checked.jl
- Clean up handling LLVM codegen bugs
- Introduce new function jl_get_LLVM_VERSION which works already early during bootstrap
- Rename intrinsics to have an "_int" suffix, like all other integer intrinsics
- In coreimg.jl, emulate checked arithmetic by regular (unchecked) arithmetic
@@ -0,0 +1,224 @@ | |||
# This file is a part of Julia. License is MIT: http://julialang.org/license | |||
|
|||
# Checked integer arithmetic |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
needs to be added to choosetests.jl or it won't run
eb0f021
to
56bf256
Compare
Tests are now running, moved LLVM-specific function to |
@@ -35,6 +35,18 @@ static void jl_init_intrinsic_functions_codegen(Module *m) | |||
#undef ALIAS | |||
} | |||
|
|||
extern "C" { | |||
JL_DLLEXPORT uint32_t jl_get_LLVM_VERSION(void); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
don't really need the braces for a single function
4 space indent in the function body
Corrected indentation. |
# typealias BrokenUnsignedIntMul UnsignedInt | ||
|
||
# A tight `widen` | ||
nextwide(::Type{Int8}) = Int16 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the advantage of nextwide
over widen
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
widen
jumps right away to Int64
, which, if you are going to be allocating an array, would be very wasteful.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But why is it useful in this case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's useful for two reasons:
- 32-bit arithmetic may be cheaper than 64-bit arithmetic, in particular for multiplication and division.
- If (my dream!) the compiler auto-vectorizes the code, then smaller types allow packing more operations into a vector. 64-bit integer arithmetic is very difficult to vectorize on x86, 32-bit integer arithmetic is much easier.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see your point, but if there are genuine gains to be had (some benchmarks would be useful), then we should modify widen
and widemul
more generally. Just having it here is somewhat odd.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See http://www.agner.org/optimize/instruction_tables.pdf for raw benchmarks for various instructions on Intel architecture. I'm looking e.g. at page 188 for the Haswell microarchitecture. There we find for idiv
(signed integer division):
- 32 bit: 22-29 cycles latency, 8-11 cycles inverse throughput (i.e. cycles/instruction)
- 64 bit: 39-103 cycles, 24-81 cycles inverse throughput
I've ran a "real" benchmark as well, and can roughly confirm these numbers. On my system, I find - 32 bit: about 8 cycles/instruction
- 64 bit: about 26 cycles/instruction
Should we change the definition of widen
, or should we introduce nextwide
into Base? Should widen
be system-dependent, or only distinguish between 32-bit and 64-bit word sizes?
According to Agner's table, there is no real benefit to go below 32 bits on a modern 64-bit Intel architecture.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is a good argument to change the behaviour of widen
(I don't think we want 2 functions which do basically the same thing), but I think that should be a separate pull request.
(In this case, the widening is only used for multiplication, and the performance stats are the same for 32/64bit).
I'm now using |
box(T, checked_umul_int(unbox(T,x), unbox(T,y))) | ||
end | ||
function checked_mul{T<:BrokenSignedIntMul}(x::T, y::T) | ||
r = widen(x) * widen(y) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't mean to be pedantic, but you can use widemul
for this.
@jakebolewski No, this is independent. |
LGTM as well. |
- Move checked_* implementation in Base into a module Checked in a new file checked.jl - Provide missing checked_* functions for BigInt (BigInt always checks) - Provide missing type promotions for checked_* functions - Move all checked_* tests into their on file test/checked.jl - Clean up handling LLVM codegen bugs - Introduce new function jl_get_LLVM_VERSION which works already early during bootstrap - Rename intrinsics to have an "_int" suffix, like all other integer intrinsics - In coreimg.jl, emulate checked arithmetic by regular (unchecked) arithmetic
feb085d
to
8d7387d
Compare
Rebased. |
Clean up implementation of checked integer arithmetic
div(x::Int128, y::Int128) = Int128(div(BigInt(x),BigInt(y))) | ||
div(x::UInt128, y::UInt128) = UInt128(div(BigInt(x),BigInt(y))) | ||
function div(x::Int128, y::Int128) | ||
try |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using try/catch here is going to be really slow. Would be better to use a test and branch for the one invalid case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See #14436.