Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clean up implementation of checked integer arithmetic #14362

Merged
merged 1 commit into from
Dec 17, 2015

Conversation

eschnett
Copy link
Contributor

  • Move checked_* implementation in Base into a module Checked in a new file checked.jl
  • Provide missing checked_* functions for BigInt (BigInt always checks)
  • Provide missing type promotions for checked_* functions
  • Move all checked_* tests into their on file test/checked.jl
  • Clean up handling LLVM codegen bugs
  • Introduce new function jl_get_LLVM_VERSION which works already early during bootstrap
  • Rename intrinsics to have an "_int" suffix, like all other integer intrinsics
  • In coreimg.jl, emulate checked arithmetic by regular (unchecked) arithmetic

@@ -0,0 +1,224 @@
# This file is a part of Julia. License is MIT: http://julialang.org/license

# Checked integer arithmetic
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

needs to be added to choosetests.jl or it won't run

@eschnett eschnett force-pushed the eschnett/cleanup-checked-2 branch 4 times, most recently from eb0f021 to 56bf256 Compare December 11, 2015 05:21
@eschnett
Copy link
Contributor Author

Tests are now running, moved LLVM-specific function to intrinsics.cpp.

@@ -35,6 +35,18 @@ static void jl_init_intrinsic_functions_codegen(Module *m)
#undef ALIAS
}

extern "C" {
JL_DLLEXPORT uint32_t jl_get_LLVM_VERSION(void);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

don't really need the braces for a single function

4 space indent in the function body

@eschnett
Copy link
Contributor Author

Corrected indentation.

# typealias BrokenUnsignedIntMul UnsignedInt

# A tight `widen`
nextwide(::Type{Int8}) = Int16
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the advantage of nextwide over widen?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

widen jumps right away to Int64, which, if you are going to be allocating an array, would be very wasteful.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But why is it useful in this case?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's useful for two reasons:

  • 32-bit arithmetic may be cheaper than 64-bit arithmetic, in particular for multiplication and division.
  • If (my dream!) the compiler auto-vectorizes the code, then smaller types allow packing more operations into a vector. 64-bit integer arithmetic is very difficult to vectorize on x86, 32-bit integer arithmetic is much easier.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see your point, but if there are genuine gains to be had (some benchmarks would be useful), then we should modify widen and widemul more generally. Just having it here is somewhat odd.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See http://www.agner.org/optimize/instruction_tables.pdf for raw benchmarks for various instructions on Intel architecture. I'm looking e.g. at page 188 for the Haswell microarchitecture. There we find for idiv (signed integer division):

  • 32 bit: 22-29 cycles latency, 8-11 cycles inverse throughput (i.e. cycles/instruction)
  • 64 bit: 39-103 cycles, 24-81 cycles inverse throughput
    I've ran a "real" benchmark as well, and can roughly confirm these numbers. On my system, I find
  • 32 bit: about 8 cycles/instruction
  • 64 bit: about 26 cycles/instruction

Should we change the definition of widen, or should we introduce nextwide into Base? Should widen be system-dependent, or only distinguish between 32-bit and 64-bit word sizes?

According to Agner's table, there is no real benefit to go below 32 bits on a modern 64-bit Intel architecture.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is a good argument to change the behaviour of widen (I don't think we want 2 functions which do basically the same thing), but I think that should be a separate pull request.

(In this case, the widening is only used for multiplication, and the performance stats are the same for 32/64bit).

@eschnett
Copy link
Contributor Author

I'm now using widen, and also r %T for performance.

box(T, checked_umul_int(unbox(T,x), unbox(T,y)))
end
function checked_mul{T<:BrokenSignedIntMul}(x::T, y::T)
r = widen(x) * widen(y)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't mean to be pedantic, but you can use widemul for this.

@jakebolewski
Copy link
Member

LGTM. @tkelman the "checked" tests are included in choosetests.jl. Is this waiting on #14414?

@eschnett
Copy link
Contributor Author

@jakebolewski No, this is independent.

@simonbyrne
Copy link
Contributor

LGTM as well.

- Move checked_* implementation in Base into a module Checked in a new file checked.jl
- Provide missing checked_* functions for BigInt (BigInt always checks)
- Provide missing type promotions for checked_* functions
- Move all checked_* tests into their on file test/checked.jl

- Clean up handling LLVM codegen bugs
- Introduce new function jl_get_LLVM_VERSION which works already early during bootstrap
- Rename intrinsics to have an "_int" suffix, like all other integer intrinsics
- In coreimg.jl, emulate checked arithmetic by regular (unchecked) arithmetic
@eschnett eschnett force-pushed the eschnett/cleanup-checked-2 branch from feb085d to 8d7387d Compare December 17, 2015 18:12
@eschnett
Copy link
Contributor Author

Rebased.

simonbyrne added a commit that referenced this pull request Dec 17, 2015
Clean up implementation of checked integer arithmetic
@simonbyrne simonbyrne merged commit ec609cc into JuliaLang:master Dec 17, 2015
@eschnett eschnett deleted the eschnett/cleanup-checked-2 branch December 17, 2015 20:13
div(x::Int128, y::Int128) = Int128(div(BigInt(x),BigInt(y)))
div(x::UInt128, y::UInt128) = UInt128(div(BigInt(x),BigInt(y)))
function div(x::Int128, y::Int128)
try
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using try/catch here is going to be really slow. Would be better to use a test and branch for the one invalid case.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See #14436.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants