`^` inlining perf regression #17759

cortner · 2016-08-02T12:58:27Z

This was briefly discussed on users.

function test1(N)
    r = 0.234; s = 0.0
    for n = 1:N
        s += r^3 + r^5
    end
    return s
end

function test2(N, f1)
    r = 0.234; s = 0.0
    for n = 1:N
        s += f1(r)
    end
    return s
end

g1(r) = r^3 + r^5
test1(10)
test2(10, g1)
println("Test1: hard-coded functions")
@time test1(1_000_000)
@time test1(1_000_000)
@time test1(1_000_000)
println("Test2: pass functions")
@time test2(1_000_000, g1)
@time test2(1_000_000, g1)
@time test2(1_000_000, g1)

# OUTPUT: yesterday's nightly and master compiled from source are similar
# Test1: hard-coded functions
#   0.086683 seconds (4.00 M allocations: 61.043 MB, 50.75% gc time)
#   0.142487 seconds (4.00 M allocations: 61.035 MB, 76.91% gc time)
#   0.025388 seconds (4.00 M allocations: 61.035 MB, 4.28% gc time)
# Test2: pass functions
#   0.000912 seconds (5 allocations: 176 bytes)
#   0.000860 seconds (5 allocations: 176 bytes)
#   0.000846 seconds (5 allocations: 176 bytes)

cortner · 2016-08-02T12:59:58Z

this does not occur with g1(r) = r^2+r^4

mauro3 · 2016-08-02T13:26:27Z

This seems interesting too:

function test0(N)
    r = 0.234
    s = 0.0
    for n = 1:N
        s += r^3 # any other number but 3 is fast (well the few I tested)
    end
    s
end
#test0(BigInt(10)) # <- uncommenting this leads to fast execution
test0(10)
@time test0(1_000_000)
@time test0(1_000_000)
@time test0(1_000_000)

This is on 0.5-RC0 and today's master (works fine on 0.4). Also when including above file from the REPL, the second time around it executes fast. So, it seems only the first compiled method of test0 is slow, subsequent ones are fast. Last, LLVM code looks fine only the native code is off.

tkelman · 2016-08-02T13:42:12Z

likely an llvm issue?

KristofferC · 2016-08-02T13:43:47Z

Maybe but there are calls to jl_gc_pool_alloc and jl_invoke in the native code for the bad case.

vtjnash · 2016-08-03T12:10:00Z

I see why this is broken. Simple repro:

julia> f(x) = x ^ 3

julia> @code_typed f(1.0)
LambdaInfo for f(::Float64)
:(begin 
        return $(Expr(:invoke, LambdaInfo for *(::Float64, ::Float64, ::Float64), :(Base.*), :(x), :(x), :(x)))
    end::Float64)

julia> @code_llvm f(1.0)
# very long

# correct versions:
julia> @code_typed f(1.0)
LambdaInfo for f(::Float64)
:(begin 
        return (Base.box)(Base.Float64,(Base.mul_float)((Base.box)(Base.Float64,(Base.mul_float)(x,x)),x))
    end::Float64)

julia> @code_llvm f(1.0)
define double @julia_f_68734(double) {
top:
  %1 = fmul double %0, %0
  %2 = fmul double %1, %0
  ret double %2
}

andyferris · 2016-08-05T00:19:32Z

I've been seeing a few examples of $(Expr(:invoke, ...)) where I was expecting an inlined function. I'll see if I can dig up any other (minimal) examples, if you think it helps? (Or is this expected behaviour somehow?)

required for the inliner to work, since all of the logic for computing edges is supposed to be handle by the inference step fix #17759

required for the inliner to work, since all of the logic for computing edges is supposed to be handle by the inference step fix #17759 (cherry picked from commit 19b1276) ref #17949

required for the inliner to work, since all of the logic for computing edges is supposed to be handle by the inference step fix JuliaLang#17759

JeffBezanson added the performance Must go faster label Aug 2, 2016

tkelman mentioned this issue Aug 2, 2016

make "dot" operations (.+ etc) fusing broadcasts #17623

Merged

10 tasks

vtjnash changed the title ~~Strange performance regression with ^~~ ^ inlining perf regression Aug 3, 2016

vtjnash self-assigned this Aug 3, 2016

vtjnash added the regression Regression in behavior compared to a previous version label Aug 3, 2016

vtjnash added this to the 0.5.x milestone Aug 3, 2016

vtjnash added a commit that referenced this issue Aug 10, 2016

add model of special case ^ inliner to inference edges

cdbc9cf

required for the inliner to work, since all of the logic for computing edges is supposed to be handle by the inference step fix #17759

vtjnash closed this as completed in 19b1276 Aug 12, 2016

tkelman added the potential benchmark Could make a good benchmark in BaseBenchmarks label Aug 12, 2016

ranjanan mentioned this issue Aug 12, 2016

Add test for PR #17949 #17986

Closed

mfasi pushed a commit to mfasi/julia that referenced this issue Sep 5, 2016

add model of special case ^ inliner to inference edges

671b6f1

required for the inliner to work, since all of the logic for computing edges is supposed to be handle by the inference step fix JuliaLang#17759

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`^` inlining perf regression #17759

`^` inlining perf regression #17759

cortner commented Aug 2, 2016

cortner commented Aug 2, 2016

mauro3 commented Aug 2, 2016

tkelman commented Aug 2, 2016

KristofferC commented Aug 2, 2016

vtjnash commented Aug 3, 2016

andyferris commented Aug 5, 2016

^ inlining perf regression #17759

^ inlining perf regression #17759

Comments

cortner commented Aug 2, 2016

cortner commented Aug 2, 2016

mauro3 commented Aug 2, 2016

tkelman commented Aug 2, 2016

KristofferC commented Aug 2, 2016

vtjnash commented Aug 3, 2016

andyferris commented Aug 5, 2016

`^` inlining perf regression #17759

`^` inlining perf regression #17759