Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

^ inlining perf regression #17759

Closed
cortner opened this issue Aug 2, 2016 · 6 comments
Closed

^ inlining perf regression #17759

cortner opened this issue Aug 2, 2016 · 6 comments
Assignees
Labels
performance Must go faster potential benchmark Could make a good benchmark in BaseBenchmarks regression Regression in behavior compared to a previous version
Milestone

Comments

@cortner
Copy link

cortner commented Aug 2, 2016

This was briefly discussed on users.

function test1(N)
    r = 0.234; s = 0.0
    for n = 1:N
        s += r^3 + r^5
    end
    return s
end

function test2(N, f1)
    r = 0.234; s = 0.0
    for n = 1:N
        s += f1(r)
    end
    return s
end

g1(r) = r^3 + r^5
test1(10)
test2(10, g1)
println("Test1: hard-coded functions")
@time test1(1_000_000)
@time test1(1_000_000)
@time test1(1_000_000)
println("Test2: pass functions")
@time test2(1_000_000, g1)
@time test2(1_000_000, g1)
@time test2(1_000_000, g1)

# OUTPUT: yesterday's nightly and master compiled from source are similar
# Test1: hard-coded functions
#   0.086683 seconds (4.00 M allocations: 61.043 MB, 50.75% gc time)
#   0.142487 seconds (4.00 M allocations: 61.035 MB, 76.91% gc time)
#   0.025388 seconds (4.00 M allocations: 61.035 MB, 4.28% gc time)
# Test2: pass functions
#   0.000912 seconds (5 allocations: 176 bytes)
#   0.000860 seconds (5 allocations: 176 bytes)
#   0.000846 seconds (5 allocations: 176 bytes)
@cortner
Copy link
Author

cortner commented Aug 2, 2016

this does not occur with g1(r) = r^2+r^4

@mauro3
Copy link
Contributor

mauro3 commented Aug 2, 2016

This seems interesting too:

function test0(N)
    r = 0.234
    s = 0.0
    for n = 1:N
        s += r^3 # any other number but 3 is fast (well the few I tested)
    end
    s
end
#test0(BigInt(10)) # <- uncommenting this leads to fast execution
test0(10)
@time test0(1_000_000)
@time test0(1_000_000)
@time test0(1_000_000)

This is on 0.5-RC0 and today's master (works fine on 0.4). Also when including above file from the REPL, the second time around it executes fast. So, it seems only the first compiled method of test0 is slow, subsequent ones are fast. Last, LLVM code looks fine only the native code is off.

@tkelman
Copy link
Contributor

tkelman commented Aug 2, 2016

likely an llvm issue?

@KristofferC
Copy link
Member

Maybe but there are calls to jl_gc_pool_alloc and jl_invoke in the native code for the bad case.

@JeffBezanson JeffBezanson added the performance Must go faster label Aug 2, 2016
@vtjnash vtjnash changed the title Strange performance regression with ^ ^ inlining perf regression Aug 3, 2016
@vtjnash
Copy link
Member

vtjnash commented Aug 3, 2016

I see why this is broken. Simple repro:

julia> f(x) = x ^ 3

julia> @code_typed f(1.0)
LambdaInfo for f(::Float64)
:(begin 
        return $(Expr(:invoke, LambdaInfo for *(::Float64, ::Float64, ::Float64), :(Base.*), :(x), :(x), :(x)))
    end::Float64)

julia> @code_llvm f(1.0)
# very long

# correct versions:
julia> @code_typed f(1.0)
LambdaInfo for f(::Float64)
:(begin 
        return (Base.box)(Base.Float64,(Base.mul_float)((Base.box)(Base.Float64,(Base.mul_float)(x,x)),x))
    end::Float64)

julia> @code_llvm f(1.0)
define double @julia_f_68734(double) {
top:
  %1 = fmul double %0, %0
  %2 = fmul double %1, %0
  ret double %2
}

@vtjnash vtjnash self-assigned this Aug 3, 2016
@vtjnash vtjnash added the regression Regression in behavior compared to a previous version label Aug 3, 2016
@vtjnash vtjnash added this to the 0.5.x milestone Aug 3, 2016
@andyferris
Copy link
Member

I've been seeing a few examples of $(Expr(:invoke, ...)) where I was expecting an inlined function. I'll see if I can dig up any other (minimal) examples, if you think it helps? (Or is this expected behaviour somehow?)

vtjnash added a commit that referenced this issue Aug 10, 2016
required for the inliner to work, since all of the logic for computing edges is supposed to be handle by the inference step

fix #17759
@tkelman tkelman added the potential benchmark Could make a good benchmark in BaseBenchmarks label Aug 12, 2016
tkelman pushed a commit that referenced this issue Aug 20, 2016
required for the inliner to work, since all of the logic for computing edges is supposed to be handle by the inference step

fix #17759

(cherry picked from commit 19b1276)
ref #17949
mfasi pushed a commit to mfasi/julia that referenced this issue Sep 5, 2016
required for the inliner to work, since all of the logic for computing edges is supposed to be handle by the inference step

fix JuliaLang#17759
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
performance Must go faster potential benchmark Could make a good benchmark in BaseBenchmarks regression Regression in behavior compared to a previous version
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants