-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Upgrade to LLVM v8.0.1 #32712
Upgrade to LLVM v8.0.1 #32712
Conversation
Welp, at least the whitespace check passed.
|
Thanks Elliot!
Aren't we pulling that from |
I fixed the source for |
We aren't building WASM at the moment (a holdover from when LLVMBuilder was used for |
The wasm code generator is unlikely to be mature enough for our use until at least LLVM 9.0 (maybe 10.0). However, we do still link the LLVM support libraries, so we need those patches even absent the wasm target. |
AnalyzeGC now fails in interesting ways, it finds one unrooted things and one null pointer dereference (locally it found two unrooted things). @Keno can you take a look if those are genuine or false-positives? |
It'd be nice to get in the address space patches for WebAssembly from #32734. |
I'd also like to see the WASM target included. |
I'd love to hear more about what's broken and what improvements are coming. For playing with basic static compilation, what's in v8 may be sufficient. That said, waiting for v9 is fine (it'll be here pretty soon). Note that a source compilation of master Julia has the address-space patches and the WebAssembly target for v8. |
There were many bugs between LLVM 8 and LLVM 9. LLVM seems fairly stable, but needs at least https://reviews.llvm.org/D65463 and https://reviews.llvm.org/D65470 in addition when fed with the LLVM IR that julia generates. |
The win32 is readily reproducible:
which is outside a mapped region, hence the seqfault. @vtjnash /@Keno does this trigger any memories? --edit:
which seems to be relocation type --- edit 2:
|
We also need to add the patches we figured out when fixing BinaryBuilder.
Otherwise people who build from source on Mingw (e.g. me) will encounter them again |
cdce970
to
ec36991
Compare
Okay now both Win32 and Win64 fail with:
|
@vchuravy asked me to leave this here:
|
Yay! (We might want to consider going straight to LLVM 9 though, which was just released today) |
I was talking to Keno about that; he seemed happier to be on a |
I think something is going wrong with |
8447b9d
to
faaaaad
Compare
Huh, in the rebase I just did, a bunch of patch commits disappeared, but I suppose that is because the patches have already been merged into |
In particular I am after decent debug-information for GPU kernels and better profiling.
Jup, x-ref #33018
I would hope not, did the BB apply |
Looks like it did to me. |
Oh, huh, the windwos buildbots made it past bootstrap this time. I don't know what that other error message I was seeing was, let's blame buildbot. |
@Keno can you take another look at analyzegc? |
Fixes some missing roots identified by the analysis pass, and clarifies other code to avoid false-positive errors.
Co-authored-by: Keno Fischer <[email protected]>
anyone know why it can't find the c++ headers? if not, I'll just put that commit on a new PR |
Probably better to do that, since it is unrelated from this PR. |
@nanosoldier |
Your benchmark job has completed - possible performance regressions were detected. A full report can be found here. cc @ararslan |
Yay! Onwards to LLVM 9. |
A tangible benefit* to LLVM 9 is that while the LLVM 8 documentation mentions expandload and compressstore, both of these intrinsics caused a crash on a Haswell cluster with the message
Yet these functions work on a build with LLVM 9. julia> versioninfo()
Julia Version 1.4.0-DEV.513
Commit 8f7855a* (2019-11-21 01:58 UTC)
Platform Info:
OS: Linux (x86_64-redhat-linux)
CPU: Intel(R) Xeon(R) CPU E5-2680 v3 @ 2.50GHz
WORD_SIZE: 64
LIBM: libopenlibm
LLVM: libLLVM-9.0.0 (ORCJIT, haswell)
Environment:
JULIA_NUM_THREADS = 24
julia> @time using SIMDPirates
1.459145 seconds (1.64 M allocations: 88.784 MiB, 1.63% gc time)
julia> x = ntuple(Val(4)) do i Core.VecElement(randn()) end
(VecElement{Float64}(0.15844536654536248), VecElement{Float64}(1.3029855351761224), VecElement{Float64}(-0.5349914246588564), VecElement{Float64}(0.4026832110654877))
julia> y = collect(1.0:99.0);
julia> SIMDPirates.expandload!(Vec{4,Float64}, pointer(y), UInt8(5))
(VecElement{Float64}(1.0), VecElement{Float64}(0.0), VecElement{Float64}(2.0), VecElement{Float64}(0.0))
julia> SIMDPirates.compressstore!(pointer(y), x, UInt8(5))
julia> y'
1×99 LinearAlgebra.Adjoint{Float64,Array{Float64,1}}:
0.158445 -0.534991 3.0 4.0 5.0 6.0 7.0 8.0 9.0 10.0 11.0 12.0 13.0 14.0 15.0 16.0 17.0 18.0 19.0 20.0 21.0 22.0 23.0 24.0 25.0 26.0 27.0 28.0 29.0 30.0 31.0 32.0 33.0 34.0 35.0 36.0 37.0 38.0 39.0 40.0 … 60.0 61.0 62.0 63.0 64.0 65.0 66.0 67.0 68.0 69.0 70.0 71.0 72.0 73.0 74.0 75.0 76.0 77.0 78.0 79.0 80.0 81.0 82.0 83.0 84.0 85.0 86.0 87.0 88.0 89.0 90.0 91.0 92.0 93.0 94.0 95.0 96.0 97.0 98.0 99.0
julia> @code_native debuginfo=:none SIMDPirates.expandload!(Vec{4,Float64}, pointer(y), UInt8(5))
.text
vmovd %edx, %xmm0
andl $1, %edx
vmovd %edx, %xmm2
movabsq $.rodata.cst16, %rax
vmovdqa (%rax), %xmm1
vpbroadcastd %xmm0, %xmm0
vpand %xmm1, %xmm0, %xmm0
vpcmpeqd %xmm1, %xmm0, %xmm0
vpsrld $31, %xmm0, %xmm1
vpextrb $0, %xmm2, %eax
testb %al, %al
je L73
vmovq (%rsi), %xmm0 # xmm0 = mem[0],zero
addq $8, %rsi
vpextrb $4, %xmm1, %eax
cmpb $1, %al
jne L101
jmp L87
L73:
vpxor %xmm0, %xmm0, %xmm0
vpextrb $4, %xmm1, %eax
cmpb $1, %al
jne L101
L87:
vmovhps (%rsi), %xmm0, %xmm2 # xmm2 = xmm0[0,1],mem[0,1]
vpblendd $15, %ymm2, %ymm0, %ymm0 # ymm0 = ymm2[0,1,2,3],ymm0[4,5,6,7]
addq $8, %rsi
L101:
vpextrb $8, %xmm1, %eax
cmpb $1, %al
je L122
vpextrb $12, %xmm1, %eax
cmpb $1, %al
je L152
L121:
retq
L122:
vextracti128 $1, %ymm0, %xmm2
vmovlps (%rsi), %xmm2, %xmm2 # xmm2 = mem[0,1],xmm2[2,3]
vinserti128 $1, %xmm2, %ymm0, %ymm0
addq $8, %rsi
vpextrb $12, %xmm1, %eax
cmpb $1, %al
jne L121
L152:
vextracti128 $1, %ymm0, %xmm1
vmovhps (%rsi), %xmm1, %xmm1 # xmm1 = xmm1[0,1],mem[0,1]
vinserti128 $1, %xmm1, %ymm0, %ymm0
retq
nopl (%rax)
julia> @code_native debuginfo=:none SIMDPirates.compressstore!(pointer(y), x, UInt8(5))
.text
vmovd %esi, %xmm1
andl $1, %esi
vmovd %esi, %xmm2
movabsq $.rodata.cst16, %rax
vmovdqa (%rax), %xmm3
vpbroadcastd %xmm1, %xmm1
vpand %xmm3, %xmm1, %xmm1
vpcmpeqd %xmm3, %xmm1, %xmm1
vpsrld $31, %xmm1, %xmm1
vpextrb $0, %xmm2, %eax
testb %al, %al
jne L93
vpextrb $4, %xmm1, %eax
cmpb $1, %al
je L111
L63:
vpextrb $8, %xmm1, %eax
vextractf128 $1, %ymm0, %xmm0
cmpb $1, %al
je L135
L79:
vpextrb $12, %xmm1, %eax
cmpb $1, %al
je L153
L89:
vzeroupper
retq
L93:
vmovlps %xmm0, (%rdi)
addq $8, %rdi
vpextrb $4, %xmm1, %eax
cmpb $1, %al
jne L63
L111:
vmovhps %xmm0, (%rdi)
addq $8, %rdi
vpextrb $8, %xmm1, %eax
vextractf128 $1, %ymm0, %xmm0
cmpb $1, %al
jne L79
L135:
vmovlps %xmm0, (%rdi)
addq $8, %rdi
vpextrb $12, %xmm1, %eax
cmpb $1, %al
jne L89
L153:
vmovhps %xmm0, (%rdi)
vzeroupper
retq
nopw %cs:(%rax,%rax)
nopl (%rax,%rax) Haswell obviously isn't one of the targets that support efficient expand loads or compress stores, but it's nice to have things work. *Impacting approximately 0% of users. |
This was only for LLVM 8. For LLVM 9, you want #33916. |
Upgrade to LLVM v8.0.1, with BinaryBuilder tarballs to match.
edit: closes #31921