Upgrade to LLVM v8.0.1 #32712

staticfloat · 2019-07-27T20:15:33Z

Upgrade to LLVM v8.0.1, with BinaryBuilder tarballs to match.

edit: closes #31921

staticfloat · 2019-07-27T20:30:34Z

Welp, at least the whitespace check passed.

We need to update the analyzegc pass to look at the right header location, as well as change some of the source.
The LLVM build system seems to only install libLLVM.so, while libjulia has linked against libLLVM-8.so. Interesting.
Win32 is segfaulting during bootstrap

vchuravy · 2019-07-29T13:58:33Z

Thanks Elliot!

The LLVM build system seems to only install libLLVM.so, while libjulia has linked against libLLVM-8.so. Interesting.

Aren't we pulling that from llvm-config?

vchuravy · 2019-07-29T15:54:46Z

I fixed the source for GCChecker just now.

vchuravy · 2019-07-29T16:02:48Z

@Keno added three patches for wasm that we will need to pull as well.

julia/deps/llvm.mk

Lines 442 to 444 in 442d159

    
           $(eval $(call LLVM_PATCH,llvm-6.0-D63688-wasm-isLocal)) 
        
           $(eval $(call LLVM_PATCH,llvm-6.0-D64032-cmake-cross)) 
        
           $(eval $(call LLVM_PATCH,llvm-6.0-D64225-cmake-cross2))

staticfloat · 2019-07-29T17:30:22Z

We aren't building WASM at the moment (a holdover from when LLVMBuilder was used for LLVM.jl and not for base Julia and having WASM enabled caused compatibility issues) but we could. Should I rebuild tarballs with WASM enabled?

Keno · 2019-07-29T17:42:39Z

The wasm code generator is unlikely to be mature enough for our use until at least LLVM 9.0 (maybe 10.0). However, we do still link the LLVM support libraries, so we need those patches even absent the wasm target.

vchuravy · 2019-07-29T18:06:46Z

AnalyzeGC now fails in interesting ways, it finds one unrooted things and one null pointer dereference (locally it found two unrooted things). @Keno can you take a look if those are genuine or false-positives?

tshort · 2019-08-02T17:30:02Z

It'd be nice to get in the address space patches for WebAssembly from #32734.

tshort · 2019-08-02T17:33:09Z

I'd also like to see the WASM target included.

staticfloat · 2019-08-02T17:35:55Z

When talking about the WASM backend with @Keno and @vchuravy, I was under the impression that it's pretty broken (for our purposes) until at least LLVM v9+. Why do you want the WASM backend for LLVM v8?

tshort · 2019-08-02T19:39:38Z

I'd love to hear more about what's broken and what improvements are coming. For playing with basic static compilation, what's in v8 may be sufficient. That said, waiting for v9 is fine (it'll be here pretty soon).

Note that a source compilation of master Julia has the address-space patches and the WebAssembly target for v8.

Keno · 2019-08-02T19:44:33Z

There were many bugs between LLVM 8 and LLVM 9. LLVM seems fairly stable, but needs at least https://reviews.llvm.org/D65463 and https://reviews.llvm.org/D65470 in addition when fed with the LLVM IR that julia generates.

vchuravy · 2019-08-14T11:23:24Z

The win32 is readily reproducible:

Thread 1 received signal SIGSEGV, Segmentation fault.
0x09a50013 in japi1_top-level scope_0 ()
(gdb) bt
#0  0x09a50013 in japi1_top-level scope_0 ()
#1  0x6ca69e79 in jl_fptr_args (f=0x0, args=0x0, nargs=0, m=0x7e26590) at /home/User/julia/src/gf.c:1809
#2  0x6ca6abba in _jl_invoke (F=0x0, args=0x0, nargs=0, mfunc=0x7e36650, world=1) at /home/User/julia/src/gf.c:2049
#3  0x6ca6ac45 in jl_invoke (F=0x0, args=0x0, nargs=0, mfunc=0x7e36650) at /home/User/julia/src/gf.c:2056
#4  0x6caa6973 in jl_toplevel_eval_flex (m=0x7e40010, e=0x7e31d30, fast=1, expanded=1) at /home/User/julia/src/toplevel.c:808
#5  0x6ca7474d in jl_parse_eval_all (fname=0x6cd79bc3 <szclass_table+1411> "boot.jl", content=0x0, contentlen=0, inmodule=0x7e40010) at /home/User/julia/src/ast.c:873
#6  0x6caa6dcd in jl_load (module=0x7e40010, fname=0x6cd79bc3 <szclass_table+1411> "boot.jl") at /home/User/julia/src/toplevel.c:878
#7  0x6ca871fb in _julia_init (rel=JL_IMAGE_JULIA_HOME) at /home/User/julia/src/init.c:785
#8  0x6ca8835e in julia_init__threading (rel=JL_IMAGE_JULIA_HOME) at /home/User/julia/src/task.c:229
#9  0x00401e97 in wmain (argc=1, argv=0x6367028, envp=0x6376f70) at /home/User/julia/ui/repl.c:211
#10 0x0040139d in __tmainCRTStartup () at /usr/src/debug/mingw64-i686-runtime-6.0.0-1/crt/crtexe.c:334
#11 0x76290419 in KERNEL32!BaseThreadInitThunk () from /cygdrive/c/Windows/System32/KERNEL32.DLL
#12 0x770a662d in ntdll!RtlGetAppContainerNamedObjectPath () from /cygdrive/c/Windows/SYSTEM32/ntdll.dll
#13 0x770a65fd in ntdll!RtlGetAppContainerNamedObjectPath () from /cygdrive/c/Windows/SYSTEM32/ntdll.dll
#14 0x00000000 in ?? ()

(gdb) disassemble
Dump of assembler code for function japi1_top-level scope_0:
   0x09a50000 <+0>:     push   %ebp
   0x09a50001 <+1>:     mov    %esp,%ebp
   0x09a50003 <+3>:     push   %ebx
   0x09a50004 <+4>:     push   %edi
   0x09a50005 <+5>:     push   %esi
   0x09a50006 <+6>:     and    $0xfffffff0,%esp
   0x09a50009 <+9>:     sub    $0x70,%esp
   0x09a5000c <+12>:    mov    0xc(%ebp),%eax
   0x09a5000f <+15>:    mov    0x38(%esp),%ecx
=> 0x09a50013 <+19>:    mov    0x68cc5004(%ecx),%edx
   0x09a50019 <+25>:    mov    (%edx),%edx
   0x09a5001b <+27>:    xor    %ebp,%edx
   0x09a5001d <+29>:    mov    %edx,0x68(%esp)
   0x09a50021 <+33>:    xorps  %xmm0,%xmm0
   0x09a50024 <+36>:    movaps %xmm0,0x40(%esp)
   0x09a50029 <+41>:    movl   $0x0,0x50(%esp)
   0x09a50031 <+49>:    mov    %eax,0x3c(%esp)
   0x09a50035 <+53>:    mov    $0x6cabb170,%eax

(gdb) info registers
eax            0x0      0
ecx            0xcbf820 13367328
edx            0x0      0
ebx            0x8ebde6f        149675631
esp            0xcbf7c0 0xcbf7c0
ebp            0xcbf848 0xcbf848
esi            0x76c9d8e        124558734
edi            0x358    856
eip            0x9a50013        0x9a50013 <japi1_top-level scope_0+19>
eflags         0x10206  [ PF IF RF ]
cs             0x23     35
ss             0x2b     43
ds             0x2b     43
es             0x2b     43
fs             0x53     83
gs             0x2b     43

        0x68cc5000 - 0x68cc53b4 is .bss in /home/User/win32/usr/bin/libssp-0.dll
(gdb) p/x ($ecx +  0x68cc5004)
$22 = 0x69984824

which is outside a mapped region, hence the seqfault.

@vtjnash /@Keno does this trigger any memories?

--edit:
Ahah! With LLVM assertions

    JULIA /home/User/win32/usr/lib/julia/corecompiler.ji
Relocation type not implemented yet!
UNREACHABLE executed at /workspace/srcdir/llvm-8.0.1.src/lib/ExecutionEngine/RuntimeDyld/RuntimeDyldELF.cpp:359!
make[1]: *** [/home/User/julia/sysimage.mk:60: /home/User/win32/usr/lib/julia/corecompiler.ji] Error 3

which seems to be relocation type R_386_GOT32

--- edit 2:
with JULIA_LLVM_ARGS=-debug-only="dyld"

Parse symbols:
emitSection SectionID: 0 Name: .text obj addr: 0ab5fa10 new addr: 0b050000 DataSize: 637 StubBufSize: 0 Allocate: 637
        Type: 4 Name: japi1_top-level scope_0 SID: 0 Offset: 00000000 flags: 66
Parse relocations:
        SectionID: 0
                RelType: 3 Addend: 0 TargetName: __stack_chk_guard
                SectionID: 0 Offset: 21
                RelType: 3 Addend: 0 TargetName: jl_world_counter
                SectionID: 0 Offset: 91
                RelType: 3 Addend: 0 TargetName: jl_global#1
                SectionID: 0 Offset: 102
                RelType: 2 Addend: 0 TargetName: jl_copy_ast
                SectionID: 0 Offset: 129
                RelType: 3 Addend: 0 TargetName: jl_sym#meta3
                SectionID: 0 Offset: 152
                RelType: 3 Addend: 0 TargetName: jl_sym#nospecialize4
                SectionID: 0 Offset: 160
                RelType: 3 Addend: 0 TargetName: jl_sym#x5
                SectionID: 0 Offset: 168
                RelType: 2 Addend: 0 TargetName: jl_f__expr
                SectionID: 0 Offset: 221
                RelType: 3 Addend: 0 TargetName: jl_global#6
                SectionID: 0 Offset: 244
                RelType: 2 Addend: 0 TargetName: jl_copy_ast
                SectionID: 0 Offset: 263
                RelType: 3 Addend: 0 TargetName: jl_sym#block7
                SectionID: 0 Offset: 286
                RelType: 3 Addend: 0 TargetName: jl_global#8
                SectionID: 0 Offset: 294
                RelType: 3 Addend: 0 TargetName: jl_global#9
                SectionID: 0 Offset: 302
                RelType: 2 Addend: 0 TargetName: jl_f__expr
                SectionID: 0 Offset: 363
                RelType: 3 Addend: 0 TargetName: jl_sym#=10
                SectionID: 0 Offset: 386
                RelType: 2 Addend: 0 TargetName: jl_f__expr
                SectionID: 0 Offset: 435
                RelType: 3 Addend: 0 TargetName: jl_global#11
                SectionID: 0 Offset: 464
                RelType: 2 Addend: 0 TargetName: jl_f__expr
                SectionID: 0 Offset: 509
                RelType: 3 Addend: 0 TargetName: jl_global#13
                SectionID: 0 Offset: 532
                RelType: 3 Addend: 0 TargetName: jlplt_jl_toplevel_eval_in_15_got
                SectionID: 0 Offset: 540
                RelType: 3 Addend: 0 TargetName: jl_global#16
                SectionID: 0 Offset: 565
                RelType: 3 Addend: 0 TargetName: __stack_chk_guard
                SectionID: 0 Offset: 596
                RelType: 2 Addend: 0 TargetName: __stack_chk_fail
                SectionID: 0 Offset: 621
emitSection SectionID: 1 Name: .eh_frame obj addr: 0ab5fc90 new addr: 0b250000 DataSize: 60 StubBufSize: 0 Allocate: 60
        SectionID: 1
                RelType: 2 Addend: 0 TargetName:
                This is section symbol
                SectionID: 1 Offset: 32
Reassigning address for section 1 (.eh_frame): 0x000000000b250000 -> 0x000000000b150000
Reassigning address for section 0 (.text): 0x000000000b050000 -> 0x000000000af50000
----- Contents of section .text before relocations -----
0x000000000af50000: 55 89 e5 53 57 56 83 e4 f0 83 ec 70 8b 45 0c 8b
0x000000000af50010: 4c 24 38 8b 91 00 00 00 00 8b 12 31 ea 89 54 24
0x000000000af50020: 68 0f 57 c0 0f 29 44 24 40 c7 44 24 50 00 00 00
0x000000000af50030: 00 89 44 24 3c b8 70 b1 ab 6c ff d0 89 c1 c7 44
0x000000000af50040: 24 40 06 00 00 00 8b 10 89 54 24 44 8d 54 24 40
0x000000000af50050: 89 10 8b 50 04 8b 74 24 38 8b be 00 00 00 00 8b
0x000000000af50060: 1f 89 58 04 8b 9e 00 00 00 00 8b 1b 89 e6 89 1e
0x000000000af50070: 89 44 24 34 89 4c 24 30 89 54 24 2c 89 7c 24 28
0x000000000af50080: e8 fc ff ff ff 8b 4c 24 28 8b 11 8b 74 24 34 89
0x000000000af50090: 56 04 8b 54 24 38 8b ba 00 00 00 00 8b 3f 8b 9a
0x000000000af500a0: 00 00 00 00 8b 1b 8b 8a 00 00 00 00 8b 09 89 44
0x000000000af500b0: 24 50 89 7c 24 54 89 5c 24 58 89 4c 24 5c 89 e1
0x000000000af500c0: 8d 7c 24 54 89 79 04 c7 41 08 03 00 00 00 c7 01
0x000000000af500d0: 00 00 00 00 89 44 24 24 89 7c 24 20 e8 fc ff ff
0x000000000af500e0: ff 8b 4c 24 28 8b 11 8b 74 24 34 89 56 04 8b 54
0x000000000af500f0: 24 38 8b ba 00 00 00 00 8b 3f 89 44 24 4c 89 e3
0x000000000af50100: 89 3b 89 44 24 1c e8 fc ff ff ff 8b 4c 24 28 8b
0x000000000af50110: 11 8b 74 24 34 89 56 04 8b 54 24 38 8b ba 00 00
0x000000000af50120: 00 00 8b 1f 8b 8a 00 00 00 00 8b 09 8b 92 00 00
0x000000000af50130: 00 00 8b 12 89 44 24 48 89 5c 24 54 89 4c 24 58
0x000000000af50140: 8b 4c 24 1c 89 4c 24 5c 89 54 24 60 89 44 24 64
0x000000000af50150: 89 e0 8b 54 24 20 89 50 04 c7 40 08 05 00 00 00
0x000000000af50160: c7 00 00 00 00 00 89 7c 24 18 e8 fc ff ff ff 8b
0x000000000af50170: 4c 24 28 8b 11 8b 74 24 34 89 56 04 8b 54 24 38
0x000000000af50180: 8b ba 00 00 00 00 8b 3f 89 44 24 48 89 7c 24 54
0x000000000af50190: 8b 7c 24 24 89 7c 24 58 89 44 24 5c 89 e0 8b 5c
0x000000000af501a0: 24 20 89 58 04 c7 40 08 03 00 00 00 c7 00 00 00
0x000000000af501b0: 00 00 e8 fc ff ff ff 8b 4c 24 28 8b 11 8b 74 24
0x000000000af501c0: 34 89 56 04 8b 54 24 18 8b 3a 8b 5c 24 38 8b 8b
0x000000000af501d0: 00 00 00 00 8b 09 89 44 24 48 89 7c 24 54 89 4c
0x000000000af501e0: 24 58 89 44 24 5c 89 e0 8b 4c 24 20 89 48 04 c7
0x000000000af501f0: 40 08 03 00 00 00 c7 00 00 00 00 00 e8 fc ff ff
0x000000000af50200: ff 8b 4c 24 28 8b 11 8b 74 24 34 89 56 04 8b 54
0x000000000af50210: 24 38 8b ba 00 00 00 00 8b 3f 8b 9a 00 00 00 00
0x000000000af50220: 8b 1b 89 44 24 48 89 e1 89 41 04 89 39 ff d3 8b
0x000000000af50230: 4c 24 38 8b 91 00 00 00 00 8b 12 8b 74 24 30 8b
0x000000000af50240: 7c 24 2c 89 7e 04 8b 7c 24 44 89 3e 8b 7c 24 68
0x000000000af50250: 31 ef 8b 99 00 00 00 00 8b 1b 29 fb 89 44 24 14
0x000000000af50260: 89 54 24 10 89 5c 24 0c 75 02 eb 05 e8 fc ff ff
0x000000000af50270: ff 8b 44 24 10 8d 65 f4 5e 5f 5b 5d c3
----- Contents of section .eh_frame before relocations -----
0x000000000b150000: 14 00 00 00 00 00 00 00 01 7a 52 00 01 7c 08 01
0x000000000b150010: 1b 0c 04 04 88 01 00 00 1c 00 00 00 1c 00 00 00
0x000000000b150020: 00 00 00 00 7d 02 00 00 00 41 0e 08 85 02 42 0d
0x000000000b150030: 05 49 86 05 87 04 83 03 00 00 00 00
Resolving relocations Name: jl_global#8 0x8d08e60
Relocation type not implemented yet!
UNREACHABLE executed at /home/User/julia/deps/srccache/llvm-8.0.1/lib/ExecutionEngine/RuntimeDyld/RuntimeDyldELF.cpp:359!

RelType: 3 is R_386_GOT32

vchuravy · 2019-08-14T19:59:02Z

We also need to add the patches we figured out when fixing BinaryBuilder.

Otherwise people who build from source on Mingw (e.g. me) will encounter them again

vchuravy · 2019-08-23T19:14:46Z

Okay now both Win32 and Win64 fail with:

ERROR: could not load library "C:\cygwin\home\Administrator\buildbot\worker\package_win64\build\usr\lib\julia\sys.dll"
%1 is not a valid Win32 application.

gflags -i julia.exe +sls
cdb julia.exe
> g
0b6c:05cc @ 12495953 - LdrpProcessWork - ERROR: Unable to load DLL: "C:\cygwin\home\vchuravy\julia\usr\lib\julia\sys.dll", Parent Module: "(null)", Status: 0xc000007b

Status 0xc000007b is STATUS_INVALID_IMAGE_FORMAT

Keno · 2019-09-11T21:55:46Z

@vchuravy asked me to leave this here:

diff --git a/src/jltypes.c b/src/jltypes.c
index 66aeeae..4728e63 100644
--- a/src/jltypes.c
+++ b/src/jltypes.c
@@ -1043,7 +1043,7 @@ static void check_datatype_parameters(jl_typename_t *tn, jl_value_t **params, si
 arraylist_t partial_inst;
 int inside_typedef = 0;

-static jl_value_t *extract_wrapper(jl_value_t *t)
+static jl_value_t *extract_wrapper(jl_value_t *t JL_PROPAGATES_ROOT)
 {
     t = jl_unwrap_unionall(t);
     if (jl_is_datatype(t))

vchuravy · 2019-09-20T00:11:44Z

Yay! (We might want to consider going straight to LLVM 9 though, which was just released today)

staticfloat · 2019-09-20T02:00:56Z

I was talking to Keno about that; he seemed happier to be on a X.Y.1 release than a X.Y.0 release; less chance of bugs, and not that many things in LLVM 9 that we're interested in. Talking to Tim though, I hear there are GPU goodies that we might be interested in.

staticfloat · 2019-09-20T02:02:45Z

I think something is going wrong with sys.dll on windows?

staticfloat · 2019-09-20T02:06:20Z

Huh, in the rebase I just did, a bunch of patch commits disappeared, but I suppose that is because the patches have already been merged into master?

vchuravy · 2019-09-20T03:13:23Z

Talking to Tim though, I hear there are GPU goodies that we might be interested in.

In particular I am after decent debug-information for GPU kernels and better profiling.

Huh, in the rebase I just did, a bunch of patch commits disappeared, but I suppose that is because the patches have already been merged into master?

Jup, x-ref #33018

I think something is going wrong with sys.dll on windows?

I would hope not, did the BB apply llvm7-revert-D44485?

staticfloat · 2019-09-20T03:18:06Z

I would hope not, did the BB apply llvm7-revert-D44485 ?

Applying patch /workspace/srcdir/llvm_patches/0016-llvm7-revert-D44485.patch                                                                             
patching file lib/MC/WinCOFFObjectWriter.cpp                                                                                                             
Hunk #1 succeeded at 681 (offset -9 lines).

Looks like it did to me.

staticfloat · 2019-09-20T03:19:35Z

Oh, huh, the windwos buildbots made it past bootstrap this time. I don't know what that other error message I was seeing was, let's blame buildbot.

vchuravy · 2019-09-20T13:23:25Z

@Keno can you take another look at analyzegc?

Fixes some missing roots identified by the analysis pass, and clarifies other code to avoid false-positive errors.

Co-authored-by: Keno Fischer <[email protected]>

vtjnash · 2019-11-19T04:42:52Z

anyone know why it can't find the c++ headers? if not, I'll just put that commit on a new PR

vchuravy · 2019-11-19T15:57:21Z

anyone know why it can't find the c++ headers? if not, I'll just put that commit on a new PR

Probably better to do that, since it is unrelated from this PR.

KristofferC · 2019-11-20T10:20:56Z

@nanosoldier runbenchmarks(ALL, vs = ":master")

nanosoldier · 2019-11-20T17:46:00Z

Your benchmark job has completed - possible performance regressions were detected. A full report can be found here. cc @ararslan

vchuravy · 2019-11-20T18:47:05Z

Yay! Onwards to LLVM 9.

chriselrod · 2019-11-21T23:21:26Z

A tangible benefit* to LLVM 9 is that while the LLVM 8 documentation mentions expandload and compressstore, both of these intrinsics caused a crash on a Haswell cluster with the message

LLVM ERROR: Cannot select: 0x2a43b50: v4f64,ch = masked_load<(load 32 from %ir.ptr.i)> 0x26e3c78, 0x2fb7538, 0x2a3fcf8, 0x2ed9e40, /home/c285497/.julia/dev/SIMDPirates/src/memory.jl:823 @[ /home/c285497/.julia/dev/SIMDPirates/src/memory.jl:802 ]

Yet these functions work on a build with LLVM 9.

julia> versioninfo()
Julia Version 1.4.0-DEV.513
Commit 8f7855a* (2019-11-21 01:58 UTC)
Platform Info:
  OS: Linux (x86_64-redhat-linux)
  CPU: Intel(R) Xeon(R) CPU E5-2680 v3 @ 2.50GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-9.0.0 (ORCJIT, haswell)
Environment:
  JULIA_NUM_THREADS = 24

julia> @time using SIMDPirates
  1.459145 seconds (1.64 M allocations: 88.784 MiB, 1.63% gc time)

julia> x = ntuple(Val(4)) do i Core.VecElement(randn()) end
(VecElement{Float64}(0.15844536654536248), VecElement{Float64}(1.3029855351761224), VecElement{Float64}(-0.5349914246588564), VecElement{Float64}(0.4026832110654877))

julia> y = collect(1.0:99.0);

julia> SIMDPirates.expandload!(Vec{4,Float64}, pointer(y), UInt8(5))
(VecElement{Float64}(1.0), VecElement{Float64}(0.0), VecElement{Float64}(2.0), VecElement{Float64}(0.0))

julia> SIMDPirates.compressstore!(pointer(y), x, UInt8(5))

julia> y'
1×99 LinearAlgebra.Adjoint{Float64,Array{Float64,1}}:
 0.158445  -0.534991  3.0  4.0  5.0  6.0  7.0  8.0  9.0  10.0  11.0  12.0  13.0  14.0  15.0  16.0  17.0  18.0  19.0  20.0  21.0  22.0  23.0  24.0  25.0  26.0  27.0  28.0  29.0  30.0  31.0  32.0  33.0  34.0  35.0  36.0  37.0  38.0  39.0  40.0  …  60.0  61.0  62.0  63.0  64.0  65.0  66.0  67.0  68.0  69.0  70.0  71.0  72.0  73.0  74.0  75.0  76.0  77.0  78.0  79.0  80.0  81.0  82.0  83.0  84.0  85.0  86.0  87.0  88.0  89.0  90.0  91.0  92.0  93.0  94.0  95.0  96.0  97.0  98.0  99.0

julia> @code_native debuginfo=:none SIMDPirates.expandload!(Vec{4,Float64}, pointer(y), UInt8(5))
        .text
        vmovd   %edx, %xmm0
        andl    $1, %edx
        vmovd   %edx, %xmm2
        movabsq $.rodata.cst16, %rax
        vmovdqa (%rax), %xmm1
        vpbroadcastd    %xmm0, %xmm0
        vpand   %xmm1, %xmm0, %xmm0
        vpcmpeqd        %xmm1, %xmm0, %xmm0
        vpsrld  $31, %xmm0, %xmm1
        vpextrb $0, %xmm2, %eax
        testb   %al, %al
        je      L73
        vmovq   (%rsi), %xmm0           # xmm0 = mem[0],zero
        addq    $8, %rsi
        vpextrb $4, %xmm1, %eax
        cmpb    $1, %al
        jne     L101
        jmp     L87
L73:
        vpxor   %xmm0, %xmm0, %xmm0
        vpextrb $4, %xmm1, %eax
        cmpb    $1, %al
        jne     L101
L87:
        vmovhps (%rsi), %xmm0, %xmm2    # xmm2 = xmm0[0,1],mem[0,1]
        vpblendd        $15, %ymm2, %ymm0, %ymm0 # ymm0 = ymm2[0,1,2,3],ymm0[4,5,6,7]
        addq    $8, %rsi
L101:
        vpextrb $8, %xmm1, %eax
        cmpb    $1, %al
        je      L122
        vpextrb $12, %xmm1, %eax
        cmpb    $1, %al
        je      L152
L121:
        retq
L122:
        vextracti128    $1, %ymm0, %xmm2
        vmovlps (%rsi), %xmm2, %xmm2    # xmm2 = mem[0,1],xmm2[2,3]
        vinserti128     $1, %xmm2, %ymm0, %ymm0
        addq    $8, %rsi
        vpextrb $12, %xmm1, %eax
        cmpb    $1, %al
        jne     L121
L152:
        vextracti128    $1, %ymm0, %xmm1
        vmovhps (%rsi), %xmm1, %xmm1    # xmm1 = xmm1[0,1],mem[0,1]
        vinserti128     $1, %xmm1, %ymm0, %ymm0
        retq
        nopl    (%rax)

julia> @code_native debuginfo=:none SIMDPirates.compressstore!(pointer(y), x, UInt8(5))
        .text
        vmovd   %esi, %xmm1
        andl    $1, %esi
        vmovd   %esi, %xmm2
        movabsq $.rodata.cst16, %rax
        vmovdqa (%rax), %xmm3
        vpbroadcastd    %xmm1, %xmm1
        vpand   %xmm3, %xmm1, %xmm1
        vpcmpeqd        %xmm3, %xmm1, %xmm1
        vpsrld  $31, %xmm1, %xmm1
        vpextrb $0, %xmm2, %eax
        testb   %al, %al
        jne     L93
        vpextrb $4, %xmm1, %eax
        cmpb    $1, %al
        je      L111
L63:
        vpextrb $8, %xmm1, %eax
        vextractf128    $1, %ymm0, %xmm0
        cmpb    $1, %al
        je      L135
L79:
        vpextrb $12, %xmm1, %eax
        cmpb    $1, %al
        je      L153
L89:
        vzeroupper
        retq
L93:
        vmovlps %xmm0, (%rdi)
        addq    $8, %rdi
        vpextrb $4, %xmm1, %eax
        cmpb    $1, %al
        jne     L63
L111:
        vmovhps %xmm0, (%rdi)
        addq    $8, %rdi
        vpextrb $8, %xmm1, %eax
        vextractf128    $1, %ymm0, %xmm0
        cmpb    $1, %al
        jne     L79
L135:
        vmovlps %xmm0, (%rdi)
        addq    $8, %rdi
        vpextrb $12, %xmm1, %eax
        cmpb    $1, %al
        jne     L89
L153:
        vmovhps %xmm0, (%rdi)
        vzeroupper
        retq
        nopw    %cs:(%rax,%rax)
        nopl    (%rax,%rax)

Haswell obviously isn't one of the targets that support efficient expand loads or compress stores, but it's nice to have things work.

*Impacting approximately 0% of users.

vtjnash · 2019-11-21T23:30:07Z

This was only for LLVM 8. For LLVM 9, you want #33916.

staticfloat requested a review from vchuravy July 27, 2019 20:15

vchuravy force-pushed the sf/llvm8 branch from 1428ee0 to 37c3552 Compare July 29, 2019 15:57

staticfloat force-pushed the sf/llvm8 branch from be4c7f2 to 07d7272 Compare July 30, 2019 16:20

tshort mentioned this pull request Aug 3, 2019

Another try tshort/ExportWebAssembly.jl#13

Open

vchuravy force-pushed the sf/llvm8 branch from d54ad13 to b2b54fc Compare August 3, 2019 20:59

vchuravy mentioned this pull request Aug 7, 2019

Add a build option to rename all LLVM symbols #12644

Closed

yuyichao mentioned this pull request Aug 8, 2019

julia-git 长期打包错误 archlinuxcn/repo#1273

Closed

vchuravy force-pushed the sf/llvm8 branch 2 times, most recently from cdce970 to ec36991 Compare August 20, 2019 09:25

vchuravy mentioned this pull request Aug 22, 2019

add llvm 9 support files (includes llvm 8) #33018

Merged

staticfloat force-pushed the sf/llvm8 branch from 8447b9d to faaaaad Compare September 20, 2019 02:05

vtjnash force-pushed the sf/llvm8 branch from faaaaad to 498a0fb Compare November 18, 2019 23:46

vtjnash and others added 3 commits November 18, 2019 20:28

fixes for gc rooting

95b6e34

Fixes some missing roots identified by the analysis pass, and clarifies other code to avoid false-positive errors.

annotate extract_wrapper for analyzegc

b6c5699

Co-authored-by: Keno Fischer <[email protected]>

Upgrade to LLVM v8.0.1

cd92f60

vtjnash force-pushed the sf/llvm8 branch from d3aa4e3 to 0a93dd8 Compare November 19, 2019 01:28

vtjnash force-pushed the sf/llvm8 branch from 0a93dd8 to cd92f60 Compare November 19, 2019 18:12

vchuravy approved these changes Nov 19, 2019

View reviewed changes

vtjnash merged commit 10463bb into master Nov 20, 2019

vtjnash deleted the sf/llvm8 branch November 20, 2019 18:07

o314 mentioned this pull request Dec 30, 2019

Tasking for Emscripten/Wasm target #32532

Merged

maleadt mentioned this pull request Jan 14, 2020

WIP: Backports for 1.4-RC1 #34238

Merged

28 tasks

tkf mentioned this pull request Apr 6, 2020

Add expandload and compressstore eschnett/SIMD.jl#66

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Upgrade to LLVM v8.0.1 #32712

Upgrade to LLVM v8.0.1 #32712

staticfloat commented Jul 27, 2019 •

edited by vtjnash

Loading

staticfloat commented Jul 27, 2019 •

edited by vchuravy

Loading

vchuravy commented Jul 29, 2019

vchuravy commented Jul 29, 2019

vchuravy commented Jul 29, 2019

staticfloat commented Jul 29, 2019 •

edited

Loading

Keno commented Jul 29, 2019

vchuravy commented Jul 29, 2019

tshort commented Aug 2, 2019

tshort commented Aug 2, 2019

staticfloat commented Aug 2, 2019

tshort commented Aug 2, 2019

Keno commented Aug 2, 2019

vchuravy commented Aug 14, 2019 •

edited

Loading

vchuravy commented Aug 14, 2019 •

edited

Loading

vchuravy commented Aug 23, 2019 •

edited

Loading

Keno commented Sep 11, 2019

vchuravy commented Sep 20, 2019

staticfloat commented Sep 20, 2019

staticfloat commented Sep 20, 2019

staticfloat commented Sep 20, 2019

vchuravy commented Sep 20, 2019

staticfloat commented Sep 20, 2019

staticfloat commented Sep 20, 2019

vchuravy commented Sep 20, 2019

vtjnash commented Nov 19, 2019

vchuravy commented Nov 19, 2019

KristofferC commented Nov 20, 2019

nanosoldier commented Nov 20, 2019

vchuravy commented Nov 20, 2019

chriselrod commented Nov 21, 2019

vtjnash commented Nov 21, 2019

Upgrade to LLVM v8.0.1 #32712

Upgrade to LLVM v8.0.1 #32712

Conversation

staticfloat commented Jul 27, 2019 • edited by vtjnash Loading

staticfloat commented Jul 27, 2019 • edited by vchuravy Loading

vchuravy commented Jul 29, 2019

vchuravy commented Jul 29, 2019

vchuravy commented Jul 29, 2019

staticfloat commented Jul 29, 2019 • edited Loading

Keno commented Jul 29, 2019

vchuravy commented Jul 29, 2019

tshort commented Aug 2, 2019

tshort commented Aug 2, 2019

staticfloat commented Aug 2, 2019

tshort commented Aug 2, 2019

Keno commented Aug 2, 2019

vchuravy commented Aug 14, 2019 • edited Loading

vchuravy commented Aug 14, 2019 • edited Loading

vchuravy commented Aug 23, 2019 • edited Loading

Keno commented Sep 11, 2019

vchuravy commented Sep 20, 2019

staticfloat commented Sep 20, 2019

staticfloat commented Sep 20, 2019

staticfloat commented Sep 20, 2019

vchuravy commented Sep 20, 2019

staticfloat commented Sep 20, 2019

staticfloat commented Sep 20, 2019

vchuravy commented Sep 20, 2019

vtjnash commented Nov 19, 2019

vchuravy commented Nov 19, 2019

KristofferC commented Nov 20, 2019

nanosoldier commented Nov 20, 2019

vchuravy commented Nov 20, 2019

chriselrod commented Nov 21, 2019

vtjnash commented Nov 21, 2019

staticfloat commented Jul 27, 2019 •

edited by vtjnash

Loading

staticfloat commented Jul 27, 2019 •

edited by vchuravy

Loading

staticfloat commented Jul 29, 2019 •

edited

Loading

vchuravy commented Aug 14, 2019 •

edited

Loading

vchuravy commented Aug 14, 2019 •

edited

Loading

vchuravy commented Aug 23, 2019 •

edited

Loading