Skip to content

Commit

Permalink
[SYCL] Always do internalization in sycl-post-link (#14976)
Browse files Browse the repository at this point in the history
We already had `Internalize` pass being launched as part of a module
cleanup phase in `sycl-post-link`, but it was only invoked when shared
libraries/dynamic linking is enabled.

However, there are other features that need this internalization and one
(and the only, at least for now) is virtual functions. When virtual
functions are used in a program it could be necessary to dynamically
link several device images containing SYCL kernels together.

The problem with that is that there are ITT instrumentation functions
added to modules that have external linkage - they cause multiple
definitions error when we try to link two kernels together, because both
of them are instrumented.

This problem is solved by running `Internalize` pass uncoditionally, but
the criteria of what we can internalize depends on whether we enable
shared libraries/dynamic linking support:
- in regular flow, we can internalize any symbol that is not considered
to be an entry point, because all device images are self-contained and
there are no dependencies between them.
- when dynamic linking is enabled, we should not internalize functions
that can be imported/exported, even if they are not considered as module
entry points.

Tests were updated where necessary to expect or ignore linkage changes
of some functions. InvokeSIMD pass had to be updated: ESIMD is handled
through two-level device code split, meaning two cleanup phases where
each of them could mistakenly drop some compiler-generated functions
unless they are properly marked in LLVM IR.
  • Loading branch information
AlexeySachkov authored Aug 10, 2024
1 parent 3b91b0b commit 2137ff0
Show file tree
Hide file tree
Showing 22 changed files with 305 additions and 87 deletions.
5 changes: 5 additions & 0 deletions llvm/lib/SYCLLowerIR/LowerInvokeSimd.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -415,6 +415,11 @@ bool processInvokeSimdCall(CallInst *InvokeSimd,
CallInst *TheTformedCall = cast<CallInst>(VMap[TheCall]);
TheTformedCall->setCalledFunction(SimdF);
fixFunctionName(NewHelper);
// When we will do ESIMD split, that helper will be moved into ESIMD module
// where it has no uses. To prevent it being internalized and killed by DCE
// during post-split cleanup, we need to add this attribtue and set proper
// linkage.
NewHelper->addFnAttr("referenced-indirectly");
}

// 3. Clone and transform __builtin_invoke_simd call:
Expand Down
44 changes: 37 additions & 7 deletions llvm/lib/SYCLLowerIR/ModuleSplitter.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -660,11 +660,36 @@ void ModuleDesc::restoreLinkageOfDirectInvokeSimdTargets() {
}
}

// Predicate for Internalize pass.
bool mustPreserveGV(const GlobalValue &GV) {
if (const Function *F = dyn_cast<Function>(&GV))
if (!canBeImportedFunction(*F))
return false;
// Predicate for Internalize pass. The pass is very aggressive and essentially
// tries to internalize absolutely everything. This function serves as "input
// from a linker" that tells the pass what must be preserved in order to make
// the transformation safe.
static bool mustPreserveGV(const GlobalValue &GV) {
if (const Function *F = dyn_cast<Function>(&GV)) {
// When dynamic linking is supported, we internalize everything that can
// not be imported which also means that there is no point of having it
// visible outside of the current module.
if (SupportDynamicLinking)
return canBeImportedFunction(*F);

// Otherwise, we are being even more aggressive: SYCL modules are expected
// to be self-contained, meaning that they have no external dependencies.
// Therefore, we can internalize every function that is not an entry point.
// One exception here is virtual functions: when they are in use, modules
// are not self-contained anymore and some device images has to be linked
// at runtime to resolve all symbols.
// Functions marked with referenced-indirectly attribute is another
// exception: that attribute was originally introduced for function pointers
// and even though its main usage was deprecated and dropped, it is still
// used in invoke_simd (but that use needs to be revisited).
return F->hasFnAttribute("sycl-entry-point") ||
F->hasFnAttribute("indirectly-callable") ||
F->hasFnAttribute("referenced-indirectly");
}

// Otherwise, we don't have enough information about a global and picking a
// safe side saying that all other globals must be preserved (we should have
// cleaned up unused globals during dependency graph analysis already).
return true;
}

Expand All @@ -687,12 +712,17 @@ void ModuleDesc::cleanup() {
F.setLinkage(GlobalValue::LinkageTypes::ExternalLinkage);
}

// Callback for internalize can't be a lambda with captures, so we propagate
// necessary information through the module itself.
if (!SupportDynamicLinking)
for (Function *F : EntryPoints.Functions)
F->addFnAttr("sycl-entry-point");

ModuleAnalysisManager MAM;
MAM.registerPass([&] { return PassInstrumentationAnalysis(); });
ModulePassManager MPM;
// Do cleanup.
if (SupportDynamicLinking)
MPM.addPass(InternalizePass(mustPreserveGV));
MPM.addPass(InternalizePass(mustPreserveGV));
MPM.addPass(GlobalDCEPass()); // Delete unreachable globals.
MPM.addPass(StripDeadDebugInfoPass()); // Remove dead debug info.
MPM.addPass(StripDeadPrototypesPass()); // Remove dead func decls.
Expand Down
10 changes: 7 additions & 3 deletions llvm/test/SYCLLowerIR/ESIMD/lower_invoke_simd.ll
Original file line number Diff line number Diff line change
Expand Up @@ -74,18 +74,18 @@ define linkonce_odr dso_local x86_regcallcc <16 x float> @SIMD_CALL_HELPER(ptr n

;---- Check that original SIMD_CALL_HELPER retained, because there are
;---- invoke_simd calls where simd target can't be inferred.
; CHECK: define {{.*}} <16 x float> @SIMD_CALL_HELPER(ptr {{.*}}%{{.*}}, <16 x float> %{{.*}}) #[[HELPER_ATTRS:[0-9]+]] !sycl_explicit_simd !0 !intel_reqd_sub_group_size !1
; CHECK: define weak_odr {{.*}} <16 x float> @SIMD_CALL_HELPER(ptr {{.*}}%{{.*}}, <16 x float> %{{.*}}) #[[HELPER_ATTRS:[0-9]+]] !sycl_explicit_simd !0 !intel_reqd_sub_group_size !1
; CHECK: %{{.*}} = call x86_regcallcc <16 x float> %{{.*}}(<16 x float> %{{.*}})
; CHECK: }

;---- Optimized version for the SIMD_CALLEE call
; CHECK: define {{.*}} <16 x float> @[[NAME1]](<16 x float> %{{.*}}) #[[HELPER_ATTRS]]
; CHECK: define weak_odr {{.*}} <16 x float> @[[NAME1]](<16 x float> %{{.*}}) #[[HELPER_ATTRS1:[0-9]+]]
; Verify that indirect call is converted to direct
; CHECK: %{{.*}} = call x86_regcallcc <16 x float> @SIMD_CALLEE(<16 x float> %{{.*}})
; CHECK: }

;---- Optimized version for the ANOTHER_SIMD_CALLEE call
; CHECK: define {{.*}} <16 x float> @[[NAME2]](<16 x float> %{{.*}}) #[[HELPER_ATTRS]]
; CHECK: define weak_odr {{.*}} <16 x float> @[[NAME2]](<16 x float> %{{.*}}) #[[HELPER_ATTRS1]]
; Verify that indirect call is converted to direct
; CHECK: %{{.*}} = call x86_regcallcc <16 x float> @ANOTHER_SIMD_CALLEE(<16 x float> %{{.*}})
; CHECK: }
Expand All @@ -95,6 +95,10 @@ declare dso_local x86_regcallcc noundef float @_Z33__regcall3____builtin_invoke_
; Check that VCStackCall attribute is added to the invoke_simd target functions:
attributes #0 = { "sycl-module-id"="invoke_simd.cpp" }
; CHECK: attributes #[[HELPER_ATTRS]] = { "VCStackCall" "sycl-module-id"="invoke_simd.cpp" }
; If we transformed the helper, then it should receive "referenced-indirectly"
; attribute so it is not dropped after Internalize + DCE in post-split module
; cleanup
; CHECK: attributes #[[HELPER_ATTRS1]] = { "VCStackCall" "referenced-indirectly" "sycl-module-id"="invoke_simd.cpp" }

!0 = !{}
!1 = !{i32 16}
Original file line number Diff line number Diff line change
Expand Up @@ -34,8 +34,8 @@ entry:
ret void
}

; CHECK-TU1: define dso_local spir_func void @{{.*}}foo{{.*}}()
; CHECK-TU0-NOT: define dso_local spir_func void @{{.*}}foo{{.*}}()
; CHECK-TU1: define {{.*}} spir_func void @{{.*}}foo{{.*}}()
; CHECK-TU0-NOT: define {{.*}} spir_func void @{{.*}}foo{{.*}}()

; CHECK-TU1: call spir_func i32 @{{.*}}bar{{.*}}(i32 1)

Expand Down Expand Up @@ -73,8 +73,8 @@ entry:
ret void
}

; CHECK-TU1: define dso_local spir_func void @{{.*}}foo1{{.*}}()
; CHECK-TU0-NOT: define dso_local spir_func void @{{.*}}foo1{{.*}}()
; CHECK-TU1: define {{.*}} spir_func void @{{.*}}foo1{{.*}}()
; CHECK-TU0-NOT: define {{.*}} spir_func void @{{.*}}foo1{{.*}}()

; Function Attrs: nounwind
define dso_local spir_func void @_Z4foo1v() {
Expand All @@ -97,8 +97,8 @@ entry:
ret void
}

; CHECK-TU1-NOT: define dso_local spir_func void @{{.*}}foo2{{.*}}()
; CHECK-TU0: define dso_local spir_func void @{{.*}}foo2{{.*}}()
; CHECK-TU1-NOT: define {{.*}} spir_func void @{{.*}}foo2{{.*}}()
; CHECK-TU0: define {{.*}} spir_func void @{{.*}}foo2{{.*}}()

; Function Attrs: nounwind
define dso_local spir_func void @_Z4foo2v() {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -38,8 +38,8 @@ entry:
ret void
}

; CHECK-TU1: define dso_local spir_func void @{{.*}}foo{{.*}}()
; CHECK-TU0-NOT: define dso_local spir_func void @{{.*}}foo{{.*}}()
; CHECK-TU1: define {{.*}} spir_func void @{{.*}}foo{{.*}}()
; CHECK-TU0-NOT: define {{.*}} spir_func void @{{.*}}foo{{.*}}()

; CHECK-TU1: call spir_func i32 @{{.*}}bar{{.*}}(i32 1)

Expand Down Expand Up @@ -77,8 +77,8 @@ entry:
ret void
}

; CHECK-TU1: define dso_local spir_func void @{{.*}}foo1{{.*}}()
; CHECK-TU0-NOT: define dso_local spir_func void @{{.*}}foo1{{.*}}()
; CHECK-TU1: define {{.*}} spir_func void @{{.*}}foo1{{.*}}()
; CHECK-TU0-NOT: define {{.*}} spir_func void @{{.*}}foo1{{.*}}()

; Function Attrs: nounwind
define dso_local spir_func void @_Z4foo1v() {
Expand All @@ -101,8 +101,8 @@ entry:
ret void
}

; CHECK-TU1-NOT: define dso_local spir_func void @{{.*}}foo2{{.*}}()
; CHECK-TU0: define dso_local spir_func void @{{.*}}foo2{{.*}}()
; CHECK-TU1-NOT: define {{.*}} spir_func void @{{.*}}foo2{{.*}}()
; CHECK-TU0: define {{.*}} spir_func void @{{.*}}foo2{{.*}}()

; Function Attrs: nounwind
define dso_local spir_func void @_Z4foo2v() {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -33,13 +33,13 @@
;
; CHECK-TU0-IR: @_ZL2GV = internal addrspace(1) constant
; CHECK-TU0-IR: define dso_local spir_kernel void @_ZTSZ4mainE11TU1_kernel0
; CHECK-TU0-IR: define dso_local spir_func i32 @_Z4foo1v
; CHECK-TU0-IR: define {{.*}} spir_func i32 @_Z4foo1v
; CHECK-TU0-IR: define dso_local spir_kernel void @_ZTSZ4mainE11TU1_kernel1
; CHECK-TU0-IR: define dso_local spir_func void @_Z4foo2v
; CHECK-TU0-IR: define {{.*}} spir_func void @_Z4foo2v
;
; CHECK-TU1-IR: define dso_local spir_kernel void @_ZTSZ4mainE10TU0_kernel
; CHECK-TU1-IR: define dso_local spir_func void @_Z3foov
; CHECK-TU1-IR: define dso_local spir_func i32 @_Z4foo3v
; CHECK-TU1-IR: define {{.*}} spir_func void @_Z3foov
; CHECK-TU1-IR: define {{.*}} spir_func i32 @_Z4foo3v

target datalayout = "e-i64:64-v16:16-v24:32-v32:32-v48:64-v96:128-v192:256-v256:256-v512:512-v1024:1024"
target triple = "spir64-unknown-linux"
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@
; CHECK-IR0: define dso_local spir_kernel void @kernel2
;
; CHECK-IR1: @_Z2f1iTable = weak global ptr @_Z2f1i
; CHECK-IR1: define dso_local spir_func i32 @_Z2f1i
; CHECK-IR1: define {{.*}} i32 @_Z2f1i
; CHECK-IR1: define weak_odr dso_local spir_kernel void @kernel1

target datalayout = "e-i64:64-v16:16-v24:32-v32:32-v48:64-v96:128-v192:256-v256:256-v512:512-v1024:1024-n8:16:32:64"
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -33,8 +33,8 @@ entry:
ret void
}

; CHECK-TU1: define dso_local spir_func void @{{.*}}foo{{.*}}()
; CHECK-TU0-NOT: define dso_local spir_func void @{{.*}}foo{{.*}}()
; CHECK-TU1: define {{.*}} spir_func void @{{.*}}foo{{.*}}()
; CHECK-TU0-NOT: define {{.*}} spir_func void @{{.*}}foo{{.*}}()

; CHECK-TU1: call spir_func i32 @{{.*}}bar{{.*}}(i32 1)

Expand Down Expand Up @@ -72,8 +72,8 @@ entry:
ret void
}

; CHECK-TU1: define dso_local spir_func void @{{.*}}foo1{{.*}}()
; CHECK-TU0-NOT: define dso_local spir_func void @{{.*}}foo1{{.*}}()
; CHECK-TU1: define {{.*}} spir_func void @{{.*}}foo1{{.*}}()
; CHECK-TU0-NOT: define {{.*}} spir_func void @{{.*}}foo1{{.*}}()

; Function Attrs: nounwind
define dso_local spir_func void @_Z4foo1v() {
Expand All @@ -96,8 +96,8 @@ entry:
ret void
}

; CHECK-TU1-NOT: define dso_local spir_func void @{{.*}}foo2{{.*}}()
; CHECK-TU0: define dso_local spir_func void @{{.*}}foo2{{.*}}()
; CHECK-TU1-NOT: define {{.*}} spir_func void @{{.*}}foo2{{.*}}()
; CHECK-TU0: define {{.*}} spir_func void @{{.*}}foo2{{.*}}()

; Function Attrs: nounwind
define dso_local spir_func void @_Z4foo2v() {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -72,12 +72,12 @@
; CHECK0-DAG: define spir_func void @BAZ

; CHECK1-DAG: define spir_kernel void @kernel_B
; CHECK1-DAG: define spir_func i32 @foo
; CHECK1-DAG: define {{.*}}spir_func i32 @foo
; CHECK1-DAG: define spir_func i32 @bar
; CHECK1-DAG: define spir_func void @BAZ

; CHECK2-DAG: define spir_kernel void @kernel_A
; CHECK2-DAG: define spir_func void @baz
; CHECK2-DAG: define {{.*}}spir_func void @baz

target datalayout = "e-i64:64-v16:16-v24:32-v32:32-v48:64-v96:128-v192:256-v256:256-v512:512-v1024:1024-n8:16:32:64"
target triple = "spir64-unknown-unknown"
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -39,9 +39,9 @@ entry:
ret void
}

; CHECK-MODULE2: define dso_local spir_func void @{{.*}}foo{{.*}}()
; CHECK-MODULE1-NOT: define dso_local spir_func void @{{.*}}foo{{.*}}()
; CHECK-MODULE0-NOT: define dso_local spir_func void @{{.*}}foo{{.*}}()
; CHECK-MODULE2: define {{.*}} spir_func void @{{.*}}foo{{.*}}()
; CHECK-MODULE1-NOT: define {{.*}} spir_func void @{{.*}}foo{{.*}}()
; CHECK-MODULE0-NOT: define {{.*}} spir_func void @{{.*}}foo{{.*}}()

; CHECK-MODULE2: call spir_func i32 @{{.*}}bar{{.*}}(i32 1)

Expand Down Expand Up @@ -82,9 +82,9 @@ entry:
ret void
}

; CHECK-MODULE2-NOT: define dso_local spir_func void @{{.*}}foo1{{.*}}()
; CHECK-MODULE1: define dso_local spir_func void @{{.*}}foo1{{.*}}()
; CHECK-MODULE0-NOT: define dso_local spir_func void @{{.*}}foo1{{.*}}()
; CHECK-MODULE2-NOT: define {{.*}} spir_func void @{{.*}}foo1{{.*}}()
; CHECK-MODULE1: define {{.*}} spir_func void @{{.*}}foo1{{.*}}()
; CHECK-MODULE0-NOT: define {{.*}} spir_func void @{{.*}}foo1{{.*}}()

; Function Attrs: nounwind
define dso_local spir_func void @_Z4foo1v() {
Expand All @@ -109,9 +109,9 @@ entry:
ret void
}

; CHECK-MODULE2-NOT: define dso_local spir_func void @{{.*}}foo2{{.*}}()
; CHECK-MODULE1-NOT: define dso_local spir_func void @{{.*}}foo2{{.*}}()
; CHECK-MODULE0: define dso_local spir_func void @{{.*}}foo2{{.*}}()
; CHECK-MODULE2-NOT: define {{.*}} spir_func void @{{.*}}foo2{{.*}}()
; CHECK-MODULE1-NOT: define {{.*}} spir_func void @{{.*}}foo2{{.*}}()
; CHECK-MODULE0: define {{.*}} spir_func void @{{.*}}foo2{{.*}}()

; Function Attrs: nounwind
define dso_local spir_func void @_Z4foo2v() {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -71,7 +71,7 @@
;
; @kernel1 uses @foo and therefore @foo should be present in the same module as
; @kernel1 as well
; CHECK-M1-IR-DAG: define spir_func void @foo
; CHECK-M1-IR-DAG: define {{.*}}spir_func void @foo
; CHECK-M1-IR-DAG: define spir_kernel void @kernel1


Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -29,36 +29,36 @@ $bar = comdat any

; Function Attrs: mustprogress norecurse nounwind
define linkonce_odr dso_local spir_func void @foo() unnamed_addr #0 comdat align 2 {
; CHECK-A0: define linkonce_odr dso_local spir_func void @foo
; CHECK-A1: define linkonce_odr dso_local spir_func void @foo
; CHECK-B0: define linkonce_odr dso_local spir_func void @foo
; CHECK-B1: define linkonce_odr dso_local spir_func void @foo
; CHECK-C0: define linkonce_odr dso_local spir_func void @foo
; CHECK-C1: define linkonce_odr dso_local spir_func void @foo
; CHECK-A0: define {{.*}} spir_func void @foo
; CHECK-A1: define {{.*}} spir_func void @foo
; CHECK-B0: define {{.*}} spir_func void @foo
; CHECK-B1: define {{.*}} spir_func void @foo
; CHECK-C0: define {{.*}} spir_func void @foo
; CHECK-C1: define {{.*}} spir_func void @foo
ret void
}

; -- Also check that function called from an addr-taken function is also added
; to every split module.
; Function Attrs: mustprogress norecurse nounwind
define weak dso_local spir_func void @baz() #3 {
; CHECK-A0: define weak dso_local spir_func void @baz
; CHECK-A1: define weak dso_local spir_func void @baz
; CHECK-B0: define weak dso_local spir_func void @baz
; CHECK-B1: define weak dso_local spir_func void @baz
; CHECK-C0: define weak dso_local spir_func void @baz
; CHECK-C1: define weak dso_local spir_func void @baz
; CHECK-A0: define {{.*}} spir_func void @baz
; CHECK-A1: define {{.*}} spir_func void @baz
; CHECK-B0: define {{.*}} spir_func void @baz
; CHECK-B1: define {{.*}} spir_func void @baz
; CHECK-C0: define {{.*}} spir_func void @baz
; CHECK-C1: define {{.*}} spir_func void @baz
ret void
}

; Function Attrs: mustprogress norecurse nounwind
define linkonce_odr dso_local spir_func void @bar() unnamed_addr #1 comdat align 2 {
; CHECK-A0: define linkonce_odr dso_local spir_func void @bar
; CHECK-A1: define linkonce_odr dso_local spir_func void @bar
; CHECK-B0: define linkonce_odr dso_local spir_func void @bar
; CHECK-B1: define linkonce_odr dso_local spir_func void @bar
; CHECK-C0: define linkonce_odr dso_local spir_func void @bar
; CHECK-C1: define linkonce_odr dso_local spir_func void @bar
; CHECK-A0: define {{.*}} spir_func void @bar
; CHECK-A1: define {{.*}} spir_func void @bar
; CHECK-B0: define {{.*}} spir_func void @bar
; CHECK-B1: define {{.*}} spir_func void @bar
; CHECK-C0: define {{.*}} spir_func void @bar
; CHECK-C1: define {{.*}} spir_func void @bar
call void @baz()
ret void
}
Expand Down
Loading

0 comments on commit 2137ff0

Please sign in to comment.