-
Notifications
You must be signed in to change notification settings - Fork 13k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[CodeGen][StaticDataPartitioning]Place local-linkage global variables in hot or unlikely prefixed sections based on profile information #125756
base: main
Are you sure you want to change the base?
Conversation
…es based on profile information
@llvm/pr-subscribers-llvm-analysis @llvm/pr-subscribers-llvm-ir Author: Mingming Liu (mingmingl-llvm) ChangesIn this PR, static-data-splitter pass finds out the local-linkage global variables in { A follow-up item is to analyze global variable initializers and count for access from data.
Some stats of static-data-splitter with this patch:
Patch is 28.13 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/125756.diff 14 Files Affected:
diff --git a/llvm/include/llvm/IR/Function.h b/llvm/include/llvm/IR/Function.h
index fcd5396ccfdbc87..29041688124bc29 100644
--- a/llvm/include/llvm/IR/Function.h
+++ b/llvm/include/llvm/IR/Function.h
@@ -346,12 +346,6 @@ class LLVM_ABI Function : public GlobalObject, public ilist_node<Function> {
/// sample PGO, to enable the same inlines as the profiled optimized binary.
DenseSet<GlobalValue::GUID> getImportGUIDs() const;
- /// Set the section prefix for this function.
- void setSectionPrefix(StringRef Prefix);
-
- /// Get the section prefix for this function.
- std::optional<StringRef> getSectionPrefix() const;
-
/// hasGC/getGC/setGC/clearGC - The name of the garbage collection algorithm
/// to use during code generation.
bool hasGC() const {
diff --git a/llvm/include/llvm/IR/GlobalObject.h b/llvm/include/llvm/IR/GlobalObject.h
index 08edc13d81f880a..bb50c39813e1407 100644
--- a/llvm/include/llvm/IR/GlobalObject.h
+++ b/llvm/include/llvm/IR/GlobalObject.h
@@ -124,6 +124,17 @@ class GlobalObject : public GlobalValue {
/// appropriate default object file section.
void setSection(StringRef S);
+ /// Set the section prefix for this global object.
+ void setSectionPrefix(StringRef Prefix);
+
+ /// Update the section prefix, unless the existing prefix is the same as
+ /// `KeepPrefix`.
+ void updateSectionPrefix(StringRef Prefix,
+ std::optional<StringRef> KeepPrefix = std::nullopt);
+
+ /// Get the section prefix for this global object.
+ std::optional<StringRef> getSectionPrefix() const;
+
bool hasComdat() const { return getComdat() != nullptr; }
const Comdat *getComdat() const { return ObjComdat; }
Comdat *getComdat() { return ObjComdat; }
diff --git a/llvm/include/llvm/IR/MDBuilder.h b/llvm/include/llvm/IR/MDBuilder.h
index e02ec8f5a3d8bb1..ce4e1da656049d1 100644
--- a/llvm/include/llvm/IR/MDBuilder.h
+++ b/llvm/include/llvm/IR/MDBuilder.h
@@ -89,8 +89,8 @@ class MDBuilder {
MDNode *createFunctionEntryCount(uint64_t Count, bool Synthetic,
const DenseSet<GlobalValue::GUID> *Imports);
- /// Return metadata containing the section prefix for a function.
- MDNode *createFunctionSectionPrefix(StringRef Prefix);
+ /// Return metadata containing the section prefix for a global object.
+ MDNode *createGlobalObjectSectionPrefix(StringRef Prefix);
/// Return metadata containing the pseudo probe descriptor for a function.
MDNode *createPseudoProbeDesc(uint64_t GUID, uint64_t Hash, StringRef FName);
diff --git a/llvm/lib/CodeGen/StaticDataSplitter.cpp b/llvm/lib/CodeGen/StaticDataSplitter.cpp
index e5bf0a5a3a255f6..f09e3b41e0723e6 100644
--- a/llvm/lib/CodeGen/StaticDataSplitter.cpp
+++ b/llvm/lib/CodeGen/StaticDataSplitter.cpp
@@ -9,13 +9,13 @@
// The pass uses branch profile data to assign hotness based section qualifiers
// for the following types of static data:
// - Jump tables
+// - Module-internal global variables
// - Constant pools (TODO)
-// - Other module-internal data (TODO)
//
// For the original RFC of this pass please see
// https://discourse.llvm.org/t/rfc-profile-guided-static-data-partitioning/83744
-#include "llvm/ADT/ScopeExit.h"
+#include "llvm/ADT/APInt.h"
#include "llvm/ADT/Statistic.h"
#include "llvm/Analysis/ProfileSummaryInfo.h"
#include "llvm/CodeGen/MBFIWrapper.h"
@@ -27,9 +27,12 @@
#include "llvm/CodeGen/MachineFunctionPass.h"
#include "llvm/CodeGen/MachineJumpTableInfo.h"
#include "llvm/CodeGen/Passes.h"
+#include "llvm/IR/GlobalVariable.h"
+#include "llvm/IR/Module.h"
#include "llvm/InitializePasses.h"
#include "llvm/Pass.h"
#include "llvm/Support/CommandLine.h"
+#include "llvm/Target/TargetLoweringObjectFile.h"
using namespace llvm;
@@ -46,12 +49,27 @@ class StaticDataSplitter : public MachineFunctionPass {
const MachineBlockFrequencyInfo *MBFI = nullptr;
const ProfileSummaryInfo *PSI = nullptr;
- // Returns true iff any jump table is hot-cold categorized.
- bool splitJumpTables(MachineFunction &MF);
+ void updateStats(bool ProfileAvailable, const MachineJumpTableInfo *MJTI);
+ void updateJumpTableStats(bool ProfileAvailable,
+ const MachineJumpTableInfo &MJTI);
- // Same as above but works on functions with profile information.
- bool splitJumpTablesWithProfiles(const MachineFunction &MF,
- MachineJumpTableInfo &MJTI);
+ // Use profiles to partition static data.
+ bool partitionStaticDataWithProfiles(MachineFunction &MF);
+
+ // If the global value is a local linkage global variable, return it.
+ // Otherwise, return nullptr.
+ const GlobalVariable *getLocalLinkageGlobalVariable(const GlobalValue *GV);
+
+ // Returns true if the global variable is in one of {.rodata, .bss, .data,
+ // .data.rel.ro} sections
+ bool inStaticDataSection(const GlobalVariable *GV, const TargetMachine &TM);
+
+ // Iterate all global variables in the module and update the section prefix
+ // of the module-internal data.
+ void updateGlobalVariableSectionPrefix(MachineFunction &MF);
+
+ // Accummulated data profile count across machine functions in the module.
+ DenseMap<const GlobalVariable *, APInt> DataProfileCounts;
public:
static char ID;
@@ -77,13 +95,24 @@ bool StaticDataSplitter::runOnMachineFunction(MachineFunction &MF) {
MBFI = &getAnalysis<MachineBlockFrequencyInfoWrapperPass>().getMBFI();
PSI = &getAnalysis<ProfileSummaryInfoWrapperPass>().getPSI();
- return splitJumpTables(MF);
+ const bool ProfileAvailable = PSI && PSI->hasProfileSummary() && MBFI &&
+ MF.getFunction().hasProfileData();
+ bool Changed = false;
+
+ if (ProfileAvailable)
+ Changed |= partitionStaticDataWithProfiles(MF);
+
+ updateGlobalVariableSectionPrefix(MF);
+ updateStats(ProfileAvailable, MF.getJumpTableInfo());
+ return Changed;
}
-bool StaticDataSplitter::splitJumpTablesWithProfiles(
- const MachineFunction &MF, MachineJumpTableInfo &MJTI) {
+bool StaticDataSplitter::partitionStaticDataWithProfiles(MachineFunction &MF) {
int NumChangedJumpTables = 0;
+ const TargetMachine &TM = MF.getTarget();
+ MachineJumpTableInfo *MJTI = MF.getJumpTableInfo();
+
// Jump table could be used by either terminating instructions or
// non-terminating ones, so we walk all instructions and use
// `MachineOperand::isJTI()` to identify jump table operands.
@@ -92,63 +121,131 @@ bool StaticDataSplitter::splitJumpTablesWithProfiles(
for (const auto &MBB : MF) {
for (const MachineInstr &I : MBB) {
for (const MachineOperand &Op : I.operands()) {
- if (!Op.isJTI())
- continue;
- const int JTI = Op.getIndex();
- // This is not a source block of jump table.
- if (JTI == -1)
+ std::optional<uint64_t> Count = std::nullopt;
+ if (!Op.isJTI() && !Op.isGlobal())
continue;
- auto Hotness = MachineFunctionDataHotness::Hot;
+ Count = MBFI->getBlockProfileCount(&MBB);
+
+ if (Op.isJTI()) {
+ assert(MJTI != nullptr && "Jump table info is not available.");
+ const int JTI = Op.getIndex();
+ // This is not a source block of jump table.
+ if (JTI == -1)
+ continue;
+
+ auto Hotness = MachineFunctionDataHotness::Hot;
+
+ // Hotness is based on source basic block hotness.
+ // TODO: PSI APIs are about instruction hotness. Introduce API for
+ // data access hotness.
+ if (Count && PSI->isColdCount(*Count))
+ Hotness = MachineFunctionDataHotness::Cold;
- // Hotness is based on source basic block hotness.
- // TODO: PSI APIs are about instruction hotness. Introduce API for data
- // access hotness.
- if (PSI->isColdBlock(&MBB, MBFI))
- Hotness = MachineFunctionDataHotness::Cold;
+ if (MJTI->updateJumpTableEntryHotness(JTI, Hotness))
+ ++NumChangedJumpTables;
+ } else if (Op.isGlobal()) {
+ // Find global variables with local linkage
+ const GlobalVariable *GV =
+ getLocalLinkageGlobalVariable(Op.getGlobal());
+ if (!GV || !inStaticDataSection(GV, TM))
+ continue;
- if (MJTI.updateJumpTableEntryHotness(JTI, Hotness))
- ++NumChangedJumpTables;
+ // Acccumulate data profile count across machine function
+ // instructions.
+ // TODO: Analyze global variable's initializers.
+ if (Count) {
+ auto [It, Inserted] =
+ DataProfileCounts.try_emplace(GV, APInt(128, 0));
+ It->second += *Count;
+ }
+ }
}
}
}
return NumChangedJumpTables > 0;
}
-bool StaticDataSplitter::splitJumpTables(MachineFunction &MF) {
- MachineJumpTableInfo *MJTI = MF.getJumpTableInfo();
- if (!MJTI || MJTI->getJumpTables().empty())
- return false;
-
- const bool ProfileAvailable = PSI && PSI->hasProfileSummary() && MBFI &&
- MF.getFunction().hasProfileData();
- auto statOnExit = llvm::make_scope_exit([&] {
- if (!AreStatisticsEnabled())
- return;
+void StaticDataSplitter::updateJumpTableStats(
+ bool ProfileAvailable, const MachineJumpTableInfo &MJTI) {
+ if (!ProfileAvailable) {
+ NumUnknownJumpTables += MJTI.getJumpTables().size();
+ return;
+ }
- if (!ProfileAvailable) {
- NumUnknownJumpTables += MJTI->getJumpTables().size();
- return;
+ for (size_t JTI = 0; JTI < MJTI.getJumpTables().size(); JTI++) {
+ auto Hotness = MJTI.getJumpTables()[JTI].Hotness;
+ if (Hotness == MachineFunctionDataHotness::Hot) {
+ ++NumHotJumpTables;
+ } else {
+ assert(Hotness == MachineFunctionDataHotness::Cold &&
+ "A jump table is either hot or cold when profile information is "
+ "available.");
+ ++NumColdJumpTables;
}
+ }
+}
- for (size_t JTI = 0; JTI < MJTI->getJumpTables().size(); JTI++) {
- auto Hotness = MJTI->getJumpTables()[JTI].Hotness;
- if (Hotness == MachineFunctionDataHotness::Hot) {
- ++NumHotJumpTables;
- } else {
- assert(Hotness == MachineFunctionDataHotness::Cold &&
- "A jump table is either hot or cold when profile information is "
- "available.");
- ++NumColdJumpTables;
- }
- }
- });
+void StaticDataSplitter::updateStats(bool ProfileAvailable,
+ const MachineJumpTableInfo *MJTI) {
+ if (!AreStatisticsEnabled())
+ return;
- // Place jump tables according to block hotness if function has profile data.
- if (ProfileAvailable)
- return splitJumpTablesWithProfiles(MF, *MJTI);
+ if (MJTI)
+ updateJumpTableStats(ProfileAvailable, *MJTI);
+}
- return true;
+const GlobalVariable *
+StaticDataSplitter::getLocalLinkageGlobalVariable(const GlobalValue *GV) {
+ if (!GV || GV->isDeclarationForLinker())
+ return nullptr;
+
+ return GV->hasLocalLinkage() ? dyn_cast<GlobalVariable>(GV) : nullptr;
+}
+
+bool StaticDataSplitter::inStaticDataSection(const GlobalVariable *GV,
+ const TargetMachine &TM) {
+ assert(GV && "Caller guaranteed");
+
+ // Skip LLVM reserved symbols.
+ if (GV->getName().starts_with("llvm."))
+ return false;
+
+ SectionKind Kind = TargetLoweringObjectFile::getKindForGlobal(GV, TM);
+ return Kind.isData() || Kind.isReadOnly() || Kind.isReadOnlyWithRel() ||
+ Kind.isBSS();
+}
+
+void StaticDataSplitter::updateGlobalVariableSectionPrefix(
+ MachineFunction &MF) {
+ for (GlobalVariable &GV : MF.getFunction().getParent()->globals()) {
+ if (GV.isDeclarationForLinker())
+ continue;
+ // DataProfileCounts accumulates data profile count across all machine
+ // function instructions, and it can't model the indirect accesses through
+ // other global variables' initializers.
+ // TODO: Analyze the users of module-internal global variables and see
+ // through the users' initializers. Do not place a global variable into
+ // unlikely section if any of its users are potentially hot.
+ auto Iter = DataProfileCounts.find(&GV);
+ if (Iter == DataProfileCounts.end())
+ continue;
+
+ // StaticDataSplitter is made a machine function pass rather than a module
+ // pass because (Lazy)MachineBlockFrequencyInfo is a machine-function
+ // analysis pass and cannot be used for a legacy module pass.
+ // As a result, we use `DataProfileCounts` to accumulate data
+ // profile count across machine functions and update global variable section
+ // prefix once per machine function.
+ // FIXME: Make StaticDataSplitter a module pass under new pass manager
+ // framework, and set global variable section prefix once per module after
+ // analyzing all machine functions.
+ if (PSI->isColdCount(Iter->second.getZExtValue())) {
+ GV.updateSectionPrefix("unlikely", std::make_optional(StringRef("hot")));
+ } else if (PSI->isHotCount(Iter->second.getZExtValue())) {
+ GV.updateSectionPrefix("hot");
+ }
+ }
}
char StaticDataSplitter::ID = 0;
diff --git a/llvm/lib/CodeGen/TargetLoweringObjectFileImpl.cpp b/llvm/lib/CodeGen/TargetLoweringObjectFileImpl.cpp
index 3c2c7c8c9fed69a..d20ab29cc197974 100644
--- a/llvm/lib/CodeGen/TargetLoweringObjectFileImpl.cpp
+++ b/llvm/lib/CodeGen/TargetLoweringObjectFileImpl.cpp
@@ -670,6 +670,7 @@ getELFSectionNameForGlobal(const GlobalObject *GO, SectionKind Kind,
}
bool HasPrefix = false;
+
if (const auto *F = dyn_cast<Function>(GO)) {
// Jump table hotness takes precedence over its enclosing function's hotness
// if it's known. The function's section prefix is used if jump table entry
@@ -687,6 +688,11 @@ getELFSectionNameForGlobal(const GlobalObject *GO, SectionKind Kind,
raw_svector_ostream(Name) << '.' << *Prefix;
HasPrefix = true;
}
+ } else if (const auto *GV = dyn_cast<GlobalVariable>(GO)) {
+ if (std::optional<StringRef> Prefix = GV->getSectionPrefix()) {
+ raw_svector_ostream(Name) << '.' << *Prefix;
+ HasPrefix = true;
+ }
}
if (UniqueSectionName) {
diff --git a/llvm/lib/IR/Function.cpp b/llvm/lib/IR/Function.cpp
index e6f0d64d071ba67..5666f0a53866fda 100644
--- a/llvm/lib/IR/Function.cpp
+++ b/llvm/lib/IR/Function.cpp
@@ -1164,22 +1164,6 @@ DenseSet<GlobalValue::GUID> Function::getImportGUIDs() const {
return R;
}
-void Function::setSectionPrefix(StringRef Prefix) {
- MDBuilder MDB(getContext());
- setMetadata(LLVMContext::MD_section_prefix,
- MDB.createFunctionSectionPrefix(Prefix));
-}
-
-std::optional<StringRef> Function::getSectionPrefix() const {
- if (MDNode *MD = getMetadata(LLVMContext::MD_section_prefix)) {
- assert(cast<MDString>(MD->getOperand(0))->getString() ==
- "function_section_prefix" &&
- "Metadata not match");
- return cast<MDString>(MD->getOperand(1))->getString();
- }
- return std::nullopt;
-}
-
bool Function::nullPointerIsDefined() const {
return hasFnAttribute(Attribute::NullPointerIsValid);
}
diff --git a/llvm/lib/IR/Globals.cpp b/llvm/lib/IR/Globals.cpp
index db5e1cb57b1bab8..884089262e4659d 100644
--- a/llvm/lib/IR/Globals.cpp
+++ b/llvm/lib/IR/Globals.cpp
@@ -18,6 +18,7 @@
#include "llvm/IR/GlobalAlias.h"
#include "llvm/IR/GlobalValue.h"
#include "llvm/IR/GlobalVariable.h"
+#include "llvm/IR/MDBuilder.h"
#include "llvm/IR/Module.h"
#include "llvm/Support/Error.h"
#include "llvm/Support/ErrorHandling.h"
@@ -286,6 +287,35 @@ void GlobalObject::setSection(StringRef S) {
setGlobalObjectFlag(HasSectionHashEntryBit, !S.empty());
}
+void GlobalObject::setSectionPrefix(StringRef Prefix) {
+ MDBuilder MDB(getContext());
+ setMetadata(LLVMContext::MD_section_prefix,
+ MDB.createGlobalObjectSectionPrefix(Prefix));
+}
+
+void GlobalObject::updateSectionPrefix(StringRef Prefix,
+ std::optional<StringRef> KeepPrefix) {
+ auto SectionPrefix = getSectionPrefix();
+ if (SectionPrefix && (*SectionPrefix == Prefix ||
+ (KeepPrefix && *SectionPrefix == *KeepPrefix)))
+ return;
+
+ setSectionPrefix(Prefix);
+ return;
+}
+
+std::optional<StringRef> GlobalObject::getSectionPrefix() const {
+ if (MDNode *MD = getMetadata(LLVMContext::MD_section_prefix)) {
+ [[maybe_unused]] StringRef MDName =
+ cast<MDString>(MD->getOperand(0))->getString();
+ assert((MDName == "section_prefix" ||
+ (isa<Function>(this) && MDName == "function_section_prefix")) &&
+ "Metadata not match");
+ return cast<MDString>(MD->getOperand(1))->getString();
+ }
+ return std::nullopt;
+}
+
bool GlobalValue::isNobuiltinFnDef() const {
const Function *F = dyn_cast<Function>(this);
if (!F || F->empty())
diff --git a/llvm/lib/IR/MDBuilder.cpp b/llvm/lib/IR/MDBuilder.cpp
index 26c8ab9fc36c850..b6aa8844a7eafa7 100644
--- a/llvm/lib/IR/MDBuilder.cpp
+++ b/llvm/lib/IR/MDBuilder.cpp
@@ -87,9 +87,9 @@ MDNode *MDBuilder::createFunctionEntryCount(
return MDNode::get(Context, Ops);
}
-MDNode *MDBuilder::createFunctionSectionPrefix(StringRef Prefix) {
- return MDNode::get(
- Context, {createString("function_section_prefix"), createString(Prefix)});
+MDNode *MDBuilder::createGlobalObjectSectionPrefix(StringRef Prefix) {
+ return MDNode::get(Context,
+ {createString("section_prefix"), createString(Prefix)});
}
MDNode *MDBuilder::createRange(const APInt &Lo, const APInt &Hi) {
diff --git a/llvm/test/CodeGen/X86/data-section-prefix.ll b/llvm/test/CodeGen/X86/data-section-prefix.ll
new file mode 100644
index 000000000000000..4812fc70758fbce
--- /dev/null
+++ b/llvm/test/CodeGen/X86/data-section-prefix.ll
@@ -0,0 +1,27 @@
+; RUN: llc -mtriple x86_64-linux-gnu -data-sections %s -o - | FileCheck %s --check-prefix=ELF
+; RUN: llc -mtriple x86_64-linux-gnu -unique-section-names=0 -data-sections %s -o - | FileCheck %s --check-prefix=ELF-NOUNIQ
+
+; RUN: llc -mtriple x86_64-windows-msvc -data-sections %s -o - | FileCheck %s --check-prefix=COFF-MSVC
+
+; ELF: .section .data.hot.foo,
+; ELF: .section .data.bar,
+; ELF: .section .bss.unlikely.baz,
+; ELF: .section .bss.quz,
+
+; ELF-NOUNIQ: .section .data.hot.,"aw",@progbits,unique,1
+; ELF-NOUNIQ: .section .data,"aw",@progbits,unique,2
+; ELF-NOUNIQ: .section .bss.unlikely.,"aw",@nobits,unique,3
+; ELF-NOUNIQ: .section .bss,"aw",@nobits,unique,4
+
+; COFF-MSVC: .section .data,"dw",one_only,foo
+; COFF-MSVC: .section .data,"dw",one_only,bar
+; COFF-MSVC: .section .bss,"bw",one_only,baz
+; COFF-MSVC: .section .bss,"bw",one_only,quz
+
+@foo = global i32 1, !section_prefix !0
+@bar = global i32 2
+@baz = global i32 0, !section_prefix !1
+@quz = global i32 0
+
+!0 = !{!"section_prefix", !"hot"}
+!1 = !{!"section_prefix", !"unlikely"}
diff --git a/llvm/test/CodeGen/X86/global-variable-partition.ll b/llvm/test/CodeGen/X86/global-variable-partition.ll
new file mode 100644
index 000000000000000..bb77f3362406bdc
--- /dev/null
+++ b/llvm/test/CodeGen/X86/global-variable-partition.ll
@@ -0,0 +1,165 @@
+
+
+target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-i128:128-f80:128-n8:16:32:64-S128"
+target triple = "x86_64-unknown-linux-gnu"
+
+; RUN: llc -mtriple=x86_64-unknown-linux-gnu -enable-split-machine-functions \
+; RUN: -partition-static-data-sections=true -data-sections=true \
+; RUN: -unique-section-names=true -relocation-model=pic \
+; RUN: %s -o - 2>&1 | FileCheck %s --check-prefixes=SYM,DATA
+
+; RUN: llc -mtriple=x86_64-unknown-linux-gnu -enable-split-machine-functions \
+; RUN: -partition-static-data-sections=true -data-sections=true \
+; RUN: -unique-section-names=false -relocation-model=pic \
+; RUN: %s -o - 2>&1 | FileCheck %s --check-prefixes=UNIQ,DATA
+
+; RUN: llc -mtriple=x86_64-unknown-linux-gnu -enable-split-machine-functions \
+; RUN: -partition-static-data-sections=true -data-sections=false \
+; RUN: -unique-section-names=false -relocation-model=pic \
+; RUN: %s -o - 2>&1 | FileCheck %s --check-prefixes=AGG,DATA
+
+; SYM: .section .rodata.str1.1.hot.
+; UNIQ: .section .rodata.str1.1.hot.,"aMS",@progbits,1
+; AGG: .section .rodata.str1.1.hot
+; DATA: .L.str
+; DATA: "hot\t"
+; DATA: .L.str.1
+; DATA: "%d\t%d\t%d\n"
+
+
+; SYM: .sect...
[truncated]
|
cc @Colibrow |
This is a split of #125756
Is there a reason for using |
std::optional<uint64_t> Count = std::nullopt; | ||
if (!Op.isJTI() && !Op.isGlobal()) | ||
continue; | ||
|
||
auto Hotness = MachineFunctionDataHotness::Hot; | ||
Count = MBFI->getBlockProfileCount(&MBB); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
std::optional<uint64_t> Count = std::nullopt; | |
if (!Op.isJTI() && !Op.isGlobal()) | |
continue; | |
auto Hotness = MachineFunctionDataHotness::Hot; | |
Count = MBFI->getBlockProfileCount(&MBB); | |
if (!Op.isJTI() && !Op.isGlobal()) | |
continue; | |
std::optional<uint64_t> Count = MBFI->getBlockProfileCount(&MBB); |
is that OK?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes the change is ok.
I kept getting "Applying suggestions on deleted lines is currently not supported." error when trying to commit the suggestion. I'll make the change in the local client..
@foo = global i32 1, !section_prefix !0 | ||
@bar = global i32 2 | ||
@baz = global i32 0, !section_prefix !1 | ||
@quz = global i32 0 | ||
|
||
!0 = !{!"section_prefix", !"hot"} | ||
!1 = !{!"section_prefix", !"unlikely"} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure if data_section_prefix would make sense.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Function and global variables are derived classes of GlobalObject, and currently function section prefix metadata has function_section_prefix
.
To me, data_section_prefix
carries the semantic that a global variable/object is a piece of data, which may become inaccurate. Alternative to setting global object's section prefix, I can keep function's methods as they are in TOT, and implements {set,get,update}SectionPrefix
for GlobalVariables (not its superclass GlobalObject) and name it global_variable_section_prefix
. How does that sound?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes. I get it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a reason for using .unlikely as a prefix rather than .cold? .cold seems more obvious as an opposite of .hot and we use it elsewhere in LLVM, most notably hot-cold splitting.
@petrhosek Thanks for the question. unlikely
is used to follow the conventions of functions in the codegenprepare pass. I'm open to name changes.
If identical section names (as string literals) are not de-duplicated in the relocatable object files (e.g., when compiled with -data-sections=true -unique-section-names=false
), cold
will give smaller section names than unlikely
per data section where this matters.
std::optional<uint64_t> Count = std::nullopt; | ||
if (!Op.isJTI() && !Op.isGlobal()) | ||
continue; | ||
|
||
auto Hotness = MachineFunctionDataHotness::Hot; | ||
Count = MBFI->getBlockProfileCount(&MBB); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes the change is ok.
I kept getting "Applying suggestions on deleted lines is currently not supported." error when trying to commit the suggestion. I'll make the change in the local client..
@foo = global i32 1, !section_prefix !0 | ||
@bar = global i32 2 | ||
@baz = global i32 0, !section_prefix !1 | ||
@quz = global i32 0 | ||
|
||
!0 = !{!"section_prefix", !"hot"} | ||
!1 = !{!"section_prefix", !"unlikely"} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Function and global variables are derived classes of GlobalObject, and currently function section prefix metadata has function_section_prefix
.
To me, data_section_prefix
carries the semantic that a global variable/object is a piece of data, which may become inaccurate. Alternative to setting global object's section prefix, I can keep function's methods as they are in TOT, and implements {set,get,update}SectionPrefix
for GlobalVariables (not its superclass GlobalObject) and name it global_variable_section_prefix
. How does that sound?
I believe the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good! Just some minor comments.
assert(GV && "Caller guaranteed"); | ||
|
||
// Skip LLVM reserved symbols. | ||
if (GV->getName().starts_with("llvm.")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could you please explain why need to skip LLVM reserved symbols?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good question. The globals with llvm.
as a name prefix are usually handled specially, by many (middle-end and back-end) passes (e.g., AsmPrinter emits special LLVM values specially), and they are skipped in this pass mostly out of conservativeness.
I moved this check before calling inStaticDataSection
helper though for readability, and added a comment around the check. What do you think about this?
// framework, and set global variable section prefix once per module after | ||
// analyzing all machine functions. | ||
if (PSI->isColdCount(Iter->second.getZExtValue())) { | ||
Changed |= GV.updateSectionPrefix("unlikely", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why not prefix "split" instead of "unlikely" to align with "MachineFunctionSplitter"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My understanding is that functions have .hot
or .unlikely
prefixes when they are placed as a whole (in the codegenprepare pass), and have .split
as a prefix when basic blocks of a function are further categorized into hot and non-hot ones (in the MFS pass you mentioned).
Here the granularity is a piece of global variable, and .split
would be a good name if a global variable itself is split into hot and cold ones. For instance, there should be hot and cold functions in a vtable, but it'd take substantial work to implement an ABI to support the split vtable entries.
} | ||
} | ||
} | ||
} | ||
return NumChangedJumpTables > 0; | ||
} | ||
|
||
const GlobalVariable * | ||
StaticDataSplitter::getLocalLinkageGlobalVariable(const GlobalValue *GV) { | ||
if (!GV || GV->isDeclarationForLinker()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can "GV->hasLocalLinkage()" cover "GV->isDeclarationForLinker()"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good question. I haven't thought about it before.
It turns out IR verifier requires a declaration to have one of 'valid' decl linkages, and valid decl linkages must be at least external according to isValidDeclarationLinkage.
I removed GV->isDeclarationForLinker()
here and added a comment.
// framework, and set global variable section prefix once per module after | ||
// analyzing all machine functions. | ||
if (PSI->isColdCount(Iter->second.getZExtValue())) { | ||
Changed |= GV.updateSectionPrefix("unlikely", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updateSectionPrefix returns void so is this UB? Regardless, I think we can drop the updateSectionPrefix helper in #125757.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done.
const GlobalVariable *getLocalLinkageGlobalVariable(const GlobalValue *GV); | ||
|
||
// Returns true if the global variable is in one of {.rodata, .bss, .data, | ||
// .data.rel.ro} sections |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: period at the end like the other comments?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done.
llvm/include/llvm/IR/GlobalObject.h
Outdated
|
||
/// Update the section prefix, unless the existing prefix is the same as | ||
/// `KeepPrefix`. | ||
bool updateSectionPrefix(StringRef Prefix, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't forget to drop this after merging #125757.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done.
It occurred to me that counters will increase monotonically (before it becomes max), so I asserted a global variable should not have .hot
prefix under if (PSI->isColdCount(Iter->second)) {
around StaticDataSplitter.cpp L 224.
bool updateGlobalVariableSectionPrefix(MachineFunction &MF); | ||
|
||
// Accummulated data profile count across machine functions in the module. | ||
DenseMap<const GlobalVariable *, APInt> DataProfileCounts; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a strong reason to use APInt as the value? When passed to the isColdCount
API we extract the value as a uint64_t anyway (with an assertion). I think if you are concerned about overflows then we can use AddOverflow
from Support/MathExtras.h
on L163 and add an assert. Additionally, there we could use a saturating count using getInstrMaxCountValue
as the limit if we do have cases which overflow. Wdyt?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's a good idea to make counters uint64_t
and clamp overflows as max. Done with saturating add and clamping to getInstrMaxCountValue
. Also removed #include "llvm/ADT/APInt.h"
Hotness = MachineFunctionDataHotness::Cold; | ||
|
||
if (MJTI->updateJumpTableEntryHotness(JTI, Hotness)) | ||
++NumChangedJumpTables; | ||
} else if (Op.isGlobal()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to check Op.isGlobal()
again? I think we can only get past L128 if Op is JTI or global and we check JTI on L133.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not need for now (i.e., before Op.isCPI
is used to handle constant pools) as you point out, so I removed it.
…lObjects, the base class of {Function, GlobalVariable, IFunc} (#125757) This is a split of llvm/llvm-project#125756
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks for the reviews!
const GlobalVariable *getLocalLinkageGlobalVariable(const GlobalValue *GV); | ||
|
||
// Returns true if the global variable is in one of {.rodata, .bss, .data, | ||
// .data.rel.ro} sections |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done.
llvm/include/llvm/IR/GlobalObject.h
Outdated
|
||
/// Update the section prefix, unless the existing prefix is the same as | ||
/// `KeepPrefix`. | ||
bool updateSectionPrefix(StringRef Prefix, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done.
It occurred to me that counters will increase monotonically (before it becomes max), so I asserted a global variable should not have .hot
prefix under if (PSI->isColdCount(Iter->second)) {
around StaticDataSplitter.cpp L 224.
bool updateGlobalVariableSectionPrefix(MachineFunction &MF); | ||
|
||
// Accummulated data profile count across machine functions in the module. | ||
DenseMap<const GlobalVariable *, APInt> DataProfileCounts; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's a good idea to make counters uint64_t
and clamp overflows as max. Done with saturating add and clamping to getInstrMaxCountValue
. Also removed #include "llvm/ADT/APInt.h"
Hotness = MachineFunctionDataHotness::Cold; | ||
|
||
if (MJTI->updateJumpTableEntryHotness(JTI, Hotness)) | ||
++NumChangedJumpTables; | ||
} else if (Op.isGlobal()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not need for now (i.e., before Op.isCPI
is used to handle constant pools) as you point out, so I removed it.
} | ||
} | ||
} | ||
} | ||
return NumChangedJumpTables > 0; | ||
} | ||
|
||
const GlobalVariable * | ||
StaticDataSplitter::getLocalLinkageGlobalVariable(const GlobalValue *GV) { | ||
if (!GV || GV->isDeclarationForLinker()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good question. I haven't thought about it before.
It turns out IR verifier requires a declaration to have one of 'valid' decl linkages, and valid decl linkages must be at least external according to isValidDeclarationLinkage.
I removed GV->isDeclarationForLinker()
here and added a comment.
assert(GV && "Caller guaranteed"); | ||
|
||
// Skip LLVM reserved symbols. | ||
if (GV->getName().starts_with("llvm.")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good question. The globals with llvm.
as a name prefix are usually handled specially, by many (middle-end and back-end) passes (e.g., AsmPrinter emits special LLVM values specially), and they are skipped in this pass mostly out of conservativeness.
I moved this check before calling inStaticDataSection
helper though for readability, and added a comment around the check. What do you think about this?
// framework, and set global variable section prefix once per module after | ||
// analyzing all machine functions. | ||
if (PSI->isColdCount(Iter->second.getZExtValue())) { | ||
Changed |= GV.updateSectionPrefix("unlikely", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done.
// framework, and set global variable section prefix once per module after | ||
// analyzing all machine functions. | ||
if (PSI->isColdCount(Iter->second.getZExtValue())) { | ||
Changed |= GV.updateSectionPrefix("unlikely", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My understanding is that functions have .hot
or .unlikely
prefixes when they are placed as a whole (in the codegenprepare pass), and have .split
as a prefix when basic blocks of a function are further categorized into hot and non-hot ones (in the MFS pass you mentioned).
Here the granularity is a piece of global variable, and .split
would be a good name if a global variable itself is split into hot and cold ones. For instance, there should be hot and cold functions in a vtable, but it'd take substantial work to implement an ABI to support the split vtable entries.
This is a split of llvm#125756
…he base class of {Function, GlobalVariable, IFunc} (llvm#125757) This is a split of llvm#125756
* In StaticDataProfileInfo.h/cpp, add an immutable pass to keep track of constants and their profile information across functions in a module. * Add a module pass, StaticDataAnnotator, to set global variable's section prefix based on module-wide hotness.
In this PR, static-data-splitter pass finds out the local-linkage global variables in {
.rodata
,.data.rel.ro
,bss
,.data
} sections by analyzing machine instruction operands, and aggregates their accesses from code across functions.A follow-up item is to analyze global variable initializers and count for access from data.
bss2
anddata3
inllvm/test/CodeGen/X86/global-variable-partition.ll
.Some stats of static-data-splitter with this patch:
#perf-sample-in-hot-prefixed <data> section / #perf-sample in <data.*> section
for each section.MEM_INST_RETIRED.ALL_LOADS:u:pinned:precise=2
events at a high frequency (perf -c 2251
) for 30 seconds. The profiled binary is built as non-PIE sodata.rel.ro
coverage data is not available.<data>
section size percentage is defined asunlikely <data> section size / the sum size of <data>.* sections
for each<data>
section