-
Notifications
You must be signed in to change notification settings - Fork 4.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add ECAL portable data formats, collections and conditions for alpaka #42930
Add ECAL portable data formats, collections and conditions for alpaka #42930
Conversation
type ecal |
enable gpu |
+code-checks Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-42930/37074
|
A new Pull Request was created by @thomreis (Thomas Reis) for master. It involves the following packages:
@mandrenguyen, @civanch, @jfernan2, @cmsbuild, @mdhildreth, @francescobrivio, @saumyaphor4252, @perrotta, @consuegs can you please review it and eventually sign? Thanks. cms-bot commands are listed here |
please test |
assign heterogeneous
|
<use name="HeterogeneousCore/CUDACore"/> | ||
<use name="HeterogeneousCore/CUDAUtilities"/> | ||
<use name="boost"/> | ||
<use name="boost_serialization"/> | ||
<use name="rootmath"/> | ||
<use name="clhep"/> | ||
<use name="cuda"/> | ||
<use name="HeterogeneousCore/AlpakaCore"/> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This dependence should not be needed in data format package.
<use name="HeterogeneousCore/CUDACore"/> | ||
<use name="HeterogeneousCore/CUDAUtilities"/> | ||
<use name="boost"/> | ||
<use name="boost_serialization"/> | ||
<use name="rootmath"/> | ||
<use name="clhep"/> | ||
<use name="cuda"/> | ||
<use name="HeterogeneousCore/AlpakaCore"/> | ||
<use name="HeterogeneousCore/AlpakaInterface"/> | ||
<flags ALPAKA_BACKENDS="1"/> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be
<flags ALPAKA_BACKENDS="1"/> | |
<flags ALPAKA_BACKENDS="cuda rocm"/> |
or (in a recent IB)
<flags ALPAKA_BACKENDS="1"/> | |
<flags ALPAKA_BACKENDS="!serial"/> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rebased to yesterday's IB and implemented the second option.
DataFormats/EcalDigi/BuildFile.xml
Outdated
<use name="FWCore/MessageLogger"/> | ||
<use name="FWCore/Utilities"/> | ||
<use name="HeterogeneousCore/AlpakaInterface"/> | ||
<flags ALPAKA_BACKENDS="cuda rocm"/> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In a recent IB could be also
<flags ALPAKA_BACKENDS="cuda rocm"/> | |
<flags ALPAKA_BACKENDS="!serial"/> |
|
||
GENERATE_SOA_LAYOUT(EcalDigiPhase2SoALayout, | ||
SOA_COLUMN(uint32_t, id), | ||
SOA_COLUMN(DataArrayPhase2Struct, data), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The sampleSize
seems to be 16 (phase1) or 10 (phase2) So this column has 16-or-10 2-byte sub-elements per SoA-element (i.e. 32-or-20 bytes).
How is this data structure accessed by the consuming code? Does the consuming code use a GPU-thread per element of this SoA? If yes, then I'm wondering if it would be more performant to split this column into sampleSize
arrays with Eigen vectors? In principle that should yield memory access pattern that is more coalesced than this one.
(as far as I'm concerned this optimization can be done later, it would anyway be better to study the impact on the performance rather than to rely on anything I wrote above on the top of my head)
(this pattern repeats in other SoAs as well, I'm not commenting on them in order to have one place to discuss about it)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It should be the other way around, in fact. 10 elements for Phase 1 and 16 for Phase 2.
At the moment one thread will use all 10 or 16 elements to calculate the weighted sum in the Phase 2 case.
I am not sure how the fitting in the multifit algorithm works but I think if also uses all samples to perform the fit in one thread.
targetClass="ecal::DigiHostCollection" | ||
version="[1-]" | ||
source="ecal::EcalDigiSoA layout_;" | ||
target="buffer_" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be
target="buffer_" | |
target="buffer_,layout_,view_" |
targetClass="ecal::DigiPhase2HostCollection" | ||
version="[1-]" | ||
source="ecal::EcalDigiPhase2SoA layout_;" | ||
target="buffer_" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be
target="buffer_" | |
target="buffer_,layout_,view_" |
targetClass="ecal::UncalibratedRecHitHostCollection" | ||
version="[1-]" | ||
source="ecal::EcalUncalibratedRecHitSoA layout_;" | ||
target="buffer_" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be
target="buffer_" | |
target="buffer_,layout_,view_" |
-1 Failed Tests: HeaderConsistency UnitTests RelVals-GPU Unit TestsI found 1 errors in the following unit tests: ---> test CondToolsLHCInfoNewPopConTest had ERRORS RelVals-GPU
Comparison SummarySummary:
|
These failure should be fixed in the next IB that includes #42916 |
a9d393c
to
5756e0f
Compare
+1 Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-a3482e/35386/summary.html Comparison SummarySummary:
GPU Comparison SummarySummary:
|
I was answering to the other Andrea and not to your review. |
Is this not the case here? The array sizes are defined in the conditions format SoA definitions and will be used in the other packages to do operations with the conditions. For example in the ESProducer inside the |
+heterogeneous |
+1 |
Hi @cms-sw/simulation-l2 @cms-sw/reconstruction-l2 do you have any more comments on this PR? |
+1 |
Hi @cms-sw/reconstruction-l2 do you have any further comments? |
+1 |
This pull request is fully signed and it will be integrated in one of the next master IBs (tests are also fine). This pull request will now be reviewed by the release team before it's merged. @sextonkennedy, @antoniovilela, @rappoccio (and backports should be raised in the release meeting by the corresponding L2) |
+1 |
PR description:
This PR is the first in a planned series of three to migrate the ECAL local reconstruction for Run 3 and Phase 2 to alpaka.
It adds the conditions formats and portable collections as well as the data formats and portable collections.
Conditions:
SoA data formats:
All developments of @valsdav , @Jakub-Gajownik , and @thomreis have been squashed into one commit for clarity.
Since this PR just adds collections for future use in the coming commits no changes are expected from this PR.
PRs 2 and 3 will be made to add the Phase 2 local reconstruction code and multifit algorithm migrations, respectively.
PR validation:
Passes GPU WFs 12434.512 and 24834.612