-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Dead modules #319
Dead modules #319
Conversation
link to cmssw PR: trackreco/cmssw#24 |
As far as I can tell, the -7 hits should be properly accounted for in STD as well. I did not test this with specific track friends, but the relevant code is here: https://github.com/trackreco/mkFit/blob/devel/mkFit/HitStructures.h#L608 It's hard to tell if this PR makes the difference between STD and CE worse. I know differences were introduced in the charge flip PR, which I'm revisiting and trying to debug now. |
deadvectors[33].push_back({1.29496,1.3754,88.9013,109.39}); | ||
deadvectors[33].push_back({1.83039,2.09648,32.5881,41.4833}); | ||
deadvectors[33].push_back({-1.34772,-1.23286,50.5025,62.0376}); | ||
deadvectors[33].push_back({3.14024,-3.02807,50.5027,62.0379}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suspect this will cause trouble.
Should we handle it in producer (create two phi regions on each side of the divide) or in the SuckInDeads()?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what is that will cause troubles?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
{3.14024,-3.02807,50.5027,62.0379} ... wrap around pi
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This actually needs to be fixed before merge. Giuseppe, I'd propose to do it in the generator, then it will be automatically fixed for cmssw as well, right?
Something like:
if (phi_low > phi_high) { // Need more sanity checks?
emit_for(phi_low, Config::PI, ...);
emit_for(-Config::PI, phi_high, ....);
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's not wrapping around pi=3.14159265359. I am not sure this is an issue
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
{3.14024,-3.02807,...}
translates to bin numbers ~127 and ~0, respectively. The fill loop ends up with (i=127; i<0; ++i)
:)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I pushed a fix that handles this in LayerOfHits::SuckInDeads(). You decide if you want to keep it like this or correct it in the producer -- probably it depends on how this is handled in CMSSW.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks! Yes having it here seems safe.
@@ -341,6 +341,10 @@ void test_standard() | |||
|
|||
StdSeq::LoadHits(ev, eoh); | |||
|
|||
std::vector<DeadVec> deadvectors(ev.layerHits_.size()); | |||
#include "deadmodules.h" | |||
StdSeq::LoadDeads(eoh, deadvectors); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could in principle call this just once where EventOfHits objects are created, one for each event of flight.
[ or even share vecvecPhiBinDead_t m_phi_bin_deads
via a const ptr ... this has the added benefit that if the ptr is null (not set), we know dead regions are not being used ]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Forgot to say, Slava says pixel dead regions change more frequently so for those we still need a per instance / per event update -- but this probably only applies to CMSSW implementation.
deadvectors[70].push_back({1.79387,1.90873,72.787,91.2269}); | ||
deadvectors[70].push_back({-2.8063,-2.69143,72.7778,91.2158}); | ||
deadvectors[70].push_back({2.39452,2.47495,88.9094,109.398}); | ||
deadvectors[70].push_back({-1.53249,-1.45203,88.8906,109.379}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting that there are no dead regions for layer 71.
@@ -48,6 +48,10 @@ typedef std::array<PhiBinInfo_t, Config::m_nphi> vecPhiBinInfo_t; | |||
|
|||
typedef std::vector<vecPhiBinInfo_t> vecvecPhiBinInfo_t; | |||
|
|||
typedef std::array<bool, Config::m_nphi> vecPhiBinDead_t; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Config::m_nphi = 128, 6.28/128 = .04906
Should we consider trying to increase phi-bins to 256 to reduce false identification of bad hits? IIRC, Slava said there is little speed change if one varies this variable. It was 1024 before Slava's last changes here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
at a radius of 1.1 m (roughly the last layer, TOB) 128 bins corresponds to 5.4 cm.
TOB modules are 9 cm. So, a phi bin is close to a half of the module or less.
TIB modules are 6 cm (the last TIB radius is 50 cm; a phi bin size at this radius is 2.5 cm).
BPIX is 1.6 cm (the last layer is 16 cm; a bin size is 0.8 cm).
So, unless this technology is going to appropriately pick up fractions of modules we will just be repeating the values.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since the overall structure of the bins is already in place, repeating values means just a bool and is thus a very small effect. So it's good that the bin size is smaller than the module size, I was more concerned that we could over-assign invalid hits. To me this means that the dead modules do not require an increase in phi bins, but if we choose to do so that is not going to create problems either
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
repeating values means just a bool and is thus a very small effect
Just to note that std::array<bool, N>
takes one byte per element (in case with you were after "bit per element" with "just a bool"). Nevertheless a byte per element could be ok storage-wise (128 bytes per layer if I guess correctly?), and avoids bit shifts (although their real cost in modern hardware is unclear to me).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, I assumed one bit. Thanks for clarifying. In any case it's still a minimal footprint (if we were to create a dedicated bin structure it would necessarily be larger, or even if we were to add many dummy hits as in the first implementation by Allie)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is actually even more pronounced for q-bins. We have both a) too large q bins (at least in some layers), and b) earch an extra q-bin up/down. So we effectively pick up dead regions way too often.
Maybe we could restrict this matching at least in q, unless the track passes relatively close to the edge of the bin.
We should re-check q-bins as well. @mmasciov haven't I tried to stick it on you a couple of months back? :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And it's 128 bytes * N_q_bins. We should switch to bits ... that part of code is heavily L1 limited.
@@ -411,6 +411,12 @@ void MkFinder::SelectHitIndices(const LayerOfHits &layer_of_hits, | |||
{ | |||
const int pb = pi & L.m_phi_mask; | |||
|
|||
if (L.m_phi_bin_deads[qi][pb] == true) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is checked for ALL phi/q bins within (potentially large?) search windows. This means, if we find a hit, that the missed-hit candidate will be generated as being in-gap -- i.e., without the missing hit penalty.
Should we only set this if no hits are found? Or if the found hit is more than some dphi-sigma away?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, this is conservative. But I think CMSSW does the same. If there is a dead region within the search window you have no guarantee that the track did not pass through it - even if you find a hit it could be an accidental match. This does not mean that we cannot be more creative than CMSSW, but I think this is a reasonable starting point.
Account for dead strip modules in mkFit. A bool for each phi-q bin is stores the information about dead modules and is checked in SelectHitIndices. If there is a dead module in the bin, an invalid hit with with idx=-7 is created.
For standalone, the list is currently taken from Run3 release (/data2/slava77/samples/CMSSW_11_2_0-orig/2021/11834.0_TTbar_14TeV+2021/AVE_50_BX01_25ns/step2_sorted.root) and stored in deadmodules.h.
This is probably working only for CloneEngine for now. In Standard, the -7 hits are added but I am not sure they are accounted for properly downstream (e.g. in the score).
For CMSSW integrated version, the list is taken at run time from the GlobalTag (a corresponding PR to trackreco/CMSSW will be made shortly).
Validation plots.
Standalone: https://cerati.web.cern.ch/cerati/dead-modules/
CMSSW: http://uaf-10.t2.ucsd.edu/~cerati/deadmodules/