Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Minor fixes to variables for lepton MVA #45860

Merged
merged 1 commit into from
Sep 4, 2024

Conversation

namapane
Copy link
Contributor

@namapane namapane commented Sep 2, 2024

PR description:

This PR is a follow up of #45754, moved to 14_2_X as requested.

We recently realized that the inputs for the muonPROMPTMVA (and likewise for electronPROMPTMVA) are almost, but not fully recoverable from nanoAODs. This means that it is not possible to check data/MC agreement for input variables from central productions, nor to test new trainings.
This can be fixed easily and cheaply with two small changes (using muons for illustration, same applies to electrons):

  • add 1 float variable Muon_jetDF, corresponding to the LepGood_jetDF MVA input
  • fix an inconsistency in the definition of Muon_jetRelIso with respect to the variable that is intended to correspond to, LepGood_jetPtRatio. With the current definition, jetRelIso = (1/ptRatio-1) if a jet is present, pfRelIso04_all otherwise. Unfortunately the actual MVA input variable is defined in a slightly different way, with a max(ptRatio, 1.5) applied in the case a jet is associated. Since it is not possible to figure out unambiguously if this was the case, recovering the exact definition of LepGood_jetPtRatio that was used for the MVA is tricky.

Our proposal is to store jetRelIso=(1/ptRatio-1) but with a default of -1 if no jet is matched. This has some advantages:

  • much easier to recover the MVA input variable correctly
  • Marks the case of no jet is found unambiguously (-1)
  • Does not mix ptRatio with pfRelIso04_all, which is already available in its own variable. This improves clarity and also saves some disk space as -1 gets compressed better
  • if anybody needs Muon_jetRelso as defined now, it can be computed easily as well (just need to pick isolation when no jet is present
  • Same considerations for the corresponding Electron variables. In this case, Electron_pfRelIso04_all is also added since this variable, used in Electron_jetRelso, was not yet present.

PR validation:

  • tested processing a DYJets sample and checking distributions, and sizes with inspectNanoFile.py [size report]
  • The addition of Muon_jetDF and Electron_jetDF cost 2.2 b/item each
  • Muon_jetPtRatio and Electron_jetPtRatio space is take less space (-0.2 and -0.1 b/item respectively), because we now store -1 when no associated jet is found
  • Electron_pfRelIso04_all costs 2.5 b/item
  • The overall effect of this proposed fix is +2.0 b/muon and +4.6 b/electron.

@cmsbuild
Copy link
Contributor

cmsbuild commented Sep 2, 2024

cms-bot internal usage

@cmsbuild
Copy link
Contributor

cmsbuild commented Sep 2, 2024

+code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-45860/41599

@cmsbuild
Copy link
Contributor

cmsbuild commented Sep 2, 2024

A new Pull Request was created by @namapane for master.

It involves the following packages:

  • PhysicsTools/NanoAOD (xpog)

@cmsbuild, @ftorrresd, @hqucms, @vlimant can you please review it and eventually sign? Thanks.
@AnnikaStein, @gpetruc this is something you requested to watch as well.
@antoniovilela, @mandrenguyen, @rappoccio, @sextonkennedy you are the release manager for this.

cms-bot commands are listed here

@hqucms
Copy link
Contributor

hqucms commented Sep 2, 2024

enable nano

@hqucms
Copy link
Contributor

hqucms commented Sep 2, 2024

please test

@cmsbuild
Copy link
Contributor

cmsbuild commented Sep 2, 2024

+1

Size: This PR adds an extra 40KB to repository
Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-5af2d8/41227/summary.html
COMMIT: 71143ae
CMSSW: CMSSW_14_2_X_2024-09-02-1100/el8_amd64_gcc12
Additional Tests: NANO
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week1/cms-sw/cmssw/45860/41227/install.sh to create a dev area with all the needed externals and cmssw changes.

Comparison Summary

Summary:

  • You potentially added 1 lines to the logs
  • ROOTFileChecks: Some differences in event products or their sizes found
  • Reco comparison results: 276 differences found in the comparisons
  • DQMHistoTests: Total files compared: 44
  • DQMHistoTests: Total histograms compared: 3328276
  • DQMHistoTests: Total failures: 4622
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 3323634
  • DQMHistoTests: Total skipped: 20
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 9.165999999999997 KiB( 43 files compared)
  • DQMHistoSizes: changed ( 11634.0,... ): 0.709 KiB Physics/NanoAODDQM
  • DQMHistoSizes: changed ( 13234.0,... ): 0.467 KiB Physics/NanoAODDQM
  • Checked 193 log files, 163 edm output root files, 44 DQM output files
  • TriggerResults: no differences found

NANO Comparison Summary

Summary:

  • You potentially added 998 lines to the logs
  • ROOTFileChecks: Some differences in event products or their sizes found
  • Reco comparison results: 40 differences found in the comparisons
  • DQMHistoTests: Total files compared: 21
  • DQMHistoTests: Total histograms compared: 54907
  • DQMHistoTests: Total failures: 47
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 54860
  • DQMHistoTests: Total skipped: 0
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 8.823999999999998 KiB( 20 files compared)
  • DQMHistoSizes: changed ( 2500.001,... ): 1.113 KiB Physics/NanoAODDQM
  • DQMHistoSizes: changed ( 2500.011,... ): 0.733 KiB Physics/NanoAODDQM
  • DQMHistoSizes: changed ( 2500.101,... ): 0.709 KiB Physics/NanoAODDQM
  • DQMHistoSizes: changed ( 2500.111,... ): 0.467 KiB Physics/NanoAODDQM
  • Checked 102 log files, 58 edm output root files, 21 DQM output files
  • TriggerResults: no differences found

Nano size comparison Summary:

Sample kb/ev ref kb/ev diff kb/ev ev/s/thd ref ev/s/thd diff rate mem/thd ref mem/thd
2500.001 2.798 2.788 0.009 ( +0.3% ) 3.34 3.35 -0.4% 6.039 6.115
2500.002 2.908 2.901 0.007 ( +0.3% ) 2.98 3.00 -0.5% 6.378 6.403
2500.003 2.856 2.848 0.008 ( +0.3% ) 3.08 3.11 -0.7% 6.359 6.360
2500.011 1.458 1.450 0.008 ( +0.6% ) 5.79 5.84 -0.8% 2.401 2.403
2500.012 1.916 1.909 0.007 ( +0.4% ) 3.16 3.18 -0.6% 2.595 2.590
2500.013 1.773 1.765 0.008 ( +0.5% ) 4.63 4.60 +0.6% 2.502 2.484
2500.021 0.022 0.022 0.000 ( +0.0% ) 0.98 0.99 -0.8% 2.229 2.208
2500.022 0.022 0.022 0.000 ( +0.0% ) 0.94 0.95 -0.5% 2.231 2.206
2500.023 0.022 0.022 0.000 ( +0.0% ) 0.95 0.94 +0.6% 2.209 2.154
2500.024 0.022 0.022 0.000 ( +0.0% ) 0.71 0.72 -2.4% 2.347 2.313
2500.031 0.035 0.035 0.000 ( +0.0% ) 0.89 0.87 +2.3% 2.382 2.306
2500.032 0.036 0.036 0.000 ( +0.0% ) 0.91 0.91 +0.3% 2.348 2.257
2500.033 0.037 0.037 0.000 ( +0.0% ) 0.80 0.81 -0.8% 2.368 2.354
2500.034 0.036 0.036 0.000 ( +0.0% ) 0.84 0.83 +0.9% 2.358 2.330
2500.101 2.654 2.646 0.008 ( +0.3% ) 9.07 8.99 +0.9% 6.891 6.891
2500.111 1.336 1.330 0.006 ( +0.5% ) 20.16 20.89 -3.5% 2.215 2.218
2500.112 1.743 1.735 0.008 ( +0.5% ) 15.53 13.68 +13.5% 2.164 2.122
2500.131 5.194 5.194 0.000 ( +0.0% ) 16.10 15.87 +1.4% 1.545 1.534
2500.201 2.485 2.478 0.007 ( +0.3% ) 7.65 7.67 -0.3% 5.580 6.127
2500.211 1.599 1.592 0.007 ( +0.4% ) 17.65 17.80 -0.8% 2.094 2.105
2500.212 2.041 2.033 0.008 ( +0.4% ) 13.23 14.12 -6.2% 2.133 2.121
2500.221 2.005 2.006 -0.000 ( -0.0% ) 7.95 7.67 +3.6% 2.301 2.236
2500.222 3.215 3.206 0.009 ( +0.3% ) 7.04 7.47 -5.7% 2.338 2.294
2500.223 8.896 8.888 0.008 ( +0.1% ) 2.82 2.80 +0.8% 2.355 2.307
2500.224 5.522 5.515 0.008 ( +0.1% ) 1.11 1.10 +1.2% 2.435 2.386
2500.225 5.541 5.533 0.008 ( +0.1% ) 1.04 1.02 +1.1% 2.274 2.407
2500.226 2.981 2.973 0.008 ( +0.3% ) 7.55 7.62 -0.9% 2.301 2.284
2500.227 8.972 8.972 0.000 ( +0.0% ) 10.34 10.32 +0.2% 1.413 1.365
2500.231 1.407 1.407 -0.000 ( -0.0% ) 13.94 13.46 +3.6% 1.948 1.957
2500.232 2.246 2.237 0.009 ( +0.4% ) 13.34 13.81 -3.4% 2.029 2.023
2500.233 4.678 4.670 0.008 ( +0.2% ) 4.87 4.94 -1.4% 2.049 2.019
2500.234 3.315 3.307 0.008 ( +0.2% ) 1.51 1.52 -0.7% 1.799 2.054
2500.235 3.325 3.317 0.008 ( +0.2% ) 1.41 1.39 +1.3% 1.825 2.084
2500.236 2.093 2.085 0.008 ( +0.4% ) 13.74 13.99 -1.8% 2.056 2.022
2500.237 7.977 7.977 0.000 ( +0.0% ) 14.01 14.67 -4.5% 1.415 1.398
2500.241 9.405 9.405 0.000 ( +0.0% ) 4.04 3.80 +6.2% 1.764 1.760
2500.242 10.331 10.331 0.000 ( +0.0% ) 0.93 0.91 +1.8% 1.724 1.720
2500.243 2.712 2.712 0.000 ( +0.0% ) 8.49 8.62 -1.5% 1.061 1.064
2500.244 485.976 485.976 0.000 ( +0.0% ) 0.58 0.57 +0.6% 1.682 1.601
2500.245 823.224 823.224 0.000 ( +0.0% ) 0.76 0.76 +0.5% 1.568 1.581
2500.901 1.749 1.749 0.000 ( +0.0% ) 22.05 21.40 +3.0% 1.824 1.794
2500.902 1.598 1.598 0.000 ( +0.0% ) 21.91 22.25 -1.5% 1.755 1.756
2500.911 13.931 13.931 0.000 ( +0.0% ) 3.35 3.23 +3.8% 1.080 1.082
2500.912 0.240 0.150 0.090 ( +59.9% ) 1.20 1.39 -13.7% 0.968 0.969
2500.913 0.110 0.110 0.000 ( +0.0% ) 1.13 1.21 -6.2% 0.968 0.975

@RSalvatico
Copy link
Contributor

RSalvatico commented Sep 3, 2024

The proposed modifications to the electron collection look good to me, thank you @namapane .

@cms-sw/egamma-pog-l2

@RSalvatico
Copy link
Contributor

type egamma

@cmsbuild cmsbuild added the egamma label Sep 3, 2024
@hqucms
Copy link
Contributor

hqucms commented Sep 3, 2024

+1

@cmsbuild
Copy link
Contributor

cmsbuild commented Sep 3, 2024

This pull request is fully signed and it will be integrated in one of the next master IBs (tests are also fine). This pull request will now be reviewed by the release team before it's merged. @rappoccio, @mandrenguyen, @sextonkennedy, @antoniovilela (and backports should be raised in the release meeting by the corresponding L2)

@hqucms
Copy link
Contributor

hqucms commented Sep 3, 2024

type muon

@cmsbuild cmsbuild added the muon label Sep 3, 2024
@mandrenguyen
Copy link
Contributor

+1

@cmsbuild cmsbuild merged commit bc07c52 into cms-sw:master Sep 4, 2024
13 checks passed
@namapane
Copy link
Contributor Author

Hi @hqucms, all,
I forgot to ask, should we backport this to 14_1_X and/or elsewhere? I am a bit lost with the plans, but I want to be sure that this development is consistently included in future central nanoAOD reprocessings.

@hqucms
Copy link
Contributor

hqucms commented Sep 12, 2024

Hi @namapane -- the next nanoAOD campaign will be based on CMSSW_15_X so there is no need to backport this.

@ftorrresd ftorrresd mentioned this pull request Oct 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants