Minor fixes to variables for lepton MVA #45860

namapane · 2024-09-02T13:38:48Z

PR description:

This PR is a follow up of #45754, moved to 14_2_X as requested.

We recently realized that the inputs for the muonPROMPTMVA (and likewise for electronPROMPTMVA) are almost, but not fully recoverable from nanoAODs. This means that it is not possible to check data/MC agreement for input variables from central productions, nor to test new trainings.
This can be fixed easily and cheaply with two small changes (using muons for illustration, same applies to electrons):

add 1 float variable Muon_jetDF, corresponding to the LepGood_jetDF MVA input
fix an inconsistency in the definition of Muon_jetRelIso with respect to the variable that is intended to correspond to, LepGood_jetPtRatio. With the current definition, jetRelIso = (1/ptRatio-1) if a jet is present, pfRelIso04_all otherwise. Unfortunately the actual MVA input variable is defined in a slightly different way, with a max(ptRatio, 1.5) applied in the case a jet is associated. Since it is not possible to figure out unambiguously if this was the case, recovering the exact definition of LepGood_jetPtRatio that was used for the MVA is tricky.

Our proposal is to store jetRelIso=(1/ptRatio-1) but with a default of -1 if no jet is matched. This has some advantages:

much easier to recover the MVA input variable correctly
Marks the case of no jet is found unambiguously (-1)
Does not mix ptRatio with pfRelIso04_all, which is already available in its own variable. This improves clarity and also saves some disk space as -1 gets compressed better
if anybody needs Muon_jetRelso as defined now, it can be computed easily as well (just need to pick isolation when no jet is present
Same considerations for the corresponding Electron variables. In this case, Electron_pfRelIso04_all is also added since this variable, used in Electron_jetRelso, was not yet present.

PR validation:

tested processing a DYJets sample and checking distributions, and sizes with inspectNanoFile.py [size report]
The addition of Muon_jetDF and Electron_jetDF cost 2.2 b/item each
Muon_jetPtRatio and Electron_jetPtRatio space is take less space (-0.2 and -0.1 b/item respectively), because we now store -1 when no associated jet is found
Electron_pfRelIso04_all costs 2.5 b/item
The overall effect of this proposed fix is +2.0 b/muon and +4.6 b/electron.

cmsbuild · 2024-09-02T13:39:14Z

cms-bot internal usage

cmsbuild · 2024-09-02T13:40:32Z

+code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-45860/41599

There are other open Pull requests which might conflict with changes you have proposed:
- File PhysicsTools/NanoAOD/python/electrons_cff.py modified in PR(s): Minor fixes to variables for lepton MVA #45754
- File PhysicsTools/NanoAOD/python/muons_cff.py modified in PR(s): Updates and Bug Fixes for NanoAOD related to muon #45635, Minor fixes to variables for lepton MVA #45754
- File PhysicsTools/NanoAOD/python/nanoDQM_cfi.py modified in PR(s): [BTV, NanoAOD] Add UParT discriminants for strange-jet tagging #45684, Minor fixes to variables for lepton MVA #45754

cmsbuild · 2024-09-02T13:40:58Z

A new Pull Request was created by @namapane for master.

It involves the following packages:

PhysicsTools/NanoAOD (xpog)

@cmsbuild, @ftorrresd, @hqucms, @vlimant can you please review it and eventually sign? Thanks.
@AnnikaStein, @gpetruc this is something you requested to watch as well.
@antoniovilela, @mandrenguyen, @rappoccio, @sextonkennedy you are the release manager for this.

cms-bot commands are listed here

hqucms · 2024-09-02T13:54:31Z

enable nano

hqucms · 2024-09-02T13:54:38Z

please test

cmsbuild · 2024-09-02T17:40:54Z

+1

Size: This PR adds an extra 40KB to repository
Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-5af2d8/41227/summary.html
COMMIT: 71143ae
CMSSW: CMSSW_14_2_X_2024-09-02-1100/el8_amd64_gcc12
Additional Tests: NANO
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week1/cms-sw/cmssw/45860/41227/install.sh to create a dev area with all the needed externals and cmssw changes.

Comparison Summary

Summary:

You potentially added 1 lines to the logs
ROOTFileChecks: Some differences in event products or their sizes found
Reco comparison results: 276 differences found in the comparisons
DQMHistoTests: Total files compared: 44
DQMHistoTests: Total histograms compared: 3328276
DQMHistoTests: Total failures: 4622
DQMHistoTests: Total nulls: 0
DQMHistoTests: Total successes: 3323634
DQMHistoTests: Total skipped: 20
DQMHistoTests: Total Missing objects: 0
DQMHistoSizes: Histogram memory added: 9.165999999999997 KiB( 43 files compared)
DQMHistoSizes: changed ( 11634.0,... ): 0.709 KiB Physics/NanoAODDQM
DQMHistoSizes: changed ( 13234.0,... ): 0.467 KiB Physics/NanoAODDQM
Checked 193 log files, 163 edm output root files, 44 DQM output files
TriggerResults: no differences found

NANO Comparison Summary

Summary:

You potentially added 998 lines to the logs
ROOTFileChecks: Some differences in event products or their sizes found
Reco comparison results: 40 differences found in the comparisons
DQMHistoTests: Total files compared: 21
DQMHistoTests: Total histograms compared: 54907
DQMHistoTests: Total failures: 47
DQMHistoTests: Total nulls: 0
DQMHistoTests: Total successes: 54860
DQMHistoTests: Total skipped: 0
DQMHistoTests: Total Missing objects: 0
DQMHistoSizes: Histogram memory added: 8.823999999999998 KiB( 20 files compared)
DQMHistoSizes: changed ( 2500.001,... ): 1.113 KiB Physics/NanoAODDQM
DQMHistoSizes: changed ( 2500.011,... ): 0.733 KiB Physics/NanoAODDQM
DQMHistoSizes: changed ( 2500.101,... ): 0.709 KiB Physics/NanoAODDQM
DQMHistoSizes: changed ( 2500.111,... ): 0.467 KiB Physics/NanoAODDQM
Checked 102 log files, 58 edm output root files, 21 DQM output files
TriggerResults: no differences found

Nano size comparison Summary:

Sample	kb/ev	ref kb/ev	diff kb/ev	ev/s/thd	ref ev/s/thd	diff rate	mem/thd	ref mem/thd
2500.001	2.798	2.788	0.009 ( +0.3% )	3.34	3.35	-0.4%	6.039	6.115
2500.002	2.908	2.901	0.007 ( +0.3% )	2.98	3.00	-0.5%	6.378	6.403
2500.003	2.856	2.848	0.008 ( +0.3% )	3.08	3.11	-0.7%	6.359	6.360
2500.011	1.458	1.450	0.008 ( +0.6% )	5.79	5.84	-0.8%	2.401	2.403
2500.012	1.916	1.909	0.007 ( +0.4% )	3.16	3.18	-0.6%	2.595	2.590
2500.013	1.773	1.765	0.008 ( +0.5% )	4.63	4.60	+0.6%	2.502	2.484
2500.021	0.022	0.022	0.000 ( +0.0% )	0.98	0.99	-0.8%	2.229	2.208
2500.022	0.022	0.022	0.000 ( +0.0% )	0.94	0.95	-0.5%	2.231	2.206
2500.023	0.022	0.022	0.000 ( +0.0% )	0.95	0.94	+0.6%	2.209	2.154
2500.024	0.022	0.022	0.000 ( +0.0% )	0.71	0.72	-2.4%	2.347	2.313
2500.031	0.035	0.035	0.000 ( +0.0% )	0.89	0.87	+2.3%	2.382	2.306
2500.032	0.036	0.036	0.000 ( +0.0% )	0.91	0.91	+0.3%	2.348	2.257
2500.033	0.037	0.037	0.000 ( +0.0% )	0.80	0.81	-0.8%	2.368	2.354
2500.034	0.036	0.036	0.000 ( +0.0% )	0.84	0.83	+0.9%	2.358	2.330
2500.101	2.654	2.646	0.008 ( +0.3% )	9.07	8.99	+0.9%	6.891	6.891
2500.111	1.336	1.330	0.006 ( +0.5% )	20.16	20.89	-3.5%	2.215	2.218
2500.112	1.743	1.735	0.008 ( +0.5% )	15.53	13.68	+13.5%	2.164	2.122
2500.131	5.194	5.194	0.000 ( +0.0% )	16.10	15.87	+1.4%	1.545	1.534
2500.201	2.485	2.478	0.007 ( +0.3% )	7.65	7.67	-0.3%	5.580	6.127
2500.211	1.599	1.592	0.007 ( +0.4% )	17.65	17.80	-0.8%	2.094	2.105
2500.212	2.041	2.033	0.008 ( +0.4% )	13.23	14.12	-6.2%	2.133	2.121
2500.221	2.005	2.006	-0.000 ( -0.0% )	7.95	7.67	+3.6%	2.301	2.236
2500.222	3.215	3.206	0.009 ( +0.3% )	7.04	7.47	-5.7%	2.338	2.294
2500.223	8.896	8.888	0.008 ( +0.1% )	2.82	2.80	+0.8%	2.355	2.307
2500.224	5.522	5.515	0.008 ( +0.1% )	1.11	1.10	+1.2%	2.435	2.386
2500.225	5.541	5.533	0.008 ( +0.1% )	1.04	1.02	+1.1%	2.274	2.407
2500.226	2.981	2.973	0.008 ( +0.3% )	7.55	7.62	-0.9%	2.301	2.284
2500.227	8.972	8.972	0.000 ( +0.0% )	10.34	10.32	+0.2%	1.413	1.365
2500.231	1.407	1.407	-0.000 ( -0.0% )	13.94	13.46	+3.6%	1.948	1.957
2500.232	2.246	2.237	0.009 ( +0.4% )	13.34	13.81	-3.4%	2.029	2.023
2500.233	4.678	4.670	0.008 ( +0.2% )	4.87	4.94	-1.4%	2.049	2.019
2500.234	3.315	3.307	0.008 ( +0.2% )	1.51	1.52	-0.7%	1.799	2.054
2500.235	3.325	3.317	0.008 ( +0.2% )	1.41	1.39	+1.3%	1.825	2.084
2500.236	2.093	2.085	0.008 ( +0.4% )	13.74	13.99	-1.8%	2.056	2.022
2500.237	7.977	7.977	0.000 ( +0.0% )	14.01	14.67	-4.5%	1.415	1.398
2500.241	9.405	9.405	0.000 ( +0.0% )	4.04	3.80	+6.2%	1.764	1.760
2500.242	10.331	10.331	0.000 ( +0.0% )	0.93	0.91	+1.8%	1.724	1.720
2500.243	2.712	2.712	0.000 ( +0.0% )	8.49	8.62	-1.5%	1.061	1.064
2500.244	485.976	485.976	0.000 ( +0.0% )	0.58	0.57	+0.6%	1.682	1.601
2500.245	823.224	823.224	0.000 ( +0.0% )	0.76	0.76	+0.5%	1.568	1.581
2500.901	1.749	1.749	0.000 ( +0.0% )	22.05	21.40	+3.0%	1.824	1.794
2500.902	1.598	1.598	0.000 ( +0.0% )	21.91	22.25	-1.5%	1.755	1.756
2500.911	13.931	13.931	0.000 ( +0.0% )	3.35	3.23	+3.8%	1.080	1.082
2500.912	0.240	0.150	0.090 ( +59.9% )	1.20	1.39	-13.7%	0.968	0.969
2500.913	0.110	0.110	0.000 ( +0.0% )	1.13	1.21	-6.2%	0.968	0.975

RSalvatico · 2024-09-03T11:20:33Z

The proposed modifications to the electron collection look good to me, thank you @namapane .

@cms-sw/egamma-pog-l2

RSalvatico · 2024-09-03T11:20:49Z

type egamma

hqucms · 2024-09-03T19:25:49Z

+1

cmsbuild · 2024-09-03T19:26:13Z

This pull request is fully signed and it will be integrated in one of the next master IBs (tests are also fine). This pull request will now be reviewed by the release team before it's merged. @rappoccio, @mandrenguyen, @sextonkennedy, @antoniovilela (and backports should be raised in the release meeting by the corresponding L2)

hqucms · 2024-09-03T19:26:26Z

type muon

mandrenguyen · 2024-09-04T12:05:56Z

+1

namapane · 2024-09-12T10:51:25Z

Hi @hqucms, all,
I forgot to ask, should we backport this to 14_1_X and/or elsewhere? I am a bit lost with the plans, but I want to be sure that this development is consistently included in future central nanoAOD reprocessings.

hqucms · 2024-09-12T14:01:44Z

Hi @namapane -- the next nanoAOD campaign will be based on CMSSW_15_X so there is no need to backport this.

Fix variables used in the lepton prompt MVA

71143ae

cmsbuild added this to the CMSSW_14_2_X milestone Sep 2, 2024

cmsbuild added pending-signatures tests-pending orp-pending code-checks-pending xpog-pending labels Sep 2, 2024

cmsbuild added code-checks-approved and removed code-checks-pending labels Sep 2, 2024

namapane mentioned this pull request Sep 2, 2024

Minor fixes to variables for lepton MVA #45754

Closed

cmsbuild added tests-started and removed tests-pending labels Sep 2, 2024

cmsbuild added tests-approved and removed tests-started labels Sep 2, 2024

cmsbuild added the egamma label Sep 3, 2024

cmsbuild added fully-signed xpog-approved and removed pending-signatures xpog-pending labels Sep 3, 2024

cmsbuild added the muon label Sep 3, 2024

cmsbuild added orp-approved and removed orp-pending labels Sep 4, 2024

cmsbuild merged commit bc07c52 into cms-sw:master Sep 4, 2024
13 checks passed

This was referenced Sep 4, 2024

Issue7955 type first spec 2 cms-sw/root#208

Closed

[Do not merge] Testing TF 2.15 GPU cms-sw/cmsdist#9394

Closed

ftorrresd mentioned this pull request Oct 16, 2024

VXBS For High Pt #46287

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Minor fixes to variables for lepton MVA #45860

Minor fixes to variables for lepton MVA #45860

namapane commented Sep 2, 2024

cmsbuild commented Sep 2, 2024 •

edited

Loading

cmsbuild commented Sep 2, 2024

cmsbuild commented Sep 2, 2024

hqucms commented Sep 2, 2024

hqucms commented Sep 2, 2024

cmsbuild commented Sep 2, 2024

RSalvatico commented Sep 3, 2024 •

edited

Loading

RSalvatico commented Sep 3, 2024

hqucms commented Sep 3, 2024

cmsbuild commented Sep 3, 2024

hqucms commented Sep 3, 2024

mandrenguyen commented Sep 4, 2024

namapane commented Sep 12, 2024

hqucms commented Sep 12, 2024

Minor fixes to variables for lepton MVA #45860

Minor fixes to variables for lepton MVA #45860

Conversation

namapane commented Sep 2, 2024

PR description:

PR validation:

cmsbuild commented Sep 2, 2024 • edited Loading

cmsbuild commented Sep 2, 2024

cmsbuild commented Sep 2, 2024

hqucms commented Sep 2, 2024

hqucms commented Sep 2, 2024

cmsbuild commented Sep 2, 2024

Comparison Summary

NANO Comparison Summary

RSalvatico commented Sep 3, 2024 • edited Loading

RSalvatico commented Sep 3, 2024

hqucms commented Sep 3, 2024

cmsbuild commented Sep 3, 2024

hqucms commented Sep 3, 2024

mandrenguyen commented Sep 4, 2024

namapane commented Sep 12, 2024

hqucms commented Sep 12, 2024

cmsbuild commented Sep 2, 2024 •

edited

Loading

RSalvatico commented Sep 3, 2024 •

edited

Loading