Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

backport PRs to CMSSW 8.0.x to improve HLT performance #15151

Closed
8 of 9 tasks
fwyzard opened this issue Jul 8, 2016 · 24 comments
Closed
8 of 9 tasks

backport PRs to CMSSW 8.0.x to improve HLT performance #15151

fwyzard opened this issue Jul 8, 2016 · 24 comments

Comments

@fwyzard
Copy link
Contributor

fwyzard commented Jul 8, 2016

Backport these PRs to CMSSW 8.0.x

Maybe backport these as well ?

@cmsbuild
Copy link
Contributor

cmsbuild commented Jul 8, 2016

A new Issue was created by @fwyzard Andrea Bocci.

@davidlange6, @smuzaffar, @Degano, @davidlt, @Dr15Jones can you please review it and eventually sign/assign? Thanks.

cms-bot commands are list here #13029

@slava77
Copy link
Contributor

slava77 commented Jul 8, 2016

PFblock updates are #14138 -> #14441 -> #14754 -> #14844 (the last 3 are bug fixes to crashes)
Please add them to the list (or add notes to the list .. or just not forget this comment)

@slava77
Copy link
Contributor

slava77 commented Jul 8, 2016

A significant fraction of PRs has bit-wise different results for a variety of reasons.
These nominally require validation in this release as well.
Reliance on 81X validation is possible, but it is weakened because in 81X changes apply both to MC and data and may have been validated as a result of consistent change.

Given the way we use 80X, I think that it's best to factor out backports that preserve results bit-wise and integrate them as they trickle in.
Everything else would need to come almost in one go to have a collective comparison.

Things may get easier if we split off into a relase specific for HLT, but that's not nice for many reasons.

@makortel
Copy link
Contributor

Backport of #14179 is in #15226.

@makortel
Copy link
Contributor

Backport of #14048 is in #15231.

@makortel
Copy link
Contributor

makortel commented Jul 20, 2016

Backport of #13959 is in #15240. It also includes one commit from #13831.

@slava77
#14247 also depends on the same commit. I guess the options are

#14070 depends on #13959. I guess the options are

What would be your preference?

@davidlange6
Copy link
Contributor

The routine validation done in CMS is not MC to data, but rather MC to MC and data to data. So I’m not sure I follow this argument or what additional information would be learned from a 80x detailed comparison. but yes, anything that can get factored out into a bit-wise the same PR is best. Meanwhile, it might be wise to go back and look at the validation checks done already in 81x..

On Jul 8, 2016, at 5:22 PM, Slava Krutelyov <[email protected]mailto:[email protected]> wrote:

A significant fraction of PRs has bit-wise different results for a variety of reasons.
These nominally require validation in this release as well.
Reliance on 81X validation is possible, but it is weakened because in 81X changes apply both to MC and data and may have been validated as a result of consistent change.

Given the way we use 80X, I think that it's best to factor out backports that preserve results bit-wise and integrate them as they trickle in.
Everything else would need to come almost in one go to have a collective comparison.

Things may get easier if we split off into a relase specific for HLT, but that's not nice for many reasons.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHubhttps://github.com//issues/15151#issuecomment-231388898, or mute the threadhttps://github.com/notifications/unsubscribe/AEzywwnh55KVSoUh2Tbn6TeJ1Fcbf99Oks5qTmstgaJpZM4JH_Ai.

@slava77
Copy link
Contributor

slava77 commented Jul 20, 2016

Hi Matti

On 7/20/16 12:32 AM, Matti Kortelainen wrote:

Backport of #13959 #13959 is in
#15240 #15240. It also includes
one commit from #13831 #13831.

@slava77 https://github.com/slava77
#14247 #14247 also depends on the
same commit. I guess the options are

I prefer to do in stages if possible and reasonable for PRs on different
topics.
If the change in #15240 (not #13959 from 81X) is ~trivial (no
regressions), we should go this way.

#14070 #14070 depends on #13959
#13959. I guess the options are

I think it's better to combine in this case: both are rather technical
and are on the same topic as a large part of RefCore changes in what is
now #15240

Slava

What would be your preference?


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#15151 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AEdcbpRtQYAaYKVulGMDJi_mQwjz6t4-ks5qXc7-gaJpZM4JH_Ai.

@slava77
Copy link
Contributor

slava77 commented Jul 20, 2016

On 7/20/16 12:36 AM, David Lange wrote:

The routine validation done in CMS is not MC to data, but rather MC to
MC and data to data.

My point was that if the same physics feature (change in resolution,
efficiency etc)
appears both in MC and in data in a given relval campaign, the release
is validated.
The scope of the backport, however, is to put only features that do not
change physics.

There is only information about presence or lack of changes sometimes at
jenkins level of 10 or at most 100 events compared.
Some are tested with more events (up to 1K on particle guns).
This is good enough for pre-integration testing.

Lack of changes is somewhat credible for bit-wise agreement.
However, when there are changes, you wouldn't really be able to tell if
it affects some corner of phase space.

Also, production level release validation is available only with
high-stat relvals that are typically made at the end of the release
cycle (not the ~10K events of RelVals).
So, a claim that a feature is validated in a pre-release is only good
enough for development release scope.
There are past examples of significant issues missed even in the
production level validation.

As a few examples: PF tails from bad tracks, vertex sorting issues
visible only with rather special PU samples, tracking isolation issues
in high-pt electron samples.

So I’m not sure I follow this argument or what
additional information would be learned from a 80x detailed comparison.

The purpose of 80X comparisons is to show no physics level changes
with good enough credibility for a production release.

but yes, anything that can get factored out into a bit-wise the same PR
is best. Meanwhile, it might be wise to go back and look at the
validation checks done already in 81x..

On Jul 8, 2016, at 5:22 PM, Slava Krutelyov
<[email protected]mailto:[email protected]> wrote:

A significant fraction of PRs has bit-wise different results for a
variety of reasons.
These nominally require validation in this release as well.
Reliance on 81X validation is possible, but it is weakened because in
81X changes apply both to MC and data and may have been validated as a
result of consistent change.

Given the way we use 80X, I think that it's best to factor out backports
that preserve results bit-wise and integrate them as they trickle in.
Everything else would need to come almost in one go to have a collective
comparison.

Things may get easier if we split off into a relase specific for HLT,
but that's not nice for many reasons.


You are receiving this because you were mentioned.
Reply to this email directly, view it on
GitHubhttps://github.com//issues/15151#issuecomment-231388898,
or mute the
threadhttps://github.com/notifications/unsubscribe/AEzywwnh55KVSoUh2Tbn6TeJ1Fcbf99Oks5qTmstgaJpZM4JH_Ai.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#15151 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AEdcbns98XR0FSkcEp-dzX8uVtQBQL_cks5qXdAPgaJpZM4JH_Ai.

@davidlange6
Copy link
Contributor

On Jul 20, 2016, at 2:05 PM, Slava Krutelyov <[email protected]mailto:[email protected]> wrote:

On 7/20/16 12:36 AM, David Lange wrote:

The routine validation done in CMS is not MC to data, but rather MC to
MC and data to data.

My point was that if the same physics feature (change in resolution,
efficiency etc)
appears both in MC and in data in a given relval campaign, the release
is validated.

This is not generally the method applied in CMS, no. MC is checked against release N-1. If changes are observed they are reported and hopefully understand (in the case of tracking, they certainly get understood).

anyway, certainly if some higher stats tests are needed, lets get those planned so things are ready to go (because online won’t want to wait for them to be done if they take very long)

The scope of the backport, however, is to put only features that do not
change physics.

There is only information about presence or lack of changes sometimes at
jenkins level of 10 or at most 100 events compared.
Some are tested with more events (up to 1K on particle guns).
This is good enough for pre-integration testing.

Lack of changes is somewhat credible for bit-wise agreement.
However, when there are changes, you wouldn't really be able to tell if
it affects some corner of phase space.

Also, production level release validation is available only with
high-stat relvals that are typically made at the end of the release
cycle (not the ~10K events of RelVals).
So, a claim that a feature is validated in a pre-release is only good
enough for development release scope.
There are past examples of significant issues missed even in the
production level validation.

As a few examples: PF tails from bad tracks, vertex sorting issues
visible only with rather special PU samples, tracking isolation issues
in high-pt electron samples.

So I’m not sure I follow this argument or what
additional information would be learned from a 80x detailed comparison.

The purpose of 80X comparisons is to show no physics level changes
with good enough credibility for a production release.

but yes, anything that can get factored out into a bit-wise the same PR
is best. Meanwhile, it might be wise to go back and look at the
validation checks done already in 81x..

On Jul 8, 2016, at 5:22 PM, Slava Krutelyov
<[email protected]mailto:[email protected]mailto:[email protected]> wrote:

A significant fraction of PRs has bit-wise different results for a
variety of reasons.
These nominally require validation in this release as well.
Reliance on 81X validation is possible, but it is weakened because in
81X changes apply both to MC and data and may have been validated as a
result of consistent change.

Given the way we use 80X, I think that it's best to factor out backports
that preserve results bit-wise and integrate them as they trickle in.
Everything else would need to come almost in one go to have a collective
comparison.

Things may get easier if we split off into a relase specific for HLT,
but that's not nice for many reasons.


You are receiving this because you were mentioned.
Reply to this email directly, view it on
GitHubhttps://github.com//issues/15151#issuecomment-231388898,
or mute the
threadhttps://github.com/notifications/unsubscribe/AEzywwnh55KVSoUh2Tbn6TeJ1Fcbf99Oks5qTmstgaJpZM4JH_Ai.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#15151 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AEdcbns98XR0FSkcEp-dzX8uVtQBQL_cks5qXdAPgaJpZM4JH_Ai.


Vyacheslav (Slava) Krutelyov
UCSD: 9500 Gillman Dr, Physics UCSD 0354, La Jolla, CA 92093-0354
CERN: 42-R-027
AIM/Skype: siava16 googleTalk: [email protected]:[email protected]
(630) 291-5128 Cell (US) +41 76 275 7116 Cell (CERN)



You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHubhttps://github.com//issues/15151#issuecomment-233930290, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AEzyw9ovqBMTw20w1jJOYjqxHPnL2MwKks5qXg8WgaJpZM4JH_Ai.

@slava77
Copy link
Contributor

slava77 commented Jul 20, 2016

On 7/20/16 5:15 AM, David Lange wrote:

On Jul 20, 2016, at 2:05 PM, Slava Krutelyov
<[email protected]mailto:[email protected]> wrote:

On 7/20/16 12:36 AM, David Lange wrote:

The routine validation done in CMS is not MC to data, but rather MC to
MC and data to data.

My point was that if the same physics feature (change in resolution,
efficiency etc)
appears both in MC and in data in a given relval campaign, the release
is validated.

This is not generally the method applied in CMS, no. MC is checked
against release N-1. If changes are observed they are reported and
hopefully understand (in the case of tracking, they certainly get
understood).

Are you saying that there were no changes in tracking performance since
April?
This is the only simple way to qualify the backports from 81X to 80X as
validated for the purpose
of integration back to 80X under the policy of no physics changes in an
active production release.

If we are giving up on the policy of no physics changes in a production
release, then the way for the backports discussed in this issue is clear.

@davidlange6
Copy link
Contributor

On Jul 20, 2016, at 2:55 PM, Slava Krutelyov <[email protected]mailto:[email protected]> wrote:

On 7/20/16 5:15 AM, David Lange wrote:

On Jul 20, 2016, at 2:05 PM, Slava Krutelyov
<[email protected]mailto:[email protected]mailto:[email protected]> wrote:

On 7/20/16 12:36 AM, David Lange wrote:

The routine validation done in CMS is not MC to data, but rather MC to
MC and data to data.

My point was that if the same physics feature (change in resolution,
efficiency etc)
appears both in MC and in data in a given relval campaign, the release
is validated.

This is not generally the method applied in CMS, no. MC is checked
against release N-1. If changes are observed they are reported and
hopefully understand (in the case of tracking, they certainly get
understood).

Are you saying that there were no changes in tracking performance since
April?
This is the only simple way to qualify the backports from 81X to 80X as
validated for the purpose
of integration back to 80X under the policy of no physics changes in an
active production release.

I don’t follow - we look at the relevant pre release(s) not months and months of changes at once. All that work is done already (at the 10k events level of course, and possibly mixed with other developments that I’ve forgotten about)

If we are giving up on the policy of no physics changes in a production
release, then the way for the backports discussed in this issue is clear.

clearly we are giving that up for these back ports, yes. that said, if the changes are meaningful (eg, noticeable) or not should be known before deploying.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHubhttps://github.com//issues/15151#issuecomment-233940383, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AEzyw5_M_vwAubckRDXKHqfmFPlSxgaxks5qXhqqgaJpZM4JH_Ai.

@slava77
Copy link
Contributor

slava77 commented Jul 20, 2016

On 7/20/16 5:15 AM, David Lange wrote:

My point was that if the same physics feature (change in resolution,
efficiency etc)
appears both in MC and in data in a given relval campaign, the release
is validated.

This is not generally the method applied in CMS, no. MC is checked
against release N-1. If changes are observed they are reported and
hopefully understand (in the case of tracking, they certainly get
understood).

BTW, most of the changes in this backport thread are not in tracking

@makortel
Copy link
Contributor

@slava77

I prefer to do in stages if possible and reasonable for PRs on different
topics.
If the change in #15240 (not #13959 from 81X) is ~trivial (no
regressions), we should go this way.

Ok, will do. (and thanks for pointing out my mix of 81X and 80X PRs, I corrected them now in my comment). Actually #14247 shows extremely tiny differences in two plots for 25202.0 and 136.731 (plots from ttbar)
image
image
While the differences are extremely tiny, seeing them in two workflows makes me a bit uncomfortable to claim "no differences".

#14070 <#14070> depends on #13959
<#13959>. I guess the options are

I think it's better to combine in this case: both are rather technical
and are on the same topic as a large part of RefCore changes in what is
now #15240

Ok. I'm now testing #14070 on top of #15240, and if it shows no difference wrt. 8_0_14 (as one would expect), I'll combine them.

@slava77
Copy link
Contributor

slava77 commented Jul 20, 2016

On 7/20/16 12:51 PM, Matti Kortelainen wrote:

Actually #14247 #14247 shows
extremely tiny differences in two plots for 25202.0 and 136.731 (plots
from ttbar)

Looking at the PR code: there is a change from double to float in the math.
Clearly, this will not give bit-wise identity.
If you feel like making an incremental diff with doubles put back, it
will give a more pedantic
confirmation of the expected behavior.
I would not insist on this though, considering that differences in the
plots are so small.

@fwyzard
Copy link
Contributor Author

fwyzard commented Jul 21, 2016

@cmsbuild assign HLT

@makortel
Copy link
Contributor

@slava77

Looking at the PR code: there is a change from double to float in the math.
Clearly, this will not give bit-wise identity.
If you feel like making an incremental diff with doubles put back, it
will give a more pedantic
confirmation of the expected behavior.
I would not insist on this though, considering that differences in the
plots are so small.

I've investigated a bit (too much?), and it is not (just) double->float. I suspect moving away from trigonometry in 94053a0 (of #14247) would be the cause. Testing on the parent commit shows no difference wrt. baseline, on the commit does show, but reverting the commit at the end also still shows the difference. If we can indeed live with the differences, I'd say hunting the causes down is not worth of the effort.

I'll anyway wait for #15240 to be merged first. Or would it be of interest to test it (especially timing) earlier?

@slava77
Copy link
Contributor

slava77 commented Jul 26, 2016

On 7/26/16 6:25 AM, Matti Kortelainen wrote:

@slava77 https://github.com/slava77

Looking at the PR code: there is a change from double to float in
the math.
Clearly, this will not give bit-wise identity.
If you feel like making an incremental diff with doubles put back, it
will give a more pedantic
confirmation of the expected behavior.
I would not insist on this though, considering that differences in the
plots are so small.

I've investigated a bit (too much?), and it is not (just) double->float.
I suspect moving away from trigonometry in 94053a0
94053a0
(of #14247 #14247) would be the
cause. Testing on the parent commit shows no difference wrt. baseline,
on the commit does show, but reverting the commit at the end also still
shows the difference. If we can indeed live with the differences, I'd
say hunting the causes down is not worth of the effort.

Let's see if the timing gains are significant.
Below 0.5%, I think, is going to invalidate the backport.

I'll anyway wait for #15240 #15240
to be merged first. Or would it be of interest to test it (especially
timing) earlier?


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#15151 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AEdcbhBE6aR2D_RSFAEbBV--ZisXmpA_ks5qZgq-gaJpZM4JH_Ai.

@makortel
Copy link
Contributor

@slava77 Ok, I filed a PR #15288 to enable testing.

@fwyzard
Copy link
Contributor Author

fwyzard commented Jul 28, 2016

backport

Performance and validation results of the various backports and tests, run over 10k events.
A green background indicates no observed differences, while a yellow background indicates differences smaller than 1‰

@fwyzard
Copy link
Contributor Author

fwyzard commented Aug 2, 2016

Here are the results of the various backports, compared with CMSSW_8_0_16.

Comparison of the average event processing time over 10k events with pileup ~40, averaged over 56 jobs:
timing

Relative comparison with respect to 8.0.16:
timing_relative

@slava77
Copy link
Contributor

slava77 commented Aug 2, 2016

On 8/2/16 2:54 PM, Andrea Bocci wrote:

Here are the results of the various backports, compared with CMSSW_8_0_16.

Comparison of the average event processing time over 10k events with
pileup ~40, averaged over 56 jobs:
timing
https://cloud.githubusercontent.com/assets/4069793/17346547/80bc1afc-590b-11e6-99bc-cc086ff38d74.png

Relative comparison with respect to 8.0.16:
timing_relative
https://cloud.githubusercontent.com/assets/4069793/17346550/83528f44-590b-11e6-9e4e-76f2b02370ce.png

Should we drop 15279 and 15288?

  • 15279 leads to changes in reco products. It has 0.5% improvement based
    on the plots.
  • 15288 shows no differences in jenkins tests. It has 0.4% improvement
    based on the plots.

Since 15288 doesn't introduce differences, it would be somewhat harmless
to still backport.
For 15279, I think more strongly that it should be dropped.

Andrea, do you agree?


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#15151 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AEdcbvwuvwEjHJhWqmR1_tV_lFhx1lTIks5qb7yggaJpZM4JH_Ai.

@fwyzard
Copy link
Contributor Author

fwyzard commented Aug 3, 2016

Hi Slava,
I have compared "all backports" and "all backports except #15279", and the difference in performance is small, less than 1%:

timing
timing_relative

On one hand, the performance increase is indeed quite small.
On the other, its impact on the HLT results is small as well.

If the impact on RECO is non-negligible, I agree we can leave it out.

@fwyzard
Copy link
Contributor Author

fwyzard commented Aug 26, 2016

#15279 was deemed to give too small an gain to be included.
All other PRs have been merged.

@fwyzard fwyzard closed this as completed Aug 26, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants