-
Notifications
You must be signed in to change notification settings - Fork 85
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] possible 10x speed improvment #231
Conversation
@kratsg can you confirm that this is the scale of the MBJ workspace (30 channels, 7 samples per channel, 5 modifiers per sample, 1 bin per sample) |
@lukasheinrich This is fantastic news! Looking forward to checking this out more tonight. |
I have some code changes locally to do something like this, but I never got it to work/pass tests -- the math confused me a little bit for combining everything. I'll need some time to look through this and see if I understand it. |
@jpivarski yes this was my first instinct. However, in pyhf we want to support multiple tensorbackends such as TensorFlow and PyTorch etc that all more or less implement the numpy tensor ops. This way, we could make easy use of hardware acceleration etc. Would this also be a goal of awkward-array? or would you want to keep this numpy only? |
@lukasheinrich The base awkward-array package will support any library that has a Numpy interface. All access to Numpy is passed through As long as the library either implements the Numpy API or can be made to do so (e.g. with shims translating every TensorFlow function into its Numpy equivalent), then awkward-array supports that. This is an assumption that's broken by the optimizers (like in that last plot); it's why we always have to have a basic version that only asks for the arrays to quack like Numpy. |
@lukasheinrich TensorFlow eager tensors have a |
@jpivarski yeah we are more or less trying to have these shims here https://github.com/diana-hep/pyhf/tree/master/pyhf/tensor and there is https://github.com/tensorly/tensorly which seems to try to do something similar |
Isn't it a numpy array under the hood - by view? print(type(tf.Session().run(tf.constant([1,2,3])))) |
If so, that would be good. I think that ML frameworks like PyTorch and TensorFlow implement their own array classes so that they can freely move them from CPU to GPU and/or ignore the distinction between eager and lazy evaluation. Surely they all have methods to move them to the CPU and eagerly evaluate, the corner of these four options where Numpy lives. |
Hrmm, this is generating a cube and vectorizing that portion of it -- but we'd still want to do a previous step before this of dealing with meta-modifiers first because that reduces the dimensionality of the cube we need at the end. No? In most cases, the largest dimensionality is the number of modifiers (unless we're CMS). |
@kratsg the meta modifiers touch a different portion of the code (the computation of the constraint term in the pdf) actually I started with this as well and it's still in this PR but commented out https://github.com/diana-hep/pyhf/pull/231/files#diff-0e8e9106451dbaea56a5ff43a27335edR287 but I didn't really see any improvement |
Damn. Separate idea, I should update the See diana-hep/pyhf@5d7e4a8 as an example. >>> spec = { ... }
>>> model = pyhf.Model(spec)
>>> model.channels
['firstchannel']
>>> model.modifiers
['mu', 'stat_firstchannel']
>>> model.samples
['mu', 'bkg1', 'bkg2', 'bkg3'] |
i'll rebase after #236 |
9f0e61c
to
edf0cc9
Compare
so good and bad news.. these are the profiles of the MBJ execution prof2_fields.txt note that both are roughly the same toplevel line (which is the bad news) most time is spent here in the interpolation (as we knew)
but the new cube version should allow us to more easily vectorize that computation here are the 1574 interpolations
but the number of cubes that are actually computed is only 89 , so if we can vectorize the interpolation such that it interpolates multiple slices in the cube at once we can improve |
i.e. instead of this python loop https://github.com/diana-hep/pyhf/pull/231/files#diff-0e8e9106451dbaea56a5ff43a27335edR148 we would want something tensor-native |
311d664
to
1aec654
Compare
Looking at this comment, https://github.com/diana-hep/pyhf/pull/231#issuecomment-418817390 -- I tried to reimplement the same thing (using the |
@kratsg did you push your code to a separate branch ? is there a diff between the first commit in this branch and yours? maybe we can figure out what the bottleneck is |
I can push the notebook and the utility function in to this branch if you're ok with it. I already rebased your branch. |
@kratsg do you still se the ~1k s on the mbj example after the rebase? I'm fine with working together on this branch, but we should make sure we don't regress |
Ok, I'll re-run it on the full MBJ with the default run. |
New interpolation codes will be added in #251. |
Close in favor of #285. |
Description
The hardest bottleneck is that compute the expected poisson rate for each sample in each channel separately (these can have many bins and this computation is vectorized, but especially in SUSY analyses often it's only 1-bin anyways so that doesn't help much)
The solution is to vectorize the computation across channels and samples. But the problem is that the number of samples in each channel is not the same (somewhat similar to @jpivarski's awkward-arrays).
But we can still make this computation vectorized by adding some padding and construcing a cube of shape
(nchannels, nsamples, nbins)
where
nsamples
andnbins
are the maximum values of samples and bins observed in the specThen the approach is to
each modifier creates a "factor field" of the same shape as the cube (for histosys we need to have a special case)
(nchannels,nbins)
.ravel()
now the expected_actualdata has(nchannels*nbins,)
some simple benchmarking shows some promise
@kratsg @matthewfeickert this is not yet passing all tests, and step 4 is missing. right now I have only benchmarked this on a test case where no padding is needed