Avoid dynamic creation of ArrayMethods classes and new wrapjaggedmethods decorator #48

guitargeek · 2019-04-11T15:29:20Z

Hi, I suggest here a fix for the issue I explained in scikit-hep/awkward-0.x#120.

The problem is the dynamic creation of classes in the _unwrapped_jagged function, which causes classes which you would expect to have the same type to actually not the same type.

I suggest two solution approaches which both work for my use case.

create the new types ass class members (not instance members) of the classes that use them
create the new types outside of the class, just in the scope of the file

I applied solution 2) to the from_ptetaphim function. Creating the types outside the class actually has one more cool benefit: you can use the types with a new decorator I called for now wrapjaggedmethod, which takes care of the _normalize_arrays, _unwrapped_jagged and re-wrapping into a jagged array for you.

I hope these Ideas could be useful to reduce code duplication as well. What to you think? Sorry again for initially implementing a lousy unit test which didn't catch this before...

jpivarski

This is a very good point—dynamically creating the classes makes them fail the type(self) == type(other) check in concatenation. Your scikit-hep/awkward-0.x#120 PR won't be testable until this is fully deployed, so I added a new version number to work through that.

However, the same issue is going to affect jagged arrays of Lorentz vectors from uproot, which are also based on a dynamically created type and doesn't go through this code. (A) Maybe it should go through this mechanism, but that would be adding a special case to uproot: it's different because uproot doesn't know about TLorentzVectors; it constructs them according to a recipe derived from streamers. That generality is a good thing. (B) Maybe the test in concatenate can be loosened to a general awkward.util.equivalent_class function that only ensures that two classes have the same set of __bases__.

If concatenate is generalized like that, then this fix wouldn't be strictly necessary. However, I see it as an improvement even if it's not motivated by the concatenate issue.

nsmith- · 2019-04-11T21:34:32Z

If I understand correctly, this wrapjagged is exactly what I was looking for in scikit-hep/coffea#60 (comment)
This is a procedure we use often in that package, it would be good for the facility to be available in awkward itself.

jpivarski · 2019-04-11T21:39:08Z

I was just about to give awkward a new semimajor version, which is a thing that would be required if we want to introduce a cross-repo dependency like this.

The whole file uproot_methods/wrapjagged.py could be moved into awkward/util.py. Do you want me to do that?

guitargeek · 2019-04-11T21:46:29Z

To me that sounds like a good idea! I would also suggest to decorate all the other from_xyz methods with wrapjaggedmethod, to close the circle. I would volunteer to do that quickly.

The name wrapjaggedmethod is probably not optimal by the way, because it suggest it can only be with classmethods and not with regular functions.

jpivarski · 2019-04-11T21:50:56Z

@guitargeek Could you create a PR of how you'd like it to look in awkward and then in uproot-methods? I understand that the uproot-methods PR won't succeed until the awkward PR is deployed, at least to a prerelease. I'm going to start a prerelease cycle for all three to get the machinery started.

If it's a large project, please let me know and we'll put it off for later. I want to have a pip-installable version for a talk on Wednesday. Thanks!

nsmith- · 2019-04-11T21:51:31Z

This makes implementing a few non-ufunc numpy ops trivial, so would be very useful I think!
Perhaps the broader idea is a decorator that wraps a function that normally accepts numpy arrays and allows it to be used with awkward (i.e. not just jagged) arrays?

jpivarski · 2019-04-11T21:57:41Z

@nsmith- If you mean "any function," then that's too broad. I don't see a way to do it in general.

The important thing right now—at the threshold of awkward version 0.9.0, is to get in the API dependencies between awkward and uproot-methods to get the functionality above to work. We need general names, because we can't change names until the next semimajor version, but we can leave holes for additional functionality. For instance, the wrapper might only work on JaggedArrays right now, but we name it something general and let it raise NotImplementedError on non-JaggedArray cases. As long as the non-JaggedArray cases can be implemented in only one repo—awkward and not uproot-methods, for instance—then we can do that at any time and it wouldn't require a new version. It would be a future awkward feature, not a combined awkward/uproot-methods feature, like this one.

guitargeek · 2019-04-11T21:59:17Z

Alright, is it ok if I quickly come up with two PRs each in awkward and uproot-methods and then we can continue discussing? For now I will rename the decorator to just awkwardfunction I think, but maybe you will find a better name.

jpivarski · 2019-04-11T22:05:38Z

Yes, focus on the two PRs—no new capabilities, just organizing this capability better. The name will have to explain what it does in contrast to other awkward.util functions, "awkwardfunction" doesn't tell me anything. It would still be correct to call this "wrap" and "unwrap", right? How about unwrap_awkward and wrap_awkward?

I used the word "normalize" in a CS way while that function was private (with an underscore). If it's public and physicists will see it, there's a chance of confusion (making the norm 1). Maybe it should be called uniformly_awkward? It makes all the given arrays have the same structure.

guitargeek · 2019-04-11T22:10:52Z

Haha right awkwarfunction is a terribly generic name!

Not all functions have to be public, the underscore could still exist for everything else than the decorator.
It could be:
_normalize_arrays -> _wrap_awkward
_unwrap_jagged -> _unwrap_awkward
wrapjaggedmethod -> ?? (the decorator)

edit: got the mapping wrong and corrected. Anyway I will not worry about the names now, I cant seem to come up with something good.

jpivarski · 2019-04-11T22:15:34Z

If uproot-methods is going to call it, the methods must be public. (Underscore means we're free to change it, but if another library, like uproot-methods, depends on it, then changing it would break that without us knowing. That happened once—it's a disaster I'd like to avoid.)

I'm confused by your mapping. Why does "unwrap" become "wrap"? That's the opposite. What _normalize_arrays did wasn't wrapping or unwrapping—it was getting all the arrays in a set to the same level of jaggedness, even if some were scalars or flat arrays. (E.g. allows us to set jagged arrays on pt, eta, phi but a scalar on mass.)

jpivarski · 2019-04-11T22:51:28Z

scikit-hep/awkward-0.x#122

guitargeek added 2 commits April 11, 2019 16:53

technically fixed dynamic mixin issue

2dc8876

new wrapjaggedmethod decorator

9623870

guitargeek mentioned this pull request Apr 11, 2019

More general unit test for Lorentz vector array concatenation scikit-hep/awkward-0.x#120

Merged

[skip ci] bump version number for deployment

422f98a

jpivarski approved these changes Apr 11, 2019

View reviewed changes

jpivarski merged commit 21e7d59 into scikit-hep:master Apr 11, 2019

guitargeek deleted the concatenate_mixin_bug branch April 11, 2019 18:04

guitargeek mentioned this pull request Apr 11, 2019

Receive wrapjagged functionality from uproot-methods scikit-hep/awkward-0.x#122

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Avoid dynamic creation of ArrayMethods classes and new wrapjaggedmethods decorator #48

Avoid dynamic creation of ArrayMethods classes and new wrapjaggedmethods decorator #48

guitargeek commented Apr 11, 2019

jpivarski left a comment

nsmith- commented Apr 11, 2019

jpivarski commented Apr 11, 2019

guitargeek commented Apr 11, 2019

jpivarski commented Apr 11, 2019

nsmith- commented Apr 11, 2019

jpivarski commented Apr 11, 2019

guitargeek commented Apr 11, 2019

jpivarski commented Apr 11, 2019

guitargeek commented Apr 11, 2019 •

edited

Loading

jpivarski commented Apr 11, 2019

jpivarski commented Apr 11, 2019

Avoid dynamic creation of ArrayMethods classes and new wrapjaggedmethods decorator #48

Avoid dynamic creation of ArrayMethods classes and new wrapjaggedmethods decorator #48

Conversation

guitargeek commented Apr 11, 2019

jpivarski left a comment

Choose a reason for hiding this comment

nsmith- commented Apr 11, 2019

jpivarski commented Apr 11, 2019

guitargeek commented Apr 11, 2019

jpivarski commented Apr 11, 2019

nsmith- commented Apr 11, 2019

jpivarski commented Apr 11, 2019

guitargeek commented Apr 11, 2019

jpivarski commented Apr 11, 2019

guitargeek commented Apr 11, 2019 • edited Loading

jpivarski commented Apr 11, 2019

jpivarski commented Apr 11, 2019

guitargeek commented Apr 11, 2019 •

edited

Loading