-
-
Notifications
You must be signed in to change notification settings - Fork 18.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reduce Circular Imports with pandas.core.reshape.concat #29133
Conversation
if isinstance(sample, Series): | ||
# TODO: Should this really require a class import? | ||
""" | ||
if isinstance(sample, ABCSeries): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is maybe a relic of the old panel days but as of now it seems a little overkill to import the generic class definition to resolve "index" and "columns" to their respective axis numbers. For now I just did this in the function body directly below commented code, but this probably belongs somewhere else as a utility function?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
my intuition is that you're right here. A lot of _axis_reversed stuff feels leftover
@@ -7134,7 +7133,6 @@ def _join_compat( | |||
self, other, on=None, how="left", lsuffix="", rsuffix="", sort=False | |||
): | |||
from pandas.core.reshape.merge import merge |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can the merge
import also be moved up?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Was hoping so but doesn't appear to be the case. I think its intertwined with some of the Categorical stuff so might make sense if / when I can resolve that space
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great.
Categorical is used in a lot of places where I think/hope a less high-powered tool could be used, and doing so couple simplify the dependency structure of the code base quite a bit.
pending green, LGTM |
I don't recall anyone trying it before. |
pandas/core/reshape/concat.py
Outdated
axis = DataFrame._get_axis_number(axis) | ||
else: | ||
axis = sample._get_axis_number(axis) | ||
""" | ||
if not isinstance(axis, int): | ||
axis = {"index": 0, "columns": 1}[axis] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks like you need the alias "rows" here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ha yep - nice little surprise there. Actually taking a look at the axis handling atm; might clean up and do as a pre-cursor.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Going to sit on this PR for a bit. Somewhat of a rabbit hole but I think will make sense to get traction on #29140 and simplify the axis handling instead of doing this as a one-off, especially since there's not much of a rush here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is fixing this more complicated than just adding the "rows" alias to that dict? runtime imports of concat are a non-trivial pain point for me when trying to reason about code, so i'd like to see this go through
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No that's it. I was hoping to do a more general utility function for axis validation (see #29140) but I can just add this here as well if you find it helpful. Not the worst cruft...
FYI not sure if anyone has seen this but here are the official Python docs on dealing with circular imports: We tend to violate Guido's recommendations a lot (ex: doing Something to consider... |
@WillAyd can you re-push. For some reason re-starting azure builds doesn't always take |
needs rebase. the azure failure hopefully unrelated |
Yea still looking at this locally but trying to get a more comprehensive fix in, which could include categorical. The problem is that this module can spur a lot of imports from the index area which import from pandas.core.base which import from pandas.core.indexes ... so a lot of the circular imports I think can be traced back to that. Might have a few pre-cursor PRs to make this a full fledged change |
that would be really nice if you can pull it off |
I'm looking at some related refactoring*, wondering what you have in mind for Categorical, in particular if * Roughly, trying to get core.algorithms, core.nanops, and maybe core.missing to not depend on Index/Series etal |
Nothing concrete. Categorical might even be a red herring and the import of ExtensionArray from I'm looking at this off and on so if it's something you are fully interested I certainly don't mind if you are motived to take over some aspects of this |
Closing to clear queue; will pick back up when I get a little more free time |
While working on #29124 I found a lot of general uses for this, but had to keep importing in the function body to prevent circular imports. By switching the isinstance checks to using ABC classes and moving the import of Index objects those circular imports can mostly be done away with
The only thing left that was causing circular imports was in the
Categorical
space. Can try to figure that out later, though cc @TomAugspurger in case that's a path that has already been gone down