Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ArrayManager] TST: run (+fix/skip) pandas/tests/frame/indexing tests #40323

Merged
merged 11 commits into from
Mar 12, 2021

Conversation

jorisvandenbossche
Copy link
Member

@jorisvandenbossche jorisvandenbossche commented Mar 9, 2021

xref #39146

Extracting part of the test changes of #39578. This PR is doing pandas/tests/frame/indexing. The pandas/tests/indexing tests are handled in #40323.

@jorisvandenbossche jorisvandenbossche added Indexing Related to indexing on series/frames, not to indexes themselves Internals Related to non-user accessible pandas implementation labels Mar 9, 2021
@jorisvandenbossche jorisvandenbossche changed the title [ArrayManager] TST: pass (fix/skip) pandas/tests/frame/indexing tests [ArrayManager] TST: run (+fix/skip) pandas/tests/frame/indexing tests Mar 9, 2021
def test_xs_droplevel_false_view(self):
# GH#37832
df = DataFrame([[1, 2, 3]], columns=Index(["a", "b", "c"]))
result = df.xs("a", axis=1, drop_level=False)
df.values[0, 0] = 2
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this version feels much clearer as to whats going on

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you have a suggestion for how to then do it? (we can't rely on .values to mutate)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i guess what we really care about here is np.shares_memory(result._values, df.iloc[:, 0])

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added the np.shares_memory check. And turned out that also with ArrayManager this actually gives a view. But it's the iloc modifying the parent that doesn't result in modifying the child result, when using ArrayManager, but also when using a mixed DataFrame. So added that as an explicit test case as well.

@@ -382,6 +388,9 @@ def test_setitem_frame_duplicate_columns(self):
columns=cols,
dtype="object",
)
if using_array_manager:
expected["B"] = expected["B"].astype("int64")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

comment as to why the difference?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, added a comment (for "C" it's expected, since it's a __setitem__, for "B" it's actually a TODO to have .loc[: "col"] not overwrite)

@@ -296,6 +298,10 @@ def test_setitem_dt64tz(self, timezone_frame):
tm.assert_series_equal(df["D"], Series(idx, name="D"))
del df["D"]

if using_array_manager:
# TODO(ArrayManager) rewrite test for ArrayManager
return
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

xfail/skip?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The first part of the test is still useful to run, so I would rather not skip it

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if you skip/xfail at this point, then the part up to here will still run

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the end was easy to rewrite using .arrays so it works for both

@@ -1617,6 +1617,15 @@ def get_slice(self, slobj: slice, axis: int = 0) -> SingleBlockManager:
def index(self) -> Index:
return self.axes[0]

@property
def array(self) -> ArrayLike:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could put this on SingleDataManager and return self.arrays[0]?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, moved it to the base class (note this will require some type ignoring, because .arrays is not exactly the same for both classes)

@jreback jreback added this to the 1.3 milestone Mar 11, 2021
@jorisvandenbossche
Copy link
Member Author

@jbrockmendel ready to merge here?

@jbrockmendel
Copy link
Member

LGTM cc @jreback

@jreback jreback merged commit c331ba8 into pandas-dev:master Mar 12, 2021
@jreback
Copy link
Contributor

jreback commented Mar 12, 2021

thanks @jorisvandenbossche

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Indexing Related to indexing on series/frames, not to indexes themselves Internals Related to non-user accessible pandas implementation
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants