Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TST: add test case for user-defined function taking correct path in groupby transform #29631

Merged
merged 7 commits into from
Nov 20, 2019
9 changes: 8 additions & 1 deletion pandas/core/groupby/generic.py
Original file line number Diff line number Diff line change
Expand Up @@ -1401,7 +1401,14 @@ def _choose_path(self, fast_path: Callable, slow_path: Callable, group: DataFram
res = slow_path(group)

# if we make it here, test if we can use the fast path
res_fast = fast_path(group)
try:
res_fast = fast_path(group)
except AssertionError:
raise
except Exception:
# GH#29631 For user-defined function, we cant predict what may be
# raised; see test_transform.test_transform_fastpath_raises
return path, res

# verify fast path does not change columns (and names), otherwise
# its results cannot be joined with those of the slow path
Expand Down
26 changes: 26 additions & 0 deletions pandas/tests/groupby/test_transform.py
Original file line number Diff line number Diff line change
Expand Up @@ -1073,3 +1073,29 @@ def test_transform_lambda_with_datetimetz():
name="time",
)
tm.assert_series_equal(result, expected)


def test_transform_fastpath_raises():
# GH#29631 case where fastpath defined in groupby.generic _choose_path
# raises, but slow_path does not

df = pd.DataFrame({"A": [1, 1, 2, 2], "B": [1, -1, 1, 2]})
gb = df.groupby("A")

def replace(g):
mask = g < 0
return g.where(mask, g[~mask].mean())
Copy link
Member

@jorisvandenbossche jorisvandenbossche Nov 19, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it might be good to use a simpler function to test here (otherwise somebody later looking at the test will have the same "what the heck is this UDF doing" thought as we had now). It just needs to be a function that accepts a Series / returns a Series of the same shape, I think

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated


# Check that the fastpath raises, see _transform_general
obj = gb._obj_with_exclusions
gen = gb.grouper.get_iterator(obj, axis=gb.axis)
fast_path, slow_path = gb._define_paths(replace)
_, group = next(gen)

with pytest.raises(ValueError, match="Must specify axis"):
fast_path(group)

result = gb.transform(replace)

expected = pd.DataFrame([1, -1, 1.5, 1.5], columns=["B"])
tm.assert_frame_equal(result, expected)