Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ERR: raise if values passed to Categorical is a DataFrame #18207

Closed
wants to merge 3 commits into from

Conversation

jbrockmendel
Copy link
Member

@jbrockmendel jbrockmendel commented Nov 10, 2017

2nd of 3 to address #17112. This raises instead of making the following mistake:

>>> df = pd.DataFrame(np.random.randn(3,2), columns=['A', 'B'])
>>> pd.Categorical(df)
[A, B]
Categories (2, object): [A, B]

@@ -308,6 +308,9 @@ def __init__(self, values, categories=None, ordered=None, dtype=None,
elif isinstance(values, (ABCIndexClass, ABCSeries)):
# we'll do inference later
pass
elif isinstance(values, ABCDataFrame):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

instead of this specific tests, rather check the ndim of the passed values, it must be 1 (or None)

e.g.
getattr(values, 'ndim', None) in [1, None]

@@ -308,6 +308,9 @@ def __init__(self, values, categories=None, ordered=None, dtype=None,
elif isinstance(values, (ABCIndexClass, ABCSeries)):
# we'll do inference later
pass
elif getattr(values, 'ndim', 0) > 1:
raise TypeError("> 1 ndim Categorical are not "
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be a NotImplementedError instead of a TypeError? Similar to how the existing error with this message on line 350 is a NotImplementedError?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call.

@codecov
Copy link

codecov bot commented Nov 10, 2017

Codecov Report

Merging #18207 into master will increase coverage by <.01%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #18207      +/-   ##
==========================================
+ Coverage    91.4%    91.4%   +<.01%     
==========================================
  Files         163      163              
  Lines       50064    50066       +2     
==========================================
+ Hits        45759    45763       +4     
+ Misses       4305     4303       -2
Flag Coverage Δ
#multiple 89.21% <100%> (+0.02%) ⬆️
#single 40.36% <100%> (-0.06%) ⬇️
Impacted Files Coverage Δ
pandas/core/categorical.py 95.67% <100%> (-0.13%) ⬇️
pandas/io/gbq.py 25% <0%> (-58.34%) ⬇️
pandas/core/frame.py 97.8% <0%> (-0.1%) ⬇️
pandas/plotting/_converter.py 65.2% <0%> (+1.81%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update ca737ac...b0da084. Read the comment docs.

1 similar comment
@codecov
Copy link

codecov bot commented Nov 10, 2017

Codecov Report

Merging #18207 into master will increase coverage by <.01%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #18207      +/-   ##
==========================================
+ Coverage    91.4%    91.4%   +<.01%     
==========================================
  Files         163      163              
  Lines       50064    50066       +2     
==========================================
+ Hits        45759    45763       +4     
+ Misses       4305     4303       -2
Flag Coverage Δ
#multiple 89.21% <100%> (+0.02%) ⬆️
#single 40.36% <100%> (-0.06%) ⬇️
Impacted Files Coverage Δ
pandas/core/categorical.py 95.67% <100%> (-0.13%) ⬇️
pandas/io/gbq.py 25% <0%> (-58.34%) ⬇️
pandas/core/frame.py 97.8% <0%> (-0.1%) ⬇️
pandas/plotting/_converter.py 65.2% <0%> (+1.81%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update ca737ac...b0da084. Read the comment docs.

@codecov
Copy link

codecov bot commented Nov 10, 2017

Codecov Report

Merging #18207 into master will increase coverage by <.01%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #18207      +/-   ##
==========================================
+ Coverage    91.4%    91.4%   +<.01%     
==========================================
  Files         163      163              
  Lines       50064    50066       +2     
==========================================
+ Hits        45759    45763       +4     
+ Misses       4305     4303       -2
Flag Coverage Δ
#multiple 89.21% <100%> (+0.02%) ⬆️
#single 40.36% <100%> (-0.06%) ⬇️
Impacted Files Coverage Δ
pandas/core/categorical.py 95.67% <100%> (-0.13%) ⬇️
pandas/io/gbq.py 25% <0%> (-58.34%) ⬇️
pandas/core/frame.py 97.8% <0%> (-0.1%) ⬇️
pandas/plotting/_converter.py 65.2% <0%> (+1.81%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update ca737ac...b0da084. Read the comment docs.

@gfyoung gfyoung added Error Reporting Incorrect or improved errors from pandas Categorical Categorical Data Type labels Nov 10, 2017
def test_constructor_dataframe_error(self):
# GH#17112
df = pd.DataFrame(np.random.randn(3, 2), columns=['A', 'B'])
with pytest.raises(NotImplementedError):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is where checking the error message is helpful. AFAICT, your tests don't cover both places where a NotImplementedError can be raised.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agree here

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea, will do.

@@ -2331,6 +2334,8 @@ def _factorize_from_iterable(values):
categories=values.categories,
ordered=values.ordered)
codes = values.codes
elif getattr(values, 'ndim', 0) > 1:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this doesn't get hit in the tests

def test_constructor_dataframe_error(self):
# GH#17112
df = pd.DataFrame(np.random.randn(3, 2), columns=['A', 'B'])
with pytest.raises(NotImplementedError):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agree here

@jreback jreback changed the title raise if values passed to Categorical is a DataFrame ERR: raise if values passed to Categorical is a DataFrame Nov 10, 2017
@@ -173,6 +173,12 @@ def test_constructor_unsortable(self):
pytest.raises(
TypeError, lambda: Categorical(arr, ordered=True))

def test_constructor_dataframe_error(self):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

paramatrize with a ndarray > 1 dim

@jbrockmendel
Copy link
Member Author

Don't want to deal with rebase right now. Will return to this bug later. Closing.

@jbrockmendel jbrockmendel deleted the period_bugs2 branch April 5, 2020 17:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Categorical Categorical Data Type Error Reporting Incorrect or improved errors from pandas
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants