Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: pandas 1.3.0 short series of type pandas.Int32DType() fails to calculate 0.75 quantile #42642

Closed
anarkiwi opened this issue Jul 21, 2021 · 2 comments
Labels
Duplicate Report Duplicate issue or pull request

Comments

@anarkiwi
Copy link

anarkiwi commented Jul 21, 2021

numpy uint32 is fine, but pd.Uint32Dtype() is not, on the same series:

>>> df = pd.DataFrame([{'x': 60}, {'x': 70}, {'x': 1514}, {'x': 1514}, {'x': 1514}, {'x': 1019}, {'x': 60}, {'x': 77}, {'x': 60}, {'x': 60}, {'x': 60}], dtype=np.uint32)
>>> df['x'].quantile(0.75)
1266.5

Now Pandas:

>>> df = pd.DataFrame([{'x': 60}, {'x': 70}, {'x': 1514}, {'x': 1514}, {'x': 1514}, {'x': 1019}, {'x': 60}, {'x': 77}, {'x': 60}, {'x': 60}, {'x': 60}], dtype=pd.UInt32Dtype())
>>> df['x'].quantile(0.75)
Traceback (most recent call last):
  File "/usr/local/lib/python3.9/site-packages/pandas/core/arrays/integer.py", line 126, in safe_cast
    return values.astype(dtype, casting="safe", copy=copy)
TypeError: Cannot cast array data from dtype('O') to dtype('uint32') according to the rule 'safe'

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/lib/python3.9/site-packages/pandas/core/series.py", line 2447, in quantile
    result = df.quantile(q=q, interpolation=interpolation, numeric_only=False)
  File "/usr/local/lib/python3.9/site-packages/pandas/core/frame.py", line 10290, in quantile
    res = self.quantile(
  File "/usr/local/lib/python3.9/site-packages/pandas/core/frame.py", line 10309, in quantile
    res = data._mgr.quantile(qs=q, axis=1, interpolation=interpolation)
  File "/usr/local/lib/python3.9/site-packages/pandas/core/internals/managers.py", line 1338, in quantile
    blocks = [
  File "/usr/local/lib/python3.9/site-packages/pandas/core/internals/managers.py", line 1339, in <listcomp>
    blk.quantile(axis=axis, qs=qs, interpolation=interpolation)
  File "/usr/local/lib/python3.9/site-packages/pandas/core/internals/blocks.py", line 1324, in quantile
    result = quantile_compat(self.values, np.asarray(qs._values), interpolation)
  File "/usr/local/lib/python3.9/site-packages/pandas/core/array_algos/quantile.py", line 46, in quantile_compat
    out = _quantile_ea_fallback(values, qs, interpolation)
  File "/usr/local/lib/python3.9/site-packages/pandas/core/array_algos/quantile.py", line 184, in _quantile_ea_fallback
    out = type(values)._from_sequence(res, dtype=values.dtype)
  File "/usr/local/lib/python3.9/site-packages/pandas/core/arrays/integer.py", line 323, in _from_sequence
    values, mask = coerce_to_array(scalars, dtype=dtype, copy=copy)
  File "/usr/local/lib/python3.9/site-packages/pandas/core/arrays/integer.py", line 231, in coerce_to_array
    values = safe_cast(values, dtype, copy=False)
  File "/usr/local/lib/python3.9/site-packages/pandas/core/arrays/integer.py", line 133, in safe_cast
    raise TypeError(
TypeError: cannot safely cast non-equivalent object to uint32
root@572eb14c4d4f:/tmp# pip3 show numpy
Name: numpy
Version: 1.21.0
Summary: NumPy is the fundamental package for array computing with Python.
Home-page: https://www.numpy.org
Author: Travis E. Oliphant et al.
Author-email: None
License: BSD
Location: /usr/local/lib/python3.9/site-packages
Requires: 
Required-by: scipy, scikit-learn, pandas
root@572eb14c4d4f:/tmp# pip3 show pandas
Name: pandas
Version: 1.3.0
Summary: Powerful data structures for data analysis, time series, and statistics
Home-page: https://pandas.pydata.org
Author: The Pandas Development Team
Author-email: [email protected]
License: BSD-3-Clause
Location: /usr/local/lib/python3.9/site-packages
Requires: pytz, numpy, python-dateutil
root@572eb14c4d4f:/tmp# python3 --version
Python 3.9.6
@anarkiwi anarkiwi added Bug Needs Triage Issue that has not been reviewed by a pandas team member labels Jul 21, 2021
@phofl
Copy link
Member

phofl commented Jul 21, 2021

This looks like a duplicate of #42626?

@phofl phofl added Duplicate Report Duplicate issue or pull request ExtensionArray Extending pandas with custom dtypes or arrays. Closing Candidate May be closeable, needs more eyeballs and removed Needs Triage Issue that has not been reviewed by a pandas team member labels Jul 21, 2021
@simonjayhawkins
Copy link
Member

This looks like a duplicate of #42626?

closing

@simonjayhawkins simonjayhawkins removed Closing Candidate May be closeable, needs more eyeballs Bug ExtensionArray Extending pandas with custom dtypes or arrays. labels Jul 22, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Duplicate Report Duplicate issue or pull request
Projects
None yet
Development

No branches or pull requests

3 participants