-
Notifications
You must be signed in to change notification settings - Fork 34
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix comparison error for pandas dataframe dtype #1054
Fix comparison error for pandas dataframe dtype #1054
Conversation
This pull request has been linked to Shortcut Story #17286: TypeError: data type 'ascii' not understood. |
The dtype passed in is likely to be of numpy.dtype when the dtype comes directly from pandas. This means we need to check the type before comparing to the string "ascii".
c0384ad
to
d53ca14
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can confirm this fixes a test that we've seen fail when using NumPy 1.20 (but passes on NumPy 1.22 - latest release). We'll be updating CI so that it tests against multiple versions of NumPy.
(tiledb-clean) vivian@mangonada:~/TileDB-Py$ pip list | grep numpy
numpy 1.20.0
(tiledb-clean) vivian@mangonada:~/TileDB-Py$ git branch --show-current
dev
(tiledb-clean) vivian@mangonada:~/TileDB-Py$ pytest -k "test_sparse_index_dtypes[i8]"
================================================ test session starts ================================================
platform linux -- Python 3.9.12, pytest-7.1.2, pluggy-1.0.0
rootdir: /home/vivian/TileDB-Py, configfile: pyproject.toml, testpaths: tiledb/tests
plugins: hypothesis-6.45.1
collected 450 items / 449 deselected / 1 skipped / 1 selected
tiledb/tests/test_libtiledb.py F [100%]
===================================================== FAILURES ======================================================
___________________________________ TestSparseArray.test_sparse_index_dtypes[i8] ____________________________________
self = <tiledb.tests.test_libtiledb.TestSparseArray object at 0x7ff47b0f9340>, dtype = 'i8'
@pytest.mark.skipif(not has_pandas(), reason="pandas not installed")
@pytest.mark.parametrize("dtype", INTEGER_DTYPES)
def test_sparse_index_dtypes(self, dtype):
path = self.path()
data = np.arange(0, 3).astype(dtype)
> schema = schema_from_dict(attrs={"attr": data}, dims={"d0": data})
tiledb/tests/test_libtiledb.py:2248:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
tiledb/util.py:37: in schema_from_dict
return _sparse_schema_from_dict(attrs, dims)
tiledb/util.py:8: in _sparse_schema_from_dict
attr_infos = {k: ColumnInfo.from_values(v) for k, v in input_attrs.items()}
tiledb/util.py:8: in <dictcomp>
attr_infos = {k: ColumnInfo.from_values(v) for k, v in input_attrs.items()}
tiledb/dataframe_.py:109: in from_values
return cls.from_dtype(array_like.dtype, varlen_types)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
cls = <class 'tiledb.dataframe_.ColumnInfo'>, dtype = dtype('int64'), varlen_types = ()
@classmethod
def from_dtype(cls, dtype, varlen_types=()):
from pandas.api import types as pd_types
> if dtype == "ascii":
E TypeError: data type 'ascii' not understood
tiledb/dataframe_.py:115: TypeError
============================================== short test summary info ==============================================
FAILED tiledb/tests/test_libtiledb.py::TestSparseArray::test_sparse_index_dtypes[i8] - TypeError: data type 'ascii...
=================================== 1 failed, 1 skipped, 449 deselected in 0.93s ====================================
(tiledb-clean) vivian@mangonada:~/TileDB-Py$ git branch --show-current
sethshelnutt/sc-17286/typeerror-data-type-ascii-not-understood
(tiledb-clean) vivian@mangonada:~/TileDB-Py$ pip list | grep numpy
numpy 1.20.0
(tiledb-clean) vivian@mangonada:~/TileDB-Py$ git branch --show-current
sethshelnutt/sc-17286/typeerror-data-type-ascii-not-understood
(tiledb-clean) vivian@mangonada:~/TileDB-Py$ pytest -k "test_sparse_index_dtypes[i8]"
================================================ test session starts ================================================
platform linux -- Python 3.9.12, pytest-7.1.2, pluggy-1.0.0
rootdir: /home/vivian/TileDB-Py, configfile: pyproject.toml, testpaths: tiledb/tests
plugins: hypothesis-6.45.1
collected 449 items / 448 deselected / 1 skipped / 1 selected
tiledb/tests/test_libtiledb.py . [100%]
=================================== 1 passed, 1 skipped, 448 deselected in 0.80s ====================================
This pull request has been linked to Shortcut Story #16509: Test failure against numpy 1.20. |
The dtype passed in is likely to be of numpy.dtype when the dtype comes directly from pandas. This means we need to check the type before comparing to the string "ascii".