Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"TypeError: unorderable types" in Python3 when column for MultiIndex contains tuple and int #15457

Closed
toobaz opened this issue Feb 19, 2017 · 9 comments · Fixed by #22072
Closed
Labels
Categorical Categorical Data Type Error Reporting Incorrect or improved errors from pandas MultiIndex
Milestone

Comments

@toobaz
Copy link
Member

toobaz commented Feb 19, 2017

Code Sample, a copy-pastable example if possible

In [2]: df = pd.DataFrame([[2, 1], [4, (1,2)]]).set_index([0, 1])
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
/home/pietro/nobackup/repo/pandas/pandas/core/categorical.py in __init__(self, values, categories, ordered, name, fastpath)
    288             try:
--> 289                 codes, categories = factorize(values, sort=True)
    290             except TypeError:

/home/pietro/nobackup/repo/pandas/pandas/core/algorithms.py in factorize(values, sort, order, na_sentinel, size_hint)
    360         uniques, labels = safe_sort(uniques, labels, na_sentinel=na_sentinel,
--> 361                                     assume_unique=True)
    362 

/home/pietro/nobackup/repo/pandas/pandas/core/algorithms.py in safe_sort(values, labels, na_sentinel, assume_unique)
    258         # unorderable in py3 if mixed str/int
--> 259         ordered = sort_mixed(values)
    260     else:

/home/pietro/nobackup/repo/pandas/pandas/core/algorithms.py in sort_mixed(values)
    251                            dtype=bool)
--> 252         nums = np.sort(values[~str_pos])
    253         strs = np.sort(values[str_pos])

/usr/lib/python3/dist-packages/numpy/core/fromnumeric.py in sort(a, axis, kind, order)
    821         a = asanyarray(a).copy(order="K")
--> 822     a.sort(axis=axis, kind=kind, order=order)
    823     return a

TypeError: unorderable types: tuple() > int()

During handling of the above exception, another exception occurred:

TypeError                                 Traceback (most recent call last)
<ipython-input-2-b560307721b0> in <module>()
----> 1 df = pd.DataFrame([[2, 1], [4, (1,2)]]).set_index([0, 1])

/home/pietro/nobackup/repo/pandas/pandas/core/frame.py in set_index(self, keys, drop, append, inplace, verify_integrity)
   2907             arrays.append(level)
   2908 
-> 2909         index = MultiIndex.from_arrays(arrays, names=names)
   2910 
   2911         if verify_integrity and not index.is_unique:

/home/pietro/nobackup/repo/pandas/pandas/indexes/multi.py in from_arrays(cls, arrays, sortorder, names)
   1085         from pandas.core.categorical import _factorize_from_iterables
   1086 
-> 1087         labels, levels = _factorize_from_iterables(arrays)
   1088         if names is None:
   1089             names = [getattr(arr, "name", None) for arr in arrays]

/home/pietro/nobackup/repo/pandas/pandas/core/categorical.py in _factorize_from_iterables(iterables)
   2082         # For consistency, it should return a list of 2 lists.
   2083         return [[], []]
-> 2084     return map(list, lzip(*[_factorize_from_iterable(it) for it in iterables]))

/home/pietro/nobackup/repo/pandas/pandas/core/categorical.py in <listcomp>(.0)
   2082         # For consistency, it should return a list of 2 lists.
   2083         return [[], []]
-> 2084     return map(list, lzip(*[_factorize_from_iterable(it) for it in iterables]))

/home/pietro/nobackup/repo/pandas/pandas/core/categorical.py in _factorize_from_iterable(values)
   2054         codes = values.codes
   2055     else:
-> 2056         cat = Categorical(values, ordered=True)
   2057         categories = cat.categories
   2058         codes = cat.codes

/home/pietro/nobackup/repo/pandas/pandas/core/categorical.py in __init__(self, values, categories, ordered, name, fastpath)
    293                     # raise, as we don't have a sortable data structure and so
    294                     # the user should give us one by specifying categories
--> 295                     raise TypeError("'values' is not ordered, please "
    296                                     "explicitly specify the categories order "
    297                                     "by passing in a categories argument.")

TypeError: 'values' is not ordered, please explicitly specify the categories order by passing in a categories argument.

Problem description

I would understand the unorderable types error - after all, I'm asking to have different types in the index, and pandas maybe wants to compare them, and it can't (this error obviously only affects Python3). But then,

In [3]: pd.DataFrame([[2, 'a'], [4, (1,2)]]).set_index([0, 1])
Out[3]: 
Empty DataFrame
Columns: []
Index: [(2, a), (4, (1, 2))]

works fine, (even though 'a' > (1,2) raises TypeError), and even allows me to sort the resulting index! Moreover, I understand this should be supported.

Expected Output

Empty DataFrame
Columns: []
Index: [(4, 1), (2, (1, 2))]

Output of pd.show_versions()

INSTALLED VERSIONS ------------------ commit: None python: 3.5.2.final.0 python-bits: 64 OS: Linux OS-release: 4.7.0-1-amd64 machine: x86_64 processor: byteorder: little LC_ALL: None LANG: it_IT.utf8 LOCALE: it_IT.UTF-8

pandas: 0.19.0+473.gf65a641
pytest: 3.0.6
pip: 8.1.2
setuptools: 28.0.0
Cython: 0.23.4
numpy: 1.12.0
scipy: 0.18.1
xarray: None
IPython: 5.1.0.dev
sphinx: 1.4.8
patsy: 0.3.0-dev
dateutil: 2.5.3
pytz: 2015.7
blosc: None
bottleneck: 1.2.0
tables: 3.2.2
numexpr: 2.6.0
feather: None
matplotlib: 2.0.0rc2
openpyxl: 2.3.0
xlrd: 1.0.0
xlwt: 1.1.2
xlsxwriter: 0.9.3
lxml: 3.6.4
bs4: 4.5.1
html5lib: 0.999
httplib2: 0.9.1
apiclient: 1.5.2
sqlalchemy: 1.0.15
pymysql: None
psycopg2: None
jinja2: 2.8
s3fs: None
pandas_datareader: 0.2.1

@jreback
Copy link
Contributor

jreback commented Feb 20, 2017

this could I suppose give a better error message. But this creates a very odd MI, with embedded tuples in a single level. This should not work.

@jreback jreback added Error Reporting Incorrect or improved errors from pandas MultiIndex Difficulty Intermediate labels Feb 20, 2017
@jreback jreback modified the milestones: Someday, Next Major Release Feb 20, 2017
@jolespin
Copy link

jolespin commented Jun 2, 2018

I have a similar error. I can't create a pd.Series object from this dictionary for some reason. I did it with another one very similar but there were only 2 tuples instead of 3.

pd.__version__
'0.23.0'


import pandas as pd
from numpy import array
import numpy as np

param_index = OrderedDict([((('criterion', 'gini'), ('max_features', 'log2'), ('min_samples_leaf', 1)), array([  0,  40,  80, 120, 160, 200])), ((('criterion', 'gini'), ('max_features', 'log2'), ('min_samples_leaf', 2)), array([  1,  41,  81, 121, 161, 201])), ((('criterion', 'gini'), ('max_features', 'log2'), ('min_samples_leaf', 3)), array([  2,  42,  82, 122, 162, 202])), ((('criterion', 'gini'), ('max_features', 'log2'), ('min_samples_leaf', 5)), array([  3,  43,  83, 123, 163, 203])), ((('criterion', 'gini'), ('max_features', 'log2'), ('min_samples_leaf', 8)), array([  4,  44,  84, 124, 164, 204])), ((('criterion', 'gini'), ('max_features', 'sqrt'), ('min_samples_leaf', 1)), array([  5,  45,  85, 125, 165, 205])), ((('criterion', 'gini'), ('max_features', 'sqrt'), ('min_samples_leaf', 2)), array([  6,  46,  86, 126, 166, 206])), ((('criterion', 'gini'), ('max_features', 'sqrt'), ('min_samples_leaf', 3)), array([  7,  47,  87, 127, 167, 207])), ((('criterion', 'gini'), ('max_features', 'sqrt'), ('min_samples_leaf', 5)), array([  8,  48,  88, 128, 168, 208])), ((('criterion', 'gini'), ('max_features', 'sqrt'), ('min_samples_leaf', 8)), array([  9,  49,  89, 129, 169, 209])), ((('criterion', 'gini'), ('max_features', None), ('min_samples_leaf', 1)), array([ 10,  50,  90, 130, 170, 210])), ((('criterion', 'gini'), ('max_features', None), ('min_samples_leaf', 2)), array([ 11,  51,  91, 131, 171, 211])), ((('criterion', 'gini'), ('max_features', None), ('min_samples_leaf', 3)), array([ 12,  52,  92, 132, 172, 212])), ((('criterion', 'gini'), ('max_features', None), ('min_samples_leaf', 5)), array([ 13,  53,  93, 133, 173, 213])), ((('criterion', 'gini'), ('max_features', None), ('min_samples_leaf', 8)), array([ 14,  54,  94, 134, 174, 214])), ((('criterion', 'gini'), ('max_features', 0.382), ('min_samples_leaf', 1)), array([ 15,  55,  95, 135, 175, 215])), ((('criterion', 'gini'), ('max_features', 0.382), ('min_samples_leaf', 2)), array([ 16,  56,  96, 136, 176, 216])), ((('criterion', 'gini'), ('max_features', 0.382), ('min_samples_leaf', 3)), array([ 17,  57,  97, 137, 177, 217])), ((('criterion', 'gini'), ('max_features', 0.382), ('min_samples_leaf', 5)), array([ 18,  58,  98, 138, 178, 218])), ((('criterion', 'gini'), ('max_features', 0.382), ('min_samples_leaf', 8)), array([ 19,  59,  99, 139, 179, 219])), ((('criterion', 'entropy'), ('max_features', 'log2'), ('min_samples_leaf', 1)), array([ 20,  60, 100, 140, 180, 220])), ((('criterion', 'entropy'), ('max_features', 'log2'), ('min_samples_leaf', 2)), array([ 21,  61, 101, 141, 181, 221])), ((('criterion', 'entropy'), ('max_features', 'log2'), ('min_samples_leaf', 3)), array([ 22,  62, 102, 142, 182, 222])), ((('criterion', 'entropy'), ('max_features', 'log2'), ('min_samples_leaf', 5)), array([ 23,  63, 103, 143, 183, 223])), ((('criterion', 'entropy'), ('max_features', 'log2'), ('min_samples_leaf', 8)), array([ 24,  64, 104, 144, 184, 224])), ((('criterion', 'entropy'), ('max_features', 'sqrt'), ('min_samples_leaf', 1)), array([ 25,  65, 105, 145, 185, 225])), ((('criterion', 'entropy'), ('max_features', 'sqrt'), ('min_samples_leaf', 2)), array([ 26,  66, 106, 146, 186, 226])), ((('criterion', 'entropy'), ('max_features', 'sqrt'), ('min_samples_leaf', 3)), array([ 27,  67, 107, 147, 187, 227])), ((('criterion', 'entropy'), ('max_features', 'sqrt'), ('min_samples_leaf', 5)), array([ 28,  68, 108, 148, 188, 228])), ((('criterion', 'entropy'), ('max_features', 'sqrt'), ('min_samples_leaf', 8)), array([ 29,  69, 109, 149, 189, 229])), ((('criterion', 'entropy'), ('max_features', None), ('min_samples_leaf', 1)), array([ 30,  70, 110, 150, 190, 230])), ((('criterion', 'entropy'), ('max_features', None), ('min_samples_leaf', 2)), array([ 31,  71, 111, 151, 191, 231])), ((('criterion', 'entropy'), ('max_features', None), ('min_samples_leaf', 3)), array([ 32,  72, 112, 152, 192, 232])), ((('criterion', 'entropy'), ('max_features', None), ('min_samples_leaf', 5)), array([ 33,  73, 113, 153, 193, 233])), ((('criterion', 'entropy'), ('max_features', None), ('min_samples_leaf', 8)), array([ 34,  74, 114, 154, 194, 234])), ((('criterion', 'entropy'), ('max_features', 0.382), ('min_samples_leaf', 1)), array([ 35,  75, 115, 155, 195, 235])), ((('criterion', 'entropy'), ('max_features', 0.382), ('min_samples_leaf', 2)), array([ 36,  76, 116, 156, 196, 236])), ((('criterion', 'entropy'), ('max_features', 0.382), ('min_samples_leaf', 3)), array([ 37,  77, 117, 157, 197, 237])), ((('criterion', 'entropy'), ('max_features', 0.382), ('min_samples_leaf', 5)), array([ 38,  78, 118, 158, 198, 238])), ((('criterion', 'entropy'), ('max_features', 0.382), ('min_samples_leaf', 8)), array([ 39,  79, 119, 159, 199, 239]))])


pd.Series(list(param_index.values()), index=param_index.keys())


---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
~/anaconda/envs/python3/lib/python3.6/site-packages/pandas/core/algorithms.py in factorize(values, sort, order, na_sentinel, size_hint)
    634         try:
--> 635             order = uniques.argsort()
    636             order2 = order.argsort()

TypeError: '<' not supported between instances of 'NoneType' and 'str'

During handling of the above exception, another exception occurred:

TypeError                                 Traceback (most recent call last)
~/anaconda/envs/python3/lib/python3.6/site-packages/pandas/core/sorting.py in safe_sort(values, labels, na_sentinel, assume_unique)
    445         try:
--> 446             sorter = values.argsort()
    447             ordered = values.take(sorter)

TypeError: '<' not supported between instances of 'NoneType' and 'str'

During handling of the above exception, another exception occurred:

TypeError                                 Traceback (most recent call last)
~/anaconda/envs/python3/lib/python3.6/site-packages/pandas/core/arrays/categorical.py in __init__(self, values, categories, ordered, dtype, fastpath)
    344             try:
--> 345                 codes, categories = factorize(values, sort=True)
    346             except TypeError:

~/anaconda/envs/python3/lib/python3.6/site-packages/pandas/util/_decorators.py in wrapper(*args, **kwargs)
    176                     kwargs[new_arg_name] = new_arg_value
--> 177             return func(*args, **kwargs)
    178         return wrapper

~/anaconda/envs/python3/lib/python3.6/site-packages/pandas/core/algorithms.py in factorize(values, sort, order, na_sentinel, size_hint)
    642                                         na_sentinel=na_sentinel,
--> 643                                         assume_unique=True)
    644 

~/anaconda/envs/python3/lib/python3.6/site-packages/pandas/core/sorting.py in safe_sort(values, labels, na_sentinel, assume_unique)
    449             # try this anyway
--> 450             ordered = sort_mixed(values)
    451 

~/anaconda/envs/python3/lib/python3.6/site-packages/pandas/core/sorting.py in sort_mixed(values)
    435                            dtype=bool)
--> 436         nums = np.sort(values[~str_pos])
    437         strs = np.sort(values[str_pos])

~/anaconda/envs/python3/lib/python3.6/site-packages/numpy/core/fromnumeric.py in sort(a, axis, kind, order)
    846         a = asanyarray(a).copy(order="K")
--> 847     a.sort(axis=axis, kind=kind, order=order)
    848     return a

TypeError: '<' not supported between instances of 'NoneType' and 'str'

During handling of the above exception, another exception occurred:

TypeError                                 Traceback (most recent call last)
<ipython-input-72-1f46691300f8> in <module>()
      1 param_index = OrderedDict([((('criterion', 'gini'), ('max_features', 'log2'), ('min_samples_leaf', 1)), array([  0,  40,  80, 120, 160, 200])), ((('criterion', 'gini'), ('max_features', 'log2'), ('min_samples_leaf', 2)), array([  1,  41,  81, 121, 161, 201])), ((('criterion', 'gini'), ('max_features', 'log2'), ('min_samples_leaf', 3)), array([  2,  42,  82, 122, 162, 202])), ((('criterion', 'gini'), ('max_features', 'log2'), ('min_samples_leaf', 5)), array([  3,  43,  83, 123, 163, 203])), ((('criterion', 'gini'), ('max_features', 'log2'), ('min_samples_leaf', 8)), array([  4,  44,  84, 124, 164, 204])), ((('criterion', 'gini'), ('max_features', 'sqrt'), ('min_samples_leaf', 1)), array([  5,  45,  85, 125, 165, 205])), ((('criterion', 'gini'), ('max_features', 'sqrt'), ('min_samples_leaf', 2)), array([  6,  46,  86, 126, 166, 206])), ((('criterion', 'gini'), ('max_features', 'sqrt'), ('min_samples_leaf', 3)), array([  7,  47,  87, 127, 167, 207])), ((('criterion', 'gini'), ('max_features', 'sqrt'), ('min_samples_leaf', 5)), array([  8,  48,  88, 128, 168, 208])), ((('criterion', 'gini'), ('max_features', 'sqrt'), ('min_samples_leaf', 8)), array([  9,  49,  89, 129, 169, 209])), ((('criterion', 'gini'), ('max_features', None), ('min_samples_leaf', 1)), array([ 10,  50,  90, 130, 170, 210])), ((('criterion', 'gini'), ('max_features', None), ('min_samples_leaf', 2)), array([ 11,  51,  91, 131, 171, 211])), ((('criterion', 'gini'), ('max_features', None), ('min_samples_leaf', 3)), array([ 12,  52,  92, 132, 172, 212])), ((('criterion', 'gini'), ('max_features', None), ('min_samples_leaf', 5)), array([ 13,  53,  93, 133, 173, 213])), ((('criterion', 'gini'), ('max_features', None), ('min_samples_leaf', 8)), array([ 14,  54,  94, 134, 174, 214])), ((('criterion', 'gini'), ('max_features', 0.382), ('min_samples_leaf', 1)), array([ 15,  55,  95, 135, 175, 215])), ((('criterion', 'gini'), ('max_features', 0.382), ('min_samples_leaf', 2)), array([ 16,  56,  96, 136, 176, 216])), ((('criterion', 'gini'), ('max_features', 0.382), ('min_samples_leaf', 3)), array([ 17,  57,  97, 137, 177, 217])), ((('criterion', 'gini'), ('max_features', 0.382), ('min_samples_leaf', 5)), array([ 18,  58,  98, 138, 178, 218])), ((('criterion', 'gini'), ('max_features', 0.382), ('min_samples_leaf', 8)), array([ 19,  59,  99, 139, 179, 219])), ((('criterion', 'entropy'), ('max_features', 'log2'), ('min_samples_leaf', 1)), array([ 20,  60, 100, 140, 180, 220])), ((('criterion', 'entropy'), ('max_features', 'log2'), ('min_samples_leaf', 2)), array([ 21,  61, 101, 141, 181, 221])), ((('criterion', 'entropy'), ('max_features', 'log2'), ('min_samples_leaf', 3)), array([ 22,  62, 102, 142, 182, 222])), ((('criterion', 'entropy'), ('max_features', 'log2'), ('min_samples_leaf', 5)), array([ 23,  63, 103, 143, 183, 223])), ((('criterion', 'entropy'), ('max_features', 'log2'), ('min_samples_leaf', 8)), array([ 24,  64, 104, 144, 184, 224])), ((('criterion', 'entropy'), ('max_features', 'sqrt'), ('min_samples_leaf', 1)), array([ 25,  65, 105, 145, 185, 225])), ((('criterion', 'entropy'), ('max_features', 'sqrt'), ('min_samples_leaf', 2)), array([ 26,  66, 106, 146, 186, 226])), ((('criterion', 'entropy'), ('max_features', 'sqrt'), ('min_samples_leaf', 3)), array([ 27,  67, 107, 147, 187, 227])), ((('criterion', 'entropy'), ('max_features', 'sqrt'), ('min_samples_leaf', 5)), array([ 28,  68, 108, 148, 188, 228])), ((('criterion', 'entropy'), ('max_features', 'sqrt'), ('min_samples_leaf', 8)), array([ 29,  69, 109, 149, 189, 229])), ((('criterion', 'entropy'), ('max_features', None), ('min_samples_leaf', 1)), array([ 30,  70, 110, 150, 190, 230])), ((('criterion', 'entropy'), ('max_features', None), ('min_samples_leaf', 2)), array([ 31,  71, 111, 151, 191, 231])), ((('criterion', 'entropy'), ('max_features', None), ('min_samples_leaf', 3)), array([ 32,  72, 112, 152, 192, 232])), ((('criterion', 'entropy'), ('max_features', None), ('min_samples_leaf', 5)), array([ 33,  73, 113, 153, 193, 233])), ((('criterion', 'entropy'), ('max_features', None), ('min_samples_leaf', 8)), array([ 34,  74, 114, 154, 194, 234])), ((('criterion', 'entropy'), ('max_features', 0.382), ('min_samples_leaf', 1)), array([ 35,  75, 115, 155, 195, 235])), ((('criterion', 'entropy'), ('max_features', 0.382), ('min_samples_leaf', 2)), array([ 36,  76, 116, 156, 196, 236])), ((('criterion', 'entropy'), ('max_features', 0.382), ('min_samples_leaf', 3)), array([ 37,  77, 117, 157, 197, 237])), ((('criterion', 'entropy'), ('max_features', 0.382), ('min_samples_leaf', 5)), array([ 38,  78, 118, 158, 198, 238])), ((('criterion', 'entropy'), ('max_features', 0.382), ('min_samples_leaf', 8)), array([ 39,  79, 119, 159, 199, 239]))])
----> 2 pd.Series(list(param_index.values()), index=param_index.keys())

~/anaconda/envs/python3/lib/python3.6/site-packages/pandas/core/series.py in __init__(self, data, index, dtype, name, copy, fastpath)
    180 
    181             if index is not None:
--> 182                 index = _ensure_index(index)
    183 
    184             if data is None:

~/anaconda/envs/python3/lib/python3.6/site-packages/pandas/core/indexes/base.py in _ensure_index(index_like, copy)
   4955             index_like = copy(index_like)
   4956 
-> 4957     return Index(index_like)
   4958 
   4959 

~/anaconda/envs/python3/lib/python3.6/site-packages/pandas/core/indexes/base.py in __new__(cls, data, dtype, copy, name, fastpath, tupleize_cols, **kwargs)
    433                     from .multi import MultiIndex
    434                     return MultiIndex.from_tuples(
--> 435                         data, names=name or kwargs.get('names'))
    436             # other iterable of some kind
    437             subarr = com._asarray_tuplesafe(data, dtype=object)

~/anaconda/envs/python3/lib/python3.6/site-packages/pandas/core/indexes/multi.py in from_tuples(cls, tuples, sortorder, names)
   1354             arrays = lzip(*tuples)
   1355 
-> 1356         return MultiIndex.from_arrays(arrays, sortorder=sortorder, names=names)
   1357 
   1358     @classmethod

~/anaconda/envs/python3/lib/python3.6/site-packages/pandas/core/indexes/multi.py in from_arrays(cls, arrays, sortorder, names)
   1298         from pandas.core.arrays.categorical import _factorize_from_iterables
   1299 
-> 1300         labels, levels = _factorize_from_iterables(arrays)
   1301         if names is None:
   1302             names = [getattr(arr, "name", None) for arr in arrays]

~/anaconda/envs/python3/lib/python3.6/site-packages/pandas/core/arrays/categorical.py in _factorize_from_iterables(iterables)
   2541         # For consistency, it should return a list of 2 lists.
   2542         return [[], []]
-> 2543     return map(list, lzip(*[_factorize_from_iterable(it) for it in iterables]))

~/anaconda/envs/python3/lib/python3.6/site-packages/pandas/core/arrays/categorical.py in <listcomp>(.0)
   2541         # For consistency, it should return a list of 2 lists.
   2542         return [[], []]
-> 2543     return map(list, lzip(*[_factorize_from_iterable(it) for it in iterables]))

~/anaconda/envs/python3/lib/python3.6/site-packages/pandas/core/arrays/categorical.py in _factorize_from_iterable(values)
   2513         codes = values.codes
   2514     else:
-> 2515         cat = Categorical(values, ordered=True)
   2516         categories = cat.categories
   2517         codes = cat.codes

~/anaconda/envs/python3/lib/python3.6/site-packages/pandas/core/arrays/categorical.py in __init__(self, values, categories, ordered, dtype, fastpath)
    349                     # raise, as we don't have a sortable data structure and so
    350                     # the user should give us one by specifying categories
--> 351                     raise TypeError("'values' is not ordered, please "
    352                                     "explicitly specify the categories order "
    353                                     "by passing in a categories argument.")

TypeError: 'values' is not ordered, please explicitly specify the categories order by passing in a categories argument.    

chrisz from stackoverflow mentioned I should link my SO question for reference so the issues are back-linked:

https://stackoverflow.com/questions/50662091/cannot-create-pd-series-from-dictionary-typeerror-values-is-not-ordered

@dickreuter
Copy link
Contributor

What's the solution to this? Do I have to downgrade to 0.22?

@toobaz
Copy link
Member Author

toobaz commented Jul 26, 2018

What's the solution to this? Do I have to downgrade to 0.22?

I guess that won't help - if you look in the bug description, you will see it dates well before 0.22.

@jolespin
Copy link

jolespin commented Jul 26, 2018

Downgrading to 0.22 solved my problem . I ran the same code and it worked.

@dickreuter
Copy link
Contributor

dickreuter commented Jul 26, 2018

What about the latest version of pandas? Would that fix the problem? Is there no workaround? By the way in my case there are neither tuples nor ints involved. Just a normal dataframe with strings as column names and some columns containing object (not sure this matters). And you're right. Downgrading to 0.22 does not solve the problem. Not able to set multiple indices seems to be something that should be classified as a major issue.

@toobaz
Copy link
Member Author

toobaz commented Jul 27, 2018

@jolespin your issue is different (as the issue reported here is not a regression), so I opened a new one: #22077

@toobaz
Copy link
Member Author

toobaz commented Jul 27, 2018

This is related to MultiIndex construction relying on Categorical:

In [2]: pd.Categorical(np.array([1, 'a'], dtype='object'), ordered=True)
Out[2]: 
[1, a]
Categories (2, object): [1 < a]

In [3]: pd.Categorical(np.array([1, (1, 2)], dtype='object'), ordered=True)
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
~/nobackup/repo/pandas/pandas/core/algorithms.py in factorize(values, sort, order, na_sentinel, size_hint)
    634         try:
--> 635             order = uniques.argsort()
    636             order2 = order.argsort()

TypeError: unorderable types: tuple() < int()

During handling of the above exception, another exception occurred:

TypeError                                 Traceback (most recent call last)
~/nobackup/repo/pandas/pandas/core/arrays/categorical.py in __init__(self, values, categories, ordered, dtype, fastpath)
    397             try:
--> 398                 codes, categories = factorize(values, sort=True)
    399             except TypeError:

~/nobackup/repo/pandas/pandas/util/_decorators.py in wrapper(*args, **kwargs)
    177                     kwargs[new_arg_name] = new_arg_value
--> 178             return func(*args, **kwargs)
    179         return wrapper

~/nobackup/repo/pandas/pandas/core/algorithms.py in factorize(values, sort, order, na_sentinel, size_hint)
    642                                         na_sentinel=na_sentinel,
--> 643                                         assume_unique=True)
    644 

~/nobackup/repo/pandas/pandas/core/sorting.py in safe_sort(values, labels, na_sentinel, assume_unique)
    447         # unorderable in py3 if mixed str/int
--> 448         ordered = sort_mixed(values)
    449     else:

~/nobackup/repo/pandas/pandas/core/sorting.py in sort_mixed(values)
    440                            dtype=bool)
--> 441         nums = np.sort(values[~str_pos])
    442         strs = np.sort(values[str_pos])

~/.local/lib/python3.5/site-packages/numpy/core/fromnumeric.py in sort(a, axis, kind, order)
    846         a = asanyarray(a).copy(order="K")
--> 847     a.sort(axis=axis, kind=kind, order=order)
    848     return a

TypeError: unorderable types: tuple() < int()

During handling of the above exception, another exception occurred:

TypeError                                 Traceback (most recent call last)
<ipython-input-3-02fe967cd88d> in <module>()
----> 1 pd.Categorical(np.array([1, (1, 2)], dtype='object'), ordered=True)

~/nobackup/repo/pandas/pandas/core/arrays/categorical.py in __init__(self, values, categories, ordered, dtype, fastpath)
    402                     # raise, as we don't have a sortable data structure and so
    403                     # the user should give us one by specifying categories
--> 404                     raise TypeError("'values' is not ordered, please "
    405                                     "explicitly specify the categories order "
    406                                     "by passing in a categories argument.")

TypeError: 'values' is not ordered, please explicitly specify the categories order by passing in a categories argument.

In [4]: pd.Categorical(np.array([1, (1, 2)], dtype='object'), ordered=False)
Out[4]: 
[1, (1, 2)]
Categories (2, object): [1, (1, 2)]

Changing title accordingly.

@toobaz toobaz changed the title "TypeError: unorderable types" in Python3 when column for MultiIndex contains tuple and int "TypeError: unorderable types" in Python3 when initializing Categorical with array values including tuples Jul 27, 2018
@toobaz toobaz added the Categorical Categorical Data Type label Jul 27, 2018
@toobaz toobaz changed the title "TypeError: unorderable types" in Python3 when initializing Categorical with array values including tuples "TypeError: unorderable types" in Python3 when column for MultiIndex contains tuple and int to "TypeError: unorderable types" Jul 27, 2018
@toobaz
Copy link
Member Author

toobaz commented Jul 27, 2018

Sorry for the mess: I restored the original title and opened #22080 for the Categorical problem as this MultiIndex problem admits a simpler fix ( #22072 ).

@jreback jreback modified the milestones: Contributions Welcome, 0.24.0 Jul 28, 2018
@toobaz toobaz changed the title "TypeError: unorderable types" in Python3 when column for MultiIndex contains tuple and int to "TypeError: unorderable types" "TypeError: unorderable types" in Python3 when column for MultiIndex contains tuple and int Aug 8, 2018
toobaz pushed a commit that referenced this issue Aug 12, 2018
Sup3rGeo pushed a commit to Sup3rGeo/pandas that referenced this issue Oct 1, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Categorical Categorical Data Type Error Reporting Incorrect or improved errors from pandas MultiIndex
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants