Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

API: scope of the public types module #16042

Closed
jorisvandenbossche opened this issue Apr 18, 2017 · 7 comments · Fixed by #16163
Closed

API: scope of the public types module #16042

jorisvandenbossche opened this issue Apr 18, 2017 · 7 comments · Fixed by #16163
Milestone

Comments

@jorisvandenbossche
Copy link
Member

jorisvandenbossche commented Apr 18, 2017

Currently, the public pandas.api.types modules holds the following functions:

In [11]: [f for f in dir(pd.api.types) if not f.startswith('_')]
Out[11]: 
['infer_dtype',
 'is_any_int_dtype',
 'is_bool',
 'is_bool_dtype',
 'is_categorical',
 'is_categorical_dtype',
 'is_complex',
 'is_complex_dtype',
 'is_datetime64_any_dtype',
 'is_datetime64_dtype',
 'is_datetime64_ns_dtype',
 'is_datetime64tz_dtype',
 'is_datetimetz',
 'is_dict_like',
 'is_dtype_equal',
 'is_extension_type',
 'is_file_like',
 'is_float',
 'is_float_dtype',
 'is_floating_dtype',
 'is_hashable',
 'is_int64_dtype',
 'is_integer',
 'is_integer_dtype',
 'is_interval',
 'is_interval_dtype',
 'is_iterator',
 'is_list_like',
 'is_named_tuple',
 'is_number',
 'is_numeric_dtype',
 'is_object_dtype',
 'is_period',
 'is_period_dtype',
 'is_re',
 'is_re_compilable',
 'is_scalar',
 'is_sequence',
 'is_signed_integer_dtype',
 'is_sparse',
 'is_string_dtype',
 'is_timedelta64_dtype',
 'is_timedelta64_ns_dtype',
 'is_unsigned_integer_dtype',
 'pandas_dtype',
 'union_categoricals']

Two questions I would like to discuss a bit:

  • Do we need to expose all of them?
    Putting the functions here, means keeping them stable. So we could also limit it to the more pandas specific ones. For example, is_re, is_dict_like, is_iterator, etc are general utility functions not actually related to pandas specifics. Of course it can be handy for other projects that they can use them instead of implementing it themselves, but it increases the number of functions we have to keep stable.

- Do we need to expose more of the pandas extension types API? (xref #16099)

From comment of @wesm (#15541 (comment)):

I've been running into these internal API / external API issues lately (e.g. needing to use DatetimeTZDtype in an external library), so it might be worth documenting the pandas < 0.19 way to get some of these APIs

It is how the dtypes will be implemented in pandas 1.x, so shouldn't we just expose this officially and put it in types? pyarrow uses the datetime tz one, and in #16015 we will probably also like to add CategoricalDtype
Not sure if the others (interval, period) are also needed.

@jreback
Copy link
Contributor

jreback commented Apr 18, 2017

I think exposing DatetimeTZDtype and CategoricalDtype are fine here as well.

@jreback
Copy link
Contributor

jreback commented Apr 18, 2017

I don't think its a bit deal to expose all (or almost all) of our introspection functions.

@jreback jreback added this to the 0.20.0 milestone Apr 23, 2017
jreback added a commit that referenced this issue Apr 24, 2017
* API: expose dtypes in pandas.api.types

xref #16015
xref apache/arrow#585
xref #16042
xref #15541 (comment)
@jreback
Copy link
Contributor

jreback commented Apr 27, 2017

@jorisvandenbossche

so we should finish documenting these is_* functions before adding them to api.rst.

I am not against leaving all of these. sure some are trivial but it does offer 1-stop shopping.

@jorisvandenbossche
Copy link
Member Author

so we should finish documenting these is_* functions before adding them to api.rst.

yes, the fact that they were not yet documented was the reason I didn't add them yet in the other PR.

I am not against leaving all of these. sure some are trivial but it does offer 1-stop shopping.

yes, in general OK as well.
I quickly went through them, and noted:

  • is_any_int_dtype is IMO a bit confusing name (it is also only used at two places in the whole pandas codebase, so not sure it is worth exposing here)
  • is_float_dtype / is_floating_dtype distinction is also not very clear. And did a search and apparently we do not use is_floating_dtype internally, so would remove that
  • the distinction between is_* and is_*dtype is not for all of them consistent. For some, the is_* version only accepts scalars, for others it also return True for a array/series of such elements.

@jreback
Copy link
Contributor

jreback commented Apr 27, 2017

ok I think ok then to remove (i'll depecate) some of these.

jreback added a commit to jreback/pandas that referenced this issue Apr 27, 2017
jreback added a commit to jreback/pandas that referenced this issue Apr 27, 2017
jreback added a commit that referenced this issue Apr 28, 2017
…i.types (#16163)

* DEPR: deprecate is_any_int_dtype and is_floating_dtype from pandas.api.types

closes #16042

* is_ docs
@jorisvandenbossche
Copy link
Member Author

Another unclear one (to me) is is_sequence. What is the difference with is_list_like ? (I can see it from the implementation: the ones checks whether the object has a length and the other not, but in practice?)
In any case, if we expose this publicly, the difference should at least be clear from the docstring. Or we could also remove it from api.types, if there is no clear usecase for users.
Internally, we also don't use is_sequence that much (only 8 times in the entire code base I think).

@jreback
Copy link
Contributor

jreback commented May 2, 2017

yeah is_sequence is also 'internal'. I'll remove it as well.

jreback added a commit to jreback/pandas that referenced this issue May 2, 2017
pcluo pushed a commit to pcluo/pandas that referenced this issue May 22, 2017
pcluo pushed a commit to pcluo/pandas that referenced this issue May 22, 2017
…i.types (pandas-dev#16163)

* DEPR: deprecate is_any_int_dtype and is_floating_dtype from pandas.api.types

closes pandas-dev#16042

* is_ docs
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants