API: integer Extension Array #20700

jreback · 2018-04-15T00:43:14Z

Could easily imagine an ExtensionArray which uses as an implementation a numpy array of the appropriate dtype and a bitmask in order to fully support Integer NA across the board. I don't think this would be too hard. As a bonus, would be zero-copy compat with pyarrow impl (for the future)

Integer (ENH: Integer NA Extension Array #21160)
UnsignedInteger (ENH: Integer NA Extension Array #21160)
~~Boolean~~ (API: boolean Extension Array #21778)

making these the actual default (e.g. when integers are inferred with or w/o nulls) might be non-trivial, but let's implement first. These would give rise to a hierarchy of dtypes, e.g. IntegerDtype, Int8Dtype

The text was updated successfully, but these errors were encountered:

jreback · 2018-04-15T00:43:27Z

cc @jorisvandenbossche @TomAugspurger @wesm @cpcloud

closes pandas-dev#20700

jreback · 2018-05-13T22:44:21Z

here is a fully-function (extension-wise) integer na: https://github.com/jreback/pandas/tree/intna
doesnt break anything and coexists

I have enabled inference to accept the new types with a Registry, e.g.

In [1]: pd.Series([1,2,3, np.nan], dtype='Int8')
Out[1]: 
0      1
1      2
2      3
3    NaN
dtype: Int8

so construction is pretty flexible now.

next up is ops

cc @TomAugspurger @jorisvandenbossche

jorisvandenbossche · 2018-05-14T06:51:16Z

Cool!

Is your intention to do a PR to add this to pandas, or to have it as a separate package for now?

closes pandas-dev#20700

jreback · 2018-05-14T12:24:00Z

still needs quite a bit more tests / work. (have arithmetic ops done, but need comparison, and more indexing tests)

But i think directly in pandas. Note that this does not actually switch the base inference (e.g. [1,2 ,3]) still resolves to int64, we can do that at a later point). I suspect will have to change quite a lot of tests as we assume float conversions in a myriad of ways.

closes pandas-dev#20700

closes pandas-dev#20700 closes pandas-dev#20747

* ENH: add integer-na support via an ExtensionArray closes #20700 closes #20747

* ENH: add integer-na support via an ExtensionArray closes pandas-dev#20700 closes pandas-dev#20747

jreback added this to the 0.24.0 milestone Apr 15, 2018

jreback added a commit to jreback/pandas that referenced this issue May 13, 2018

ENH: add integer-na support via an ExtensionArray

3b75e85

closes pandas-dev#20700

jreback added a commit to jreback/pandas that referenced this issue May 14, 2018

ENH: add integer-na support via an ExtensionArray

6fc19f9

closes pandas-dev#20700

jreback added a commit to jreback/pandas that referenced this issue May 21, 2018

ENH: add integer-na support via an ExtensionArray

0758f1d

closes pandas-dev#20700

jreback mentioned this issue May 22, 2018

ENH: Integer NA Extension Array #21160

Merged

jreback added a commit to jreback/pandas that referenced this issue May 23, 2018

ENH: add integer-na support via an ExtensionArray

2e30a9c

closes pandas-dev#20700

jreback added a commit to jreback/pandas that referenced this issue May 24, 2018

ENH: add integer-na support via an ExtensionArray

3995d44

closes pandas-dev#20700

jreback added a commit to jreback/pandas that referenced this issue May 24, 2018

ENH: add integer-na support via an ExtensionArray

886fdc7

closes pandas-dev#20700

jreback added a commit to jreback/pandas that referenced this issue May 24, 2018

ENH: add integer-na support via an ExtensionArray

97b01e4

closes pandas-dev#20700

jreback added a commit to jreback/pandas that referenced this issue May 24, 2018

ENH: add integer-na support via an ExtensionArray

9f1179b

closes pandas-dev#20700

jreback added a commit to jreback/pandas that referenced this issue May 25, 2018

ENH: add integer-na support via an ExtensionArray

2e08b47

closes pandas-dev#20700

jreback added a commit to jreback/pandas that referenced this issue May 25, 2018

ENH: add integer-na support via an ExtensionArray

2a29ab2

closes pandas-dev#20700

jreback added a commit to jreback/pandas that referenced this issue May 29, 2018

ENH: add integer-na support via an ExtensionArray

accbdcc

closes pandas-dev#20700

jreback added a commit to jreback/pandas that referenced this issue Jul 4, 2018

ENH: add integer-na support via an ExtensionArray

0193c91

closes pandas-dev#20700 closes pandas-dev#20747

jreback added a commit to jreback/pandas that referenced this issue Jul 4, 2018

ENH: add integer-na support via an ExtensionArray

c73fb2d

closes pandas-dev#20700 closes pandas-dev#20747

jreback added a commit to jreback/pandas that referenced this issue Jul 5, 2018

ENH: add integer-na support via an ExtensionArray

74a8cac

closes pandas-dev#20700 closes pandas-dev#20747

jreback mentioned this issue Jul 6, 2018

API: boolean Extension Array #21778

Closed

jreback added a commit to jreback/pandas that referenced this issue Jul 7, 2018

ENH: add integer-na support via an ExtensionArray

5b31778

closes pandas-dev#20700 closes pandas-dev#20747

jreback added a commit to jreback/pandas that referenced this issue Jul 7, 2018

ENH: add integer-na support via an ExtensionArray

930e99d

closes pandas-dev#20700 closes pandas-dev#20747

jreback added a commit to jreback/pandas that referenced this issue Jul 7, 2018

ENH: add integer-na support via an ExtensionArray

a67afb9

closes pandas-dev#20700 closes pandas-dev#20747

jreback added a commit to jreback/pandas that referenced this issue Jul 8, 2018

ENH: add integer-na support via an ExtensionArray

9bdb7e1

closes pandas-dev#20700 closes pandas-dev#20747

jreback added a commit to jreback/pandas that referenced this issue Jul 8, 2018

ENH: add integer-na support via an ExtensionArray

f1a590b

closes pandas-dev#20700 closes pandas-dev#20747

jreback added a commit to jreback/pandas that referenced this issue Jul 8, 2018

ENH: add integer-na support via an ExtensionArray

aee2914

closes pandas-dev#20700 closes pandas-dev#20747

jreback added a commit to jreback/pandas that referenced this issue Jul 10, 2018

ENH: add integer-na support via an ExtensionArray

0b4bcca

closes pandas-dev#20700 closes pandas-dev#20747

jreback added a commit to jreback/pandas that referenced this issue Jul 11, 2018

ENH: add integer-na support via an ExtensionArray

3c160f5

closes pandas-dev#20700 closes pandas-dev#20747

jreback added a commit to jreback/pandas that referenced this issue Jul 12, 2018

ENH: add integer-na support via an ExtensionArray

738fc0b

closes pandas-dev#20700 closes pandas-dev#20747

jreback added a commit to jreback/pandas that referenced this issue Jul 16, 2018

ENH: add integer-na support via an ExtensionArray

4586245

closes pandas-dev#20700 closes pandas-dev#20747

jreback closed this as completed in #21160 Jul 20, 2018

jreback added a commit that referenced this issue Jul 20, 2018

ENH: Integer NA Extension Array (#21160)

8fd8d0d

* ENH: add integer-na support via an ExtensionArray closes #20700 closes #20747

Sup3rGeo pushed a commit to Sup3rGeo/pandas that referenced this issue Oct 1, 2018

ENH: Integer NA Extension Array (pandas-dev#21160)

8008e49

* ENH: add integer-na support via an ExtensionArray closes pandas-dev#20700 closes pandas-dev#20747

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

API: integer Extension Array #20700

API: integer Extension Array #20700

jreback commented Apr 15, 2018 •

edited

Loading

jreback commented Apr 15, 2018

jreback commented May 13, 2018

jorisvandenbossche commented May 14, 2018

jreback commented May 14, 2018

API: integer Extension Array #20700

API: integer Extension Array #20700

Comments

jreback commented Apr 15, 2018 • edited Loading

jreback commented Apr 15, 2018

jreback commented May 13, 2018

jorisvandenbossche commented May 14, 2018

jreback commented May 14, 2018

jreback commented Apr 15, 2018 •

edited

Loading