Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

REF: Make PeriodArray an ExtensionArray #22862

Merged
merged 149 commits into from
Oct 25, 2018
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
Show all changes
149 commits
Select commit Hold shift + click to select a range
eaadcbc
WIP: PeriodArray
TomAugspurger Sep 26, 2018
a05928a
WIP
TomAugspurger Sep 27, 2018
3c0d9ee
Just moves
TomAugspurger Sep 27, 2018
63fc3fa
PeriodArray.shift definition
TomAugspurger Sep 27, 2018
7d5d71c
_data type
TomAugspurger Sep 27, 2018
e5caac6
clean
TomAugspurger Sep 27, 2018
c194407
accessor wip
TomAugspurger Sep 27, 2018
eb4506b
some more wip
TomAugspurger Sep 27, 2018
1b9fd7a
tshift, shift
TomAugspurger Sep 28, 2018
0fa0ed1
Arithmetic
TomAugspurger Sep 28, 2018
3247ea8
repr changes
TomAugspurger Sep 28, 2018
c162cdd
wip
TomAugspurger Sep 28, 2018
611d378
freq setter
TomAugspurger Sep 28, 2018
fb2ff82
Added disabled ops
TomAugspurger Sep 28, 2018
25a380f
copy
TomAugspurger Sep 28, 2018
1b2c4ec
Support concat
TomAugspurger Sep 28, 2018
d04293e
object ctor
TomAugspurger Sep 28, 2018
eacad39
Updates
TomAugspurger Sep 28, 2018
70cd3b8
lint
TomAugspurger Sep 28, 2018
9b22889
lint
TomAugspurger Sep 28, 2018
87d289a
Merge remote-tracking branch 'upstream/master' into ea-period
TomAugspurger Oct 1, 2018
6369c7f
wip
TomAugspurger Oct 1, 2018
01551f0
more wip
TomAugspurger Oct 1, 2018
0437940
array-setitem
TomAugspurger Oct 1, 2018
42ab137
wip
TomAugspurger Oct 1, 2018
298390f
wip
TomAugspurger Oct 1, 2018
23e5cfc
Use ._tshift internally for datetimelike ops
TomAugspurger Oct 2, 2018
9d17fd2
deep
TomAugspurger Oct 2, 2018
959cd72
Squashed commit of the following:
TomAugspurger Oct 2, 2018
b66f617
Squashed commit of the following:
TomAugspurger Oct 2, 2018
5669675
fixup
TomAugspurger Oct 2, 2018
2c0311c
The rest of the EA tests
TomAugspurger Oct 2, 2018
012be1c
docs
TomAugspurger Oct 2, 2018
c3a96d0
Merge remote-tracking branch 'upstream/master' into datetimelike-tshift
TomAugspurger Oct 3, 2018
67faabc
rename to time_shift
TomAugspurger Oct 3, 2018
ff7c06c
Squashed commit of the following:
TomAugspurger Oct 3, 2018
c2d57bd
Squashed commit of the following:
TomAugspurger Oct 3, 2018
fbde770
Squashed commit of the following:
TomAugspurger Oct 3, 2018
1c4bbe7
Squashed commit of the following:
TomAugspurger Oct 3, 2018
b395c90
fixed merge conflict
TomAugspurger Oct 3, 2018
d68a5c5
Handle divmod test
TomAugspurger Oct 3, 2018
0c7b704
extension tests passing
TomAugspurger Oct 3, 2018
d26d3d2
Squashed commit of the following:
TomAugspurger Oct 4, 2018
e4babea
Merge remote-tracking branch 'upstream/master' into ea-period
TomAugspurger Oct 4, 2018
7f6c144
merge conflict
TomAugspurger Oct 4, 2018
b4aa4ca
wip
TomAugspurger Oct 4, 2018
6a70131
indexes passing
TomAugspurger Oct 4, 2018
9aa077c
op names
TomAugspurger Oct 4, 2018
411738c
extension, arrays passing
TomAugspurger Oct 4, 2018
8e0fb69
Merge remote-tracking branch 'upstream/master' into ea-period
TomAugspurger Oct 9, 2018
6d98e85
fixup
TomAugspurger Oct 9, 2018
6d9e150
lint
TomAugspurger Oct 9, 2018
4899479
Fixed to_timestamp
TomAugspurger Oct 9, 2018
634def1
Same error message for index, series
TomAugspurger Oct 9, 2018
1f18452
Fix freq handling in to_timestamp
TomAugspurger Oct 9, 2018
2f92b22
dtype update
TomAugspurger Oct 9, 2018
23f232c
accept kwargs
TomAugspurger Oct 9, 2018
dd3b8cd
fixups
TomAugspurger Oct 9, 2018
1a7c360
Merge remote-tracking branch 'upstream/master' into ea-period
TomAugspurger Oct 9, 2018
87ecb64
updates
TomAugspurger Oct 9, 2018
0bde329
explicit
TomAugspurger Oct 9, 2018
2d85a82
add to assert
TomAugspurger Oct 9, 2018
438e6b5
wip period_array
TomAugspurger Oct 10, 2018
a9456fd
Merge remote-tracking branch 'upstream/master' into ea-period
TomAugspurger Oct 10, 2018
ac05365
wip period_array
TomAugspurger Oct 10, 2018
36ed547
order
TomAugspurger Oct 10, 2018
4652ca7
sort order
TomAugspurger Oct 10, 2018
a047a1b
test for hashing
TomAugspurger Oct 10, 2018
a4a30d7
update
TomAugspurger Oct 10, 2018
1441ae6
lint
TomAugspurger Oct 10, 2018
8003808
boxing
TomAugspurger Oct 10, 2018
5f43753
fix fixtures
TomAugspurger Oct 10, 2018
1c13d0f
infer
TomAugspurger Oct 10, 2018
bae6b3d
Remove seemingly unreachable code
TomAugspurger Oct 10, 2018
f422cf0
lint
TomAugspurger Oct 10, 2018
0229d74
wip
TomAugspurger Oct 12, 2018
aa40cf4
Merge remote-tracking branch 'upstream/master' into ea-period
TomAugspurger Oct 12, 2018
29085e1
Updates for master
TomAugspurger Oct 12, 2018
00ffddf
simplify
TomAugspurger Oct 12, 2018
e81fa9c
wip
TomAugspurger Oct 12, 2018
0c8925f
Merge remote-tracking branch 'upstream/master' into ea-period
TomAugspurger Oct 15, 2018
96204a1
remove view
TomAugspurger Oct 15, 2018
82930f7
Merge remote-tracking branch 'upstream/master' into ea-period
TomAugspurger Oct 17, 2018
8d24582
simplify
TomAugspurger Oct 17, 2018
1fc7744
lint
TomAugspurger Oct 17, 2018
6cd428c
Removed add_comparison_methods
TomAugspurger Oct 17, 2018
21693e0
xfail op
TomAugspurger Oct 17, 2018
b65ffad
remove some
TomAugspurger Oct 17, 2018
1f438e3
constructors
TomAugspurger Oct 17, 2018
f3928fb
Constructor cleanup
TomAugspurger Oct 17, 2018
089f8ab
misc fixups
TomAugspurger Oct 17, 2018
700650a
more xfails
TomAugspurger Oct 17, 2018
452c229
typo
TomAugspurger Oct 17, 2018
e3e0e57
Merge remote-tracking branch 'upstream/master' into ea-period
TomAugspurger Oct 18, 2018
78751c2
Added asi8
TomAugspurger Oct 18, 2018
203d561
Allow setting nan
TomAugspurger Oct 18, 2018
eb1c67d
revert breaking docs
TomAugspurger Oct 18, 2018
e08aa79
Override _add_sub_int_array
TomAugspurger Oct 18, 2018
c1ee04b
lint
TomAugspurger Oct 18, 2018
827e563
Update PeriodIndex._simple_new
TomAugspurger Oct 18, 2018
ca4a7fd
Clean up uses of .values, ._values, ._ndarray_values, ._data
TomAugspurger Oct 18, 2018
ed185c0
one more values
TomAugspurger Oct 18, 2018
b3407ac
Merge remote-tracking branch 'upstream/master' into ea-period
TomAugspurger Oct 18, 2018
a4011eb
remove xfails
TomAugspurger Oct 18, 2018
fc1ca3c
Fixed freq handling in _shallow_copy with a freq
TomAugspurger Oct 18, 2018
1b1841f
test updates
TomAugspurger Oct 18, 2018
b3b315a
API: Keep PeriodIndex.values an ndarray
TomAugspurger Oct 18, 2018
3ab4176
Merge remote-tracking branch 'upstream/master' into ea-period
TomAugspurger Oct 18, 2018
8102475
BUG: Raise for non-equal freq in take
TomAugspurger Oct 18, 2018
8c329eb
Punt on DataFrame.replace specializing
TomAugspurger Oct 18, 2018
78d4960
lint
TomAugspurger Oct 18, 2018
4e3d914
fixed xfail message
TomAugspurger Oct 18, 2018
5e4aaa7
TST: _from_datetime64
TomAugspurger Oct 19, 2018
7f77563
Fixups
TomAugspurger Oct 19, 2018
f88d6f7
escape
TomAugspurger Oct 19, 2018
7aa78ba
dtype
TomAugspurger Oct 19, 2018
2d737f8
revert and unxfail values
TomAugspurger Oct 19, 2018
833899a
error catching
TomAugspurger Oct 19, 2018
236b49c
isort
TomAugspurger Oct 19, 2018
8230347
Avoid PeriodArray.values
TomAugspurger Oct 19, 2018
bf33a57
clarify _box_func usage
TomAugspurger Oct 19, 2018
738acfe
Merge remote-tracking branch 'upstream/master' into ea-period
TomAugspurger Oct 19, 2018
032ec02
TST: unxfail ops tests
TomAugspurger Oct 19, 2018
77e389a
Avoid use of .values
jorisvandenbossche Oct 19, 2018
61031d7
__setitem__ type
TomAugspurger Oct 19, 2018
a094b3d
Misc cleanups
TomAugspurger Oct 19, 2018
ace4856
lint
TomAugspurger Oct 19, 2018
fc6a1c7
API: remove ordinal from period_array
TomAugspurger Oct 19, 2018
900afcf
catch exception
TomAugspurger Oct 19, 2018
0baa3e9
misc cleanup
TomAugspurger Oct 19, 2018
f95106e
Handle astype integer size
TomAugspurger Oct 19, 2018
e57e24a
Bump test coverage
TomAugspurger Oct 19, 2018
ce1c970
remove partial test
TomAugspurger Oct 19, 2018
a7e1216
close bracket
TomAugspurger Oct 19, 2018
2548d6a
change the test
TomAugspurger Oct 19, 2018
02e3863
isort
TomAugspurger Oct 19, 2018
1997cff
consistent _data
TomAugspurger Oct 19, 2018
af2d1de
lint
TomAugspurger Oct 19, 2018
64f5778
Merge remote-tracking branch 'upstream/master' into ea-period
TomAugspurger Oct 20, 2018
4151510
ndarray_values -> asi8
TomAugspurger Oct 20, 2018
ac9bd41
colocate ops
TomAugspurger Oct 20, 2018
5462bd7
refactor PeriodIndex.item
TomAugspurger Oct 20, 2018
c1c6428
return NotImplemented for Series / Index
TomAugspurger Oct 20, 2018
7ab2736
remove xpass
TomAugspurger Oct 20, 2018
bd6f966
release note
TomAugspurger Oct 22, 2018
8068daf
Merge remote-tracking branch 'upstream/master' into ea-period
TomAugspurger Oct 23, 2018
5691506
types, use data
TomAugspurger Oct 23, 2018
575d61a
remove ufunc xpass
TomAugspurger Oct 24, 2018
4065bdb
Merge remote-tracking branch 'upstream/master' into ea-period
TomAugspurger Oct 25, 2018
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion pandas/core/arrays/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@
from .categorical import Categorical # noqa
from .datetimes import DatetimeArrayMixin # noqa
from .interval import IntervalArray # noqa
from .period import PeriodArrayMixin # noqa
from .period import PeriodArray # noqa
from .timedeltas import TimedeltaArrayMixin # noqa
from .integer import ( # noqa
IntegerArray, integer_array)
2 changes: 1 addition & 1 deletion pandas/core/arrays/base.py
Original file line number Diff line number Diff line change
Expand Up @@ -118,7 +118,7 @@ def _from_factorized(cls, values, original):
Parameters
----------
values : ndarray
An integer ndarray with the factorized values.
An ndarray with the unique factorized values.
original : ExtensionArray
The original ExtensionArray that factorize was called on.

Expand Down
12 changes: 6 additions & 6 deletions pandas/core/arrays/datetimelike.py
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,9 @@
from pandas._libs.tslibs.period import (
Period, DIFFERENT_FREQ_INDEX, IncompatibleFrequency)

from pandas.errors import NullFrequencyError, PerformanceWarning
from pandas.errors import (
NullFrequencyError, PerformanceWarning, AbstractMethodError
)
from pandas import compat

from pandas.tseries import frequencies
Expand Down Expand Up @@ -76,12 +78,10 @@ class AttributesMixin(object):
@property
def _attributes(self):
# Inheriting subclass should implement _attributes as a list of strings
from pandas.errors import AbstractMethodError
raise AbstractMethodError(self)

@classmethod
def _simple_new(cls, values, **kwargs):
from pandas.errors import AbstractMethodError
raise AbstractMethodError(cls)

def _get_attributes_dict(self):
Expand Down Expand Up @@ -118,7 +118,7 @@ def _box_func(self):
"""
box function to get object from internal representation
"""
raise com.AbstractMethodError(self)
raise AbstractMethodError(self)

def _box_values(self, values):
"""
Expand Down Expand Up @@ -351,13 +351,13 @@ def _add_datelike(self, other):
typ=type(other).__name__))

def _sub_datelike(self, other):
raise com.AbstractMethodError(self)
raise AbstractMethodError(self)

def _sub_period(self, other):
return NotImplemented

def _add_offset(self, offset):
raise com.AbstractMethodError(self)
raise AbstractMethodError(self)

def _add_delta(self, other):
return NotImplemented
Expand Down
240 changes: 216 additions & 24 deletions pandas/core/arrays/period.py
Original file line number Diff line number Diff line change
Expand Up @@ -17,15 +17,21 @@
from pandas.util._decorators import cache_readonly
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you isort (if you have not done), and remov from the non-checking list

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Going to wait on that since there are other outstanding PRs touching imports.


from pandas.core.dtypes.common import (
is_integer_dtype, is_float_dtype, is_period_dtype)
is_integer_dtype, is_float_dtype, is_period_dtype,
is_float, is_integer, pandas_dtype, is_scalar,
is_datetime64_dtype,
ensure_object
)
from pandas.core.dtypes.dtypes import PeriodDtype
from pandas.core.dtypes.generic import ABCSeries
from pandas.core.dtypes.generic import ABCSeries, ABCIndex

import pandas.core.common as com

from pandas.tseries import frequencies
from pandas.tseries.frequencies import get_freq_code as _gfc
from pandas.tseries.offsets import Tick, DateOffset

from pandas.core.arrays import ExtensionArray
from pandas.core.arrays.datetimelike import DatetimeLikeArrayMixin


Expand All @@ -49,13 +55,16 @@ def _period_array_cmp(cls, op):

def wrapper(self, other):
op = getattr(self._ndarray_values, opname)
if isinstance(other, (ABCSeries, ABCIndex)):
other = other.values
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought we now tested that this was indeed returning NotImplemented?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's in #23155 (not merged yet, but I think ready to go).

We'll need to re-implement / adjust the test, since it uses EA.__add__(Series[EA]), which isn't defined for PeriodArray. We can do __sub__.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had to add an xfail for PeriodArray == Series[PeriodArray]. We don't return NotImplemented for comparison yet. But I think changing ops is out of scope for this PR.


if isinstance(other, Period):
if other.freq != self.freq:
msg = DIFFERENT_FREQ_INDEX.format(self.freqstr, other.freqstr)
raise IncompatibleFrequency(msg)

result = op(other.ordinal)
elif isinstance(other, PeriodArrayMixin):
elif isinstance(other, PeriodArray):
if other.freq != self.freq:
msg = DIFFERENT_FREQ_INDEX.format(self.freqstr, other.freqstr)
raise IncompatibleFrequency(msg)
Expand All @@ -70,6 +79,9 @@ def wrapper(self, other):
elif other is NaT:
result = np.empty(len(self._ndarray_values), dtype=bool)
result.fill(nat_result)
elif isinstance(other, (list, np.ndarray)):
# XXX: is this correct?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, but this is broken in the status quo, #21793

return NotImplemented
else:
other = Period(other, freq=self.freq)
result = op(other.ordinal)
Expand All @@ -82,7 +94,186 @@ def wrapper(self, other):
return compat.set_function_name(wrapper, opname, cls)


class PeriodArrayMixin(DatetimeLikeArrayMixin):
class PeriodArray(DatetimeLikeArrayMixin, ExtensionArray):
"""
Pandas ExtensionArray for Period data.

There are two components to a PeriodArray

- ordinals
- freq

The values are physically stored as an ndarray of integers. These are
called "ordinals" and represent some kind of offset from a base.

The `freq` indicates the span covered by each element of the array.
All elements in the PeriodArray have the same `freq`.
"""
_attributes = ["freq"]
# --------------------------------------------------------------------
# Constructors

def __new__(cls, data=None, ordinal=None, freq=None, start=None, end=None,
periods=None, tz=None, dtype=None, copy=False,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tz very likely doesnt belong

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Part of why the constructors for the existing mixins are bare-bones is because there are comments in the Index subclasses suggesting things like start/end should be taken out of them.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I mentioned on the mailing list, I would rather go for a very simple constructor (basically what _simple_new is now below). I don't think our array classes should have a __new__.
If it is too much work for this PR to refactor this __new__ method, we could also leave it for now as is but under another name (eg _complex_new :-))

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And I think we could then combine _simple_new and _from _ordinals ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To maybe clarify what I mean, a small change to IntervalArray: https://github.com/pandas-dev/pandas/compare/master...jorisvandenbossche:intervalarray?expand=1 (needs better naming of course): remove fastpath (should be done anyway as it is not used), and use __init__ instead of __new__. Of course, the functionality of passing a single array-like (of scalars or of interval dtype) what is now in the __new__ could still be kept in the __init__ if we find that important.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No preference on doing it as part of this PR or another. The diff is already impossible to read, so I may as well try doing it here.

**fields):
from pandas import PeriodIndex, DatetimeIndex, Int64Index

# copy-pase from PeriodIndex.__new__ with slight adjustments.
#
# - removed all uses of name
valid_field_set = {'year', 'month', 'day', 'quarter',
'hour', 'minute', 'second'}

if not set(fields).issubset(valid_field_set):
raise TypeError('__new__() got an unexpected keyword argument {}'.
format(list(set(fields) - valid_field_set)[0]))

if periods is not None:
if is_float(periods):
periods = int(periods)
elif not is_integer(periods):
msg = 'periods must be a number, got {periods}'
raise TypeError(msg.format(periods=periods))

if dtype is not None:
dtype = pandas_dtype(dtype)
if not is_period_dtype(dtype):
raise ValueError('dtype must be PeriodDtype')
if freq is None:
freq = dtype.freq
elif freq != dtype.freq:
msg = 'specified freq and dtype are different'
raise IncompatibleFrequency(msg)

# coerce freq to freq object, otherwise it can be coerced elementwise
# which is slow
if freq:
freq = Period._maybe_convert_freq(freq)

if data is None:
if ordinal is not None:
data = np.asarray(ordinal, dtype=np.int64)
else:
data, freq = cls._generate_range(start, end, periods,
freq, fields)
return cls._from_ordinals(data, freq=freq)

if isinstance(data, (PeriodArray, PeriodIndex)):
if freq is None or freq == data.freq: # no freq change
freq = data.freq
data = data._ndarray_values
else:
base1, _ = _gfc(data.freq)
base2, _ = _gfc(freq)
data = libperiod.period_asfreq_arr(data._ndarray_values,
base1, base2, 1)
return cls._simple_new(data, freq=freq)

# not array / index
if not isinstance(data, (np.ndarray, PeriodIndex,
DatetimeIndex, Int64Index)):
if is_scalar(data) or isinstance(data, Period):
# XXX
cls._scalar_data_error(data)

# other iterable of some kind
if not isinstance(data, (list, tuple)):
data = list(data)

data = np.asarray(data)

# datetime other than period
if is_datetime64_dtype(data.dtype):
data = dt64arr_to_periodarr(data, freq, tz)
return cls._from_ordinals(data, freq=freq)

# check not floats
if lib.infer_dtype(data) == 'floating' and len(data) > 0:
raise TypeError("PeriodIndex does not allow "
"floating point in construction")

# anything else, likely an array of strings or periods
data = ensure_object(data)
freq = freq or libperiod.extract_freq(data)
data = libperiod.extract_ordinals(data, freq)
return cls._from_ordinals(data, freq=freq)

@property
def asi8(self):
return self._data.view("i8")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this needed?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is also duplicated a bit below


@classmethod
def _from_sequence(cls, scalars, dtype=None, copy=False):
return cls(scalars, dtype=dtype, copy=copy)

@classmethod
def _from_factorized(cls, values, original):
return cls(values, dtype=original.dtype)

def __repr__(self):
return '<pandas PeriodArray>\n{}\nLength: {}, dtype: {}'.format(
[str(s) for s in self],
len(self),
self.dtype
)

def __len__(self):
return len(self._data)

def isna(self):
return self._data == iNaT

def take(self, indices, allow_fill=False, fill_value=None):
from pandas.core.algorithms import take

if fill_value is None:
fill_value = iNaT
elif isinstance(fill_value, Period):
fill_value = fill_value.ordinal
elif fill_value is NaT:
fill_value = iNaT
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

possibly self._fill_value or self._na_value or something? Those attributes exist in a few places but I don't think get used often.

elif fill_value != self.dtype.na_value:
raise ValueError("Expected a Period.")

new_values = take(self._data,
indices,
allow_fill=allow_fill,
fill_value=fill_value)

return self._from_ordinals(new_values, self.freq)

@property
def nbytes(self):
return self._data.nbytes

def copy(self, deep=False):
return self._from_ordinals(self._data.copy(), freq=self.freq)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should the deep arg be passed through to self._data.copy()?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, thanks.


@classmethod
def _concat_same_type(cls, to_concat):
freq = {x.freq for x in to_concat}
assert len(freq) == 1
freq = list(freq)[0]
values = np.concatenate([x._data for x in to_concat])
return cls._from_ordinals(values, freq=freq)

def value_counts(self, dropna=False):
from pandas.core.algorithms import value_counts
from pandas.core.indexes.period import PeriodIndex

if dropna:
values = self[~self.isna()]._data
else:
values = self._data

result = value_counts(values)
index = PeriodIndex._from_ordinals(result.index,
name=result.index.name,
freq=self.freq)
return type(result)(result.values,
index=index,
name=result.name)

@property
def _box_func(self):
return lambda x: Period._from_ordinal(ordinal=x, freq=self.freq)
Expand Down Expand Up @@ -114,21 +305,6 @@ def freq(self, value):
FutureWarning, stacklevel=2)
self._freq = value

# --------------------------------------------------------------------
# Constructors

_attributes = ["freq"]

def __new__(cls, values, freq=None, **kwargs):
if is_period_dtype(values):
# PeriodArray, PeriodIndex
if freq is not None and values.freq != freq:
raise IncompatibleFrequency(freq, values.freq)
freq = values.freq
values = values.asi8

return cls._simple_new(values, freq, **kwargs)

@classmethod
def _simple_new(cls, values, freq=None, **kwargs):
"""
Expand Down Expand Up @@ -264,7 +440,7 @@ def asfreq(self, freq=None, how='E'):
if self.hasnans:
new_data[self._isnan] = iNaT

return self._simple_new(new_data, self.name, freq=freq)
return self._simple_new(new_data, freq=freq)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, maybe better to use shallow_copy?


# ------------------------------------------------------------------
# Arithmetic Methods
Expand Down Expand Up @@ -319,7 +495,7 @@ def _add_delta(self, other):
ordinal_delta = self._maybe_convert_timedelta(other)
return self.shift(ordinal_delta)

def shift(self, n):
def shift(self, periods=1):
"""
Specialized shift which produces an Period Array/Index

Expand All @@ -332,7 +508,8 @@ def shift(self, n):
-------
shifted : Period Array/Index
"""
values = self._ndarray_values + n * self.freq.n
# TODO: ensure we match EA semantics, not PeriodIndex
values = self._ndarray_values + periods * self.freq.n
if self.hasnans:
values[self._isnan] = iNaT
return self._shallow_copy(values=values)
Expand Down Expand Up @@ -384,9 +561,15 @@ def _maybe_convert_timedelta(self, other):
raise IncompatibleFrequency(msg.format(cls=type(self).__name__,
freqstr=self.freqstr))

@classmethod
def _scalar_data_error(cls, data):
raise TypeError('{0}(...) must be called with a collection of some '
'kind, {1} was passed'.format(cls.__name__,
repr(data)))


PeriodArrayMixin._add_comparison_ops()
PeriodArrayMixin._add_datetimelike_methods()
PeriodArray._add_comparison_ops()
PeriodArray._add_datetimelike_methods()


# -------------------------------------------------------------------
Expand Down Expand Up @@ -486,3 +669,12 @@ def _make_field_arrays(*fields):
else np.repeat(x, length) for x in fields]

return arrays


def dt64arr_to_periodarr(data, freq, tz):
if data.dtype != np.dtype('M8[ns]'):
raise ValueError('Wrong dtype: %s' % data.dtype)

freq = Period._maybe_convert_freq(freq)
base, mult = _gfc(freq)
return libperiod.dt64arr_to_periodarr(data.view('i8'), base, tz)
Loading