DISC: Period.le #17692

jbrockmendel · 2017-09-27T06:32:27Z

There are a bunch of issues outstanding that relate to Period ops and comparison. AFAICT making a decision about the comparison issue will snowball into resolving (some of) the ops issues.

#5202 ENH: Period ops NaT & timedelta ops
#10798 Date / Datetime in Period Index
#6779 Adding Period and Offset not implemented
#13077 ENH/API: Decide what to return Period - Period subtraction
#17112 MultiIndex - Comparison with Mixed Frequencies (and other FUBAR)

Right now two Period objects are comparable iff they equal freq attributes. AFAICT this is to avoid guessing in cases where the "correct" answer is ambiguous. But there are some other cases with an obviously correct answer. Namely, if the per1.end_time < per2.start_time, then it should be the case that per1 < per2 unambiguously. This intuition also extends to datetime and Timestamp objects that do not lie between per.start_time and per.end_time.

For cases with overlap there are a couple of reasonable approaches. My preferred approach is lexicographic: first compare per1.start_time with per2.start_time. If they are equal, then compare the end_times. Then we treat datetime and Timestamp objects as analogous to zero-duration periods.

Thoughts?

The text was updated successfully, but these errors were encountered:

shoyer · 2017-09-27T08:39:59Z

Another point of reference is Interval, which uses lexicographic ordering on (start, stop, closed):

pandas/pandas/_libs/interval.pyx

Lines 91 to 94 in d3be81a

    
           if _interval_like(other): 
        
               self_tuple = (self.left, self.right, self.closed) 
        
               other_tuple = (other.left, other.right, other.closed) 
        
               return PyObject_RichCompare(self_tuple, other_tuple, op)

So if we care about ordering periods with different frequencies, this would be a reasonable choice. On the other hand, if we don't have use cases for this, from a usability perspective it is probably better to stick with refusing to order potentially overlapping periods.

Then we treat datetime and Timestamp objects as analogous to zero-duration periods.

So period < timestamp would effectively compare period.start_time < timestamp? I suppose we could argue that this is consistent, but here I really don't think it's a good idea. It's better to force users to be explicit about what they want rather than allow them to make mistaken assumptions (period.end_time < timestamp is at least as plausible).

jreback · 2017-09-27T11:52:05Z

yeah in an ideal work we would actually use an Interval to back a Period. Yes we should clarify the semantics here.

jbrockmendel · 2017-09-27T15:07:37Z

So period < timestamp would effectively compare period.start_time < timestamp? I suppose we could argue that this is consistent, but here I really don't think it's a good idea

Yah, it's sub-optimal. The question becomes if we want to implement the unambiguous cases and continue raising for ambiguous ones vs leave existing behavior. Even for cases with overlap __eq__ and __ne__ should usually be unambiguous.

On the other hand, if we don't have use cases for this, from a usability perspective it is probably better to stick with refusing to order potentially overlapping periods.

From #17112, set_index uses comparisons under the hood:

index = pd.Index(['PCE']*2, name='Variable')
data = [
	pd.Period('2018Q1'),
	pd.Period('2021', freq='A'),
	]
ser = pd.Series(data, index=index, name='Period')
ser.to_frame().set_index('Period', append=True)

In py2 raises NotImplementedError: > 1 ndim Categorical are not supported at this time and in py3 raises SystemError: <built-in function isinstance> returned a result with an error set

jbrockmendel · 2017-10-01T20:41:56Z

Doesn't look like there's a lot of enthusiasm for this. One more shot with a less ambitious fix: just define __ne__ and __eq__ to evaluate unequal instead of raising.

jbrockmendel · 2017-11-21T03:43:17Z

Lack of consensus, closing.

jreback added API Design Period Period data type labels Sep 27, 2017

jbrockmendel closed this as completed Nov 21, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DISC: Period.le #17692

DISC: Period.le #17692

jbrockmendel commented Sep 27, 2017

shoyer commented Sep 27, 2017

jreback commented Sep 27, 2017

jbrockmendel commented Sep 27, 2017

jbrockmendel commented Oct 1, 2017

jbrockmendel commented Nov 21, 2017

DISC: Period.__le__ #17692

DISC: Period.__le__ #17692

Comments

jbrockmendel commented Sep 27, 2017

shoyer commented Sep 27, 2017

jreback commented Sep 27, 2017

jbrockmendel commented Sep 27, 2017

jbrockmendel commented Oct 1, 2017

jbrockmendel commented Nov 21, 2017

DISC: Period.le #17692

DISC: Period.le #17692