-
Notifications
You must be signed in to change notification settings - Fork 918
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fixed timedelta issue in longest_contiguous_slice #725
Conversation
Codecov Report
@@ Coverage Diff @@
## master #725 +/- ##
=======================================
Coverage 90.80% 90.80%
=======================================
Files 67 67
Lines 6740 6740
=======================================
Hits 6120 6120
Misses 620 620
Continue to review full report at Codecov.
|
darts/timeseries.py
Outdated
@@ -1468,7 +1468,7 @@ def longest_contiguous_slice(self, max_gap_size: int = 0) -> 'TimeSeries': | |||
max_slice_start = None | |||
max_slice_end = None | |||
for index, row in relevant_gaps.iterrows(): | |||
size = row['gap_start'] - curr_slice_start - self._freq |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does changing the order of the operands make a difference here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes,
row['gap_start'] - curr_slice_start
gives a time delta in days from which we cannot subtract the frequency.row['gap_start'] - self._freq
gives a time stamp from which we can subtractcurr_slice_start
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we change our API to avoid relying on the order of the operands? Because this seems a bit fragile... I would expect a subtraction to always return the same type.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is how pandas handles the subtraction
pd.Timestamp - pd.Timestamp -> pd.TimeDelta
pd.Timestamp - pandas DateOffset (freq) -> pd.Timestamp
As TimeDelta doesn't carry information about the exact dates, we cannot subtract irregular frequencies (such as month start 'MS', etc.) which is why the original operand order fails.
For regular frequencies (such as 'D') the order doesn't matter for the results.
I don't believe it is possible to account for this in our API (as we deal with dates) but maybe I'm missing something.
Maybe we could make it more obvious with:
curr_slice_end = row['gap_start'] - self._freq
size = curr_slice_end - curr_slice_start
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 with your solution (and adding a comment) @dennisbader
…t8co/darts into fix/longest_contiguous_slice
* fixed timedelta issue in longest_contiguous_slice * made slice size calculation more comprehensible
Fixes #716
Summary
TimeSeries.longest_contiguous_slice()