Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fixed timedelta issue in longest_contiguous_slice #725

Merged
merged 7 commits into from
Jan 14, 2022

Conversation

dennisbader
Copy link
Collaborator

Fixes #716

Summary

  • fixed timedelta issue in TimeSeries.longest_contiguous_slice()

@codecov-commenter
Copy link

codecov-commenter commented Jan 8, 2022

Codecov Report

Merging #725 (050f4d5) into master (3882a0c) will not change coverage.
The diff coverage is 100.00%.

Impacted file tree graph

@@           Coverage Diff           @@
##           master     #725   +/-   ##
=======================================
  Coverage   90.80%   90.80%           
=======================================
  Files          67       67           
  Lines        6740     6740           
=======================================
  Hits         6120     6120           
  Misses        620      620           
Impacted Files Coverage Δ
darts/timeseries.py 86.03% <100.00%> (+0.01%) ⬆️
darts/utils/likelihood_models.py 96.04% <0.00%> (-0.01%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 3882a0c...050f4d5. Read the comment docs.

@@ -1468,7 +1468,7 @@ def longest_contiguous_slice(self, max_gap_size: int = 0) -> 'TimeSeries':
max_slice_start = None
max_slice_end = None
for index, row in relevant_gaps.iterrows():
size = row['gap_start'] - curr_slice_start - self._freq
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does changing the order of the operands make a difference here?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes,

  • row['gap_start'] - curr_slice_start gives a time delta in days from which we cannot subtract the frequency.
  • row['gap_start'] - self._freq gives a time stamp from which we can subtract curr_slice_start

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we change our API to avoid relying on the order of the operands? Because this seems a bit fragile... I would expect a subtraction to always return the same type.

Copy link
Collaborator Author

@dennisbader dennisbader Jan 10, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is how pandas handles the subtraction

  • pd.Timestamp - pd.Timestamp -> pd.TimeDelta
  • pd.Timestamp - pandas DateOffset (freq) -> pd.Timestamp

As TimeDelta doesn't carry information about the exact dates, we cannot subtract irregular frequencies (such as month start 'MS', etc.) which is why the original operand order fails.

For regular frequencies (such as 'D') the order doesn't matter for the results.

I don't believe it is possible to account for this in our API (as we deal with dates) but maybe I'm missing something.

Maybe we could make it more obvious with:

curr_slice_end = row['gap_start'] - self._freq
size = curr_slice_end - curr_slice_start

?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 with your solution (and adding a comment) @dennisbader

@dennisbader dennisbader requested a review from brunnedu as a code owner January 13, 2022 21:29
@dennisbader dennisbader merged commit adde52a into master Jan 14, 2022
@dennisbader dennisbader deleted the fix/longest_contiguous_slice branch January 14, 2022 09:28
dennisbader added a commit that referenced this pull request Jan 15, 2022
* fixed timedelta issue in longest_contiguous_slice

* made slice size calculation more comprehensible
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[BUG] longest_contiguous_slice throws TypeError if max_gap_size is greater or equal than 'gap_size'
4 participants