-
-
Notifications
You must be signed in to change notification settings - Fork 18.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
resample gives AmbiguousTimeError when index ends on a DST boundary #10117
Comments
I saw that this bug is still open so I thought I should re-test Pandas 0.16.2 to see if the bug had been fixed along the way. Unfortunately this bug still exists. I have included the new traceback below (because it has different line numbers to the traceback above): In [79]: import pandas as pd
In [80]: idx = pd.date_range("2014-10-25 22:00:00", "2014-10-26 00:30:00",freq="30T",
tz="Europe/London")
In [81]: series = pd.Series(np.random.randn(len(idx)), index=idx)
In [82]: series
Out[82]:
2014-10-25 22:00:00+01:00 -1.315553
2014-10-25 22:30:00+01:00 0.294073
2014-10-25 23:00:00+01:00 0.067067
2014-10-25 23:30:00+01:00 0.710251
2014-10-26 00:00:00+01:00 -0.192490
2014-10-26 00:30:00+01:00 0.661763
Freq: 30T, dtype: float64
In [83]: series.resample('30T')
---------------------------------------------------------------------------
AmbiguousTimeError Traceback (most recent call last)
<ipython-input-83-bb9e86068ce1> in <module>()
----> 1 series.resample('30T')
/usr/local/lib/python2.7/dist-packages/pandas/core/generic.pyc in resample(self, rule, how, axis, fill_method, closed, label, convention, kind, loffset, limit, base)
3264 fill_method=fill_method, convention=convention,
3265 limit=limit, base=base)
-> 3266 return sampler.resample(self).__finalize__(self)
3267
3268 def first(self, offset):
/usr/local/lib/python2.7/dist-packages/pandas/tseries/resample.pyc in resample(self, obj)
80
81 if isinstance(ax, DatetimeIndex):
---> 82 rs = self._resample_timestamps()
83 elif isinstance(ax, PeriodIndex):
84 offset = to_offset(self.freq)
/usr/local/lib/python2.7/dist-packages/pandas/tseries/resample.pyc in _resample_timestamps(self, kind)
270 axlabels = self.ax
271
--> 272 self._get_binner_for_resample(kind=kind)
273 grouper = self.grouper
274 binner = self.binner
/usr/local/lib/python2.7/dist-packages/pandas/tseries/resample.pyc in _get_binner_for_resample(self, kind)
118 kind = self.kind
119 if kind is None or kind == 'timestamp':
--> 120 self.binner, bins, binlabels = self._get_time_bins(ax)
121 elif kind == 'timedelta':
122 self.binner, bins, binlabels = self._get_time_delta_bins(ax)
/usr/local/lib/python2.7/dist-packages/pandas/tseries/resample.pyc in _get_time_bins(self, ax)
159 first, last = ax.min(), ax.max()
160 first, last = _get_range_edges(first, last, self.freq, closed=self.closed,
--> 161 base=self.base)
162 tz = ax.tz
163 binner = labels = DatetimeIndex(freq=self.freq,
/usr/local/lib/python2.7/dist-packages/pandas/tseries/resample.pyc in _get_range_edges(first, last, offset, closed, base)
389 if (is_day and day_nanos % offset.nanos == 0) or not is_day:
390 return _adjust_dates_anchored(first, last, offset,
--> 391 closed=closed, base=base)
392
393 if not isinstance(offset, Tick): # and first.time() != last.time():
/usr/local/lib/python2.7/dist-packages/pandas/tseries/resample.pyc in _adjust_dates_anchored(first, last, offset, closed, base)
456
457 return (Timestamp(fresult).tz_localize(first_tzinfo),
--> 458 Timestamp(lresult).tz_localize(first_tzinfo))
459
460
pandas/tslib.pyx in pandas.tslib.Timestamp.tz_localize (pandas/tslib.c:10551)()
pandas/tslib.pyx in pandas.tslib.tz_localize_to_utc (pandas/tslib.c:50619)()
AmbiguousTimeError: Cannot infer dst time from Timestamp('2014-10-26 01:00:00'),
try using the 'ambiguous' argument |
if it were fixed it would be closed pull requests are welcome |
cc @rockg |
This happens also when index starts on DST boundary, df = pd.DataFrame(columns=['a', 'b', 'c'], index=pd.date_range('2014-03-09 03:00', '2015-03-09 03:00', freq='H', tz='America/Chicago')).assign(a=np.random.rand(), b=np.random.rand(), c=np.random.rand())
df.resample('H', label='right', closed='right').sum() results in: |
Here's the bug:
This is my hacky work-around:
The bug disappears if the end date of the index is beyond the DST boundary:
All is fine if the start date is on a DST boundary but the end date is beyond the boundary:
(As always, I must say a huge THANK YOU to everyone working on Pandas; it really is a great bit of software)
Possibly related:
The text was updated successfully, but these errors were encountered: