Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: resample by BusinessHour raises ValueError #12351

Open
scari opened this issue Feb 16, 2016 · 8 comments
Open

BUG: resample by BusinessHour raises ValueError #12351

scari opened this issue Feb 16, 2016 · 8 comments
Labels
Bug Frequency DateOffsets Resample resample method

Comments

@scari
Copy link
Contributor

scari commented Feb 16, 2016

I digged into a bit and found that resample by BusinessHour ran into here since the offset is neither string nor Tick:
https://github.com/pydata/pandas/blob/master/pandas/tseries/resample.py#L999

    if not isinstance(offset, Tick):  # and first.time() != last.time():
        # hack!
        first = first.normalize()
        last = last.normalize()

normalize in _get_range_edges cut the time parts off from the offset so it causes the exception.
If I modify the line into following, the resample goes smoothly:

    if not isinstance(offset, Tick) and not isinstance(offset, BusinessHour):
        # hack!
        first = first.normalize()
        last = last.normalize()

If I made wrong approach, please point me out. I'll make a PR with test.

Reproducible code

In [5]: df = pd.DataFrame(index=pd.date_range(
                                         start='2016-02-11 00:00:00',
                                         end='2016-02-12 23:00:00',
                                         freq='H'))

In [6]: df.resample('BH').count()

Traceback

In [6]: df.resample('BH').count()
> /Users/scari/codes/scari/pandas/pandas/tseries/resample.py(988)_get_range_edges()
-> if isinstance(offset, compat.string_types):
(Pdb) c
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-6-0ce6b0e15edc> in <module>()
----> 1 df.resample('BH').count()

/Users/scari/codes/scari/pandas/pandas/tseries/resample.py in f(self, _method)
    457
    458     def f(self, _method=method):
--> 459         return self._groupby_and_aggregate(None, _method)
    460     f.__doc__ = getattr(GroupBy, method).__doc__
    461     setattr(Resampler, method, f)

/Users/scari/codes/scari/pandas/pandas/tseries/resample.py in _groupby_and_aggregate(self, grouper, how, *args, **kwargs)
    339
    340         if grouper is None:
--> 341             self._set_binner()
    342             grouper = self.grouper
    343

/Users/scari/codes/scari/pandas/pandas/tseries/resample.py in _set_binner(self)
    193
    194         if self.binner is None:
--> 195             self.binner, self.grouper = self._get_binner()
    196
    197     def _get_binner(self):

/Users/scari/codes/scari/pandas/pandas/tseries/resample.py in _get_binner(self)
    201         """
    202
--> 203         binner, bins, binlabels = self._get_binner_for_time()
    204         bin_grouper = BinGrouper(bins, binlabels)
    205         return binner, bin_grouper

/Users/scari/codes/scari/pandas/pandas/tseries/resample.py in _get_binner_for_time(self)
    476         if self.kind == 'period':
    477             return self.groupby._get_time_period_bins(self.ax)
--> 478         return self.groupby._get_time_bins(self.ax)
    479
    480     def _downsample(self, how, **kwargs):

/Users/scari/codes/scari/pandas/pandas/tseries/resample.py in _get_time_bins(self, ax)
    879         # general version, knowing nothing about relative frequencies
    880         bins = lib.generate_bins_dt64(
--> 881             ax_values, bin_edges, self.closed, hasnans=ax.hasnans)
    882
    883         if self.closed == 'right':

/Users/scari/codes/scari/pandas/pandas/lib.pyx in pandas.lib.generate_bins_dt64 (pandas/lib.c:20883)()
   1160
   1161     if values[lenidx-1] > binner[lenbin-1]:
-> 1162         raise ValueError("Values falls after last bin")
   1163
   1164     bins = np.empty(lenbin - 1, dtype=np.int64)

ValueError: Values falls after last bin

INSTALLED VERSIONS

commit: 8f1a318
python: 3.4.4.final.0
python-bits: 64
OS: Darwin
OS-release: 15.0.0
machine: x86_64
processor: i386
byteorder: little
LC_ALL: ko_KR.UTF-8
LANG: ko_KR.UTF-8

pandas: 0.18.0rc1+8.g8f1a318.dirty
nose: 1.3.7
pip: 7.1.2
setuptools: 19.2
Cython: 0.23.4
numpy: 1.10.2
scipy: 0.16.0
statsmodels: 0.6.1
xarray: None
IPython: 4.0.0
sphinx: 1.3.1
patsy: 0.4.0
dateutil: 2.4.2
pytz: 2015.7
blosc: None
bottleneck: 1.0.0
tables: 3.2.2
numexpr: 2.4.4
matplotlib: 1.4.3
openpyxl: 2.2.6
xlrd: 0.9.4
xlwt: 1.0.0
xlsxwriter: 0.7.7
lxml: 3.4.4
bs4: 4.4.1
html5lib: None
httplib2: 0.9.1
apiclient: None
sqlalchemy: 1.0.9
pymysql: None
psycopg2: None
jinja2: 2.8

@jreback
Copy link
Contributor

jreback commented Feb 16, 2016

hmm, not really sure this business hour has a lot of tests.

@jreback jreback added this to the 0.18.1 milestone Feb 16, 2016
@scari
Copy link
Contributor Author

scari commented Feb 17, 2016

I found some glitch on business hour resampling during write a test for this. will open another issue when I finish investigating.

@scari
Copy link
Contributor Author

scari commented May 22, 2017

Working on this at PyCon.

scari pushed a commit to scari/pandas that referenced this issue May 23, 2017
scari pushed a commit to scari/pandas that referenced this issue May 23, 2017
scari pushed a commit to scari/pandas that referenced this issue May 23, 2017
scari pushed a commit to scari/pandas that referenced this issue May 23, 2017
scari pushed a commit to scari/pandas that referenced this issue May 24, 2017
scari pushed a commit to scari/pandas that referenced this issue May 24, 2017
@romulomadu-zz
Copy link

Hi guys, this problem was solved? I'm facing same issue here.
thx

@scari
Copy link
Contributor Author

scari commented Oct 11, 2017

@romulomadu I'm working on it.
#16447

@romulomadu-zz
Copy link

romulomadu-zz commented Oct 11, 2017

thx @scari , good luck!

@tylerwmarrs
Copy link

Is there any update on this issue? I ran into it and would like to know when the fix will be released.

@jreback
Copy link
Contributor

jreback commented Jan 5, 2019

@tylerwmarrs you are welcome to contribute a patch

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Frequency DateOffsets Resample resample method
Projects
None yet
7 participants