Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: Cleanup backend for Offsets and Period #5148

Closed
wants to merge 4 commits into from

Conversation

cancan101
Copy link
Contributor

This PR is a work in progress

Fixes #5306 - Removing unused imports
Fixes #5082 - Added inferred_freq_offset
Fixes #5028 - fixed issues with _offset_map
Fixes #4878 - freq parameter for Period init supports full set of Offsets.
Fixes #5418 - raise DateParseError rather than incorrect KeyError for invalid frequencies

@@ -73,7 +73,7 @@ def get_freq(freq):
return freq


def get_freq_code(freqstr):
def get_freq_code(freqstr, as_periodstr=False):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

needs to default to do what it did before. [because I think this is actually a public method]

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jtratner Am I missing something here? The change should by default not change behavior.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

okay.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you updte docstring for parameters and returns

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cancan101 please make sure to update docstring here

@jtratner
Copy link
Contributor

jtratner commented Oct 8, 2013

I know it seems weird, but feels like it would be much easier to just have DateOffset subclass str.

@cancan101
Copy link
Contributor Author

@jtratner While there may be benefits to having DateOffset subclass str, I don't think we can rely on the rulecode being == corresponding DateOffset even in the case where DateOffset subclass str.

@jtratner
Copy link
Contributor

jtratner commented Oct 8, 2013

no, it's not for eq, you'd override that anyways, it's more to not have to create 2 more properties (freq_obj, inferred_freq_offset) and a bunch of corresponding methods

@cancan101
Copy link
Contributor Author

That might work as long as you made sure to return the str value was always the canonical string.

For example you never got back 'Q' but rather instead got 'Q-MAR'.

@jtratner
Copy link
Contributor

jtratner commented Oct 8, 2013

whatever you're going to have for freq_obj, clearly you could just shift that to be returned by freq, and whatever freqstr you'd have on freq_obj would be what you'd want. I don't really think a str subclass would work, it would just be a very elegant solution if it did.

I'm assuming you're going to change freq and inferred_freq to be just:

def freq(self):
     return getattr(self.freq_obj, 'freqstr', None)
def inferred_freq(self):
    return getattr(self.inferred_freq_offset, 'freqstr', None)

@cancan101
Copy link
Contributor Author

Currently freq and freqstr are written:

    # alias to offset
    @property
    def freq(self):
        return self.offset

    @property
    def freqstr(self):
        return self.offset.freqstr

I wrote this, but eventially this could be made more elegeant:

    @cache_readonly
    def inferred_freq_offset(self):
        try:
            return get_offset(self.inferred_freq)
        except ValueError:
            return None

@cancan101
Copy link
Contributor Author

@jtratner I am getting stuck now with the following:

creating a period requires a period "code" or "base" (an integer). This integer code is used by tslib.period_ordinal to convert the date to an ordinal. Unfortunately, get_period_ordinal is in cython which makes working with it difficult. What do you think of having some way of checking the DateOffset for an instance method "get_period_ordinal" before calling to cython? That way new DateOffset that want to handle their own conversion to periods can do so on the python class rather than having to add a new "period code" and then to add cython code?

The issue what I am trying to solve is taking data collected in 52-53 week fiscal years and up sampling to weeks. By design this should be a very clean upsampling (the quarters/years always end on the same day of the week).

@jtratner
Copy link
Contributor

@cancan101 what specifically are you having trouble with in regards to get_period_ordinal. Usually functions are written in cython for performance, so I'd prefer not to add a hasattr check anywhere.

@cancan101
Copy link
Contributor Author

I am still hacking around. Any hasattr will be done outside of any loop.
i.e. if it does end up needed, it will be called once per numpy array (not
per element)

On Wed, Oct 23, 2013 at 8:57 PM, Jeff Tratner [email protected]:

@cancan101 https://github.com/cancan101 what specifically are you
having trouble with in regards to get_period_ordinal. Usually functions
are written in cython for performance, so I'd prefer not to add a hasattrcheck anywhere.


Reply to this email directly or view it on GitHubhttps://github.com//pull/5148#issuecomment-26959204
.

@jtratner
Copy link
Contributor

@cancan101 okay, well that was my quick response to your question. 😄 Ping me when you have something you'd like to ask about. To be clear: we've already merged the 52-53 week offset, so why wasn't this an issue in that PR when it is an issue now?

@cancan101
Copy link
Contributor Author

There is no issue with the FY5253 et al.

There are a number of pandas offsets which are not fully handled as periods. This DOES include the new 5253 offsets, but it also includes BQuarter (see #5091) and some others.

I am trying to clean some of that up.

@jtratner
Copy link
Contributor

@cancan101 okay - that's totally fine.

@jtratner
Copy link
Contributor

Can you move the location of the 52-53 week classes in the offsets.py file? When you rebased they ended up inserted between QuarterOffset and the various offsets that subclass QuarterOffset, makes it harder to read.

@cancan101
Copy link
Contributor Author

Sure. I will tack on to this pr.
On Oct 23, 2013 9:50 PM, "Jeff Tratner" [email protected] wrote:

Can you move the location of the 52-53 week classes in the offsets.py
file? When you rebased they ended up inserted between QuarterOffset and
the various offsets that subclass QuarterOffset, makes it harder to read.


Reply to this email directly or view it on GitHubhttps://github.com//pull/5148#issuecomment-26961152
.

@jtratner
Copy link
Contributor

great

@cancan101
Copy link
Contributor Author

@jtratner

Do you think that if:
pd.tseries.frequencies.get_offset(X) == pd.tseries.frequencies.get_offset(Y)
then
repr(pd.tseries.frequencies.get_offset(X))==repr(pd.tseries.frequencies.get_offset(Y))
?

@jtratner
Copy link
Contributor

Nope, not going to be that way because of the special-case backwards-comptible reprs. Not a big deal.

@cancan101
Copy link
Contributor Author

I actually think it can be. Or in many more cases it can be with a very small change.

@cancan101
Copy link
Contributor Author

At least for:

    'QS': 'QS-JAN',
    'BQ': 'BQ-DEC',
    'BQS': 'BQS-JAN',

@cancan101
Copy link
Contributor Author

@jtratner If you have time, could you take a look? I have a first pass at making it easier to add new periods / correctly support existing periods without having to get into cython code.

@@ -1391,6 +1391,13 @@ def inferred_freq(self):
except ValueError:
return None

@cache_readonly
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are you sure this can be cached like this? Is it possible that it will become inferred later on?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The inferred_freq property is also marked as cache_readonly so I don't think so.

@jreback jreback modified the milestones: 0.14.1, 0.14.0 Apr 27, 2014
@cancan101
Copy link
Contributor Author

This is now passing tests. LMK what else needs to be done to get this merged.

@nehalecky
Copy link
Contributor

Was really hoping to see this get in 0.14, oh well. :(

Does it look likely to make it for 0.14.1? I want to integrate it in dev efforts asap to clean up our attempts for more sanity from the Offsets and Periods. Thanks @cancan101, this is an epic one.

@jreback
Copy link
Contributor

jreback commented May 19, 2014

well, this needs a rebase to even see it this works

@jreback
Copy link
Contributor

jreback commented May 30, 2014

@cancan101 alex if you can rebase this on master I think can do this for 0.14.1

@jreback jreback modified the milestones: 0.15.0, 0.14.1 Jun 10, 2014
@nehalecky
Copy link
Contributor

Hi all. Any progress? I guess a no go for 0.14.1 but still would be nice go get into master when possible after so we can get to know it. I was super excited about this one. 😃

@cancan101
Copy link
Contributor Author

I have this rebased and passing tests. Should I update release notes for 0.14.1? Or just wait for 0.15?

@jreback
Copy link
Contributor

jreback commented Jun 30, 2014

0.15.0

@jreback
Copy link
Contributor

jreback commented Jul 28, 2014

closing. this would need a major rebase to be accepted. pls submit pr's for single inssues at a time. much easier to review / integrate. thanks.

@jreback jreback closed this Jul 28, 2014
@cancan101
Copy link
Contributor Author

I think actually have done the rebase .
On Jul 28, 2014 10:59 AM, "jreback" [email protected] wrote:

Closed #5148 #5148.


Reply to this email directly or view it on GitHub
#5148 (comment).

@cancan101
Copy link
Contributor Author

It's on a separate branch.
On Jul 28, 2014 11:00 AM, [email protected] wrote:

I think actually have done the rebase .
On Jul 28, 2014 10:59 AM, "jreback" [email protected] wrote:

Closed #5148 #5148.


Reply to this email directly or view it on GitHub
#5148 (comment).

@jreback
Copy link
Contributor

jreback commented Jul 28, 2014

needs to target separate issues. linking them just makes this too complicated.

@sinhrks sinhrks mentioned this pull request Aug 7, 2014
11 tasks
@cancan101
Copy link
Contributor Author

Are there any parts of this PR that still make sense to try to get into master?

@jreback
Copy link
Contributor

jreback commented Oct 23, 2014

Fixes #5306 - Removing unused imports
Fixes #5082 - Added inferred_freq_offset
Fixes #5028 - fixed issues with _offset_map
Fixes #4878 - freq parameter for Period init supports full set of Offsets.
Fixes #5418 - raise DateParseError rather than incorrect KeyError for invalid frequencies

@jreback
Copy link
Contributor

jreback commented Oct 23, 2014

These would need to be done as separata PRs because
#5306 (can be done in 0.15.1)
#5028, #5418, #5028 also ok in 0.15.1

but
#5082 is the prob issue here and is an API change (or maybe not, but hard to tell).

so love to have these worked on (and the code as changed a bit since ths was originally posted)

@cancan101
Copy link
Contributor Author

I did want to re-bump this PR / content, but I don't personally have time right now to tackle these.

@jreback
Copy link
Contributor

jreback commented Oct 23, 2014

ok these issues r open and referenced

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Frequency DateOffsets
Projects
None yet
4 participants