-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Day of week effect #89
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅ Additional details and impacted files📢 Thoughts on this report? Let us know! |
dd0a2da
to
753367f
Compare
753367f
to
b5a75b7
Compare
Gratia development version works fine, but the current CRAN version has a bug in its factor handling. When passing (known) factor levels, it errors out when constructing the dataset in `data_slic()`. This error wasn't showing up in the tests because I had installed the development version from Github. But it was getting caught on CI and in the vignette because those were installing the latest CRAN version. I had to implement an imperfect workaround here. It's messy because each individual factor level is "far" from the data in mgcv's definition -- at least far enough to throw a warning. It would be pretty labor intensive to reconstruct the associated levels because we explicitly set them to zero with `exclude`. I don't want to do all the work of reconstructing the matching levels because to then immediately toss them. So instead, I've suppressed the warning -- an imperfect but "good enoough" solution...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks correct to me though is convoluted as you say. I am not sure there is much of a way around that.
I have flagged some things for new issues and a few points of complexity we are starting to see from the vector based interface (IMO). The only actionable thing to check is whether you can get away with passing reference_date
as the default to day_of_week
and so avoid having an argument with multiple supported types.
Co-authored-by: Sam Abbott <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks really good Zack! I left a few fussy comments and I lost steam on the tests, but overall I think this is ready.
Implements a day of week effect for the model and handling for day of week in predictions. The model will automatically implement a day of week by default, but also allows for a custom vector of levels to be passed to the model or for the day of week estimation to be turned off. The predict method will automatically handle day of week effects where possible and, when not possible, throws informative error messages with changes needed and why.
Note that this implementation took some custom handling logic and fiddling with some internal gratia errors. I think I tested that all pretty carefully and documented it thoroughly, but it took a couple of tries. It wasn't quite as easy I was originally hoping last week.
Closes #87