Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a day-of-week effect to simulated dataset #91

Merged
merged 6 commits into from
Dec 2, 2024

Conversation

zsusswein
Copy link
Collaborator

@zsusswein zsusswein commented Nov 29, 2024

The current timeseries doesn't have a day of week effect, which makes it hard to test how the day of week effect implementation is performing.
This modifies the simulated timeseries to add a large-ish day of week reporting effect.

In implementing this change, I converted the timeseries over to using a data-raw script for reproducibility.
I moved the original Gostic "Practical considerations" dataset into an RDS object in data-raw.
This change makes the dataset tracked and available for re-generation of the package dataset, but doesn't expose it to or ship it with the package.
It's not 100% reproducible because I don't implement the git clone of the "Practical considerations" repo, but I think it gets us a fair but further than before.

I also clean up the package-provided stochastic_sir_rt datset.
I rename the obs_incidence col to obs_cases, add a obs_cases_no_dow for testing, and rename date to reference_date in anticipation of a future report_date.

Note that this change causes the non-day-of-week model performance to decrease in the vignette.
I'm expecting this change will be temporary until we can implement #89, which this PR unblocks.

Closes #90

Copy link

codecov bot commented Nov 29, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Additional details and impacted files

📢 Thoughts on this report? Let us know!

@zsusswein zsusswein marked this pull request as draft November 29, 2024 17:16
@zsusswein zsusswein marked this pull request as ready for review November 29, 2024 17:35
Copy link
Collaborator

@kgostic kgostic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this looks reasonable but please double check the infections vs. cases language and conceptual choices!

@zsusswein
Copy link
Collaborator Author

Hm yeah I'll take another pass at the docs.

For the simulated data, we've been entirely ignoring infection to case observation delays for now. See, for example, the predict method in the vignette: https://cdcgov.github.io/cfa-gam-rt/articles/RtGam.html

Although not explored vs simulated data, we currently implement just a mean-shift as a correction. I have it on the roadmap to explore that more vs simulated data in #42 and we're discussing potential different approaches for the model in #88.

@kgostic
Copy link
Collaborator

kgostic commented Nov 29, 2024 via email

@seabbs seabbs enabled auto-merge (squash) December 1, 2024 19:45
Copy link
Contributor

@seabbs seabbs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM once the suggestions from @kgostic are actioned.

The current timeseries doesn't have a day of week effect, which makes it hard to test how the day of week effect implementation is performing.
This modifies the simulated timeseries to add a large-ish day of week reporting effect.

In implementing this change, I converted the timeseries over to using a `data-raw` script for reproducibility.
I moved the original Gostic "Practical considerations" dataset into an RDS object in `data-raw`.
This change makes the dataset tracked and available for re-generation of the package dataset, but doesn't expose it to or ship it with the package.
It's not 100% reproducible because I don't implement the git clone of the "Practical considerations" repo, but I think it gets us a fair but further than before.

I also clean up the package-provided `stochastic_sir_rt` datset.
I rename the `obs_incidence` col to `obs_cases`, add a `obs_cases_no_dow` for testing, and rename `date` to `reference_date` in anticipation of a future `report_date`.

Closes #90
Was silently producing and broadcasting day of week vector of length 13
out to 299. This bug was leading to a phasing of the day-of-week effect
and making it impossible to fit a meaningful effect.
@zsusswein zsusswein force-pushed the add-day-of-week-to-sim-data branch from 63bde98 to 064b4e5 Compare December 2, 2024 16:15
@seabbs seabbs merged commit 5d7f019 into main Dec 2, 2024
10 checks passed
@seabbs seabbs deleted the add-day-of-week-to-sim-data branch December 2, 2024 19:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Simulated day-of-week effect
3 participants