Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add simulated data from Gostic, 2020 for benchmarking #37

Merged
merged 11 commits into from
Sep 12, 2024

Conversation

zsusswein
Copy link
Collaborator

@zsusswein zsusswein commented Sep 6, 2024

This commit re-uses the data processing and documentation from CDCgov/cfa-epinow2-pipeline#17. That repo is open source and public domain by nature of being USG property. I think it's convenient to re-use, but @athowes and @kgostic there's a little text in there that you both suggested, so I'd appreciate if you could give your permission for the re-use here! I've made sure to credit you both in the commit as it's partially your writing.

It's probably best practice to make the data prep and processing here fully reproducible in data-raw/ but it's quite a pain to do, with a mixture of shells cripting and R scripting needed. I skipped it out of convenience, but if you have a really strong feeling that it's required @seabbs, let me know and we can revisit. I did it for the generation interval PMF but not the dataset recreation.

Closes #9

Co-authored-by: Adam Howes [email protected]
Co-authored-by: Katie Gostic [email protected]

This commit re-uses the data processing and documentation from CDCgov/cfa-epinow2-pipeline#17. That repo is open source and public domain by nature of being USG property. I think it's convenient to re-use, but @athowes and @kgostic there's a little text in there that you both suggested, so I'd appreciate if you could give your permission for the re-use here! I've made sure to credit you both in the commit as it's partially your writing.

It's probably best practice to make the data prep and processing here fully reproducible in `data-raw/` but it's quite a pain to do, with a mixture of shells cripting and R scripting needed. I skipped it out of convenience, but if you have a really strong feeling that it's required @seabbs, let me know and we can revisit.

Closes #9

Co-authored-by: Adam Howes <[email protected]>
Co-authored-by: Katie Gostic <[email protected]>
Copy link

codecov bot commented Sep 6, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Additional details and impacted files

📢 Thoughts on this report? Let us know!

@zsusswein zsusswein marked this pull request as ready for review September 6, 2024 21:54
Copy link
Collaborator

@seabbs seabbs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think its reasonable to skip the toy rt data generating scripts but ideally I would like to the exact code that produced it if its available (i.e the CFA code?). I also think that the code for the pmf should be very minimal so maybe include.

@zsusswein
Copy link
Collaborator Author

ideally I would like to the exact code that produced it

I'll do it with primarycensoreddist. How stable is that API?

Copy link
Collaborator

@athowes athowes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with Sam about might as well document going from the data to the PMF in data-raw if that isn't already something you've got. Also perhaps move to primarycensordist

@seabbs
Copy link
Collaborator

seabbs commented Sep 10, 2024

I'll do it with primarycensoreddist. How stable is that API?

For the R code its stable unless I receive any feedback with strong views

@zsusswein zsusswein requested review from seabbs and athowes September 11, 2024 02:45
Copy link
Collaborator

@seabbs seabbs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me. Remaining conversation points are really questions as to whether features here should actually be part of primarycensorreddist (i.e do they have a wider application).

Copy link
Collaborator

@kgostic kgostic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really am squirming seeing my name all over this and would like it if you could call the dataset something like "sudden-Rt-change" but this isn't the hill I'm going to die on if it makes a bunch of work for you.

Agree too that it would be ideal to be able to re-create the dataset and think that's a lift we could tackle in the future, but think it's ok to remain an issue in the backlog for now!

@zsusswein zsusswein merged commit bd6d033 into main Sep 12, 2024
10 checks passed
@zsusswein zsusswein deleted the 09-simulated-data branch September 12, 2024 15:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Simulated data from Gostic, 2020
4 participants