-
Notifications
You must be signed in to change notification settings - Fork 16
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adding data guide. #268
Comments
I'll put my hand up for some of this. I think I can do:
Thinking about this, it almost feels something which might draft better with a wiki approach rather than concurrent PRs on a branch. One suggestion which I think I would make would be to work through a flowchart/decision tree of the data available and then model the new code not on
This approach is more likely to give people a framework with the parts they need already in than providing a very generic framework which has many methods in place which may not be necessary for most use cases. It may be useful to fill out the taxonomy above for general reference too. What I don't think I can do apart from asking more questions is advise on testing or on debugging (two different things). Debugging the R6 code has been a learning experience, I'm not sure I'm doing it the best way possible but perhaps someone else may have some tips we could share. |
That would be amazing Richard. I really like the flow chart idea that seems like a much nicer way of handling it. We could use GitHub wikis for both this and contributing guide (as suggested by @kathsherratt)? That might make changing them easier and more natural. Totally agree on the testing/debugging issues. I also don't think I have found the optimal work flow. I think perhaps if we all chip on with suggestions we can agree some good working practices? |
My personal workflow has been to implement each method in turn, load the class and then use |
Good - I am starting with a wiki formulation in https://github.com/epiforecasts/covidregionaldata/wiki/Adding-Data-(wiki-draft) I've drawn in from @seabbs blob and will also pull in some of the text I'd written for a vignette on this topic for 0.8.3 Currently I'm littering it with headers as a way of outlining - please feel free to also edit on this text, including adding in comments/questions/suggestions. I would propose to draft on the wiki for a few days and then transfer into a PR (I think it's useful to have this documentation in the package) but we may choose to leave it on the wiki. |
Just skimmed through where this is at. Looks really great currently. Really like the Q and A design. I think I am quite in favour in leaving it in the wiki (so that is a 100% course change from me) but very open to alternative views on that. |
Thanks for the kind comments. Can someone who is familiar with the ECDC / WHO / JRC code sketch out or write a version of what could be said about doing this sort of a system? I've looked a bit at these Classes but really am not familiar and don't know what counts as a good example and what counts as an edge case which is functional but to be avoided. I think I know what I want to write more in the cleaning single levels and cleaning multiple levels, I think I've got a sense of what needs to go in the incremental and testing sections. I have not written about only using One other thing I might suggest is that if people know of open data sources which we don't yet include, we could start a "wish list" of these which could be added. It could be faster than me going looking for new data sources, but would have to come with the clear statement that listing something doesn't mean it's automatically, or even ever, going to be added. (There may be complications or limitations which aren't apparent, or which the nominator doesn't consider important.) |
@joseph-palmer as you are handling adding the JHU and google data sketching out some docs for these kind of classes might be a good intro? Maybe we should leave documenting the I thought we might wishlist datasets as a discussion section? Agree we should post to this and definitely stress that it is very much not a given anything will be implemented based on it being listed.. https://github.com/epiforecasts/covidregionaldata/discussions/categories/new-datasets |
Iv'e put a page up https://github.com/epiforecasts/covidregionaldata/wiki/Adding-a-national-data-source and will keep going over it to make sure its clear. I will update as I add in JHU and Google data as I'm sure that will raise steps I've missed. Could even do a walk though of creating it which might be nice. |
I'll look at this again at more detail today/tomorrow but are we happy with how the package links to these docs at the moment? |
I think it’s fine. I just looked and I cannot find how the package links to these but I think that’s probably ok. I have to imagine that anyone who wants to add a data set will at least visit the GitHub page and may find their way to the wiki. i note (and this is something which I guess may go into the pkgdown.yml) a typo in the following line:
classes. Let’s also put in a sentence there saying that individual data classes may have params which are documented in their data class but can be passed to |
This looks like its finished so closing |
The adding data guide needs to be expanded. Current version is here: https://github.com/epiforecasts/covidregionaldata/blob/master/.github/ADDING-DATA.md
It should include:
self
in top level method calls where possible).The text was updated successfully, but these errors were encountered: