-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Proposed future workflow #161
Comments
I think the current functionality is a lot cleaner and will be easier for new users coming to the package. It looks like at least one of the functions in the second code chunk are already exported from {socialmixr}, therefore I have two questions:
|
Thanks, looks like some motivation/context was missing from the original post. The One related question is whether we want to be able to address #143 - at the moment the Re (1) yes we could and perhaps it's the way forward, but from a maintenance point of view it would help to be a bit more prescriptive on the workflow. |
I would change
to
On CRAN {o2geosocial} and {finalsize} are reverse (Suggests:) dependencies. Not sure how we feel about potential breaking changes. If long term the idea is to move to {contactmatrix} then I think we should maintain IMO treating Section comments (below) from
Potential mappings are... as_contact_survey(){
## === check and clean survey
## === check if specific countries are requested (if a survey contains data from multiple countries)
## === merge participants and contacts into a single data table
} filter() {
## === check if any filters have been requested
} process_age(){
## === age processing: deal with ranges and missing data
## === adjust age.group.brakes to the lower and upper ages in the survey
## === process contact age ranges / missing ages
} as_contact_survey(){
## === check if specific countries are requested (if a survey contains data from multiple countries)
} |
Thanks for the additional context. The misunderstanding/misinterpretation comes from the number of arguments in
I agree that the proposed refactor is the right development direction, as the single point of user interaction (
The plan is to use the |
What I meant was the risk is from doing lots of processing inside the function that isn't really exposed to the user (e.g. there is no way for the user currently to see the result of interpolating within age ranges, or of puling in and manipulating demographic data - they only see the end result in the matrix). In that sense I see a risk of misinterpretation due to lack of visibility of what happens inside the |
Agree it can potentially be a bit confusing to have the demography generated by a Am I correct that the below line basically defines the raw POLYMOD data object with some aggregation? uk_contacts <- polymod |>
filter(country == "United Kingdom") |>
process_age(age_limits = c(0, 1, 5, 15)) So perhaps clearer to have uk_data <- polymod |>
filter(country == "United Kingdom") |>
process_age(age_limits = c(0, 1, 5, 15)) Then can extract contacts and/or demography as user needs, like you suggest? |
The demographic data isn't actually stored inside the polymod object. When calling uk_pop <- uk_contacts |>
country_population() the |
The proposed workflow feels less "magic" but I think it's a good thing. There is a lot going on in socialmixr's code and in particular in The tradeoff, which is still definitely worth it in my opinion, is that users will have to write a couple extra more lines of code and spend more time to understand what is happening behind the scenes. A nice other added benefit that was not yet mentioned in this thread is that this would resolve the case of arguments accepting multiple types (e.g., Beyond the general agreement in terms of direction, I think this also raises a lot of practical questions:
|
Great discussion! While breaking the functionality into steps could improve user understanding, it is currently very convenient for users to obtain a contact matrix from a survey dataset without having to deal with normalization, demographic adjustments, filtering, and other complexities. From an end-user perspective, I would therefore advocate for retaining the current contact_matrix(...) interface. However, the current implementation, with most functionality directly coded inside a single function, presents challenges for further development. For users looking to extend the code, this design results in a steep learning curve. I would recommend reorganizing the contact_matrix() function into an "interface function" that delegates tasks to multiple sub-functions (potentially without including all features). Additionally, it would be beneficial to provide documentation guiding users through the stepwise process for more in-depth analyses or for utilizing all features when needed. This approach would enhance usability for both end-users and developers looking to extend or adapt the functionality. |
Thanks for all the comments. Synthesising the comments I would suggest to:
|
In the process of addressing #131 workflows for estimating contact matrices will change. The proposal is to move some of the functionality hidden inside the
contact_matrix()
function into separate functions. This means that instead of (currently)One would do something like
It's quite a lot more typing but hopefully more explicit and less black box processing thus less likely to lead to misinterpretation / misunderstanding.
The text was updated successfully, but these errors were encountered: