-
Notifications
You must be signed in to change notification settings - Fork 76
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Proposal (and bugfix) to implement value-based conditional requirement #1511
Proposal (and bugfix) to implement value-based conditional requirement #1511
Conversation
…509-una-validation-missing-uniform-search-radius
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@phargogh I really like where this is going! Our existing validation doesn't yet evaluate conditional requirement of CSV columns or vector fields, so I think that would need to be added. I think we just left it out of the original ARGS_SPEC
design to keep things simple, but it would be a great improvement to have.
I found two other models where I left in a placeholder conditional requrirement for CSV columns. It would be great to have those working too:
invest/src/natcap/invest/forest_carbon_edge_effect.py
Lines 83 to 106 in 9e50106
"c_below": { | |
"type": "number", | |
"units": u.metric_ton/u.hectare, | |
"required": "pools_to_calculate == 'all'", | |
"about": gettext( | |
"Carbon density value for the belowground carbon " | |
"pool. Required if calculating all pools.") | |
}, | |
"c_soil": { | |
"type": "number", | |
"units": u.metric_ton/u.hectare, | |
"required": "pools_to_calculate == 'all'", | |
"about": gettext( | |
"Carbon density value for the soil carbon pool. " | |
"Required if calculating all pools.") | |
}, | |
"c_dead": { | |
"type": "number", | |
"units": u.metric_ton/u.hectare, | |
"required": "pools_to_calculate == 'all'", | |
"about": gettext( | |
"Carbon density value for the dead matter carbon " | |
"pool. Required if calculating all pools.") | |
}, |
invest/src/natcap/invest/urban_cooling_model.py
Lines 72 to 96 in 9e50106
"shade": { | |
"type": "ratio", | |
"required": "cc_method == factors", | |
"about": gettext( | |
"The proportion of area in this LULC class that is " | |
"covered by tree canopy at least 2 meters high. " | |
"Required if the 'factors' option is selected for " | |
"the Cooling Capacity Calculation Method.")}, | |
"albedo": { | |
"type": "ratio", | |
"required": "cc_method == factors", | |
"about": gettext( | |
"The proportion of solar radiation that is directly " | |
"reflected by this LULC class. Required if the " | |
"'factors' option is selected for the Cooling " | |
"Capacity Calculation Method.")}, | |
"building_intensity": { | |
"type": "ratio", | |
"required": "cc_method == intensity", | |
"about": gettext( | |
"The ratio of building floor area to footprint " | |
"area, with all values in this column normalized " | |
"between 0 and 1. Required if the 'intensity' option " | |
"is selected for the Cooling Capacity Calculation " | |
"Method.")} |
It's interesting that we've done this before in a limited way with boolean inputs by lumping in their true/false value with "input sufficiency". Which has worked well, but would it make more sense to explicitly check the value of the input like you propose here? For instance,
"required": "do_valuation.value == True"
vs"required": "do_valuation.value"
vs"required": "do_valuation"
?
Conditional requirement now evaluates the truthiness of the expression based on the value of the parameter. If the parameter is missing, that parameter's value is Falsy. RE:natcap#1509
The latest update to using truthiness in conditional requirement expressions handles all of these cases. RE:natcap#1509
This ended up not being needed if we just work directly with truthiness. RE:natcap#1509
…ting tests to write. RE:natcap#1509
…s' of github.com:phargogh/invest into bugfix/1509-una-validation-missing-uniform-search-radius
Thanks so much for taking a look at this @emlys ! I made a few changes that ultimately resulted in a larger refactor that really simplified Expressions Now Use Arg ValuesSymbols in expressions that represent args keys will now contain the user-provided value of the arg. If the key is not present in Even though we have been assuming so far that keys in expressions represent sufficiency (whether a key has a non-empty value in args), sufficiency is really just the truthy evaluation of the arg value ... so why not just use the arg value itself? There was only one place in all of InVEST where the expression needed to be changed (in UCM) because of this change in meaning. This also gets around your question about the value or sufficiency of checkboxes. By switching to using the values of the checkboxes, we're being more explicit about just using the value of the input (whatever type that is), and allowing the programmer to decide how to handle its type in an expression that evaluates to a Expression Results are Cast to
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @phargogh! I really like how this simplifies validation. I had some questions but nothing major.
src/natcap/invest/validation.py
Outdated
insufficient_keys = set() | ||
for key in spec.keys(): | ||
try: | ||
sufficient_inputs[key] = bool(args[key]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would an arg value of 0
be labeled insufficient here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This specific line has been removed in a prior edit, but this is a case that the authors of expressions will need to be aware of. One cost of this proposed simplification is that it is possible for there to be a falsey value (like 0
) which was previously acknowledged as "sufficient" to now be effectively insufficient. This can be mitigated by better, context-aware expressions in cases where this may be an issue, so I think we'll largely be OK. In practice, I don't think there are many places where we are likely to encounter a value of 0
.
sufficient_keys = set(args.keys()).difference(insufficient_keys) | ||
for key in sufficient_keys.difference(excluded_keys): | ||
insufficient_keys = (missing_keys | keys_with_no_value) | ||
for key in set(args.keys()) - insufficient_keys: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I'm understanding, there is a subtle change here: before, we removed excluded_keys
from the set to validate. Now, any keys that have a value will be validated, even if they're not required. I'm not sure what the correct answer is, but it seems worth noting the change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this exclusion was definitely something I needed to address, and the fix now includes a better variable name for excluded_keys
to clarify how they were being used.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@phargogh might there still be a difference if a non-required key has a real value? Suppose you have a valuation_table
that's required only if do_valuation
is True. Then if we provide args where do_valuation
is False and valuation_table
is a valid path, should we still validate valuation_table
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great question, and thank you for thinking critically about this!
This specific snippet we're discussing here isn't updating for some reason, but there is now logic to cover the case where a non-required parameter has a truthy value. So yes, in the latest revision (at least as of rev 617e76d) I would expect the validation you describe to take place.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah okay, thanks for clarifying. I think that validation would not take place before, because excluded_keys
were removed from the set to validate:
invest/src/natcap/invest/validation.py
Lines 1003 to 1004 in 9a70648
sufficient_keys = set(args.keys()).difference(insufficient_keys) | |
for key in sufficient_keys.difference(excluded_keys): |
Is that a change worth noting in history?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh! I see what you mean. Sorry for my confusion about this! Yes, this is indeed a change in behavior that I think is for the best and I have noted the changes in HISTORY
. Once we figure out the code-signing issue, I'll share a dev build with Stacie to have her try it out.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@phargogh I just had a couple more comments. The test suite also failed - not sure if it's related
Set names are now clearer, and type-specific validation will no longer take place if the value is falsey. RE:natcap#1509
Thanks @emlys ! Yes, the test failures were definitely related and I believe I have addressed everything. Let me know what you think! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome, thanks @phargogh!
This proposal implements value-based conditional requirement expressions in validation.
The use case for this is that sometimes (like in UNA, see #1509 ), validation is conditional not only on whether a parameter is provided and has a value (is "sufficient"), but sometimes validation depends on the value of another key. In the case of UNA, several different inputs are required depending on the value of
search_radius_mode
. Ifsearch_radius_mode
represents uniform search radii, thenargs['search_radius']
is required. Ifsearch_radius_mode
represents search radii by population group, thenargs['population_gorup_radii_table']
is required.An alternate approach to this could have been to just write out the required logic in UNA's
validate
function, but this feature also seemed useful and perhaps worth expanding. I would be happy to replace this implementation with such a model-specific implementation if that is preferred.Implementation-wise, I thought it might be convenient to be able to use a dot-notation in the MODEL_SPEC expression to represent the argument's value. So for the above example, one might use
search_radius_mode.value == "{RADIUS_OPT_UNIFORM}"
. Behind the scenes, validation (as proposed) will tokenize the expression and rewritesearch_radius_mode.value
into a new local variable name. This allows backwards compatibility with existing expressions throughout InVEST, while supporting the proposed dot-notation.Critical feedback is very welcome!
Fixes #1509
Fixes #1165
Checklist
- [ ] Updated the user's guide (if needed)