-
Notifications
You must be signed in to change notification settings - Fork 76
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Validate spatial files referenced in a CSV column #1504
Conversation
@@ -3449,36 +3447,6 @@ def logger_callback(proportion_complete): | |||
return logger_callback | |||
|
|||
|
|||
def _validate_habitat_table_paths(habitat_table_path): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is now redundant to what's done in validation by default.
@@ -1044,23 +1044,13 @@ def _validate_threat_path(threat_path, lulc_key): | |||
""" | |||
# Checking threat path exists to control custom error messages | |||
# for user readability. | |||
try: | |||
threat_gis_type = pygeoprocessing.get_gis_type(threat_path) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No longer need to check that the file can be opened by GDAL. Validation already does that
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For this model there are three columns expecting raster paths. Only the "cur_path" column is required. The "base_path", and "fut_path" columns are conditionally required. If the user has an empty "base_path" column with the conditional unmet (the column is not required but is present in the table), I think the model says that it's okay for that column to be empty.
Does that still work here with validation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, I see that this check is still left in for the model to look for.
So I guess my question is, does validate check conditional spatial files referenced in a CSV?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, I see that in validation for this an empty entry is returned with no error.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, empty entries are passed through and don't raise any validation warning. Note that this applies even when the column is required - NA values don't fail validation because in some cases we do need to allow NA values.
@@ -1548,21 +1548,6 @@ def test_clip_project_vector_on_invalid_geometry(self): | |||
n_actual_features = target_layer.GetFeatureCount() | |||
self.assertTrue(n_actual_features == n_features - 1) | |||
|
|||
def test_invalid_habitat_table_paths(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is now covered by regular validation and validation tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @emlys for this. I had some small comments / suggestions / questions. Looking at the PR checklist reminder, I was also curious if you tested this in the UI at all?
@@ -1044,23 +1044,13 @@ def _validate_threat_path(threat_path, lulc_key): | |||
""" | |||
# Checking threat path exists to control custom error messages | |||
# for user readability. | |||
try: | |||
threat_gis_type = pygeoprocessing.get_gis_type(threat_path) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For this model there are three columns expecting raster paths. Only the "cur_path" column is required. The "base_path", and "fut_path" columns are conditionally required. If the user has an empty "base_path" column with the conditional unmet (the column is not required but is present in the table), I think the model says that it's okay for that column to be empty.
Does that still work here with validation?
@@ -1044,23 +1044,13 @@ def _validate_threat_path(threat_path, lulc_key): | |||
""" | |||
# Checking threat path exists to control custom error messages | |||
# for user readability. | |||
try: | |||
threat_gis_type = pygeoprocessing.get_gis_type(threat_path) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, I see that this check is still left in for the model to look for.
So I guess my question is, does validate check conditional spatial files referenced in a CSV?
@@ -80,7 +80,7 @@ | |||
"values besides 0 or 1 will be treated as 0.") | |||
}}, | |||
"fields": {}, | |||
"geometries": spec_utils.POLYGONS, | |||
"geometries": spec_utils.ALL_GEOMS, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this is true that multiple geometry types are allowed for this, should the "about" section below be updated as well? It currently states:
For vectors, a polygon indicates...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure! I had to update this line because one of our tests used a non-polygon geometry, which would now fail validation.
@@ -1044,23 +1044,13 @@ def _validate_threat_path(threat_path, lulc_key): | |||
""" | |||
# Checking threat path exists to control custom error messages | |||
# for user readability. | |||
try: | |||
threat_gis_type = pygeoprocessing.get_gis_type(threat_path) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, I see that in validation for this an empty entry is returned with no error.
tests/test_validation.py
Outdated
@@ -1352,6 +1342,43 @@ def test_rows(self): | |||
self.assertEqual(df['row2'][1], 3) | |||
self.assertEqual(df['row2'].dtype, float) | |||
|
|||
def test_recursive_path_validation(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should there also be a test case for vector paths?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure! I added a couple more tests
Co-authored-by: Doug <[email protected]>
Co-authored-by: Doug <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks Emily! I think this is looking good!
Hey @emlys, there appears to be a Windows Workbench test that fails consistently. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just removing approval, so we can check out that test failure.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just tried for the first time, "dismiss review".
@dcdenu4 , I'm not sure what was happening with that test, but it's passing now, so it must've just been flaky. I'll go ahead and merge if you approve! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
Description
Fixes #327
Currently limited to validating rasters and vectors as the issue specified, but it would be simple to expand to all types of columns.
Checklist