-
Notifications
You must be signed in to change notification settings - Fork 81
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enhanced testing for Seasonal Water Yield (SWY) to test intermediate rasters #1744
Conversation
…t_local_recharge_undefined_nodata
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, @claire-simpson , I had a few minor suggestions. We can talk them over of course.
os.path.join(self.workspace_dir, 'target_aet_path.tif'), | ||
os.path.join(self.workspace_dir, 'target_precip_path.tif')) | ||
except Exception as e: | ||
assert False, f"calculate_local_recharge raised an exception: {e}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we're only concerned about calculate_local_recharge
not raising an exception, then if we did get an exception we will want to make sure we see the full traceback. The simplest way to accomplish that might be to remove the try/except
altogether.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep I was struggling with how to assert that there were no errors and landed on this unsatisfying solution.. but I think your note below makes the most sense to improve this test and remove redundancy!
except Exception as e: | ||
assert False, f"calculate_local_recharge raised an exception: {e}" | ||
|
||
def test_calculate_local_recharge(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since the setup in this test is almost identical to test_local_recharge_undefined_nodata
do you think it would make sense to combine them? Basically taking the assertions you added in this test and applying them to the pre-existing test that had no assertions?
assert numpy.allclose(b, expected_b, equal_nan=True), \ | ||
f"Baseflow raster values do not match. Expected: {expected_b}, Got: {b}" | ||
assert numpy.allclose(b_sum, expected_b_sum, equal_nan=True), \ | ||
f"b_sum raster values do not match. Expected: {expected_b_sum}, Got: {b_sum}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of the builtin assert
, unittest
and numpy.testing
both provide lots of really convenient assertion methods. These are nice because their names tend to be very descriptive in what they're asserting, and they handle the string formatting when the assertion fails. For example, in this case we could use numpy.testing.assert_allclose
.
And in other, simpler cases, we tend to use one of these: https://docs.python.org/3/library/unittest.html#assert-methods
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you! This is super helpful. I have been seeing different types of assertions and wasn't sure which was best
def test_calculate_curve_number_raster(self): | ||
"""test `_calculate_curve_number_raster`""" | ||
from natcap.invest.seasonal_water_yield import seasonal_water_yield | ||
import pandas |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we move the pandas
import to the top of the module? I think it's only important to import the modules we are testing inside the scope of the test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes definitely, I noticed another test importing pandas in this way so I followed suit, but I can change both
soil_array = numpy.zeros((3, 3), dtype=numpy.int32) | ||
for i, row in enumerate(soil_array): | ||
row[:] = i % soil_groups + 1 | ||
make_raster_from_array(soil_array, soil_group_path) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I noticed this module has make_soil_raster
and make_lulc_raster
and make_biophysical_csv
already defined. Does it make sense to use those functions to create the input data we need here? It may not, but I'm curious what you think.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes I noticed these as well! I chose not to use them as the make_soil_raster
and make_lulc_raster
both create 100x100 pixel rasters, which seemed a bit unnecessarily large to me as then I'd also be hard-coding a 100x100 pixel raster target.
The issue I had with make_biophysical_csv
is that the column names are uppercase vs. the _calculate_curve_number_raster
function expected lowercase. I could alternatively change the make_biophysical_csv
function output to be lowercase, however the function as a whole does accept uppercase column names (as shown in the sample data) so I elected not to change the make_biophysical_csv
function
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool, that all makes sense to me, thanks!
seasonal_water_yield._calculate_curve_number_raster( | ||
lulc_raster_path, soil_group_path, biophysical_df, cn_path) | ||
|
||
cn = pygeoprocessing.raster_to_numpy_array(cn_path) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sometimes it can be nice to call this variable actual_cn
so it's easy to see that it is the data we are testing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks great, thanks!
soil_array = numpy.zeros((3, 3), dtype=numpy.int32) | ||
for i, row in enumerate(soil_array): | ||
row[:] = i % soil_groups + 1 | ||
make_raster_from_array(soil_array, soil_group_path) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool, that all makes sense to me, thanks!
Description
Improved testing for SWY:
route_baseflow_sum
,calculate_local_recharge
, and_calculate_curve_number_raster
)test_local_recharge_undefined_nodata
Fixes #1549
Checklist