Mobt 812 vera threshold interpolation #2079

lambert-p · 2025-01-21T10:15:29Z

Addresses #https://github.com/metoppv/mo-blue-team/issues/812

Test data:
metoppv/improver_test_data#73

Description:
Adds plugin and CLI for threshold interpolation step and necessary tests alongside.

Testing:

Ran tests and they passed OK
Added new tests for the new feature(s)

codecov · 2025-01-22T14:57:48Z

Codecov Report

Attention: Patch coverage is 98.30508% with 1 line in your changes missing coverage. Please review.

Project coverage is 98.42%. Comparing base (84a8944) to head (c74eb6a).
Report is 74 commits behind head on master.

Files with missing lines	Patch %	Lines
improver/utilities/threshold_interpolation.py	98.30%	1 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##           master    #2079      +/-   ##
==========================================
+ Coverage   98.39%   98.42%   +0.02%     
==========================================
  Files         124      136      +12     
  Lines       12212    13439    +1227     
==========================================
+ Hits        12016    13227    +1211     
- Misses        196      212      +16

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Kat-90 · 2025-02-03T15:43:19Z

A couple of minor thoughts I've had over the last week since this was put into review. May be useful to the reviewer:

This branch needs rebasing so that the tests pass.
Should we move the collapse realization to the beginning of the process function, this way we are only interpolating once rather than over each realization we are going to collapse anyway.
Should we make collapse realization an argument? For our case, we always want this coord collapsed, but we've made this generic so I wonder if there is a case where it wouldn't want to be collapsed. This is an option in the threshold step

bayliffe

Thanks for writing this. I've given this a first pass and there is some work to do. If you have any questions, please ask me.

improver_tests/acceptance/SHA256SUMS

bayliffe · 2025-02-07T10:02:50Z

improver/utilities/threshold_interpolation.py

+    if thresholds is None:
+        thresholds = forecast_at_thresholds.coord(threshold_coord).points
+        warnings.warn(
+            f"No thresholds provided, using existing thresholds. Thresholds being used are: {list(thresholds)}"
+        )


What is the point of this? Why not just return the cube unchanged as it already contains all of the required thresholds?

More broadly, why is thresholds an optional argument rather than a mandatory one? The only outcome without providing this argument is that you get your cube back unchanged, so why would anyone call it in that way?

bayliffe · 2025-02-07T10:06:23Z

improver/utilities/threshold_interpolation.py

+    return result
+
+
+def Threshold_interpolation(


This name is written in an odd style. Not quite CamelCase where the first letter of each word is capitalised which we use for plugins and not quite a function suitable name with all lowercase letters.

Unless there is a good reason, this code ought to be written as a plugin:

class ThresholdInterpolation(BasePlugin): def __init__(thresholds): etc. def _interpolate_thresholds(): etc. def _create_cube_with_thresholds(): etc. def process(cube): do stuff.

This contains all the related code together within a class. The underscores proceeding the method names indicates they are private methods, i.e. they are specific to this class and should not be called for other purposes. The use of a class allows the sharing of variables using self. More importantly, we can create an instance of a class, e.g.

plugin = ThresholdInterpolation(percentiles=[10,50,90]) result = plugin.process(cube)

This value of this (though not really in this case) is that if there is some complexity in configuring the plugin, i.e. converting the percentile values, checking them, etc. all of that can be done just once when we create the plugin instance. We can then call it over and over again without this overhead.

bayliffe · 2025-02-07T11:07:29Z

improver/utilities/threshold_interpolation.py

+            forecast_at_thresholds,
+            method="mean",
+        )
+        forecast_at_thresholds = collapsed_forecast_at_thresholds


@Kat-90 asked about whether realization collapse should occur before the interpolation. This is a good question to which the answer is yes, this is all commutative and we can save some effort by collapsing first. An example with some numbers:

Process each realization then collapse

Realization Thresh 1 (200) Thresh k (500) Thresh 2 (800)

R0 0.6 0.4 0.2

R1 1.0 0.8 0.6

R2 0.8 0.6 0.4

Mean 0.8 0.6 0.4

Collapse then process the result

Realization Thresh 1 (200) Thresh 2 (800)

R0 0.6 0.2

R1 1.0 0.6

R2 0.8 0.4

Mean 0.8 0.4

thresh_k = 500: value = [(500 - 200) / (800 - 200)] * (0.4 - 0.8) + 0.8 = 0.6

This is a very simple example, but the result is general for our simple linear interpolation (we're basically just averaging numbers).

bayliffe · 2025-02-07T11:13:28Z

improver/cli/threshold_interpolation.py

+def process(
+    forecast_at_thresholds: cli.inputcube,
+    *,
+    thresholds: cli.comma_separated_list = None,


See comments in plugin file about this not being optional.

bayliffe · 2025-02-07T12:15:43Z

improver_tests/utilities/test_threshold_interpolation.py

+    thresholds = ["cat", "dog", "elephant"]
+    error_msg = "could not convert string to float"
+    with pytest.raises(ValueError, match=error_msg):
+        Threshold_interpolation(input_cube, thresholds)


This is not required unless the plugin is doing something to handle this exception itself. Here we are essentially testing a numpy exception and that's (hopefully) covered by numpy's unit tests.

bayliffe · 2025-02-07T12:18:30Z

improver_tests/utilities/test_threshold_interpolation.py

+        coord.name() for coord in realization_cube.coords(dim_coords=True)
+    ]
+    expected_dim_coords.remove("realization")
+    assert dim_coords, expected_dim_coords


Again, this is testing the collapse_realization function, which already has unit tests covering this, so this is not required.

bayliffe · 2025-02-07T12:19:10Z

improver_tests/utilities/test_threshold_interpolation.py

+    realization_cube.remove_coord("visibility_in_air")
+    error_msg = "No threshold coord found"
+    with pytest.raises(CoordinateNotFoundError, match=error_msg):
+        Threshold_interpolation(realization_cube, thresholds)


This is testing the find_threshold_coordinate function and is not required.

bayliffe · 2025-02-07T12:19:33Z

improver_tests/utilities/test_threshold_interpolation.py

+    warning_msg = "No thresholds provided, using existing thresholds."
+
+    with pytest.warns(UserWarning, match=warning_msg):
+        Threshold_interpolation(input_cube)


Hopefully this goes away if you get rid of the option to not provide thresholds.

bayliffe · 2025-02-07T12:20:04Z

improver_tests/utilities/test_threshold_interpolation.py

+    Testing that a Cube is returned when inputting a masked cube.
+    """
+    thresholds = [100, 150, 200, 250, 300]
+    result = Threshold_interpolation(masked_cube_same, thresholds)


You should test here that the mask in is the same as the mask out.

improver/utilities/threshold_interpolation.py

bayliffe · 2025-02-12T14:09:51Z

improver/utilities/threshold_interpolation.py

+        for point in self.thresholds:
+            cube = template_cube.copy()
+            coord = iris.coords.DimCoord(
+                np.array([point], dtype="float32"), units=cube.units


Suggested change

np.array([point], dtype="float32"), units=cube.units

np.array([point], dtype="float32"), units=threshold_units

bayliffe

Thanks for all the work Phoebe. Have a load more comments, but we are getting there.

improver/utilities/threshold_interpolation.py

bayliffe · 2025-02-19T15:54:35Z

improver_tests/utilities/test_threshold_interpolation.py

+    """Test that the plugin returns an Iris.cube.Cube with suitable units."""
+    thresholds = [100, 150, 200, 250, 300]
+    result = ThresholdInterpolation(thresholds)(input_cube)
+    assert result, Cube


This line still doesn't do anything. It's still the same as assert 1, 2

To check the type of something we use isinstance. So your test should be:

assert isinstance(result, Cube)

Still not fixed.

improver_tests/utilities/test_threshold_interpolation.py

bayliffe · 2025-02-19T16:08:34Z

improver_tests/utilities/test_threshold_interpolation.py

+def test_masked_cube(masked_cube):
+    """
+    Testing that a Cube is returned when inputting a masked cube.
+    """
+    thresholds = [100, 150, 200, 250, 300]
+    result = ThresholdInterpolation(thresholds)(masked_cube)
+    assert isinstance(result, Cube)


Pytest allows us to parameterize our tests so we don't have to write the same test twice we can just vary the inputs. Unfortunately parameterising over fixtures is the least possible friendly introduction to this but oh well.

If you replace your test_basic with what follows and remove the test_masked_cube test you will have exactly the same test coverage. The test below loops over the fixtures given in the brackets of the parametrize statement, namely ["input_cube", "masked_cube"]. Annoyingly you reference these as string names which I hate.

@pytest.mark.parametrize("input", ["input_cube", "masked_cube"]) def test_cube_returned(request, input): """Test that the plugin returns an Iris.cube.Cube with suitable units.""" cube = request.getfixturevalue(input) thresholds = [100, 150, 200, 250, 300] result = ThresholdInterpolation(thresholds)(cube) assert result, Cube assert result.units == cube.units

To describe:
@pytest.mark.parametrize("input", ["input_cube", "masked_cube"])
This tells pytest we are looping over parameters. The variable name is "input", i.e. that's what it sets to have a given value, like i in for i in [1, 2, 3]. Here the values are the names of the fixtures that we want to pass in, which return our different cubes; the normal one and the masked one.

def test_cube_returned(request, input):
We pass in the "input" variable. As we are looping over fixtures that we reference as string names we also have to pass in the request magic argument that's part of pytest. This can be called within the code to effectively call the fixture and get the thing it returns, which in this case is one of our input cubes.

cube = request.getfixturevalue(input)
This is us doing exactly that, calling the fixture to get the cube it returns, either a normal cube or one containing masked data.

The rest of the test is as before but using the cube variable.

improver_tests/utilities/test_threshold_interpolation.py

Co-authored-by: bayliffe <[email protected]>

bayliffe

Thanks Phoebe. Very little left to do.

Note that your acceptance tests are broken. The precipitation diagnostics you've used for the masked data tests are thresholded as greater_than the threshold values. Now that we've removed the hard-coding of the spp__relative_to_threshold coordinate attribute these are being created with the correct threshold when the tests run, but these do not agree with the KGO which have the wrong attribute. These need to be recreated.

bayliffe · 2025-02-25T14:30:23Z

improver/utilities/threshold_interpolation.py

@@ -118,11 +113,9 @@ def _interpolate_thresholds(
            forecast_at_interpolated_thresholds
        )

-        # Reshape forecast_at_percentiles, so the percentiles dimension is
-        # first, and any other dimension coordinates follow.


You've deleted this comment rather than rewritten it to explain what is actually happening here. Can you write it again but correct it to describe what this function is doing to your data.

bayliffe · 2025-02-25T14:33:46Z

improver/utilities/threshold_interpolation.py

                Cube expected to contain a threshold coordinate.

        Returns:
            Cube:
                Cube with forecast values at the desired set of thresholds.
                The threshold coordinate is always the zeroth dimension.
        """
-        self.threshold_coord = find_threshold_coordinate(forecast_at_thresholds).name()
+        self.threshold_coord = find_threshold_coordinate(forecast_at_thresholds)


Line 196/197 below:
Code coverage is suggesting you don't have test coverage of this. You'll need a unit test with a cube that has a realization dimension.

bayliffe · 2025-02-25T14:36:25Z

improver_tests/utilities/test_threshold_interpolation.py

+    """Test that the plugin returns an Iris.cube.Cube with suitable units."""
+    thresholds = [100, 150, 200, 250, 300]
+    result = ThresholdInterpolation(thresholds)(input_cube)
+    assert result, Cube


Still not fixed.

lambert-p · 2025-02-26T14:11:14Z

Hi Ben, thank you. I have recreated the KGOs and the acceptance tests are running now.

lambert-p marked this pull request as ready for review January 27, 2025 13:28

lambert-p mentioned this pull request Jan 27, 2025

added threshold interpolation inputs and kgos metoppv/improver_test_data#73

Open

bayliffe requested changes Feb 7, 2025

View reviewed changes

bayliffe reviewed Feb 12, 2025

View reviewed changes

improver/utilities/threshold_interpolation.py Show resolved Hide resolved

bayliffe reviewed Feb 12, 2025

View reviewed changes

improver/utilities/threshold_interpolation.py Show resolved Hide resolved

bayliffe reviewed Feb 12, 2025

View reviewed changes

lambert-p and others added 12 commits February 13, 2025 14:12

Threshold interpolation plugin and cli

abf2d53

changes made after PR comments

daa05f8

add unit tests for threshold interpolation plugin

b83f52a

plugin changes and completed iris unit tests

9fcf500

Converted unit tests from iris test to pytest.

8d1703e

acceptance test

e505d07

added acceptance tests and tidied up

bf0627b

formatting plugin

4d2bad1

rewritten as a class

2a82258

commit for rebase

a057ac4

commit to run suite

adf7990

first review changes complete

e08049d

lambert-p force-pushed the mobt_812_vera_threshold_interpolation branch from 70c4922 to e08049d Compare February 19, 2025 13:52

pre-commit checks done

91218c5

bayliffe requested changes Feb 19, 2025

View reviewed changes

lambert-p and others added 4 commits February 20, 2025 12:17

Update improver_tests/utilities/test_threshold_interpolation.py

fc57ff2

Co-authored-by: bayliffe <[email protected]>

wip

175e82f

second review changes

945723c

pre-commit tests done

c74eb6a

bayliffe requested changes Feb 25, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Mobt 812 vera threshold interpolation #2079

Mobt 812 vera threshold interpolation #2079

lambert-p commented Jan 21, 2025 •

edited

Loading

codecov bot commented Jan 22, 2025 •

edited

Loading

Kat-90 commented Feb 3, 2025

bayliffe left a comment

bayliffe Feb 7, 2025

bayliffe Feb 7, 2025

bayliffe Feb 7, 2025

bayliffe Feb 7, 2025

bayliffe Feb 7, 2025

bayliffe Feb 7, 2025

bayliffe Feb 7, 2025

bayliffe Feb 7, 2025

bayliffe Feb 7, 2025

bayliffe Feb 12, 2025

bayliffe left a comment

bayliffe Feb 19, 2025

bayliffe Feb 25, 2025

bayliffe Feb 19, 2025

bayliffe left a comment

bayliffe Feb 25, 2025

bayliffe Feb 25, 2025

bayliffe Feb 25, 2025

lambert-p commented Feb 26, 2025

	np.array([point], dtype="float32"), units=cube.units
	np.array([point], dtype="float32"), units=threshold_units

Mobt 812 vera threshold interpolation #2079

Are you sure you want to change the base?

Mobt 812 vera threshold interpolation #2079

Conversation

lambert-p commented Jan 21, 2025 • edited Loading

codecov bot commented Jan 22, 2025 • edited Loading

Codecov Report

Kat-90 commented Feb 3, 2025

bayliffe left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Process each realization then collapse

Collapse then process the result

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

bayliffe left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

bayliffe left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

lambert-p commented Feb 26, 2025

lambert-p commented Jan 21, 2025 •

edited

Loading

codecov bot commented Jan 22, 2025 •

edited

Loading