Added Support for Median #907

Dhruvanshu-Joshi · 2024-07-09T13:28:42Z

Description

Added support and test for median without creating a separate op for median as discussed in #53 . Hence, our implementation of median is standalone and not dependent on numpy. Also added test for it.

Related Issue

Closes #
Related to Implement equivalent numpy median and quantile / percentile #53

Checklist

Checked that the pre-commit linting/style checks pass
Included tests that prove the fix is effective or that the new feature works
Added necessary documentation (docstrings and/or example notebooks)
If you are a pro: each commit corresponds to a relevant logical change

Type of change

codecov · 2024-07-09T14:12:08Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 81.90%. Comparing base (b248eba) to head (7bfbf68).
Report is 83 commits behind head on main.

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #907      +/-   ##
==========================================
+ Coverage   81.89%   81.90%   +0.01%     
==========================================
  Files         182      182              
  Lines       47778    47799      +21     
  Branches     8597     8599       +2     
==========================================
+ Hits        39126    39150      +24     
+ Misses       6487     6485       -2     
+ Partials     2165     2164       -1

Files with missing lines	Coverage Δ
pytensor/tensor/math.py	`91.29% <100.00%> (+0.16%)`	⬆️

... and 2 files with indirect coverage changes

pytensor/tensor/math.py

tests/tensor/test_math.py

pytensor/tensor/math.py

tests/tensor/test_math.py

Dhruvanshu-Joshi · 2024-07-19T18:36:53Z

Hi @ricardoV94 is there something left here?

ricardoV94

Some comments/questions

ricardoV94 · 2024-07-20T15:35:31Z

pytensor/tensor/math.py

+    elif isinstance(axis, int | np.integer):
+        axis = [axis]
+    elif isinstance(axis, np.ndarray) and axis.ndim == 0:
+        axis = [int(axis)]
+    else:
+        axis = [int(a) for a in axis]


Can we use normalize_axis_tuple for these 3 cases like we do elsewhere?

Surely. It won't work in cases of axis=None right?
Stuff like var and mean use this so should we open an issue to replace those?

Yeah, we should use it everywhere, and no I don't think it handles axis=None, but we can double check

pytensor/tensor/math.py

ricardoV94 · 2024-07-20T15:38:09Z

pytensor/tensor/math.py

+    indices1 = expand_dims(full_like(sorted_input.take(0, axis=axis), k - 1), axis)
+    indices2 = expand_dims(full_like(sorted_input.take(0, axis=axis), k), axis)


What is happening here? Can't quite follow why we need all this

This was a tricky part.
The problem started when indices was used in take_along_axis. We need to ensure that the shape of the indices tensor (indices1) matches the shape required for broadcasting during the selection of elements using take_along_axis (shape of sorted_inputs).
The full_like essentially makes an array of k/k-1 (central element indices) with the shape similar to the shape of the tensor along that axis( calculated using sorted_input.take(0, axis=axis).

I think this will be better with an example:

sorted_input = np.array([[1, 3, 5], [2, 4, 6], [7, 8, 9]]) k = sorted_input.shape[1] // 2 # Middle index for axis 1 # For this example, k = 3 // 2 = 1 first_elements = sorted_input.take(0, axis=1) # first_elements = [1, 2, 7] k_minus_1 = k - 1 # k - 1 = 0 full_tensor = full_like(first_elements, k_minus_1) # full_tensor = [0, 0, 0] indices1 = expand_dims(full_tensor, axis=1) # indices1 = [[0], # [0], # [0]] ans1 = take_along_axis(sorted_input, indices1, axis=axis) # ans1 = [[1], # [2], # [7]] k = k # k = 1 full_tensor = full_like(first_elements, k) # full_tensor = [1, 1, 1] indices2 = expand_dims(full_tensor, axis=1) # indices2 = [[1], # [1], # [1]] ans2 = take_along_axis(sorted_input, indices2, axis=axis) # ans2 = [[3], # [4], # [8]]

And hence based on ans1 and ans2, median is calculated.

Okay so take_along_axis is not what we want, since we don't need advanced (array indexing). indices is a scalar and always the same. We can do sorted_inputs[:, :, ..., :, k) to get the value we want. Where the : are empty slices before the axis we want to index. I thought take_along_axis would do this, but apparently it's only for advanced (list of numbers) indexing

For what basic vs advanced indexing means check out the numpy docs: https://numpy.org/doc/stable/user/basics.indexing.html

pytensor/tensor/math.py

ricardoV94 · 2024-07-20T15:43:19Z

pytensor/tensor/math.py

+
+    indices1 = expand_dims(full_like(sorted_input.take(0, axis=axis), k - 1), axis)
+    indices2 = expand_dims(full_like(sorted_input.take(0, axis=axis), k), axis)
+    ans1 = take_along_axis(sorted_input, indices1, axis=axis)


We should check if take along axis does basic indexing in this case, shouldn't require advanced indexing

We just need sorted_input[..., k] and sorted_input[..., k+1] right?

Can you elaborate on basic indexing and advanced indexing?

https://numpy.org/doc/stable/user/basics.indexing.html

pytensor/tensor/math.py

ricardoV94

More comments

pytensor/tensor/math.py

Dhruvanshu-Joshi · 2024-07-31T20:47:02Z

Hi @ricardoV94 , is there something left here?

Want to do a final check

Co-authored-by: Ricardo Vieira <[email protected]>

ricardoV94 · 2024-10-11T13:51:14Z

pytensor/tensor/math.py

+    k_values = x_sorted[..., k]
+    km1_values = x_sorted[..., k - 1]


I simplified the indexing, we can use simple indexing instead of take_along_axis

Looks great! I did not know we can use simple indexing so conveniently.

ricardoV94 · 2024-10-11T13:51:31Z

pytensor/tensor/math.py

+    # Put axis at the end and unravel them
+    x_raveled = x.transpose(*non_axis, *axis)
+    if len(axis) > 1:
+        x_raveled = x_raveled.reshape((*non_axis_shape, -1))


Added a small optimization to avoid reshaping when not needed

ricardoV94 · 2024-10-11T14:43:53Z

We could have added keepdims as well, maybe something we can address in #942

Dhruvanshu-Joshi changed the title ~~Add logic for median from scratch~~ Added Support for Median Jul 9, 2024

Dhruvanshu-Joshi force-pushed the median branch from 5dbc00f to 1c11dbe Compare July 9, 2024 13:47

ricardoV94 reviewed Jul 9, 2024

View reviewed changes

pytensor/tensor/math.py Outdated Show resolved Hide resolved

ricardoV94 reviewed Jul 10, 2024

View reviewed changes

tests/tensor/test_math.py Outdated Show resolved Hide resolved

ricardoV94 reviewed Jul 10, 2024

View reviewed changes

pytensor/tensor/math.py Outdated Show resolved Hide resolved

ricardoV94 added enhancement New feature or request NumPy compatibility labels Jul 10, 2024

Dhruvanshu-Joshi force-pushed the median branch from 2c30021 to 163736a Compare July 13, 2024 14:30

ricardoV94 reviewed Jul 13, 2024

View reviewed changes

tests/tensor/test_math.py Outdated Show resolved Hide resolved

Dhruvanshu-Joshi force-pushed the median branch from 163736a to 9874a55 Compare July 13, 2024 14:44

ricardoV94 reviewed Jul 20, 2024

View reviewed changes

pytensor/tensor/math.py Outdated Show resolved Hide resolved

ricardoV94 reviewed Jul 20, 2024

View reviewed changes

pytensor/tensor/math.py Outdated Show resolved Hide resolved

Dhruvanshu-Joshi force-pushed the median branch from 9874a55 to aa48112 Compare July 20, 2024 20:35

ricardoV94 previously approved these changes Oct 10, 2024

View reviewed changes

ricardoV94 self-requested a review October 10, 2024 16:17

Implement median helper

7bfbf68

Co-authored-by: Ricardo Vieira <[email protected]>

ricardoV94 force-pushed the median branch from aa48112 to 7bfbf68 Compare October 11, 2024 13:50

ricardoV94 reviewed Oct 11, 2024

View reviewed changes

ricardoV94 approved these changes Oct 11, 2024

View reviewed changes

ricardoV94 merged commit f277af7 into pymc-devs:main Oct 11, 2024
61 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Added Support for Median #907

Added Support for Median #907

Dhruvanshu-Joshi commented Jul 9, 2024

codecov bot commented Jul 9, 2024 •

edited

Loading

Dhruvanshu-Joshi commented Jul 19, 2024

ricardoV94 left a comment •

edited

Loading

ricardoV94 Jul 20, 2024

Dhruvanshu-Joshi Jul 20, 2024

ricardoV94 Jul 22, 2024

ricardoV94 Jul 20, 2024 •

edited

Loading

Dhruvanshu-Joshi Jul 20, 2024

ricardoV94 Jul 22, 2024

ricardoV94 Jul 22, 2024

ricardoV94 Jul 20, 2024 •

edited

Loading

Dhruvanshu-Joshi Jul 20, 2024

ricardoV94 Jul 22, 2024

ricardoV94 left a comment

Dhruvanshu-Joshi commented Jul 31, 2024

ricardoV94 Oct 11, 2024

Dhruvanshu-Joshi Oct 11, 2024

ricardoV94 Oct 11, 2024

ricardoV94 commented Oct 11, 2024

		indices1 = expand_dims(full_like(sorted_input.take(0, axis=axis), k - 1), axis)
		indices2 = expand_dims(full_like(sorted_input.take(0, axis=axis), k), axis)

		k_values = x_sorted[..., k]
		km1_values = x_sorted[..., k - 1]

Added Support for Median #907

Added Support for Median #907

Conversation

Dhruvanshu-Joshi commented Jul 9, 2024

Description

Related Issue

Checklist

Type of change

codecov bot commented Jul 9, 2024 • edited Loading

Codecov Report

Dhruvanshu-Joshi commented Jul 19, 2024

ricardoV94 left a comment • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ricardoV94 Jul 20, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ricardoV94 Jul 20, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ricardoV94 left a comment

Choose a reason for hiding this comment

Dhruvanshu-Joshi commented Jul 31, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ricardoV94 commented Oct 11, 2024

codecov bot commented Jul 9, 2024 •

edited

Loading

ricardoV94 left a comment •

edited

Loading

ricardoV94 Jul 20, 2024 •

edited

Loading

ricardoV94 Jul 20, 2024 •

edited

Loading