Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement Box blur fast filter that could approximate gaussian filter #223

Open
wants to merge 7 commits into
base: main
Choose a base branch
from

Conversation

light-le
Copy link

@light-le light-le commented Jan 16, 2025

solve #168. The algorithm was derived from this blog post

Copy link
Member

@johnnv1 johnnv1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you add benchmarks to it as well?

@light-le
Copy link
Author

You mean in crates/kornia-imgproc/benches/bench_filters.rs ? Sure ok

crates/kornia-imgproc/src/filter/kernels.rs Show resolved Hide resolved
crates/kornia-imgproc/src/filter/kernels.rs Show resolved Hide resolved
@@ -92,4 +122,11 @@ mod tests {
assert_eq!(k, expected[i]);
}
}

#[test]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe add also some test that it’s not only ones ?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok I'll add a few more tests

/// * `src` - The source image with shape (H, W, C).
/// * `dst` - The destination image with shape (H, W, C).
/// * `kernel_size` - The size of the kernel (kernel_x, kernel_y).
/// * `sigma` - The sigma of the gaussian kernel.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Specify that it should be x-y ordered

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right

mod tests {
use super::*;

#[test]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would some simple numbers test too similar to the other functions to verify that’s doing the right thing

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I added 2 tests here: test_box_blur_fast() and test_gaussian_blur(). Both has the same input (0..25) to show that the outputs are not that much different.

I did attempt to use the same input (all 0.0, 9.0 in the middle) as test_fast_horizontal_filter(), if that's how you mean by this comment. The result was a little disappointing as there's a big difference between the outputs of the 2 methods. I figured it's because the test input was odd. It might be fitting for test_fast_horizontal_filter() but not for these. Therefore I went with something more randomized.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why should be different? The test you describe should give you a box of ones, right ?

@@ -88,6 +88,72 @@ pub fn separable_filter<const C: usize>(
Ok(())
}


/// Apply a fast filter horizontally, take advantage of property where all
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Split header docs. Usually there’s a single line explaining in short the purpose of the function followed by end of line then you can add some clarification, formulation of anything needed

/// * `src` - The source image with shape (H, W, C).
/// * `dst_transposed` - The destination image with shape (W, H, C).
/// * `half_kernel_x_size` - Half of the kernel at weight 1. The total size would be 2*this+1
pub fn fast_horizontal_filter<const C: usize>(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
pub fn fast_horizontal_filter<const C: usize>(
fn fast_horizontal_filter<const C: usize>(

I wouldn’t make public to users, this is more a utility function

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I change it to pub(crate) because just removing pub would not expose it to ops.rs/box_blur_fast()

/// * `half_kernel_x_size` - Half of the kernel at weight 1. The total size would be 2*this+1
pub fn fast_horizontal_filter<const C: usize>(
src: &Image<f32, C>,
dst_transposed: &mut Image<f32, C>,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For consistency just dst ?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right. I was signaling that the result would be transposed. But it could be placed in the docstring.

row_acc[ch] += src_data[kernel_pix_offset];
}
leftmost_pixel[ch] = *source_pixel;
rightmost_pixel[ch] = src_data[pix_offset+((src.cols()-1)*C)];
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(src.cols()-1)*C) could be computed outside before the loops


if c == 0 {
row_acc[ch] = *source_pixel * (half_kernel_x_size+1) as f32;
let mut kernel_pix_offset = pix_offset;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wondering wether this offset could be precomputed beforehand as you are computing several times in the top level function

@edgarriba
Copy link
Member

@johnnv1 any idea why python tests are failing (I believe it’s unrelated to this PR). Shouldn’t we be using the new just commands in https://github.com/kornia/kornia-rs/blob/main/.github/workflows/python_test.yml#L40

@johnnv1
Copy link
Member

johnnv1 commented Jan 19, 2025

@johnnv1 any idea why python tests are failing (I believe it’s unrelated to this PR). Shouldn’t we be using the new just commands in https://github.com/kornia/kornia-rs/blob/main/.github/workflows/python_test.yml#L40

yeah, seems unrelated, but should be working

@johnnv1 johnnv1 closed this Jan 21, 2025
@johnnv1 johnnv1 reopened this Jan 21, 2025
Copy link
Member

@edgarriba edgarriba left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you expand this benchmark and report the numbers so that we know wether this method is really making what’s expected ?

https://github.com/kornia/kornia-rs/blob/main/crates/kornia-imgproc/benches/bench_filters.rs

I highly suggest once you have the benchmark setup that you play around with it and try to do micro optimisations like reusing as much as possible pre-computed variables as I suggested in the review to see how affects in the benchmarks.


let mut input_img = src.clone();
for (half_kernel_x_size, half_kernel_y_size) in half_kernel_x_sizes.iter().zip(half_kernel_y_sizes.iter()) {
let mut transposed = Image::<f32, C>::from_size_val(transposed_size, 0.0)?;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can allocate once outside the loop

mod tests {
use super::*;

#[test]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why should be different? The test you describe should give you a box of ones, right ?

@edgarriba edgarriba linked an issue Jan 26, 2025 that may be closed by this pull request
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Implement fast-box-blur
3 participants