Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DAPHNE-#399] Aggregation kernels can return different value types #402

Merged

Conversation

aristotelis96
Copy link
Collaborator

AggAll.h, AggCol.h and AggRow.h used to return the same value type as the argument. In case of some aggregations (i.e. mean, stddev) that did not make much sense for integer arguments where the result will probably be floating point value. With this PR:

  • Extended these template kernels to support different return types
  • Added some test cases.

Note that these kernels still support aggregation of type DenseMatrix<int> -> int, but we end up with losing precision (in case of mean). The floating-point result is not rounded to the closest integer, but it is converted. We can support rounding if we want to. But imo if the user needs that, then it makes more sense to me, to use a rounding function on the floating type result.

Closes #399.

@pdamme pdamme self-requested a review July 27, 2022 16:24
Copy link
Collaborator

@pdamme pdamme left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @aristotelis96, thanks for this contribution. Decoupling the result and argument value types of aggregation is very important, not only for integer to floating-point (MEAN, ...), but also for integer to wider integer (SUM, ...).

In general, your changes look good to me. However, may I kindly ask you to improve two aspects:

  1. I think func should have template parameters <VTRes, VTRes, VTArg> (instead of <VTRes, VTArg, VTArg>) everywhere, since agg is of type VTRes. In the current version, agg would be casted back to VTArg (e.g., from double to int64_t for MEAN, or from int64_t to int8_t for SUM).

  2. The test cases should systematically test different combinations of VTRes and VTArg. From the aggregation functions we already support, this applies to SUM, MEAN, STDDEV. (For MIN/MAX we can assume VTRes is the same as VTArg; for IDXMIN/IDXMAX we can assume VTRes is always size_t).) I see that you tried this for MEAN with double result value type in AggColTest.cpp and AggRowTets.cpp, but weren't sure how to do it right. By TEMPLATE_PRODUCT_TEST_CAST, catch2 gives us a way to instantiate all combinations of a data type and value type (as the argument type), but not to further use combinations. I see the following options:

    • Currently, we derive the result value type from the argument value type. Instead, we could hard-code the result value type and copy the entire test case (for SUM, MEAN, and STDDEV) multiple times with different result value types.
    • To reduce code duplication, instead of copying the test cases, we could make them macros with the result value type as a macro parameter. Then we would use that macro multiple times with different result value types.
    • We could have one TEMPLATE_PRODUCT_TEST_CASE per aggregation function, but enumerate different result value types internally (somehow like you tried it already). However, this would lead to code duplication, too. But maybe it can be done better with some clever template magic.
    • Or some other solution, I'd be quite open here... :)

@aristotelis96
Copy link
Collaborator Author

Thanks @pdamme for reviewing this PR.

  1. Thanks for catching this. This is now fixed. However, in case of std::min and std::max (EwBinarySca.h) we can't support different argument value types (e.g. comparing double with int). I'm not sure if this is correct, but I changed the MAKE_EW_BINARY_SCA macro, specifing the template parameter of std::min/max to double, since this would probably work in all cases (see EwBinarySca.h changes). Let me know if I am missing something here.
  2. I agree we should test different combinations of VTRes and VTArg and, yes, I wasn't really sure how to do it right. Using macros seems like a good idea, but I have no strong opinion on this one. 😄 I've changed each test case to use macros for different result types. Let me know what you think.

@aristotelis96 aristotelis96 requested a review from pdamme July 28, 2022 17:35
Copy link
Collaborator

@pdamme pdamme left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for these updates, @aristotelis96. It looks very good to me in general. I just polished it a bit and also introduced tests with various result value types for SUM.

However, there is one problem which is not directly related to this feature: The template instantiation of getEwBinaryScaFuncPtr in src/runtime/local/kernels/EwBinarySca.h causes compiler errors (in case of min/max) and warnings (in case of comparisons of signed/unsigned integer types) once we instantiate any binary function for heterogeneous argument types. You worked around this for min/max by hard-coding a double return type, but acutally, min/max should return the input type (thereby also guaranteeing that both input types are the same, for aggregations). Currently, I don't have a clear solution for this, need to think about it myself...

@pdamme
Copy link
Collaborator

pdamme commented Aug 16, 2022

An update on this: @akroviakov smartly solved a similar problem in his PR #418. In a nutshell, if constexpr can be used to (not) compile certain branches of the code depending on the (combination of) template arguments.

For instance, we could solve the problem with min/max roughly as follows:

if constexpr(std::is_same_v<VTRes, VTArg>) {
    // check if op code is min/max, if so instantiate
    // this will only be compiled if VTRes and VTArg are the same, so no compiler error if they aren't
}

Of course, then we are limited to homogeneous input/output value types for min/max, but that's perfectly okay. And it will not prevent heterogeneous types for the other aggregation functions.

@aristotelis96
Copy link
Collaborator Author

Thanks @pdamme and @akroviakov for the great suggestion! One other possible solution I was thinking was changing EwBinarySca.h kernel by replacing the std::min/max() functions with lhs < rhs ? lhs : rhs (and > respectively), when calling MAKE_EW_BINARY_SCA macro.

@aristotelis96 aristotelis96 force-pushed the 399-support-for-different-return-types branch from 842e312 to f527193 Compare April 25, 2023 09:07
@aristotelis96
Copy link
Collaborator Author

Thanks @pdamme for your feedback. I've updated EwBinarySca.h kernel using constexpr to instantiate the kernel for min/max only for homogeneous value types. Let me know if this is what you had in mind. I've also:

  • Fixed a small issue with a test case.
  • Rebased to main.

@aristotelis96 aristotelis96 requested a review from pdamme April 25, 2023 09:18
@pdamme
Copy link
Collaborator

pdamme commented Apr 25, 2023

Thanks @aristotelis96, I will have a look at your changes within the next few days.

aristotelis96 and others added 6 commits May 29, 2023 13:23
…ypes

- AggAll, AggCol and AggRow used to return the same value type as the argument. In case of some aggregations (i.e. mean, stddev) that did not make sense for integer arguments.
- Extended these template kernels to support different return types
- Added some test cases.
- Fixed template arguments when calling EwBinarySca kernel.

- Updated test cases for AggAll, AggRow and AggCol with macros in order to support combinations of different argument and result value types.
- SUM test cases also test different result types (e.g., sometimes the sum should be a wider integer type).
- Renamed macro parameter from ResultType to VTRes (shorter and more precise: *value* type).
- Added a few comments.
- Some more minor things.
- In case of different value types for min/max, throw a runtime error.
- Updated a testcase.
@aristotelis96 aristotelis96 force-pushed the 399-support-for-different-return-types branch from f527193 to 2c4a9ea Compare May 29, 2023 10:31
Copy link
Collaborator

@pdamme pdamme left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks again for this contribution, @aristotelis96. And sorry that it took so long.

In fact, different value types for the argument and result of aggregations have been supported for some while on main at the DaphneDSL level. This became possible a while ago through a combination of type inference (e.g., the result value type of mean on a si64 matrix is inferred to be f64) and type promotion of the arguments (e.g., the arguments of an aggregation are casted to the value type of the result before the aggregation). Nevertheless, this cast happens at the granularity of an entire matrix, thereby creating an additional intermediate (which is usually small inside a vectorized pipeline).

Taking inspiration from this approach (as well as for consistency with it), I slightly adapted the code of this PR now. Back then, one of the difficulties was the potentially mismatching types of the left and right inputs of the elementwise binary function applied internally. Back then, we had to introduce a workaround for min and max. I have simplified this now by always casting the argument values to the result value type, and using an elementwise binary function with VTRes consistently for the result and both inputs. This is analogous to the procedure described above. Consistently, I have undone the work-around in EwBinarySca.h.

With that, it is ready for merging, from my point of view. Note that the kernels will currently not be used by DaphneDSL scripts, because we still have the type promotion mechanism, which casts the arguments to the result value type, in place for AllAggOp, ColAggOp, and RowAggOp. We would need to simply remove the trait CastArgsToResType from them and make sure that the kernels are instantiated for different argument/result value types. Nevertheless, I wouldn't do that right now, because the current approach on main works and I don't want to risk introducing new bugs shortly before the v0.2 release. For instance, the CUDA kernels for aggregation do not support different result/argument value types yet. So let's keep this for later.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Aggregation kernels: Support for different return types
2 participants