-
Notifications
You must be signed in to change notification settings - Fork 65
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[DAPHNE-#399] Aggregation kernels can return different value types #402
[DAPHNE-#399] Aggregation kernels can return different value types #402
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @aristotelis96, thanks for this contribution. Decoupling the result and argument value types of aggregation is very important, not only for integer to floating-point (MEAN
, ...), but also for integer to wider integer (SUM
, ...).
In general, your changes look good to me. However, may I kindly ask you to improve two aspects:
-
I think
func
should have template parameters<VTRes, VTRes, VTArg>
(instead of<VTRes, VTArg, VTArg>
) everywhere, sinceagg
is of typeVTRes
. In the current version,agg
would be casted back toVTArg
(e.g., fromdouble
toint64_t
forMEAN
, or fromint64_t
toint8_t
forSUM
). -
The test cases should systematically test different combinations of
VTRes
andVTArg
. From the aggregation functions we already support, this applies toSUM
,MEAN
,STDDEV
. (ForMIN
/MAX
we can assumeVTRes
is the same asVTArg
; forIDXMIN
/IDXMAX
we can assumeVTRes
is alwayssize_t
).) I see that you tried this forMEAN
withdouble
result value type inAggColTest.cpp
andAggRowTets.cpp
, but weren't sure how to do it right. ByTEMPLATE_PRODUCT_TEST_CAST
, catch2 gives us a way to instantiate all combinations of a data type and value type (as the argument type), but not to further use combinations. I see the following options:- Currently, we derive the result value type from the argument value type. Instead, we could hard-code the result value type and copy the entire test case (for
SUM
,MEAN
, andSTDDEV
) multiple times with different result value types. - To reduce code duplication, instead of copying the test cases, we could make them macros with the result value type as a macro parameter. Then we would use that macro multiple times with different result value types.
- We could have one
TEMPLATE_PRODUCT_TEST_CASE
per aggregation function, but enumerate different result value types internally (somehow like you tried it already). However, this would lead to code duplication, too. But maybe it can be done better with some clever template magic. - Or some other solution, I'd be quite open here... :)
- Currently, we derive the result value type from the argument value type. Instead, we could hard-code the result value type and copy the entire test case (for
Thanks @pdamme for reviewing this PR.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for these updates, @aristotelis96. It looks very good to me in general. I just polished it a bit and also introduced tests with various result value types for SUM
.
However, there is one problem which is not directly related to this feature: The template instantiation of getEwBinaryScaFuncPtr
in src/runtime/local/kernels/EwBinarySca.h
causes compiler errors (in case of min
/max
) and warnings (in case of comparisons of signed/unsigned integer types) once we instantiate any binary function for heterogeneous argument types. You worked around this for min
/max
by hard-coding a double
return type, but acutally, min
/max
should return the input type (thereby also guaranteeing that both input types are the same, for aggregations). Currently, I don't have a clear solution for this, need to think about it myself...
An update on this: @akroviakov smartly solved a similar problem in his PR #418. In a nutshell, For instance, we could solve the problem with min/max roughly as follows: if constexpr(std::is_same_v<VTRes, VTArg>) {
// check if op code is min/max, if so instantiate
// this will only be compiled if VTRes and VTArg are the same, so no compiler error if they aren't
} Of course, then we are limited to homogeneous input/output value types for min/max, but that's perfectly okay. And it will not prevent heterogeneous types for the other aggregation functions. |
Thanks @pdamme and @akroviakov for the great suggestion! One other possible solution I was thinking was changing |
842e312
to
f527193
Compare
Thanks @pdamme for your feedback. I've updated
|
Thanks @aristotelis96, I will have a look at your changes within the next few days. |
…ypes - AggAll, AggCol and AggRow used to return the same value type as the argument. In case of some aggregations (i.e. mean, stddev) that did not make sense for integer arguments. - Extended these template kernels to support different return types - Added some test cases.
- Fixed template arguments when calling EwBinarySca kernel. - Updated test cases for AggAll, AggRow and AggCol with macros in order to support combinations of different argument and result value types.
- SUM test cases also test different result types (e.g., sometimes the sum should be a wider integer type). - Renamed macro parameter from ResultType to VTRes (shorter and more precise: *value* type). - Added a few comments. - Some more minor things.
- In case of different value types for min/max, throw a runtime error. - Updated a testcase.
f527193
to
2c4a9ea
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks again for this contribution, @aristotelis96. And sorry that it took so long.
In fact, different value types for the argument and result of aggregations have been supported for some while on main at the DaphneDSL level. This became possible a while ago through a combination of type inference (e.g., the result value type of mean
on a si64
matrix is inferred to be f64
) and type promotion of the arguments (e.g., the arguments of an aggregation are casted to the value type of the result before the aggregation). Nevertheless, this cast happens at the granularity of an entire matrix, thereby creating an additional intermediate (which is usually small inside a vectorized pipeline).
Taking inspiration from this approach (as well as for consistency with it), I slightly adapted the code of this PR now. Back then, one of the difficulties was the potentially mismatching types of the left and right inputs of the elementwise binary function applied internally. Back then, we had to introduce a workaround for min
and max
. I have simplified this now by always casting the argument values to the result value type, and using an elementwise binary function with VTRes
consistently for the result and both inputs. This is analogous to the procedure described above. Consistently, I have undone the work-around in EwBinarySca.h
.
With that, it is ready for merging, from my point of view. Note that the kernels will currently not be used by DaphneDSL scripts, because we still have the type promotion mechanism, which casts the arguments to the result value type, in place for AllAggOp
, ColAggOp
, and RowAggOp
. We would need to simply remove the trait CastArgsToResType
from them and make sure that the kernels are instantiated for different argument/result value types. Nevertheless, I wouldn't do that right now, because the current approach on main works and I don't want to risk introducing new bugs shortly before the v0.2 release. For instance, the CUDA kernels for aggregation do not support different result/argument value types yet. So let's keep this for later.
AggAll.h
,AggCol.h
andAggRow.h
used to return the same value type as the argument. In case of some aggregations (i.e. mean, stddev) that did not make much sense for integer arguments where the result will probably be floating point value. With this PR:Note that these kernels still support aggregation of type
DenseMatrix<int> -> int
, but we end up with losing precision (in case ofmean
). The floating-point result is not rounded to the closest integer, but it is converted. We can support rounding if we want to. But imo if the user needs that, then it makes more sense to me, to use a rounding function on the floating type result.Closes #399.