-
Notifications
You must be signed in to change notification settings - Fork 99
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sptrsv perf_test enhancements #605
sptrsv perf_test enhancements #605
Conversation
- If KOKKOSKERNELS_ENABLE_TPL_SUPERLU is enabled, sptrsv_supernode must declare read_supernodal_graphL for sptrsv_superlu.
- Include KokkosKernels_config.h in superlu and choldmod perf test. This is needed for defines that conditionally enable the test at compile time. - Improve command line argument parsing for superlu perf test.
@@ -796,8 +796,8 @@ int main(int argc, char **argv) { | |||
} | |||
|
|||
{ | |||
using scalar_t = double; | |||
//using scalar_t = Kokkos::complex<double>; | |||
//using scalar_t = double; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we pick this based on what scalars are defined ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will modify this perf_test to exercise both the double and complex types.
EDIT: We should be able to pick this based on the input files.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it should also be what is enabled in ETI at compile time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it! Thanks!
#if defined(KOKKOSKERNELS_ENABLE_TPL_LAPACKE) && defined(KOKKOSKERNELS_ENABLE_TPL_CBLAS) | ||
#if (defined(KOKKOSKERNELS_ENABLE_TPL_LAPACKE) && \ | ||
defined(KOKKOSKERNELS_ENABLE_TPL_CBLAS)) \ | ||
|| defined(KOKKOSKERNELS_ENABLE_TPL_SUPERLU) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this right to include cblas and lapacke when SuperLU is defined ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do not think it is the correct place, I think these includes should be hidden within KokkosBlas
and I intend to do this while resolving #589. These small changes are just to fix compilation errors. || defined(KOKKOSKERNELS_ENABLE_TPL_SUPERLU)
makes the definition of read_supernodal_graphL
available for sptrsv_superlu
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for looking at this. By the way, this file uses CBLAS/LAPACKE, but not SuperLU (in fact, it is also used with sptrsv_cholmod). Does CBLAS/LAPACKE gets enabled, if SuperLU get enabled?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No problem :)
The problem here is that sptrsv_superlu needs the definition of read_supernodal_graphL
-- otherwise, kokkos-kernels will not compile.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR feedback. This is my mistake. It looks like I had a bug with my build makefile generation scripts where CBLAS and LAPACKE were not enabled but SUPERLU was.
I now see that the perf tests pull in KokkosKernels_config.h
via KokkosSparse_sptrsv.hpp->KokkosSparse_sptrsv_symbolic_spec.hpp
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, you are trying to build it without CBLAS/LAPACKE enabled?
@iyamazaki: Yes, I was. This compilation issue arises when --with-tpls=cholmod,superlu
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am confused now. What is the goal here ? If you enable superlu, cholmod then cblas/lapacke are required. Your changes will get rid of cblas/lapacke dependency, but we are not there yet. Correct ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yea, I don't think we can compile the code without enabling CBLAS/LAPACKE, right now..
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@srajama1: I was confused as well -- if we use --with-tpls=cholmod,superlu
, one way to get things compiling is by adding || defined(KOKKOSKERNELS_ENABLE_TPL_SUPERLU)
here to fix a seemingly unrelated compilation error. If we use --with-tpls=cholmod,superlu,cblas,blas,lapacke
everything works as expected without adding || defined(KOKKOSKERNELS_ENABLE_TPL_SUPERLU)
here.
What is the goal here ?
The goal of this PR is to learn the kokkos-kernels PR guidelines and fix the build for the superlu and cholmod perf_tests and in the process.
If you enable superlu, cholmod then cblas/lapacke are required.
OK, is the correct place to handle this dependency resolution in the new cmake build system (which announces whether a TPL is available) and the ETI specialization layer (which determines if a full specialization is available)?
Your changes will get rid of cblas/lapacke dependency, but we are not there yet. Correct ?
I think that's right assuming I am starting to understand the ETI system. I plan on addressing this dependency in a separate PR -- I will update the description in #589 to see if we are on the same page.
These questions were really helpful, thank you.
- Select scalar_t based on compile time eti-values.
This reverts commit 25a640f.
The spot-check passed on ride: nohup ../scripts/test_all_sandia --kokkoskernels-path=$KOKKOSKERNELS_PATH --kokkos-path=$KOKKOS_PATH --arch="Power8,Pascal60" --spot-check &
Running on machine: white
Going to test compilers: gcc/6.4.0 gcc/7.2.0 gcc/7.4.0 ibm/16.1.0 cuda/9.2.88 cuda/10.0.130
Testing compiler gcc/6.4.0
Testing compiler gcc/7.2.0
Starting job gcc-7.2.0-OpenMP-release
Starting job gcc-6.4.0-OpenMP_Serial-release
PASSED gcc-7.2.0-OpenMP-release
Starting job gcc-7.2.0-Serial-release
PASSED gcc-6.4.0-OpenMP_Serial-release
Testing compiler gcc/7.4.0
Starting job gcc-7.2.0-OpenMP_Serial-release
PASSED gcc-7.2.0-Serial-release
Testing compiler ibm/16.1.0
Starting job gcc-7.4.0-OpenMP-release
PASSED gcc-7.4.0-OpenMP-release
Testing compiler cuda/9.2.88
Starting job ibm-16.1.0-Serial-release
PASSED gcc-7.2.0-OpenMP_Serial-release
PASSED ibm-16.1.0-Serial-release
Testing compiler cuda/10.0.130
Starting job cuda-9.2.88-Cuda_OpenMP-release
PASSED cuda-9.2.88-Cuda_OpenMP-release
Starting job cuda-10.0.130-Cuda_Serial-release
PASSED cuda-10.0.130-Cuda_Serial-release
#######################################################
PASSED TESTS
#######################################################
cuda-10.0.130-Cuda_Serial-release build_time=1248 run_time=853
cuda-9.2.88-Cuda_OpenMP-release build_time=1318 run_time=663
gcc-6.4.0-OpenMP_Serial-release build_time=649 run_time=841
gcc-7.2.0-OpenMP-release build_time=478 run_time=301
gcc-7.2.0-OpenMP_Serial-release build_time=603 run_time=1080
gcc-7.2.0-Serial-release build_time=257 run_time=462
gcc-7.4.0-OpenMP-release build_time=402 run_time=304
ibm-16.1.0-Serial-release build_time=1484 run_time=779
#######################################################
FAILED TESTS
####################################################### |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, this looks simple enough to go in. I would like this for float and complex float as well, but that can come in another PR.
If @iyamazaki is fine, I will push this PR in.
@e10harvey Can you start a spot check on a non-GPU machine as well ? |
@srajama1, no problem. I started another with [email protected] and kokkos-kernels@topic/sptrsv_build_patches on a sems machine. I won't have access to other non-GPU machines listed in |
Got it ! Thanks ! |
@@ -100,7 +102,7 @@ void print_factor_superlu(int n, SuperMatrix *L, SuperMatrix *U, int *perm_r, in | |||
} | |||
printf( "];\n" ); | |||
|
|||
#if 0 | |||
#if defined(KOKKOSKERNELS_ENABLE_TPL_SUPERLU) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm planning to remove these debugging functions in the next PR, but we may get compilation errors/warnings with %e in fprintf using complex scalar type?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't encounter any when testing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These "spot-check" are not testing these codes, are they?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@e10harvey Bold of you to assume that tests actually enable complex scalar types ;-)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These "spot-check" are not testing these codes, are they?
Correct. The spot-check is not testing these diffs.
@e10harvey Bold of you to assume that tests actually enable complex scalar types ;-)
I ran the cholmod and superlu perf tests manually for both double and complex :-)
We can either close or merge this PR -- it's really just an onboarding task for myself and the only real improvement is the super LU perf test argument processing.
@lucbv Thank you ! Merging it. |
I merged this when I was trying to merge a PR by @lucbv by switching to the wrong window. I think it is still reasonable to merge this. @e10harvey @iyamazaki : Do you agree or should I revert it ? I would have guarded the fprintf with complex being defined, but I think you can submit another quick PR for it, right ? |
I agree but I defer to @iyamazaki. @iyamazaki, shall I submit a PR for the fprintf guarding or are you still planning to remove these in your next PR? |
Yes. I've already removed it. I am not quite done cleaning it up, but I am thinking to do a PR today after resolving conflicts. Thank you again for looking at it!! |
scalar_t
based on configure value of KOKKOSKERNELS_SCALARS.Related to #589.
Spot-check run in progress.