-
Notifications
You must be signed in to change notification settings - Fork 578
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Kokkos + KokkosKernels Promotion To Version 2.6.00 #2351
Conversation
Thanks @ndellingwood for providing most of these
KokkosGraph_GraphColor.hpp renamed KokkosGraph_graph_color.hpp Changes to be committed: modified: MueLu_AggregationPhase1Algorithm_kokkos_def.hpp
atomic_increment has better performance and usage prevents need for casting in atomic_fetch_add
static members in CellTools (structs with View members) required a finalize hook to properly destruct to prevent errors of the following type: cudaDeviceSynchronize() error( cudaErrorCudartUnloading)
Cast '1' increment to proper type in Kokkos::atomic_fetch_add; replace with Kokkos::atomic_increment when possible.
…into kokkos-develop
Added to fix issues exposed by Panzer unit tests during Trilinos integration testing with Cuda backend.
Necessary to fix error with Cuda of type cudaDeviceSynchronize() error(cudaErrorCudartUnloading) due to View wrapped in RCP having destructor called after Kokkos::finalize. Changes to be committed: modified: ../../../panzer/adapters-stk/test/gather_scatter_evaluators/scatter_field_evaluator.cpp
…53ad40 From repository at [email protected]:kokkos/kokkos.git At commit: commit e01945d0947f47e579468f325e4d97446453ad40 Merge: 62e760f d1ba7d7 Author: Nathan Ellingwood <[email protected]> Date: Wed Mar 7 16:10:52 2018 -0700 Merge branch 'develop' for 2.6.00 Part of Kokkos C++ Performance Portability Programming EcoSystem 2.6
…d27a6a56b98114 From repository at [email protected]:kokkos/kokkos-kernels.git At commit: commit 6e8e97a977564673bdadc15085d27a6a56b98114 Merge: 00b1648 f81778c Author: Nathan Ellingwood <[email protected]> Date: Wed Mar 7 16:29:28 2018 -0700 Merge branch 'develop' for 2.6.00 Part of Kokkos C++ Performance Portability Programming EcoSystem 2.6
Will post shortly. |
Testing Results Shepard SerialErrors and test failures are a strict subset of the ones encountered on the develop branch of Trilinos (without the Kokkos or KokkosKernels snapshots) with the same configuration. Build Errors:None General Test Results:Two failures that also failed with existing Trilinos develop branch
Test Failures:
|
Testing Results Shepard PthreadsErrors and test failures are a strict subset of the ones encountered on the develop branch of Trilinos (without the Kokkos or KokkosKernels snapshots) with the same configuration. Build Errors:Massive amount of 'undefined reference' errors on both develop and kokkos-develop Trilinos branches. The kokkos-develop branch makes it farther through the build, exposes the following MueLu error also seen on develop and kokkos-develop branch of Trilinos with Cuda:
General Test Results:Many tests not run. No difference between the kokkos-develop and develop branches of Trilinos.
Test Failures
|
Testing Results White OpenMPErrors and test failures are a strict subset of the ones encountered on the develop branch of Trilinos (without the Kokkos or KokkosKernels snapshots) with the same configuration. Build Errors:None General Test Results:
Test Failures:One additional failure on the develop branch not occurring on the kokkos-develop branch of Trilinos.
|
Testing Results White CudaErrors and test failures are a strict subset of the ones encountered on the develop branch of Trilinos (without the Kokkos or KokkosKernels snapshots) with the same configuration. Build Errors:No new build errors were introduced by the integration testing. The following errors occur with both develop and kokkos-develop branches of Trilinos: ROL MueLu
General Test Results:
Test Failures:One additional test failed on kokkos-develop branch during this testing,
|
@crtrott Summary of build and test output posted. |
Ok considering Jims post we can go ahead and merge!!! The Teuchos::null thing in Panzer can be done later. |
Merged!! |
Nathan do you have the merge for Kokkos and KokkosKernels ready? |
I.e. is everything in develop and is it just a straightforward merge at this point? |
@crtrott I have the develop and master branches ready (on kokkos-dev) that we prepped for the snapshots into Trilinos. I don't have permissions to push them, how do we proceed? |
ok I open this up now so you can push, and then I close this again. |
OK Kokkos is open for you to push. Don't screw it up ;-) |
Kokkos is done! |
ok opening up kokkos-kernels now. |
Try now. |
@crtrott I still need permission for the master branch on KokkosKernels, develop is done... |
Try again |
@crtrott develop and master branches updated in both Kokkos and KokkosKernels! |
Branches are closed again. |
Thanks Nathan for all the work on this !! Good job. |
Thanks for all the help!! |
YAY |
For what its worth, the checkin script run also passed. |
@trilinos/kokkos
@trilinos/kokkos-kernels
Description
This merges Kokkos and KokkosKernels versions 2.6.00 into Trilinos.
This adds support for Volta GPUs and improves performance of all Cuda kernel launches and View shallow copies and atomics with Serial backend.
This adds enhancement for KokkosKernels Batched BLAS and performance improvements for Spgemm hashing.
Kokkos ChangeLog
2.6.00 (2018-03-07)
Full Changelog
Part of the Kokkos C++ Performance Portability Programming EcoSystem 2.6
Implemented enhancements:
Fixed bugs:
KokkosKernels ChangeLog
2.6.00 (2018-03-07)
Full Changelog
Implemented enhancements:
Fixed bugs:
How Has This Been Tested?
This set of changes has been tested on Shepard and White with Intel 17, GCC 5.3, NVCC 8.
Configurations according to kokkos-kernels/scripts/trilinos-integration test scripts.
A detailed list of test failures is provided below. No additional test failures compared to the current
trilinos/develop branch were observed. Both individual packages are passing their comprehensive
nightly test suites. This includes testing of more than 200 configuration spanning 25+ compiler versions and 9 hardware platforms ( Intel Skylake, Intel SandyBridge, Intel Haswell, Intel Haswell + NVIDIA K40, Intel KNL, ARM, Power8, Power8 + K80, Power8 + P100).