-
Notifications
You must be signed in to change notification settings - Fork 578
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
tpetra vector creation creates 6 modified flags causing a ton of launch overhead #6158
Comments
The modified flags are in Kokkos::DualView. There are three now, but when applications set KOKKOS_ENABLE_DEPRECATED_CODE=OFF, there is one. getVectorNonConst(0) creates a new Tpetra::MultiVector containing the 0th vector of the "this" multivector (not a zero-length view). New flags are created when subviews of the DualView are taken; the subview code is in Kokkos (Kokkos_DualView.hpp). Subviews of both the host and device memory are also created in the DualView subview constructor. Subviews of the import and export buffers are also created on host and device. In Tpetra, perhaps we can add logic to test the number of vectors in a MultiVector, just reusing the views (new = old) instead of subviewing (new = subview(old, ...)) when the multivector has only one vector. @mhoemmen, what do you think? |
Hi @kddevin ! I'm looking at the implementations of
Notice how the |
@crtrott Is there launch overhead in copying the modify flags' view? |
a MultiVector, we created subviews for comm buffers, but did not store them. This commit stores them. It also offsets the buffers by the vector j requested from the MultiVector.
…s:develop' (d17489d). * trilinos-develop: SEACAS: go back to lib:fmt 6.0.0 until fix issue on vortex xl/cuda build Disable Teko_testdriver_tpetra_MPI_4 in all atdm 'waterman' builds (trilinos#6463) zoltan2: add missing include file for non-ETI builds Tpetra: Missed ifdef guard zoltan2: name change to prevent shadow warnings zoltan2: Change logic for determining gno types to use in tests Simplified now that Trilinos builds only one gno_t Tpetra: More stacked timer fixes Tpetra: Fix overflow for TpetraCore_MatrixMatrix_UnitTests tpetra: removed unused field from FixedHashTable This removes some warnings about calling host functions from host device functions trilinos#5698; E.g., warning: calling a __host__ function("std::basic_string<char, std::char_traits<char>, std::allocator<char> >::~basic_string") from a __host__ __device__ function("Tpetra::Details::LocalMap<int, long long, ::Kokkos::Device< ::Kokkos::Serial, ::Kokkos::HostSpace> > ::~LocalMap [subobject]") is not allowed tpetra: when looking at trilinos#6158, I saw that, when creating a Vector from a MultiVector, we created subviews for comm buffers, but did not store them. This commit stores them. It also offsets the buffers by the vector j requested from the MultiVector.
…s:develop' (d17489d). * trilinos-develop: SEACAS: go back to lib:fmt 6.0.0 until fix issue on vortex xl/cuda build Disable Teko_testdriver_tpetra_MPI_4 in all atdm 'waterman' builds (trilinos#6463) zoltan2: add missing include file for non-ETI builds Tpetra: Missed ifdef guard zoltan2: name change to prevent shadow warnings zoltan2: Change logic for determining gno types to use in tests Simplified now that Trilinos builds only one gno_t Tpetra: More stacked timer fixes Tpetra: Fix overflow for TpetraCore_MatrixMatrix_UnitTests tpetra: removed unused field from FixedHashTable This removes some warnings about calling host functions from host device functions trilinos#5698; E.g., warning: calling a __host__ function("std::basic_string<char, std::char_traits<char>, std::allocator<char> >::~basic_string") from a __host__ __device__ function("Tpetra::Details::LocalMap<int, long long, ::Kokkos::Device< ::Kokkos::Serial, ::Kokkos::HostSpace> > ::~LocalMap [subobject]") is not allowed tpetra: when looking at trilinos#6158, I saw that, when creating a Vector from a MultiVector, we created subviews for comm buffers, but did not store them. This commit stores them. It also offsets the buffers by the vector j requested from the MultiVector.
…s:develop' (d17489d). * trilinos-develop: tpetra: In trilinos#6598, @mhoemmen recommended this change of offset SEACAS: go back to lib:fmt 6.0.0 until fix issue on vortex xl/cuda build Disable Teko_testdriver_tpetra_MPI_4 in all atdm 'waterman' builds (trilinos#6463) zoltan2: add missing include file for non-ETI builds Tpetra: Missed ifdef guard zoltan2: name change to prevent shadow warnings zoltan2: Change logic for determining gno types to use in tests Simplified now that Trilinos builds only one gno_t Tpetra: More stacked timer fixes Tpetra: Fix overflow for TpetraCore_MatrixMatrix_UnitTests tpetra: removed unused field from FixedHashTable This removes some warnings about calling host functions from host device functions trilinos#5698; E.g., warning: calling a __host__ function("std::basic_string<char, std::char_traits<char>, std::allocator<char> >::~basic_string") from a __host__ __device__ function("Tpetra::Details::LocalMap<int, long long, ::Kokkos::Device< ::Kokkos::Serial, ::Kokkos::HostSpace> > ::~LocalMap [subobject]") is not allowed tpetra: when looking at trilinos#6158, I saw that, when creating a Vector from a MultiVector, we created subviews for comm buffers, but did not store them. This commit stores them. It also offsets the buffers by the vector j requested from the MultiVector.
…s:develop' (d17489d). * trilinos-develop: tpetra: In trilinos#6598, @mhoemmen recommended this change of offset Tpetra::CrsMatrix: Add Kokkos kernel labels; expose debug code Tpetra::CrsMatrix: Remove values2D_ Tpetra::CrsGraph: Remove gblInds2D_ Tpetra::CrsGraph: Remove lclInds2D_ Tpetra::CrsMatrix: Remove unused method allocateValues2D Tpetra: Use verbosePrintCountThreshold in copyOffsets Tpetra::Details::Behavior: Add longRowMinNumEntries Tpetra::Details::Behavior: Factor out size_t reading SEACAS: go back to lib:fmt 6.0.0 until fix issue on vortex xl/cuda build Disable Teko_testdriver_tpetra_MPI_4 in all atdm 'waterman' builds (trilinos#6463) zoltan2: add missing include file for non-ETI builds Tpetra: Missed ifdef guard zoltan2: name change to prevent shadow warnings zoltan2: Change logic for determining gno types to use in tests Simplified now that Trilinos builds only one gno_t Tpetra: More stacked timer fixes Tpetra: Fix overflow for TpetraCore_MatrixMatrix_UnitTests tpetra: removed unused field from FixedHashTable This removes some warnings about calling host functions from host device functions trilinos#5698; E.g., warning: calling a __host__ function("std::basic_string<char, std::char_traits<char>, std::allocator<char> >::~basic_string") from a __host__ __device__ function("Tpetra::Details::LocalMap<int, long long, ::Kokkos::Device< ::Kokkos::Serial, ::Kokkos::HostSpace> > ::~LocalMap [subobject]") is not allowed Framework: updating the autotester env to remove dependency on atdm-env tpetra: when looking at trilinos#6158, I saw that, when creating a Vector from a MultiVector, we created subviews for comm buffers, but did not store them. This commit stores them. It also offsets the buffers by the vector j requested from the MultiVector.
With Kokkos' deprecated code removed, the number of flags is reduced. |
Bug Report
@trilinos/tpetra
Description
I've figured out that this call here
creates 6 views of modified flags
It should be 1. Fixes in the non-deprecated branch make it 2,
Currently a 0 length view takes 60us to create and ifpack2 makes 1000s of them per solve... That is free money
There are 4500 calls in trilinos to this, that is a lot of places to reduce launch overhead of initializing these.
Steps to Reproduce
The text was updated successfully, but these errors were encountered: