Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tpetra vector creation creates 6 modified flags causing a ton of launch overhead #6158

Closed
bathmatt opened this issue Oct 25, 2019 · 4 comments
Assignees
Labels
pkg: Tpetra type: bug The primary issue is a bug in Trilinos code or tests

Comments

@bathmatt
Copy link
Contributor

Bug Report

@trilinos/tpetra

Description

I've figured out that this call here

RCP<vector_type> B_vec = B.getVectorNonConst (0);

creates 6 views of modified flags

|   |   |   |   |   |   |   |   |   |   |   |   |   compute: 5.93946 - 99.1017% [2148]
|   |   |   |   |   |   |   |   |   |   |   |   |   |   fused: 5.93083 - 99.8546% [2148]
|   |   |   |   |   |   |   |   |   |   |   |   |   |   |   Kokkos View Allocation - DualView::modified_flags: 0.00229072 - 0.0386239% [3866\

It should be 1. Fixes in the non-deprecated branch make it 2,

Currently a 0 length view takes 60us to create and ifpack2 makes 1000s of them per solve... That is free money

There are 4500 calls in trilinos to this, that is a lot of places to reduce launch overhead of initializing these.

Steps to Reproduce

  1. SHA1: [insert here]
  2. Configure script: [attach here]
  3. Configure log: [attach here]
  4. Build log: [attach here]
  5. Input deck: [attach here]
  6. Do this.
  7. Do that.
  8. Shake fist angrily at computer.
  9. Run log: [attach here]
@bathmatt bathmatt added the type: bug The primary issue is a bug in Trilinos code or tests label Oct 25, 2019
@kddevin
Copy link
Contributor

kddevin commented Oct 25, 2019

The modified flags are in Kokkos::DualView. There are three now, but when applications set KOKKOS_ENABLE_DEPRECATED_CODE=OFF, there is one.

getVectorNonConst(0) creates a new Tpetra::MultiVector containing the 0th vector of the "this" multivector (not a zero-length view). New flags are created when subviews of the DualView are taken; the subview code is in Kokkos (Kokkos_DualView.hpp). Subviews of both the host and device memory are also created in the DualView subview constructor. Subviews of the import and export buffers are also created on host and device.

In Tpetra, perhaps we can add logic to test the number of vectors in a MultiVector, just reusing the views (new = old) instead of subviewing (new = subview(old, ...)) when the multivector has only one vector. @mhoemmen, what do you think?

@cgcgcg

@mhoemmen
Copy link
Contributor

Hi @kddevin ! I'm looking at the implementations of getVectorNonConst and the DualView subview constructor, and I'm not seeing where the subviews create new flags. (This matters because people don't always realize that subviews of a DualView share flags with the parent.)

Notice how the Kokkos::DualView subview constructor invokes the flags' copy constructor(s). (Whether one or two copy constructors are invoked depends on whether deprecated code is enabled, but it's the copy constructor each time.) The copy constructor does not allocate; it just shares the allocation of its input, like the std::shared_ptr copy constructor.

@kddevin
Copy link
Contributor

kddevin commented Oct 29, 2019

@crtrott Is there launch overhead in copying the modify flags' view?
There will be two copies (after the user can run with KOKKOS_ENABLE_DEPRECATED_CODE=OFF) because the newly created Vector has (one) a subview of the DualView and (two) a view of the original DualView. The modify flags are touched for each. But isn't each touch inexpensive? Thanks!

kddevin added a commit that referenced this issue Dec 17, 2019
a MultiVector, we created subviews for comm buffers, but did not store them.
This commit stores them.  It also offsets the buffers by the vector j requested
from the MultiVector.
jmgate pushed a commit to tcad-charon/Trilinos that referenced this issue Jan 18, 2020
…s:develop' (d17489d).

* trilinos-develop:
  SEACAS: go back to lib:fmt 6.0.0 until fix issue on vortex xl/cuda build
  Disable Teko_testdriver_tpetra_MPI_4 in all atdm 'waterman' builds (trilinos#6463)
  zoltan2:  add missing include file for non-ETI builds
  Tpetra: Missed ifdef guard
  zoltan2:  name change to prevent shadow warnings
  zoltan2:  Change logic for determining gno types to use in tests Simplified now that Trilinos builds only one gno_t
  Tpetra: More stacked timer fixes
  Tpetra: Fix overflow for TpetraCore_MatrixMatrix_UnitTests
  tpetra:  removed unused field from FixedHashTable This removes some warnings about calling host functions from host device functions trilinos#5698; E.g., warning: calling a __host__ function("std::basic_string<char, std::char_traits<char>, std::allocator<char> >::~basic_string") from a __host__ __device__ function("Tpetra::Details::LocalMap<int, long long,  ::Kokkos::Device< ::Kokkos::Serial,  ::Kokkos::HostSpace> > ::~LocalMap [subobject]") is not allowed
  tpetra:  when looking at trilinos#6158, I saw that, when creating a Vector from a MultiVector, we created subviews for comm buffers, but did not store them. This commit stores them.  It also offsets the buffers by the vector j requested from the MultiVector.
jmgate pushed a commit to tcad-charon/Trilinos that referenced this issue Jan 19, 2020
…s:develop' (d17489d).

* trilinos-develop:
  SEACAS: go back to lib:fmt 6.0.0 until fix issue on vortex xl/cuda build
  Disable Teko_testdriver_tpetra_MPI_4 in all atdm 'waterman' builds (trilinos#6463)
  zoltan2:  add missing include file for non-ETI builds
  Tpetra: Missed ifdef guard
  zoltan2:  name change to prevent shadow warnings
  zoltan2:  Change logic for determining gno types to use in tests Simplified now that Trilinos builds only one gno_t
  Tpetra: More stacked timer fixes
  Tpetra: Fix overflow for TpetraCore_MatrixMatrix_UnitTests
  tpetra:  removed unused field from FixedHashTable This removes some warnings about calling host functions from host device functions trilinos#5698; E.g., warning: calling a __host__ function("std::basic_string<char, std::char_traits<char>, std::allocator<char> >::~basic_string") from a __host__ __device__ function("Tpetra::Details::LocalMap<int, long long,  ::Kokkos::Device< ::Kokkos::Serial,  ::Kokkos::HostSpace> > ::~LocalMap [subobject]") is not allowed
  tpetra:  when looking at trilinos#6158, I saw that, when creating a Vector from a MultiVector, we created subviews for comm buffers, but did not store them. This commit stores them.  It also offsets the buffers by the vector j requested from the MultiVector.
jmgate pushed a commit to tcad-charon/Trilinos that referenced this issue Jan 20, 2020
…s:develop' (d17489d).

* trilinos-develop:
  tpetra:  In trilinos#6598, @mhoemmen recommended this change of offset
  SEACAS: go back to lib:fmt 6.0.0 until fix issue on vortex xl/cuda build
  Disable Teko_testdriver_tpetra_MPI_4 in all atdm 'waterman' builds (trilinos#6463)
  zoltan2:  add missing include file for non-ETI builds
  Tpetra: Missed ifdef guard
  zoltan2:  name change to prevent shadow warnings
  zoltan2:  Change logic for determining gno types to use in tests Simplified now that Trilinos builds only one gno_t
  Tpetra: More stacked timer fixes
  Tpetra: Fix overflow for TpetraCore_MatrixMatrix_UnitTests
  tpetra:  removed unused field from FixedHashTable This removes some warnings about calling host functions from host device functions trilinos#5698; E.g., warning: calling a __host__ function("std::basic_string<char, std::char_traits<char>, std::allocator<char> >::~basic_string") from a __host__ __device__ function("Tpetra::Details::LocalMap<int, long long,  ::Kokkos::Device< ::Kokkos::Serial,  ::Kokkos::HostSpace> > ::~LocalMap [subobject]") is not allowed
  tpetra:  when looking at trilinos#6158, I saw that, when creating a Vector from a MultiVector, we created subviews for comm buffers, but did not store them. This commit stores them.  It also offsets the buffers by the vector j requested from the MultiVector.
jmgate pushed a commit to tcad-charon/Trilinos that referenced this issue Jan 21, 2020
…s:develop' (d17489d).

* trilinos-develop:
  tpetra:  In trilinos#6598, @mhoemmen recommended this change of offset
  Tpetra::CrsMatrix: Add Kokkos kernel labels; expose debug code
  Tpetra::CrsMatrix: Remove values2D_
  Tpetra::CrsGraph: Remove gblInds2D_
  Tpetra::CrsGraph: Remove lclInds2D_
  Tpetra::CrsMatrix: Remove unused method allocateValues2D
  Tpetra: Use verbosePrintCountThreshold in copyOffsets
  Tpetra::Details::Behavior: Add longRowMinNumEntries
  Tpetra::Details::Behavior: Factor out size_t reading
  SEACAS: go back to lib:fmt 6.0.0 until fix issue on vortex xl/cuda build
  Disable Teko_testdriver_tpetra_MPI_4 in all atdm 'waterman' builds (trilinos#6463)
  zoltan2:  add missing include file for non-ETI builds
  Tpetra: Missed ifdef guard
  zoltan2:  name change to prevent shadow warnings
  zoltan2:  Change logic for determining gno types to use in tests Simplified now that Trilinos builds only one gno_t
  Tpetra: More stacked timer fixes
  Tpetra: Fix overflow for TpetraCore_MatrixMatrix_UnitTests
  tpetra:  removed unused field from FixedHashTable This removes some warnings about calling host functions from host device functions trilinos#5698; E.g., warning: calling a __host__ function("std::basic_string<char, std::char_traits<char>, std::allocator<char> >::~basic_string") from a __host__ __device__ function("Tpetra::Details::LocalMap<int, long long,  ::Kokkos::Device< ::Kokkos::Serial,  ::Kokkos::HostSpace> > ::~LocalMap [subobject]") is not allowed
  Framework: updating the autotester env to remove dependency on atdm-env
  tpetra:  when looking at trilinos#6158, I saw that, when creating a Vector from a MultiVector, we created subviews for comm buffers, but did not store them. This commit stores them.  It also offsets the buffers by the vector j requested from the MultiVector.
@kddevin
Copy link
Contributor

kddevin commented Feb 25, 2020

With Kokkos' deprecated code removed, the number of flags is reduced.
I will close this issue; if the problem persists, please re-open.

@kddevin kddevin closed this as completed Feb 25, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
pkg: Tpetra type: bug The primary issue is a bug in Trilinos code or tests
Projects
None yet
Development

No branches or pull requests

3 participants