Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MueLu: build error on Ride #5962

Closed
ikalash opened this issue Sep 23, 2019 · 6 comments
Closed

MueLu: build error on Ride #5962

ikalash opened this issue Sep 23, 2019 · 6 comments
Labels
client: Albany Issue impacting the Albany project pkg: MueLu type: bug The primary issue is a bug in Trilinos code or tests

Comments

@ikalash
Copy link
Contributor

ikalash commented Sep 23, 2019

The Albany nightlies last night failed due to MueLu. Here is the error - seems like there is a missing overloaded implementation:

 91%] Building CXX object packages/muelu/src/CMakeFiles/muelu_lgn.dir/Utils/ExplicitInstantiation/MueLu_LocalLexicographicIndexManager.cpp.o
/home/projects/albany/ride/repos/Trilinos/packages/muelu/src/Graph/UncoupledAggregation/MueLu_AggregationPhase2bAlgorithm_kokkos_def.hpp(144): error: no instance of overloaded function "Kokkos::View<DataType, Properties...>::operator() [with DataType=signed int *, Properties=<Kokkos::CudaUVMSpace::memory_space>]" matches the argument list
            argument types are: (signed int, int)
            object type is: const Kokkos::View<signed int *, Kokkos::CudaUVMSpace::memory_space>
          detected during instantiation of "void MueLu::AggregationPhase2bAlgorithm_kokkos<LocalOrdinal, GlobalOrdinal, Node>::BuildAggregatesRandom(const Teuchos::ParameterList &, const MueLu::AggregationPhase2bAlgorithm_kokkos<LocalOrdinal, GlobalOrdinal, Node>::LWGraph_kokkos &, MueLu::AggregationPhase2bAlgorithm_kokkos<LocalOrdinal, GlobalOrdinal, Node>::Aggregates_kokkos &, Kokkos::View<unsigned int *, MueLu::LWGraph_kokkos<LocalOrdinal, GlobalOrdinal, Node>::memory_space> &, MueLu::AggregationPhase2bAlgorithm_kokkos<LocalOrdinal, GlobalOrdinal, Node>::LO &) const [with LocalOrdinal=int, GlobalOrdinal=longlong, Node=Kokkos_Compat_KokkosCudaWrapperNode]"
(80): here

Unfortunately our dashboard is down right now, so I cannot point you to that.

@trilinos/muelu

@ikalash ikalash added type: bug The primary issue is a bug in Trilinos code or tests pkg: MueLu client: Albany Issue impacting the Albany project labels Sep 23, 2019
@lucbv
Copy link
Contributor

lucbv commented Sep 23, 2019

@ikalash this is related to work I am doing now.
Could you add a config script for reproduceability?

@ikalash
Copy link
Contributor Author

ikalash commented Sep 23, 2019

@lucbv : oh ok, great. You will find our config options here: https://github.com/SNLComputation/Albany/blob/master/doc/dashboards/ride.sandia.gov/ctest_nightly.cmake.frag#L153-L295 (unfortunately it's part of a cdash script, so you'd have to parse it a little bit to convert it to a configure script that can be executed).

lucbv added a commit to lucbv/Trilinos that referenced this issue Sep 23, 2019
There is a bad access to a Kokkos::View data in AggregationPhase2bAlgorithm_kokkos. This should be fixed with these changes.
@lucbv
Copy link
Contributor

lucbv commented Sep 23, 2019

@ikalash
PR #5969 should fix your problem, at least I was able to fix a problem reported by @bartgol with the same class AggregationPhase2bAlgorithm_kokkos using his CMake configuration.
I will do a build with the configuration you sent above and check that it works fine, although I probably won't have time to build on white until tomorrow morning.

@ikalash
Copy link
Contributor Author

ikalash commented Sep 24, 2019

@lucbv: thanks! It may be easiest to just push the fix, then let the CDash tell us if it fixed the issue or not, given how long the GPU builds typically take.

trilinos-autotester added a commit that referenced this issue Sep 24, 2019
Automatically Merged using Trilinos Pull Request AutoTester
PR Title: MueLu: tentative fix for issue #5962
PR Author: lucbv
jmgate pushed a commit to tcad-charon/Trilinos that referenced this issue Sep 24, 2019
…s:develop' (832f686).

* trilinos-develop:
  MueLu: tentative fix for issue trilinos#5962
  MueLu: fixing bug with kokkos refactor in FactoryManager, see issue trilinos#5961
  MueLu: adding unit-test for factoryManager
  mods to fix a bug in the find region code and to add more flexibility to the options that we can run and the output that we can dump.
  MueLu: Fix issue in nullspace fix
  Testing: Tpetra: enable ETI explicitly
  TrilinosCouplings: fix build errors w/o deprecated
  MueLu: remove regionMG coordinate debug output
  Sidafe's Branch Squashed in Place
jmgate pushed a commit to tcad-charon/Trilinos that referenced this issue Sep 24, 2019
…s:develop' (832f686).

* trilinos-develop:
  MueLu: tentative fix for issue trilinos#5962
  MueLu: fixing bug with kokkos refactor in FactoryManager, see issue trilinos#5961
  MueLu: adding unit-test for factoryManager
  mods to fix a bug in the find region code and to add more flexibility to the options that we can run and the output that we can dump.
  MueLu: Fix issue in nullspace fix
  Testing: Tpetra: enable ETI explicitly
  TrilinosCouplings: fix build errors w/o deprecated
  MueLu: remove regionMG coordinate debug output
  Sidafe's Branch Squashed in Place
@lucbv
Copy link
Contributor

lucbv commented Sep 24, 2019

@ikalash sounds good to me, the patch has been merged yesterday so there is a chance it might have already been tested overnight. Let me know if you see any improvement.

@ikalash
Copy link
Contributor Author

ikalash commented Sep 24, 2019

Seems our ride nightly is back to normal - thanks for fixing it!

@ikalash ikalash closed this as completed Sep 24, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
client: Albany Issue impacting the Albany project pkg: MueLu type: bug The primary issue is a bug in Trilinos code or tests
Projects
None yet
Development

No branches or pull requests

2 participants