Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Post-push CI build showing build errors were other builds are not #7195

Closed
bartlettroscoe opened this issue Apr 17, 2020 · 5 comments
Closed
Labels
ATDM DevOps Issues that will be worked by the Coordinated ATDM DevOps teams type: bug The primary issue is a bug in Trilinos code or tests

Comments

@bartlettroscoe
Copy link
Member

Looking at the CI build shown in this query:

one can see that the Kokkos 3.1 promotion from PR #7172 yesterday broke the CI build shown here. The first build errors were in the KokkosKernels package showing:

In file included from /scratch/rabartl/Trilinos.base/SEMSCIBuild/Trilinos/packages/kokkos-kernelssparse/impl/KokkosSparse_spgemm_jacobi_spec.hpp:55:0,
                 from packages/kokkos-kernelsimpl/generated_specializations_cpp/spgemm_jacobi/Sparse_spgemm_jacobi_eti_DOUBLE_ORDINAL_INT_OFFSET_INT_LAYOUTLEFT_EXECSPACE_OPENMP_MEMSPACE_HOSTSPACE_MEMSPACE_HOSTSPACE.cpp:47:
/scratch/rabartl/Trilinos.base/SEMSCIBuild/Trilinos/packages/kokkos-kernels/src/sparse/impl/KokkosSparse_spgemm_jacobi_denseacc_impl.hpp: In member function ‘size_t KokkosSparse::Impl::KokkosSPGEMM<HandleType, a_row_view_t_, a_lno_nnz_view_t_, a_scalar_nnz_view_t_, b_lno_row_view_t_, b_lno_nnz_view_t_, b_scalar_nnz_view_t_>::JacobiSpGEMMDenseAcc<a_row_view_t, a_nnz_view_t, a_scalar_view_t, b_row_view_t, b_nnz_view_t, b_scalar_view_t, c_row_view_t, c_nnz_view_t, c_scalar_view_t, dinv_view_t, mpool_type>::get_thread_id(size_t) const’:
/scratch/rabartl/Trilinos.base/SEMSCIBuild/Trilinos/packages/kokkos-kernelssparse/impl/KokkosSparse_spgemm_jacobi_denseacc_impl.hpp:120:11: error: ‘impl_hardware_thread_id’ is not a member of ‘Kokkos::OpenMP’
    return Kokkos::OpenMP::impl_hardware_thread_id();
           ^
In file included from /scratch/rabartl/Trilinos.base/SEMSCIBuild/Trilinos/packages/kokkos-kernelssparse/impl/KokkosSparse_spgemm_jacobi_spec.hpp:56:0,
                 from packages/kokkos-kernelsimpl/generated_specializations_cpp/spgemm_jacobi/Sparse_spgemm_jacobi_eti_DOUBLE_ORDINAL_INT_OFFSET_INT_LAYOUTLEFT_EXECSPACE_OPENMP_MEMSPACE_HOSTSPACE_MEMSPACE_HOSTSPACE.cpp:47:
/scratch/rabartl/Trilinos.base/SEMSCIBuild/Trilinos/packages/kokkos-kernels/src/sparse/impl/KokkosSparse_spgemm_jacobi_sparseacc_impl.hpp: In member function ‘size_t KokkosSparse::Impl::KokkosSPGEMM<HandleType, a_row_view_t_, a_lno_nnz_view_t_, a_scalar_nnz_view_t_, b_lno_row_view_t_, b_lno_nnz_view_t_, b_scalar_nnz_view_t_>::JacobiSpGEMMSparseAcc<a_row_view_t, a_nnz_view_t, a_scalar_view_t, b_row_view_t, b_nnz_view_t, b_scalar_view_t, c_row_view_t, c_nnz_view_t, c_scalar_view_t, dinv_view_t, pool_memory_type>::get_thread_id(size_t) const’:
/scratch/rabartl/Trilinos.base/SEMSCIBuild/Trilinos/packages/kokkos-kernelssparse/impl/KokkosSparse_spgemm_jacobi_sparseacc_impl.hpp:209:11: error: ‘impl_hardware_thread_id’ is not a member of ‘Kokkos::OpenMP’
    return Kokkos::OpenMP::impl_hardware_thread_id();
           ^
/scratch/rabartl/Trilinos.base/SEMSCIBuild/Trilinos/packages/kokkos-kernels/src/sparse/impl/KokkosSparse_spgemm_jacobi_sparseacc_impl.hpp: In member function ‘size_t KokkosSparse::Impl::KokkosSPGEMM<HandleType, a_row_view_t_, a_lno_nnz_view_t_, a_scalar_nnz_view_t_, b_lno_row_view_t_, b_lno_nnz_view_t_, b_scalar_nnz_view_t_>::JacobiSpGEMMSparseAcc<a_row_view_t, a_nnz_view_t, a_scalar_view_t, b_row_view_t, b_nnz_view_t, b_scalar_view_t, c_row_view_t, c_nnz_view_t, c_scalar_view_t, dinv_view_t, pool_memory_type>::get_thread_id(size_t) const [with a_row_view_t = Kokkos::View<const int*, Kokkos::LayoutLeft, Kokkos::Device<Kokkos::OpenMP, Kokkos::HostSpace>, Kokkos::MemoryTraits<1u> >; a_nnz_view_t = Kokkos::View<const int*, Kokkos::LayoutLeft, Kokkos::Device<Kokkos::OpenMP, Kokkos::HostSpace>, Kokkos::MemoryTraits<1u> >; a_scalar_view_t = Kokkos::View<const double*, Kokkos::LayoutLeft, Kokkos::Device<Kokkos::OpenMP, Kokkos::HostSpace>, Kokkos::MemoryTraits<1u> >; b_row_view_t = Kokkos::View<const int*, Kokkos::LayoutLeft, Kokkos::Device<Kokkos::OpenMP, Kokkos::HostSpace>, Kokkos::MemoryTraits<1u> >; b_nnz_view_t = Kokkos::View<const int*, Kokkos::LayoutLeft, Kokkos::Device<Kokkos::OpenMP, Kokkos::HostSpace>, Kokkos::MemoryTraits<1u> >; b_scalar_view_t = Kokkos::View<const double*, Kokkos::LayoutLeft, Kokkos::Device<Kokkos::OpenMP, Kokkos::HostSpace>, Kokkos::MemoryTraits<1u> >; c_row_view_t = Kokkos::View<int*, Kokkos::LayoutLeft, Kokkos::Device<Kokkos::OpenMP, Kokkos::HostSpace>, Kokkos::MemoryTraits<1u> >; c_nnz_view_t = Kokkos::View<int*, Kokkos::LayoutLeft, Kokkos::Device<Kokkos::OpenMP, Kokkos::HostSpace>, Kokkos::MemoryTraits<1u> >; c_scalar_view_t = Kokkos::View<double*, Kokkos::LayoutLeft, Kokkos::Device<Kokkos::OpenMP, Kokkos::HostSpace>, Kokkos::MemoryTraits<1u> >; dinv_view_t = Kokkos::View<const double**, Kokkos::LayoutLeft, Kokkos::Device<Kokkos::OpenMP, Kokkos::HostSpace>, Kokkos::MemoryTraits<1u> >; pool_memory_type = KokkosKernels::Impl::UniformMemoryPool<Kokkos::HostSpace, int>; HandleType = KokkosKernels::Experimental::KokkosKernelsHandle<const int, const int, const double, Kokkos::OpenMP, Kokkos::HostSpace, Kokkos::HostSpace>; a_row_view_t_ = Kokkos::View<const int*, Kokkos::LayoutLeft, Kokkos::Device<Kokkos::OpenMP, Kokkos::HostSpace>, Kokkos::MemoryTraits<1u> >; a_lno_nnz_view_t_ = Kokkos::View<const int*, Kokkos::LayoutLeft, Kokkos::Device<Kokkos::OpenMP, Kokkos::HostSpace>, Kokkos::MemoryTraits<1u> >; a_scalar_nnz_view_t_ = Kokkos::View<const double*, Kokkos::LayoutLeft, Kokkos::Device<Kokkos::OpenMP, Kokkos::HostSpace>, Kokkos::MemoryTraits<1u> >; b_lno_row_view_t_ = Kokkos::View<const int*, Kokkos::LayoutLeft, Kokkos::Device<Kokkos::OpenMP, Kokkos::HostSpace>, Kokkos::MemoryTraits<1u> >; b_lno_nnz_view_t_ = Kokkos::View<const int*, Kokkos::LayoutLeft, Kokkos::Device<Kokkos::OpenMP, Kokkos::HostSpace>, Kokkos::MemoryTraits<1u> >; b_scalar_nnz_view_t_ = Kokkos::View<const double*, Kokkos::LayoutLeft, Kokkos::Device<Kokkos::OpenMP, Kokkos::HostSpace>, Kokkos::MemoryTraits<1u> >; size_t = long unsigned int]’:
/scratch/rabartl/Trilinos.base/SEMSCIBuild/Trilinos/packages/kokkos-kernelssparse/impl/KokkosSparse_spgemm_jacobi_sparseacc_impl.hpp:224:7: warning: control reaches end of non-void function [-Wreturn-type]
       }
       ^
In file included from /scratch/rabartl/Trilinos.base/SEMSCIBuild/Trilinos/packages/kokkos-kernelssparse/impl/KokkosSparse_spgemm_jacobi_spec.hpp:55:0,
                 from packages/kokkos-kernelsimpl/generated_specializations_cpp/spgemm_jacobi/Sparse_spgemm_jacobi_eti_DOUBLE_ORDINAL_INT_OFFSET_INT_LAYOUTLEFT_EXECSPACE_OPENMP_MEMSPACE_HOSTSPACE_MEMSPACE_HOSTSPACE.cpp:47:
/scratch/rabartl/Trilinos.base/SEMSCIBuild/Trilinos/packages/kokkos-kernels/src/sparse/impl/KokkosSparse_spgemm_jacobi_denseacc_impl.hpp: In member function ‘size_t KokkosSparse::Impl::KokkosSPGEMM<HandleType, a_row_view_t_, a_lno_nnz_view_t_, a_scalar_nnz_view_t_, b_lno_row_view_t_, b_lno_nnz_view_t_, b_scalar_nnz_view_t_>::JacobiSpGEMMDenseAcc<a_row_view_t, a_nnz_view_t, a_scalar_view_t, b_row_view_t, b_nnz_view_t, b_scalar_view_t, c_row_view_t, c_nnz_view_t, c_scalar_view_t, dinv_view_t, mpool_type>::get_thread_id(size_t) const [with a_row_view_t = Kokkos::View<const int*, Kokkos::LayoutLeft, Kokkos::Device<Kokkos::OpenMP, Kokkos::HostSpace>, Kokkos::MemoryTraits<1u> >; a_nnz_view_t = Kokkos::View<const int*, Kokkos::LayoutLeft, Kokkos::Device<Kokkos::OpenMP, Kokkos::HostSpace>, Kokkos::MemoryTraits<1u> >; a_scalar_view_t = Kokkos::View<const double*, Kokkos::LayoutLeft, Kokkos::Device<Kokkos::OpenMP, Kokkos::HostSpace>, Kokkos::MemoryTraits<1u> >; b_row_view_t = Kokkos::View<const int*, Kokkos::LayoutLeft, Kokkos::Device<Kokkos::OpenMP, Kokkos::HostSpace>, Kokkos::MemoryTraits<1u> >; b_nnz_view_t = Kokkos::View<const int*, Kokkos::LayoutLeft, Kokkos::Device<Kokkos::OpenMP, Kokkos::HostSpace>, Kokkos::MemoryTraits<1u> >; b_scalar_view_t = Kokkos::View<const double*, Kokkos::LayoutLeft, Kokkos::Device<Kokkos::OpenMP, Kokkos::HostSpace>, Kokkos::MemoryTraits<1u> >; c_row_view_t = Kokkos::View<int*, Kokkos::LayoutLeft, Kokkos::Device<Kokkos::OpenMP, Kokkos::HostSpace>, Kokkos::MemoryTraits<1u> >; c_nnz_view_t = Kokkos::View<int*, Kokkos::LayoutLeft, Kokkos::Device<Kokkos::OpenMP, Kokkos::HostSpace>, Kokkos::MemoryTraits<1u> >; c_scalar_view_t = Kokkos::View<double*, Kokkos::LayoutLeft, Kokkos::Device<Kokkos::OpenMP, Kokkos::HostSpace>, Kokkos::MemoryTraits<1u> >; dinv_view_t = Kokkos::View<const double**, Kokkos::LayoutLeft, Kokkos::Device<Kokkos::OpenMP, Kokkos::HostSpace>, Kokkos::MemoryTraits<1u> >; mpool_type = KokkosKernels::Impl::UniformMemoryPool<Kokkos::HostSpace, double>; HandleType = KokkosKernels::Experimental::KokkosKernelsHandle<const int, const int, const double, Kokkos::OpenMP, Kokkos::HostSpace, Kokkos::HostSpace>; a_row_view_t_ = Kokkos::View<const int*, Kokkos::LayoutLeft, Kokkos::Device<Kokkos::OpenMP, Kokkos::HostSpace>, Kokkos::MemoryTraits<1u> >; a_lno_nnz_view_t_ = Kokkos::View<const int*, Kokkos::LayoutLeft, Kokkos::Device<Kokkos::OpenMP, Kokkos::HostSpace>, Kokkos::MemoryTraits<1u> >; a_scalar_nnz_view_t_ = Kokkos::View<const double*, Kokkos::LayoutLeft, Kokkos::Device<Kokkos::OpenMP, Kokkos::HostSpace>, Kokkos::MemoryTraits<1u> >; b_lno_row_view_t_ = Kokkos::View<const int*, Kokkos::LayoutLeft, Kokkos::Device<Kokkos::OpenMP, Kokkos::HostSpace>, Kokkos::MemoryTraits<1u> >; b_lno_nnz_view_t_ = Kokkos::View<const int*, Kokkos::LayoutLeft, Kokkos::Device<Kokkos::OpenMP, Kokkos::HostSpace>, Kokkos::MemoryTraits<1u> >; b_scalar_nnz_view_t_ = Kokkos::View<const double*, Kokkos::LayoutLeft, Kokkos::Device<Kokkos::OpenMP, Kokkos::HostSpace>, Kokkos::MemoryTraits<1u> >; size_t = long unsigned int]’:
/scratch/rabartl/Trilinos.base/SEMSCIBuild/Trilinos/packages/kokkos-kernelssparse/impl/KokkosSparse_spgemm_jacobi_denseacc_impl.hpp:131:7: warning: control reaches end of non-void function [-Wreturn-type]
       }
       ^

We are not seeing that build error in any of the Trilinos PR builds yet as shown here:

and it not being seen in any of the ATDM Trilinos builds today shown here:

@bartlettroscoe bartlettroscoe added type: bug The primary issue is a bug in Trilinos code or tests ATDM DevOps Issues that will be worked by the Coordinated ATDM DevOps teams labels Apr 17, 2020
@bartlettroscoe
Copy link
Member Author

bartlettroscoe commented Apr 17, 2020

I deleted the build directory on the machine ceerws1113 and restarted the CI server from scratch with:

nohup \
env
  TRILINOS_CI_DO_INITIAL_REBUILD=1 \
  CTEST_BUILD_FLAGS="-j8 -k 999999" \
  CTEST_PARALLEL_LEVEL=8 \
./Trilinos/cmake/ctest/drivers/sems_ci/trilinos_ci_sever.sh \
  &> trilinos_ci_server.out &

Let's see what happens with this.

@bartlettroscoe
Copy link
Member Author

What is concerning is that I have not been getting emails for all of those broken iterations. I only got the first email for the first error iteration. Not good.

@csiefer2
Copy link
Member

FYI - I have seen some "need to blitz the build directory" issues with the Kokkos upgrade.

@bartlettroscoe
Copy link
Member Author

FYI - I have seen some "need to blitz the build directory" issues with the Kokkos upgrade.

@csiefer2, we have seen this before as reported in #6855. I need to add that solid reproducer for that case.

And indeed, a build from scratch for the CI build seems to have fixed the problem as shown in:

We really need to get someone to figure out why Kokkos is not rebuilding correctly after their big CMake refactor.

@bartlettroscoe
Copy link
Member Author

I will go ahead and close this issue since it looks like the problem is resolved (until the next time Kokkos is updated or they fix #6855).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ATDM DevOps Issues that will be worked by the Coordinated ATDM DevOps teams type: bug The primary issue is a bug in Trilinos code or tests
Projects
None yet
Development

No branches or pull requests

2 participants