-
Notifications
You must be signed in to change notification settings - Fork 579
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GCC 4.9.3-SERIAL fails in nightly clean test #3773
Comments
@william76 there hasn't been anything from kokkos-kernels pushed into Trilinos since Oct 11, see this SHA 549ca9b Have you enabled different test options that weren't being used before? Is Pthreads enabled even though this is referencing a serial build? |
@ndellingwood I'm not 100% sure, I was out of town for a wedding since last Wednesday, we just got back home late last night. I was having a look at the testing and noticed the test in the Clean track failing. It's odd then that it was passing until the 25th and then started failing every night. Perhaps something on the ASCIC build farm got changed? It's definitely possible that the "SERIAL" build in the Clean track is pulling in parallel stuff, that test was put together a while back and I've never really dug into what it's doing. The variant of the serial test in the PR testing is new and it's passing the latest SHA1 just fine. I put some effort into the new serial test to keep parallel TPLs from loading and to turn off all the parallel stuff I could find. @jwillenbring Now that we have the dev->master PR set up and running, what do we want to do with the tests in the "Clean" track? Should we deprecate them in favor of supporting the PR versions? We should discuss this in our stand-up on Wednesday. |
Looks like somebody turned off the Pthread TPL, or isn't linking with that library when they should. Some OpenMP implementations want that. |
I am going to assume this is a @trilinos/framework issue. |
I looked more closely at the date/times of these tests, this error would have shown up on the 25th of October, not the 24th since the tests that started failing kicked off at 10pm on the 25th. Here are the PR's that were merged on 10/25. I don't see anything from @trilinos/framework that would have changed the build system. It looks like @jhux2 changed a couple of CMake settings in #3732 and #3736, but the changes don't look like they'd be causing this error at first glance. @srajama1 We spoke about this in the Framework meeting this morning, and @prwolfe said that they're seeing this error now as well on their Sierra testing. Can @trilinos/kokkos have a look at the output from testing and the PR's that were merged in on 10/25 to see if anything pushed in that day looks like it could be the culprit? @bartlettroscoe Any suggestions on something in TriBiTS that could be looked into to figure out why @trilinos/kokkos-kernels is trying to link pthreads in a serial build? @prwolfe Can you add any links to information on this you're seeing on the Sierra side? Any dashboard failures you're seeing? If you're in the ceelan area, can you ask them if anything changed on the configuration of the systems recently? @mhoemmen yeah, the odd thing to me is that this error just popped up out of the blue. Since the gtest stuff is bundled inside @trilinos/kokkos my initial thought would be to look there or into Kokkos-Kernels for an update to Trilinos... but it looks like there weren't any updates on the 25th. So, the next culprit I'd think of is that something changed on the ASCIC nodes that this test is run or something in the framework... but nothing from framework was merged that day. I'll poke around at the build for that test and see if anything jumps out at me... |
I am seeing timeouts for 2-4 of these tests nightly. Note that we do have pthreads off, I'm not sure why and I will go review that ticket. |
@william76, for the build |
@william76 can you try adding this line:
to the following location in kokkos-kernels: |
@bartlettroscoe Thanks for checking that and providing the links! @jhux2 Thanks for having a look. @ndellingwood Thanks, I'll test that change out and see if that disables the flag... from what I see, Trilinos/packages/kokkos/tpls/gtest/gtest/gtest.h Lines 583 to 590 in d193a2d
#if GTEST_HAS_PTHREAD
// gtest-port.h guarantees to #include <pthread.h> when GTEST_HAS_PTHREAD is
// true.
# include <pthread.h> // NOLINT
// For timespec and nanosleep, used below.
# include <time.h> // NOLINT
#endif I'll test this guard out and report back. |
I modified the |
@william76 Your PR testing passed, do you want to push this change in ? |
@trilinos/kokkos-kernels
An error showed up in the gcc 4.9.3 SERIAL build on 10/25, which means the issue likely was introduced sometime on 10/24.
The errors I'm seeing are:
I was out of town for the past week... @jwillenbring do you know if anything @trilinos/framework related changed on our configurations that would cause an undefined reference to things like
pthread_key_create
?@trilinos/kokkos-kernels was anything merged into Trilinos that is using any new features from pthreads that might require a newer version of pthreads on the testing machines?
I also found this StackOverflow issue which was related to someone linking libgtest incorrectly... were any changes that might relate to gtest stuff in KokkosKernels pushed into Trilinos on 10/24?
The text was updated successfully, but these errors were encountered: