-
Notifications
You must be signed in to change notification settings - Fork 578
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Tpetra: Fix #5279 (add test for checking a pointer to see if it's CUDA-accessible) #5280
Conversation
@trilinos/tpetra
@trilinos/tpetra
@trilinos/tpetra
Status Flag 'Pre-Test Inspection' - Auto Inspected - Inspection Is Not Necessary for this Pull Request. |
Status Flag 'Pull Request AutoTester' - Testing Jenkins Projects: Pull Request Auto Testing STARTING (click to expand)Build InformationTest Name: Trilinos_pullrequest_gcc_4.8.4
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_intel_17.0.1
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_gcc_4.9.3_SERIAL
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_gcc_7.2.0
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_cuda_9.2
Jenkins Parameters
Using Repos:
Pull Request Author: mhoemmen |
Status Flag 'Pull Request AutoTester' - Jenkins Testing: 1 or more Jobs FAILED Note: Testing will normally be attempted again in approx. 2 Hrs 30 Mins. If a change to the PR source branch occurs, the testing will be attempted again on next available autotester run. Pull Request Auto Testing has FAILED (click to expand)Build InformationTest Name: Trilinos_pullrequest_gcc_4.8.4
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_intel_17.0.1
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_gcc_4.9.3_SERIAL
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_gcc_7.2.0
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_cuda_9.2
Jenkins Parameters
Console Output (last 100 lines) : Trilinos_pullrequest_gcc_4.8.4 # 3699 (click to expand)
Console Output (last 100 lines) : Trilinos_pullrequest_intel_17.0.1 # 3514 (click to expand)
Console Output (last 100 lines) : Trilinos_pullrequest_gcc_4.9.3_SERIAL # 1960 (click to expand)
Console Output (last 100 lines) : Trilinos_pullrequest_gcc_7.2.0 # 1735 (click to expand)
Console Output (last 100 lines) : Trilinos_pullrequest_cuda_9.2 # 1364 (click to expand)
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @mhoemmen !
@mhoemmen do you need some flag to turn this on..?? |
Status Flag 'Pull Request AutoTester' - Testing Jenkins Projects: Pull Request Auto Testing STARTING (click to expand)Build InformationTest Name: Trilinos_pullrequest_gcc_4.8.4
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_intel_17.0.1
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_gcc_4.9.3_SERIAL
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_gcc_7.2.0
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_cuda_9.2
Jenkins Parameters
Using Repos:
Pull Request Author: mhoemmen |
@bathmatt wrote:
This is only the first step. Once I get this PR through, I will make Tpetra classes like Map, MultiVector, CrsGraph, and CrsMatrix use this function. Their constructors will use the |
@trilinos/framework Something appears to be wrong with tests. |
OK, so there is no reason to test this yet? It won't show anything? |
@bathmatt wrote:
That's right, this first part is just the function for testing itself, not the use of that function in Tpetra objects. |
I am looking at it now. |
Status Flag 'Pull Request AutoTester' - Jenkins Testing: 1 or more Jobs FAILED Note: Testing will normally be attempted again in approx. 2 Hrs 30 Mins. If a change to the PR source branch occurs, the testing will be attempted again on next available autotester run. Pull Request Auto Testing has FAILED (click to expand)Build InformationTest Name: Trilinos_pullrequest_gcc_4.8.4
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_intel_17.0.1
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_gcc_4.9.3_SERIAL
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_gcc_7.2.0
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_cuda_9.2
Jenkins Parameters
Console Output (last 100 lines) : Trilinos_pullrequest_gcc_4.8.4 # 3700 (click to expand)
Console Output (last 100 lines) : Trilinos_pullrequest_intel_17.0.1 # 3515 (click to expand)
Console Output (last 100 lines) : Trilinos_pullrequest_gcc_4.9.3_SERIAL # 1961 (click to expand)
Console Output (last 100 lines) : Trilinos_pullrequest_gcc_7.2.0 # 1736 (click to expand)
Console Output (last 100 lines) : Trilinos_pullrequest_cuda_9.2 # 1365 (click to expand)
|
Status Flag 'Pull Request AutoTester' - User Requested Retest - Label AT: RETEST will be reset after testing. |
Status Flag 'Pull Request AutoTester' - Testing Jenkins Projects: Pull Request Auto Testing STARTING (click to expand)Build InformationTest Name: Trilinos_pullrequest_gcc_4.8.4
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_intel_17.0.1
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_gcc_4.9.3_SERIAL
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_gcc_7.2.0
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_cuda_9.2
Jenkins Parameters
Using Repos:
Pull Request Author: mhoemmen |
I think I know what's going on here. In CUDA 9.2, the only way This suggests I can fix this issue by calling |
I was able to replicate the test failure in a debug CUDA build. When I added the |
@trilinos/tpetra My fix for trilinos#5279 uses the error code returned by cudaPointerGetAttributes to identify an unregistered host pointer. It turns out that this error code sticks around and causes a spurious failure later. I fixed this by calling cudaGetLastError after cudaPointerGetAttributes; that resets the "last error" to cudaSuccess.
I just pushed the fix. Let's try this again. |
Status Flag 'Pull Request AutoTester' - User Requested Retest - Label AT: RETEST will be reset after testing. |
Status Flag 'Pre-Test Inspection' - Auto Inspected - Inspection Is Not Necessary for this Pull Request. |
Status Flag 'Pull Request AutoTester' - Testing Jenkins Projects: Pull Request Auto Testing STARTING (click to expand)Build InformationTest Name: Trilinos_pullrequest_gcc_4.8.4
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_intel_17.0.1
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_gcc_4.9.3_SERIAL
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_gcc_7.2.0
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_cuda_9.2
Jenkins Parameters
Using Repos:
Pull Request Author: mhoemmen |
Status Flag 'Pull Request AutoTester' - Jenkins Testing: all Jobs PASSED Pull Request Auto Testing has PASSED (click to expand)Build InformationTest Name: Trilinos_pullrequest_gcc_4.8.4
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_intel_17.0.1
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_gcc_4.9.3_SERIAL
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_gcc_7.2.0
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_cuda_9.2
Jenkins Parameters
|
Status Flag 'Pre-Merge Inspection' - - This Pull Request Requires Inspection... The code must be inspected by a member of the Team before Testing/Merging |
All Jobs Finished; status = PASSED, However Inspection must be performed before merge can occur... |
IT PASSED YAY!!! |
Status Flag 'Pre-Merge Inspection' - SUCCESS: The last commit to this Pull Request has been INSPECTED AND APPROVED by [ tjfulle ]! |
Status Flag 'Pull Request AutoTester' - Pull Request will be Automerged |
Merge on Pull Request# 5280: IS A SUCCESS - Pull Request successfully merged |
Thanks @tjfulle ! :-D |
…s:develop' (d1a6821). * trilinos-develop: (41 commits) Tpetra: Fix CUDA run-time error in PR trilinos#5280 Tpetra: Allow CrsMatrix with StaticProfile to resize during import/export (trilinos#5268) Xpetra: ETI TpetraBlockCrsMatrix bug fixes trilinos#4 (compiles) Tpetra: Fix Intel build errors in PR trilinos#5280 Tpetra: Use trilinos#5279 fix to check Map array/View input Tpetra: Use trilinos#5279 fix to check MultiVector (Dual)View input Xpetra: ETI TpetraBlockCrsMatrix bug fixes trilinos#3 Tpetra: Rename headers in fix for trilinos#5279 Tpetra: Fix trilinos#5279 Xpetra: ETI TpetraBlockCrsMatrix bug fixes trilinos#2 Tpetra: Add inMemorySpace & memorySpaceName (part of trilinos#5279) Panzer: Generalized logic to use DirichletResidual_FaceBasis for all HDiv DOFs Cmake/Ctest: Fixing Avatar Issues MueLu: Updates for Avatar-as-external-package MueLu: Updates for Avatar-as-external-package Cmake: Adding back Avatar TPL Support CMake: Add support for Check TPL Ctest: More lighsaber mods Ctest: More lightsaber fixes Ctest: Adding lightsaber experimentals ...
…s:develop' (d1a6821). * trilinos-develop: (56 commits) TrilinosCouplings: More updates for Avatar-as-external-package SEACAS: bug fix for may snapshot Stk update (trilinos#5289) Xpetra: fixing a specialization of the Xpetra::TpetraOperator see issue trilinos#5293 MueLu: adjusing number of iterations in unit-test to new tolerance Tempus: pactch from Empire (trilinos#5282) Tpetra: Fix CUDA run-time error in PR trilinos#5280 Tpetra: Allow CrsMatrix with StaticProfile to resize during import/export (trilinos#5268) MueLu: cleaning unit-tests and code for build with Scalar=float Ifpack2::Relaxation: Fix signed/unsigned comparison warning Teuchos::ScalarTraits: Fix Kokkos::complex specialization macro Xpetra: ETI TpetraBlockCrsMatrix bug fixes trilinos#4 (compiles) Tpetra: Fix Intel build errors in PR trilinos#5280 Tpetra: Use trilinos#5279 fix to check Map array/View input Tpetra: Use trilinos#5279 fix to check MultiVector (Dual)View input Xpetra: ETI TpetraBlockCrsMatrix bug fixes trilinos#3 Tpetra: Rename headers in fix for trilinos#5279 Tpetra: Fix trilinos#5279 Xpetra: ETI TpetraBlockCrsMatrix bug fixes trilinos#2 Tpetra: Add inMemorySpace & memorySpaceName (part of trilinos#5279) ...
…s:develop' (d1a6821). * trilinos-develop: (69 commits) Add Intel 18.0.5 compiler to ATDM environment. Fix stk cmake files to disable unit and doc tests that depend on STKNGP_TEST if STKNGP_TEST is not enabled. MueLu: removing the Epetra path tests for unit-tests-kokkos, see issue trilinos#4325 Xpetra: fix MultiVector unit-test for Epetra=OFF configuration, see issue trilinos#5300 fix another type of quoting error TrilinosCouplings: More updates for Avatar-as-external-package Correct shell quoting in the package file replacement for python SEACAS: bug fix for may snapshot Python testing - resolve issues found during testing Stk update (trilinos#5289) Xpetra: fixing a specialization of the Xpetra::TpetraOperator see issue trilinos#5293 move gcc 7.2.0 environment back to python2 MueLu: adjusing number of iterations in unit-test to new tolerance Add stand-alone tests for framework python unittests Tempus: pactch from Empire (trilinos#5282) This commit allows all of the clean_workspace scripts and test to run under both pyython 2 and 3 test_Modules.py: re-work to run under both python 2 and 3 Change gcc 7.2 PR testing to use python 3 and disable Modules tests Tpetra: Fix CUDA run-time error in PR trilinos#5280 Tpetra: Allow CrsMatrix with StaticProfile to resize during import/export (trilinos#5268) ...
@trilinos/tpetra @bathmatt @vbrunini
Description
Add a Tpetra implementation detail function to test whether a given nonnull pointer is CUDA-accessible. This is a partial fix for #5279. A full fix would involve finding all the Tpetra object constructors that take
Kokkos::View
orKokkos::DualView
from the user, and using this function in those constructors. I'll open a separate issue for that.Motivation and Context
Help debug run-time errors in Panzer: see discussion here.
@vbrunini suggested that
Kokkos::View
's constructor (the overloads that take a raw pointer from the user) could use this function, at least in debug mode. @crtrott , what do you think?Related Issues
How Has This Been Tested?
CUDA 9.2 build with GCC 7.2.0 (x86) and OpenMPI 1.10.1.