Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

KokkosKernels: improve CRS sorting performance #13401

Merged
merged 1 commit into from
Aug 30, 2024

Conversation

brian-kelley
Copy link
Contributor

@brian-kelley brian-kelley commented Aug 27, 2024

The changes exactly match this KokkosKernels PR (its description has some performance measurements): kokkos/kokkos-kernels/pull/2293. The highlight is that for Sierra TF's abnormal_energy matrix, this improves performance of sort_crs_matrix by 6.3x on V100.

@trilinos/kokkos-kernels

Stakeholder Feedback

sort_crs_matrix being very slow for irregular matrices was directly reported by @vbrunini for Sierra TF. But this will give an uplift for other applications running on GPUs with either very regular, or very irregular matrices (see the KK PR for details).

Testing

All sort_crs_ functions are tested in KokkosKernels sparse unit tests.

The changes exactly match github.com/kokkos/kokkos-kernels/pull/2293.
Improves performance of sort_crs_matrix by up to 6.3x.
@brian-kelley brian-kelley added impacting: performance pkg: KokkosKernels client: Sierra All issues that primarily impacts SNL Sierra codes labels Aug 27, 2024
@brian-kelley brian-kelley self-assigned this Aug 27, 2024
@brian-kelley brian-kelley requested a review from a team as a code owner August 27, 2024 18:12
@trilinos-autotester
Copy link
Contributor

Status Flag 'Pre-Test Inspection' - Auto Inspected - Inspection is Not Necessary for this Pull Request.

@trilinos-autotester
Copy link
Contributor

Status Flag 'Pull Request AutoTester' - Testing Jenkins Projects:

Pull Request Auto Testing STARTING (click to expand)

Build Information

Test Name: PR_gcc-openmpi-openmp

  • Build Num: 424
  • Status: STARTED

Jenkins Parameters

Parameter Name Value
FORCE_CLEAN true
GENCONFIG_BUILD_NAME rhel8_sems-gnu-8.5.0-openmpi-4.1.6-openmp_release-debug_static_no-kokkos-arch_no-asan_no-complex_no-fpic_mpi_no-pt_no-rdc_no-uvm_deprecated-on_no-package-enables
PR_LABELS impacting: performance;pkg: KokkosKernels;client: Sierra
PULLREQUESTNUM 13401
PULLREQUEST_CDASH_TRACK Pull Request
TEST_REPO_ALIAS TRILINOS
TRILINOS_NODE_LABEL rhel8
TRILINOS_SOURCE_REPO https://github.com/brian-kelley/Trilinos
TRILINOS_SOURCE_SHA 44ef23d
TRILINOS_SRN_CONFIG true
TRILINOS_TARGET_BRANCH develop
TRILINOS_TARGET_REPO https://github.com/trilinos/Trilinos
TRILINOS_TARGET_SHA 9c6525b

Build Information

Test Name: PR_gcc

  • Build Num: 474
  • Status: STARTED

Jenkins Parameters

Parameter Name Value
FORCE_CLEAN true
GENCONFIG_BUILD_NAME rhel8_sems-gnu-8.5.0-serial_release-debug_shared_no-kokkos-arch_no-asan_no-complex_no-fpic_no-mpi_no-pt_no-rdc_no-uvm_deprecated-on_no-package-enables
PR_LABELS impacting: performance;pkg: KokkosKernels;client: Sierra
PULLREQUESTNUM 13401
PULLREQUEST_CDASH_TRACK Pull Request
TEST_REPO_ALIAS TRILINOS
TRILINOS_NODE_LABEL rhel8
TRILINOS_SOURCE_REPO https://github.com/brian-kelley/Trilinos
TRILINOS_SOURCE_SHA 44ef23d
TRILINOS_SRN_CONFIG true
TRILINOS_TARGET_BRANCH develop
TRILINOS_TARGET_REPO https://github.com/trilinos/Trilinos
TRILINOS_TARGET_SHA 9c6525b

Build Information

Test Name: PR_gcc-openmpi_debug

  • Build Num: 475
  • Status: STARTED

Jenkins Parameters

Parameter Name Value
FORCE_CLEAN true
GENCONFIG_BUILD_NAME rhel8_sems-gnu-8.5.0-openmpi-4.1.6-serial_debug_shared_no-kokkos-arch_no-asan_no-complex_no-fpic_mpi_no-pt_no-rdc_no-uvm_deprecated-on_no-package-enables
PR_LABELS impacting: performance;pkg: KokkosKernels;client: Sierra
PULLREQUESTNUM 13401
PULLREQUEST_CDASH_TRACK Pull Request
TEST_REPO_ALIAS TRILINOS
TRILINOS_NODE_LABEL rhel8
TRILINOS_SOURCE_REPO https://github.com/brian-kelley/Trilinos
TRILINOS_SOURCE_SHA 44ef23d
TRILINOS_SRN_CONFIG true
TRILINOS_TARGET_BRANCH develop
TRILINOS_TARGET_REPO https://github.com/trilinos/Trilinos
TRILINOS_TARGET_SHA 9c6525b

Build Information

Test Name: PR_clang

  • Build Num: 473
  • Status: STARTED

Jenkins Parameters

Parameter Name Value
FORCE_CLEAN true
GENCONFIG_BUILD_NAME rhel8_sems-clang-11.0.1-openmpi-4.0.5-serial_release-debug_shared_no-kokkos-arch_no-asan_no-complex_no-fpic_mpi_no-pt_no-rdc_no-uvm_deprecated-on_no-package-enables
PR_LABELS impacting: performance;pkg: KokkosKernels;client: Sierra
PULLREQUESTNUM 13401
PULLREQUEST_CDASH_TRACK Pull Request
TEST_REPO_ALIAS TRILINOS
TRILINOS_NODE_LABEL rhel8
TRILINOS_SOURCE_REPO https://github.com/brian-kelley/Trilinos
TRILINOS_SOURCE_SHA 44ef23d
TRILINOS_SRN_CONFIG true
TRILINOS_TARGET_BRANCH develop
TRILINOS_TARGET_REPO https://github.com/trilinos/Trilinos
TRILINOS_TARGET_SHA 9c6525b

Build Information

Test Name: Trilinos_PR_python3

  • Build Num: 4432
  • Status: STARTED

Jenkins Parameters

Parameter Name Value
FORCE_CLEAN true
GENCONFIG_BUILD_NAME rhel8_aue-gnu-12.1.0-anaconda3-serial_debug_shared_no-kokkos-arch_no-asan_no-complex_no-fpic_no-mpi_no-pt_no-rdc_no-uvm_deprecated-on_pr-framework
PR_LABELS impacting: performance;pkg: KokkosKernels;client: Sierra
PULLREQUESTNUM 13401
PULLREQUEST_CDASH_TRACK Pull Request
TEST_REPO_ALIAS TRILINOS
TRILINOS_NODE_LABEL rhel8
TRILINOS_SOURCE_REPO https://github.com/brian-kelley/Trilinos
TRILINOS_SOURCE_SHA 44ef23d
TRILINOS_SRN_CONFIG true
TRILINOS_TARGET_BRANCH develop
TRILINOS_TARGET_REPO https://github.com/trilinos/Trilinos
TRILINOS_TARGET_SHA 9c6525b

Build Information

Test Name: PR_cuda

  • Build Num: 472
  • Status: STARTED

Jenkins Parameters

Parameter Name Value
FORCE_CLEAN true
GENCONFIG_BUILD_NAME rhel8_sems-cuda-11.4.2-gnu-10.1.0-openmpi-4.1.6_release_static_Volta70_no-asan_complex_no-fpic_mpi_pt_no-rdc_no-uvm_deprecated-on_no-package-enables
PR_LABELS impacting: performance;pkg: KokkosKernels;client: Sierra
PULLREQUESTNUM 13401
PULLREQUEST_CDASH_TRACK Pull Request
TEST_REPO_ALIAS TRILINOS
TRILINOS_NODE_LABEL rhel8-gpu
TRILINOS_SOURCE_REPO https://github.com/brian-kelley/Trilinos
TRILINOS_SOURCE_SHA 44ef23d
TRILINOS_SRN_CONFIG true
TRILINOS_TARGET_BRANCH develop
TRILINOS_TARGET_REPO https://github.com/trilinos/Trilinos
TRILINOS_TARGET_SHA 9c6525b

Build Information

Test Name: PR_intel

  • Build Num: 393
  • Status: STARTED

Jenkins Parameters

Parameter Name Value
FORCE_CLEAN true
GENCONFIG_BUILD_NAME rhel8_sems-intel-2021.3-sems-openmpi-4.1.6_release-debug_shared_no-kokkos-arch_no-asan_no-complex_fpic_mpi_no-pt_no-rdc_no-uvm_deprecated-on_no-package-enables
PR_LABELS impacting: performance;pkg: KokkosKernels;client: Sierra
PULLREQUESTNUM 13401
PULLREQUEST_CDASH_TRACK Pull Request
TEST_REPO_ALIAS TRILINOS
TRILINOS_NODE_LABEL rhel8
TRILINOS_SOURCE_REPO https://github.com/brian-kelley/Trilinos
TRILINOS_SOURCE_SHA 44ef23d
TRILINOS_SRN_CONFIG true
TRILINOS_TARGET_BRANCH develop
TRILINOS_TARGET_REPO https://github.com/trilinos/Trilinos
TRILINOS_TARGET_SHA 9c6525b

Build Information

Test Name: PR_cuda-uvm

  • Build Num: 472
  • Status: STARTED

Jenkins Parameters

Parameter Name Value
FORCE_CLEAN true
GENCONFIG_BUILD_NAME rhel8_sems-cuda-11.4.2-gnu-10.1.0-openmpi-4.1.6_release_static_Volta70_no-asan_complex_no-fpic_mpi_pt_no-rdc_uvm_deprecated-on_no-package-enables
PR_LABELS impacting: performance;pkg: KokkosKernels;client: Sierra
PULLREQUESTNUM 13401
PULLREQUEST_CDASH_TRACK Pull Request
TEST_REPO_ALIAS TRILINOS
TRILINOS_NODE_LABEL rhel8
TRILINOS_SOURCE_REPO https://github.com/brian-kelley/Trilinos
TRILINOS_SOURCE_SHA 44ef23d
TRILINOS_SRN_CONFIG true
TRILINOS_TARGET_BRANCH develop
TRILINOS_TARGET_REPO https://github.com/trilinos/Trilinos
TRILINOS_TARGET_SHA 9c6525b

Using Repos:

Repo: TRILINOS (brian-kelley/Trilinos)
  • Branch: PatchKK2293
  • SHA: 44ef23d
  • Mode: TEST_REPO

Pull Request Author: brian-kelley

@trilinos-autotester
Copy link
Contributor

Status Flag 'Pull Request AutoTester' - Jenkins Testing: all Jobs PASSED

Pull Request Auto Testing has PASSED (click to expand)

Build Information

Test Name: PR_gcc-openmpi-openmp

  • Build Num: 424
  • Status: PASSED

Jenkins Parameters

Parameter Name Value
FORCE_CLEAN true
GENCONFIG_BUILD_NAME rhel8_sems-gnu-8.5.0-openmpi-4.1.6-openmp_release-debug_static_no-kokkos-arch_no-asan_no-complex_no-fpic_mpi_no-pt_no-rdc_no-uvm_deprecated-on_no-package-enables
PR_LABELS impacting: performance;pkg: KokkosKernels;client: Sierra
PULLREQUESTNUM 13401
PULLREQUEST_CDASH_TRACK Pull Request
TEST_REPO_ALIAS TRILINOS
TRILINOS_NODE_LABEL rhel8
TRILINOS_SOURCE_REPO https://github.com/brian-kelley/Trilinos
TRILINOS_SOURCE_SHA 44ef23d
TRILINOS_SRN_CONFIG true
TRILINOS_TARGET_BRANCH develop
TRILINOS_TARGET_REPO https://github.com/trilinos/Trilinos
TRILINOS_TARGET_SHA 9c6525b

Build Information

Test Name: PR_gcc

  • Build Num: 474
  • Status: PASSED

Jenkins Parameters

Parameter Name Value
FORCE_CLEAN true
GENCONFIG_BUILD_NAME rhel8_sems-gnu-8.5.0-serial_release-debug_shared_no-kokkos-arch_no-asan_no-complex_no-fpic_no-mpi_no-pt_no-rdc_no-uvm_deprecated-on_no-package-enables
PR_LABELS impacting: performance;pkg: KokkosKernels;client: Sierra
PULLREQUESTNUM 13401
PULLREQUEST_CDASH_TRACK Pull Request
TEST_REPO_ALIAS TRILINOS
TRILINOS_NODE_LABEL rhel8
TRILINOS_SOURCE_REPO https://github.com/brian-kelley/Trilinos
TRILINOS_SOURCE_SHA 44ef23d
TRILINOS_SRN_CONFIG true
TRILINOS_TARGET_BRANCH develop
TRILINOS_TARGET_REPO https://github.com/trilinos/Trilinos
TRILINOS_TARGET_SHA 9c6525b

Build Information

Test Name: PR_gcc-openmpi_debug

  • Build Num: 475
  • Status: PASSED

Jenkins Parameters

Parameter Name Value
FORCE_CLEAN true
GENCONFIG_BUILD_NAME rhel8_sems-gnu-8.5.0-openmpi-4.1.6-serial_debug_shared_no-kokkos-arch_no-asan_no-complex_no-fpic_mpi_no-pt_no-rdc_no-uvm_deprecated-on_no-package-enables
PR_LABELS impacting: performance;pkg: KokkosKernels;client: Sierra
PULLREQUESTNUM 13401
PULLREQUEST_CDASH_TRACK Pull Request
TEST_REPO_ALIAS TRILINOS
TRILINOS_NODE_LABEL rhel8
TRILINOS_SOURCE_REPO https://github.com/brian-kelley/Trilinos
TRILINOS_SOURCE_SHA 44ef23d
TRILINOS_SRN_CONFIG true
TRILINOS_TARGET_BRANCH develop
TRILINOS_TARGET_REPO https://github.com/trilinos/Trilinos
TRILINOS_TARGET_SHA 9c6525b

Build Information

Test Name: PR_clang

  • Build Num: 473
  • Status: PASSED

Jenkins Parameters

Parameter Name Value
FORCE_CLEAN true
GENCONFIG_BUILD_NAME rhel8_sems-clang-11.0.1-openmpi-4.0.5-serial_release-debug_shared_no-kokkos-arch_no-asan_no-complex_no-fpic_mpi_no-pt_no-rdc_no-uvm_deprecated-on_no-package-enables
PR_LABELS impacting: performance;pkg: KokkosKernels;client: Sierra
PULLREQUESTNUM 13401
PULLREQUEST_CDASH_TRACK Pull Request
TEST_REPO_ALIAS TRILINOS
TRILINOS_NODE_LABEL rhel8
TRILINOS_SOURCE_REPO https://github.com/brian-kelley/Trilinos
TRILINOS_SOURCE_SHA 44ef23d
TRILINOS_SRN_CONFIG true
TRILINOS_TARGET_BRANCH develop
TRILINOS_TARGET_REPO https://github.com/trilinos/Trilinos
TRILINOS_TARGET_SHA 9c6525b

Build Information

Test Name: Trilinos_PR_python3

  • Build Num: 4432
  • Status: PASSED

Jenkins Parameters

Parameter Name Value
FORCE_CLEAN true
GENCONFIG_BUILD_NAME rhel8_aue-gnu-12.1.0-anaconda3-serial_debug_shared_no-kokkos-arch_no-asan_no-complex_no-fpic_no-mpi_no-pt_no-rdc_no-uvm_deprecated-on_pr-framework
PR_LABELS impacting: performance;pkg: KokkosKernels;client: Sierra
PULLREQUESTNUM 13401
PULLREQUEST_CDASH_TRACK Pull Request
TEST_REPO_ALIAS TRILINOS
TRILINOS_NODE_LABEL rhel8
TRILINOS_SOURCE_REPO https://github.com/brian-kelley/Trilinos
TRILINOS_SOURCE_SHA 44ef23d
TRILINOS_SRN_CONFIG true
TRILINOS_TARGET_BRANCH develop
TRILINOS_TARGET_REPO https://github.com/trilinos/Trilinos
TRILINOS_TARGET_SHA 9c6525b

Build Information

Test Name: PR_cuda

  • Build Num: 472
  • Status: PASSED

Jenkins Parameters

Parameter Name Value
FORCE_CLEAN true
GENCONFIG_BUILD_NAME rhel8_sems-cuda-11.4.2-gnu-10.1.0-openmpi-4.1.6_release_static_Volta70_no-asan_complex_no-fpic_mpi_pt_no-rdc_no-uvm_deprecated-on_no-package-enables
PR_LABELS impacting: performance;pkg: KokkosKernels;client: Sierra
PULLREQUESTNUM 13401
PULLREQUEST_CDASH_TRACK Pull Request
TEST_REPO_ALIAS TRILINOS
TRILINOS_NODE_LABEL rhel8-gpu
TRILINOS_SOURCE_REPO https://github.com/brian-kelley/Trilinos
TRILINOS_SOURCE_SHA 44ef23d
TRILINOS_SRN_CONFIG true
TRILINOS_TARGET_BRANCH develop
TRILINOS_TARGET_REPO https://github.com/trilinos/Trilinos
TRILINOS_TARGET_SHA 9c6525b

Build Information

Test Name: PR_intel

  • Build Num: 393
  • Status: PASSED

Jenkins Parameters

Parameter Name Value
FORCE_CLEAN true
GENCONFIG_BUILD_NAME rhel8_sems-intel-2021.3-sems-openmpi-4.1.6_release-debug_shared_no-kokkos-arch_no-asan_no-complex_fpic_mpi_no-pt_no-rdc_no-uvm_deprecated-on_no-package-enables
PR_LABELS impacting: performance;pkg: KokkosKernels;client: Sierra
PULLREQUESTNUM 13401
PULLREQUEST_CDASH_TRACK Pull Request
TEST_REPO_ALIAS TRILINOS
TRILINOS_NODE_LABEL rhel8
TRILINOS_SOURCE_REPO https://github.com/brian-kelley/Trilinos
TRILINOS_SOURCE_SHA 44ef23d
TRILINOS_SRN_CONFIG true
TRILINOS_TARGET_BRANCH develop
TRILINOS_TARGET_REPO https://github.com/trilinos/Trilinos
TRILINOS_TARGET_SHA 9c6525b

Build Information

Test Name: PR_cuda-uvm

  • Build Num: 472
  • Status: PASSED

Jenkins Parameters

Parameter Name Value
FORCE_CLEAN true
GENCONFIG_BUILD_NAME rhel8_sems-cuda-11.4.2-gnu-10.1.0-openmpi-4.1.6_release_static_Volta70_no-asan_complex_no-fpic_mpi_pt_no-rdc_uvm_deprecated-on_no-package-enables
PR_LABELS impacting: performance;pkg: KokkosKernels;client: Sierra
PULLREQUESTNUM 13401
PULLREQUEST_CDASH_TRACK Pull Request
TEST_REPO_ALIAS TRILINOS
TRILINOS_NODE_LABEL rhel8
TRILINOS_SOURCE_REPO https://github.com/brian-kelley/Trilinos
TRILINOS_SOURCE_SHA 44ef23d
TRILINOS_SRN_CONFIG true
TRILINOS_TARGET_BRANCH develop
TRILINOS_TARGET_REPO https://github.com/trilinos/Trilinos
TRILINOS_TARGET_SHA 9c6525b


CDash Test Results for PR# 13401.

@trilinos-autotester
Copy link
Contributor

Status Flag 'Pre-Merge Inspection' - - This Pull Request Requires Inspection... The code must be inspected by a member of the Team before Testing/Merging
WARNING: NO REVIEWERS HAVE BEEN REQUESTED FOR THIS PULL REQUEST!

@trilinos-autotester
Copy link
Contributor

All Jobs Finished; status = PASSED, However Inspection must be performed before merge can occur...

@brian-kelley brian-kelley requested a review from lucbv August 28, 2024 16:06
@trilinos-autotester
Copy link
Contributor

Status Flag 'Pre-Merge Inspection' - - This Pull Request Requires Inspection... The code must be inspected by a member of the Team before Testing/Merging
NO REVIEWS HAVE BEEN PERFORMED ON THIS PULL REQUEST!

@trilinos-autotester
Copy link
Contributor

All Jobs Finished; status = PASSED, However Inspection must be performed before merge can occur...

1 similar comment
@trilinos-autotester
Copy link
Contributor

All Jobs Finished; status = PASSED, However Inspection must be performed before merge can occur...

@brian-kelley
Copy link
Contributor Author

Thanks @lucbv. It looks like all the AT2 builds are failing for reasons not related to my code, but #13414 should fix those and then I'll retest.

@trilinos-autotester
Copy link
Contributor

Status Flag 'Pre-Merge Inspection' - SUCCESS: The last commit to this Pull Request has been INSPECTED AND APPROVED by [ lucbv ]!

@trilinos-autotester
Copy link
Contributor

Status Flag 'Pull Request AutoTester' - AutoMerge IS ENABLED, but the Label AT: AUTOMERGE is not set. Either set Label AT: AUTOMERGE or manually merge the PR...

@brian-kelley brian-kelley merged commit f424b7a into trilinos:develop Aug 30, 2024
13 of 18 checks passed
@brian-kelley brian-kelley deleted the PatchKK2293 branch August 30, 2024 05:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
client: Sierra All issues that primarily impacts SNL Sierra codes impacting: performance pkg: KokkosKernels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants