Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dist1 PPS improvements #577

Closed
wants to merge 3 commits into from
Closed

Conversation

brian-kelley
Copy link
Contributor

@brian-kelley brian-kelley commented Jan 23, 2020

Use a single scan pass to build the worklist, rather than a scan + for. Not using a temporary integer view to hold insertion indices, since the scan can just do the insertion in the "final" sweep.

Minor cleanup thing: rename "_conflictlist" to "_conflict_scheme" to be more clear about what it is. Use the enum values that the Handle declares instead of 0,1,2 (which correspond 1-1 with the underlying enum values anyway).

Bowman spot checks:
#######################################################
PASSED TESTS
#######################################################
intel-16.4.258-Pthread-release build_time=726 run_time=1034
intel-16.4.258-Pthread_Serial-release build_time=1041 run_time=2000
intel-16.4.258-Serial-release build_time=703 run_time=948
intel-17.2.174-OpenMP-release build_time=878 run_time=568
intel-17.2.174-OpenMP_Serial-release build_time=1212 run_time=1468
intel-17.2.174-Pthread-release build_time=801 run_time=902
intel-17.2.174-Pthread_Serial-release build_time=1117 run_time=1819
intel-17.2.174-Serial-release build_time=786 run_time=874

RIDE:
#######################################################
PASSED TESTS
#######################################################
cuda-9.2.88-Cuda_OpenMP-release build_time=498 run_time=418
cuda-9.2.88-Cuda_Serial-release build_time=561 run_time=525
gcc-6.4.0-OpenMP_Serial-release build_time=194 run_time=396
gcc-7.2.0-OpenMP-release build_time=123 run_time=128
gcc-7.2.0-OpenMP_Serial-release build_time=223 run_time=358
gcc-7.2.0-Serial-release build_time=111 run_time=227

instead of magic numbers 0,1,2
PPS worklist construction (VB and EB) now happens in a single scan,
without using a temporary array to store indices. Faster and saves
memory.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant