Skip to content
This repository has been archived by the owner on Mar 21, 2024. It is now read-only.

warning: loop not unrolled in dispatch_radix_sort.cuh #246

Closed
andrewcorrigan opened this issue Dec 6, 2020 · 7 comments
Closed

warning: loop not unrolled in dispatch_radix_sort.cuh #246

andrewcorrigan opened this issue Dec 6, 2020 · 7 comments

Comments

@andrewcorrigan
Copy link
Contributor

The latest version of CUB used by Thrust is triggering warnings in Clang:

/thrust/cub/device/dispatch/dispatch_radix_sort.cuh:521:1: warning: loop not unrolled: the optimizer was unable to perform the requested transformation; the transformation might be disabled or specified as part of an unsupported transformation ordering [-Wpass-failed=transform-warning]
DeviceRadixSortHistogramKernel
^
/thrust/cub/device/dispatch/dispatch_radix_sort.cuh:539:1: warning: loop not unrolled: the optimizer was unable to perform the requested transformation; the transformation might be disabled or specified as part of an unsupported transformation ordering [-Wpass-failed=transform-warning]
DeviceRadixSortOnesweepKernel
@andrewcorrigan
Copy link
Contributor Author

I don't understand this code well enough to know if this is innocuous and safely ignored or if it needs to be fixed properly. For now, I have suppressed the warning in my local copy. If you're interested in a PR like the below, please let me know, and I will submit it. Thanks!

#include <thrust/system/cuda/detail/core/triple_chevron_launch.h>

#if defined(__clang__)
#  pragma clang diagnostic push
#  pragma clang diagnostic ignored "-Wpass-failed"
#endif

/// Optional outer namespace(s)
CUB_NS_PREFIX
CUB_NS_POSTFIX  // Optional outer namespace(s)


#if defined(__clang__)
#  pragma clang diagnostic pop
#endif

@alliepiper
Copy link
Collaborator

The unrolls are purely for performance on old hardware/compilers, these warnings can be safely ignored.

This is related to NVIDIA/cccl#754, which points out that a lot of these explicit unroll directives are actually hurting performance in some instances, and definitely aren't worth the extra compiler costs. I'm hoping to address that in the 1.13 / 1.14 timeframe, which will likely resolve this issue as well.

@andrewcorrigan
Copy link
Contributor Author

Thank you for the explanation. Since this can be safely ignored, I'm going to add the suppression code I posted above to my current PR, in case you're interested in merging that.

@canonizer
Copy link
Contributor

@andrewcorrigan Do we know for which particular loop the warning is generated? There are no loops in the code of the kernel function itself, all loops are in the agent functions.

I'm asking because this warning is generated for the code I've written.

@andrewcorrigan
Copy link
Contributor Author

Sorry. I'm just a naive Thrust user. All I know is what is in the warning messages posted at the top. I'm not at all familiar with the internals.

@canonizer
Copy link
Contributor

What compiler are you using that produces these warnings?

@andrewcorrigan
Copy link
Contributor Author

I'm using Clang 11. The warnings appear to be independent of CUDA toolkit version.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants