Skip to content
This repository has been archived by the owner on Mar 21, 2024. It is now read-only.

Remove some problematic inline annotations. #259

Merged

Conversation

alliepiper
Copy link
Collaborator

These functions started producing invalid results in CUDA 11 under
certain circumstances (see issue NVIDIA/thrust#1371), and removing
these hints fixes the issue.

cc: @ArEsKay3
Fixes NVIDIA/thrust#1371

Copy link
Collaborator

@brycelelbach brycelelbach left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's no reason for this to have been __forceinline__ in the first place (and generally __forceinline__ is just bad).

@brycelelbach
Copy link
Collaborator

This may have had something to do with the use of a static variable internally.

These functions started producing invalid results in CUDA 11 under
certain circumstances (see issue NVIDIA/thrust#1371), and removing
these hints fixes the issue.

NVIDIA#260 reported that other functions in this file were also
causing the same issue.

These methods are not perf critical -- they don't need to be inlined.
@alliepiper alliepiper force-pushed the bug/inline_regression/gh.thrust1371 branch from b86303d to bc209e6 Compare February 8, 2021 22:11
@alliepiper alliepiper merged commit a307334 into NVIDIA:main Feb 9, 2021
@alliepiper alliepiper deleted the bug/inline_regression/gh.thrust1371 branch February 9, 2021 20:28
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

(cudaErrorInvalidDevice) when trying to perform a thrust::reduce
2 participants