Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sycl::is_compatible is currently not very useful #7561

Open
al42and opened this issue Nov 28, 2022 · 6 comments · Fixed by #9769
Open

sycl::is_compatible is currently not very useful #7561

al42and opened this issue Nov 28, 2022 · 6 comments · Fixed by #9769
Assignees
Labels
bug Something isn't working confirmed

Comments

@al42and
Copy link
Contributor

al42and commented Nov 28, 2022

Describe the bug

The spec is somewhat vague about the behavior of is_compatible :

A kernel that is defined in the application is compatible with a device unless:
• It uses optional features which are not supported on the device, as described in Section 5.7; or
• It is decorated with a [[sycl::device_has()]] C++ attribute that lists an aspect that is not supported by
the device, as described in Section 5.8.1.

The current implementation seems to fulfill these requirements, so it is not technically broken.

However, it does not handle many cases related to whether the device was targeted during compilation. One would expect that is_compatible returning true would mean that the kernel can be run on the device. This is not the case.

E.g., when targeting SPIR-V, the function falsely reports OpenCL AMD device as compatible (an exception is thrown when trying to launch the kernel) and throws an exception when called for HIP and CUDA devices (one would expect it to return false):

$ clang++ -fsycl -fsycl-targets=spir64 test.cpp -o test
$ ONEAPI_DEVICE_SELECTOR=level_zero:gpu ./test # Works OK!
Checking Intel(R) UHD Graphics 770 [0x4680]
Device is compatible
$ ONEAPI_DEVICE_SELECTOR=opencl:gpu ./test # AMD device reported as compatible, while it is not
Checking Intel(R) UHD Graphics 770 [0x4680]
Device is compatible
Checking gfx1034
Device is compatible
terminate called after throwing an instance of 'sycl::_V1::runtime_error'
  what():  Native API failed. Native API returns: -59 (PI_ERROR_INVALID_OPERATION) -59 (PI_ERROR_INVALID_OPERATION)
Aborted (core dumped)
$ ONEAPI_DEVICE_SELECTOR=hip:gpu ./test # Now is_compatible itself throws instead of returning false
Checking AMD Radeon RX 6400
terminate called after throwing an instance of 'sycl::_V1::runtime_error'
  what():  Native API failed. Native API returns: -42 (PI_ERROR_INVALID_BINARY) -42 (PI_ERROR_INVALID_BINARY)
Aborted (core dumped)

If we target NVPTX backed with unsupported --offload-arch, the kernel is anyway reported as compatible, while it cannot be launched:

$ clang++ -fsycl -fsycl-targets=nvptx64-nvidia-cuda -Xsycl-target-backend --offload-arch=sm_90 test.cpp -o test
$ ONEAPI_DEVICE_SELECTOR=level_zero:gpu ./test # Throws again
Checking Intel(R) UHD Graphics 770 [0x4680]
terminate called after throwing an instance of 'sycl::_V1::runtime_error'
  what():  Native API failed. Native API returns: -42 (PI_ERROR_INVALID_BINARY) -42 (PI_ERROR_INVALID_BINARY)
Aborted (core dumped)
$ ONEAPI_DEVICE_SELECTOR=cuda:gpu ./test # I have sm_86 device, which is not compatible, yet the function reports that it is:
Checking NVIDIA GeForce RTX 3060
Device is compatible

PI CUDA ERROR:
        Value:           209
        Name:            CUDA_ERROR_NO_BINARY_FOR_GPU
        Description:     no kernel image is available for execution on the device
        Function:        build_program
        Source Location: /home/aland/intel-sycl/llvm/sycl/plugins/cuda/pi_cuda.cpp:733


PI CUDA ERROR:
        Value:           400
        Name:            CUDA_ERROR_INVALID_HANDLE
        Description:     invalid resource handle
        Function:        cuda_piProgramRelease
        Source Location: /home/aland/intel-sycl/llvm/sycl/plugins/cuda/pi_cuda.cpp:3609

terminate called after throwing an instance of 'sycl::_V1::compile_program_error'
  what():  The program was built for 1 devices
Build program log for 'NVIDIA GeForce RTX 3060':
 -999 (Unknown PI error)
Aborted (core dumped)

To Reproduce

#include <iostream>
#include <CL/sycl.hpp>

class Kernel;

int main()
{
    for (const auto& dev : sycl::device::get_devices())
    {
        std::cout << "Checking " << dev.get_info<sycl::info::device::name>() << std::endl;
        bool deviceOk = sycl::is_compatible<Kernel>(dev);
        std::cout << "Device is " << (deviceOk ? "compatible" : "incompatible") << std::endl;
        if (deviceOk)
        {
            sycl::queue q{ dev };
            q.submit([&](sycl::handler& cgh) {
                 cgh.parallel_for<Kernel>(sycl::range<1>{ 1 },
                                          [=](sycl::id<1> threadId) { int x = threadId[0]; });
             }).wait_and_throw();
        }
    }
    return 0;
}

Environment (please complete the following information):

  • OS: Ubuntu 20.04
  • Target device and vendor: Intel, NVIDIA, AMD GPU
  • DPC++ version: clang version 16.0.0 (https://github.com/intel/llvm 67f6bba)
  • Dependencies version: Intel compute runtime 22.39.24347, CUDA 11.8, ROCm 5.3.3
@al42and al42and added the bug Something isn't working label Nov 28, 2022
@KornevNikita KornevNikita self-assigned this Nov 29, 2022
@AlexeySachkov
Copy link
Contributor

Tagging @gmlueck here to hear his feedback

@AlexeySachkov
Copy link
Contributor

Related spec clarification: KhronosGroup/SYCL-Docs#381

@KornevNikita
Copy link
Contributor

KhronosGroup/SYCL-Docs#381 is in master, so it can be implemented now.

@KornevNikita KornevNikita linked a pull request Jun 7, 2023 that will close this issue
dm-vodopyanov pushed a commit that referenced this issue Jun 14, 2023
Modify `is_compatible` to check if specific target is defined with
`-fsycl-targets` and change the result. Previously there was a situation
when kernel is compatible with the device by aspects, but actually it
fails to run on this device as it was compiled for another target
device.

Related spec change: KhronosGroup/SYCL-Docs#381

Resolves #7561
@KornevNikita
Copy link
Contributor

#9769 fixes this issue only partially.
There is still a need to check if the device architecture matches the target architecture of the compiled kernel. This functionality is in progress, I'll add it here when it's available.
And I'm not sure if it fixes these crashes when is_compatible throws itself. @al42and can I ask you to try your reproducer one more time?

@KornevNikita KornevNikita reopened this Jun 14, 2023
@al42and
Copy link
Contributor Author

al42and commented Jun 14, 2023

Thank you! The fix behaves as you described.

And I'm not sure if it fixes these crashes when is_compatible throws itself.

The AMD devices are now hidden in the OpenCL backend, so cannot reproduce :)

Full logs (click to expand)

Using the same test code and IntelLLVM ef03323 (with #9769).

$ clang++ -fsycl -fsycl-targets=spir64 test_7561.cpp -o test
$ ONEAPI_DEVICE_SELECTOR=level_zero:gpu ./test # Works OK!
Checking Intel(R) Arc(TM) A770 Graphics
Device is compatible
Checking Intel(R) UHD Graphics 770
Device is compatible
$ ONEAPI_DEVICE_SELECTOR=hip:gpu ./test # Now correctly returns false!
Checking AMD Radeon RX 6400
Device is incompatible
$ ONEAPI_DEVICE_SELECTOR=cuda:gpu ./test # Now correctly returns false!
Checking NVIDIA GeForce RTX 3060
Device is incompatible

As mentioned, a mismatched architecture is not caught yet (neither for AMD nor for NVIDIA):

$ clang++ -fsycl -fsycl-targets=nvptx64-nvidia-cuda -Xsycl-target-backend --offload-arch=sm_90 test_7561.cpp -o test
$ ONEAPI_DEVICE_SELECTOR=level_zero:gpu ./test # Now correctly returns false!
Checking Intel(R) Arc(TM) A770 Graphics
Device is incompatible
Checking Intel(R) UHD Graphics 770
Device is incompatible
$ ONEAPI_DEVICE_SELECTOR=cuda:gpu ./test # I have sm_86 device, which is not compatible, yet the function reports that it is:
Checking NVIDIA GeForce RTX 3060
Device is compatible

UR CUDA ERROR:
        Value:           209
        Name:            CUDA_ERROR_NO_BINARY_FOR_GPU
        Description:     no kernel image is available for execution on the device
        Function:        buildProgram
        Source Location: /home/aland/intel-sycl/llvm/sycl/plugins/unified_runtime/ur/adapters/cuda/program.cpp:145

terminate called after throwing an instance of 'sycl::_V1::compile_program_error'
  what():  The program was built for 1 devices
Build program log for 'NVIDIA GeForce RTX 3060':
 -999 (Unknown PI error)
Aborted (core dumped)
$ ONEAPI_DEVICE_SELECTOR=hip:gpu ./test # Now correctly returns false!
Checking AMD Radeon RX 6400
Device is incompatible
$ clang++ -fsycl -fsycl-targets=spir64,nvptx64-nvidia-cuda,amdgcn-amd-amdhsa -Xsycl-target-backend=amdgcn-amd-amdhsa --offload-arch=gfx1031 test_7561.cpp -o test
$ ONEAPI_DEVICE_SELECTOR=hip:gpu ./test # Compiled for gfx1031, device is gfx1034
Checking AMD Radeon RX 6400
Device is compatible

PI HIP ERROR:
        Value:           209
        Name:            hipErrorNoBinaryForGpu
        Description:     no kernel image is available for execution on the device
        Function:        build_program
        Source Location: /home/aland/intel-sycl/llvm/sycl/plugins/hip/pi_hip.cpp:744


PI HIP ERROR:
        Value:           400
        Name:            hipErrorInvalidHandle
        Description:     invalid resource handle
        Function:        hip_piProgramRelease
        Source Location: /home/aland/intel-sycl/llvm/sycl/plugins/hip/pi_hip.cpp:3609

terminate called after throwing an instance of 'sycl::_V1::compile_program_error'
  what():  The program was built for 1 devices
Build program log for 'AMD Radeon RX 6400':
5� -999 (Unknown PI error)
Aborted (core dumped)
# ^^^^ Uninitialized memory in the build log?

Compiling for multiple architectures works fine too 👍

$ clang++ -fsycl -fsycl-targets=spir64,nvptx64-nvidia-cuda,amdgcn-amd-amdhsa -Xsycl-target-backend=amdgcn-amd-amdhsa --offload-arch=gfx1034  test_7561.cpp -o test
$ ./test
Checking Intel(R) Arc(TM) A770 Graphics
Device is compatible
Checking Intel(R) UHD Graphics 770
Device is compatible
Checking 12th Gen Intel(R) Core(TM) i9-12900K
Device is compatible
Checking Intel(R) FPGA Emulation Device
Device is compatible
Checking Intel(R) Arc(TM) A770 Graphics
Device is compatible
Checking Intel(R) UHD Graphics 770
Device is compatible
Checking NVIDIA GeForce RTX 3060
Device is compatible
Checking AMD Radeon RX 6400
Device is compatible

fineg74 pushed a commit to fineg74/llvm that referenced this issue Jun 15, 2023
Modify `is_compatible` to check if specific target is defined with
`-fsycl-targets` and change the result. Previously there was a situation
when kernel is compatible with the device by aspects, but actually it
fails to run on this device as it was compiled for another target
device.

Related spec change: KhronosGroup/SYCL-Docs#381

Resolves intel#7561
@KornevNikita
Copy link
Contributor

@al42and thanks for the quick feedback! I'll keep this issue opened until is_compatible() fully matches the requirements.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working confirmed
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants