Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clarify "is_compatible" #381

Merged

Conversation

gmlueck
Copy link
Contributor

@gmlueck gmlueck commented Mar 15, 2023

This PR addresses an issue that was raised during the review of KhronosGroup/SYCL-CTS#553. It is also related to the clarification made in #370.

When we clarified the semantics of any_device_has and all_devices_have, we decided that these tell information about the "compilation environment" used to compile the SYCL translation unit. This implies that kernels in a TU might not be compatible with some devices depending on the "compilation environment". For example, a vendor's compiler might support some compiler flag that allows kernels to run on GPU devices but not on other devices.

While writing the CTS, we realized that applications need some way to tell if a kernel was compiled in a way that restricts the devices on which it can run. It seems like the existing API is_compatible is a good fit, so this PR extends the semantics to say that this API also considers the "compilation environment" when determining compatibility.

This PR addresses an issue that was raised during the review of
KhronosGroup/SYCL-CTS#553.  It is also related to the clarification
made in KhronosGroup#370.

When we clarified the semantics of `any_device_has` and
`all_devices_have`, we decided that these tell information about the
"compilation environment" used to compile the SYCL translation unit.
This implies that kernels in a TU might not be compatible with some
devices depending on the "compilation environment".  For example, a
vendor's compiler might support some compiler flag that allows kernels
to run on GPU devices but not on other devices.

While writing the CTS, we realized that applications need some way to
tell if a kernel was compiled in a way that restricts the devices on
which it can run.  It seems like the existing API `is_compatible` is
a good fit, so this PR extends the semantics to say that this API also
considers the "compilation environment" when determining compatibility.
@keryell
Copy link
Member

keryell commented Mar 16, 2023

Need more reviews.

@@ -14999,7 +14999,12 @@ application is compatible with a device unless:
in <<sec:optional-kernel-features>>; or
* It is decorated with a [code]#[[sycl::device_has()]]# {cpp} attribute that
lists an aspect that is not supported by the device, as described in
<<sec:kernel.attributes>>.
<<sec:kernel.attributes>>; or
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is_compatible doesn't really sound like "was this kernel compiled for this device?". With this change, users won't have an easy way to understand whether or not all optional features used in a kernel are supported by the device, because is_compatible could return false for other reasons.

Would it be better to instead expand definition of has_kernel_bundle to say that it may return false if compilation environment doesn't support the device?

Kernel definition/properties, i.e. list of aspects used by it should not change depending on compilation environment, because it is encoded in the source code. Content of kernel bundle and list of available device images, however, is not defined by the source code alone and can also be heavily dependent on compilation environment.

Tagging @rolandschulz

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the intent of is_compatible was that you can use it to determine whether a kernel can be submitted to a device. If we think that was the intent, then I believe this clarification makes sense.

I agree that an application could have some kernel K that uses features that are compatible with some device D, but due to the compilation environment K cannot be run on device D. This change to the spec makes it impossible for the application to test that K would be compatible with device D if the application was compiled with some other set of options. However, I'm not sure that such a query is useful. It seems like the thing the application cares about is whether K can be run on D, not some hypothetical question about whether K could be run on D if the application was compiled differently.

Note that it's less convenient to use has_kernel_bundle because the application need to create a kernel bundle in order to use that API. It's much more convenient to use is_compatible because you only need the kernel's type-name.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

However, I'm not sure that such a query is useful. It seems like the thing the application cares about is whether K can be run on D, not some hypothetical question about whether K could be run on D if the application was compiled differently.

Let's say you provide a library which is AOT-compiled for a certain list of devices. You may want to improve user experience of the library users by emitting a better error saying that they need to re-compile the library with different flags to make it work on that device; instead of having some unspecified exception at runtime.
However, I agree that it is kind of an artificial case, because as a library author you could somehow configure your compilation environment to ensure that there are some fallback paths for non-target devices; or simply have clear documentation about device requirements.

Note that it's less convenient to use has_kernel_bundle because the application need to create a kernel bundle in order to use that API.

You don't need a kernel bundle to query if a kernel bundle exists. has_kernel_bundle accepts context, vector of kernel_id and optionally list of devices.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think both queries are useful. It is only hypothetical for run decision, but not for error reporting. An application might want to report "This kernel could run on the device you have available. But I was compiled without proper support. Please recompile me for best performance.".

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I agree that since there are a few different reasons for why is_compatible may return false, it would be useful to have a way to expose the reason, I'm not sure how useful it would be as a mechanism for dispatching different kernels, but it could at least be a useful debug feature, to be able to identify exactly why the kernel couldn't be run on a certain device. Saying that, if we were to do this, I think that could be a different PR that doesn't need to hold up this one.

@@ -14999,7 +14999,12 @@ application is compatible with a device unless:
in <<sec:optional-kernel-features>>; or
* It is decorated with a [code]#[[sycl::device_has()]]# {cpp} attribute that
lists an aspect that is not supported by the device, as described in
<<sec:kernel.attributes>>.
<<sec:kernel.attributes>>; or
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I agree that since there are a few different reasons for why is_compatible may return false, it would be useful to have a way to expose the reason, I'm not sure how useful it would be as a mechanism for dispatching different kernels, but it could at least be a useful debug feature, to be able to identify exactly why the kernel couldn't be run on a certain device. Saying that, if we were to do this, I think that could be a different PR that doesn't need to hold up this one.

adoc/chapters/programming_interface.adoc Show resolved Hide resolved
@fraggamuffin
Copy link

delay until remaining open discussion closes

@tomdeakin
Copy link
Contributor

WG: ready to merge

@keryell keryell merged commit e6b4a5a into KhronosGroup:SYCL-2020/master May 4, 2023
@gmlueck gmlueck deleted the gmlueck/is-compatible branch May 15, 2023 20:07
KornevNikita added a commit to KornevNikita/llvm that referenced this pull request Jun 7, 2023
Modify is_compatible to check if specific target defined with
-fsycl-targets and change the result. Previously there was a situation
when kernel is compatible with the device by aspects, but actually it
fails to run on this device as it was compiled for another target
device.

KhronosGroup/SYCL-Docs#381
dm-vodopyanov pushed a commit to intel/llvm that referenced this pull request Jun 14, 2023
Modify `is_compatible` to check if specific target is defined with
`-fsycl-targets` and change the result. Previously there was a situation
when kernel is compatible with the device by aspects, but actually it
fails to run on this device as it was compiled for another target
device.

Related spec change: KhronosGroup/SYCL-Docs#381

Resolves #7561
fineg74 pushed a commit to fineg74/llvm that referenced this pull request Jun 15, 2023
Modify `is_compatible` to check if specific target is defined with
`-fsycl-targets` and change the result. Previously there was a situation
when kernel is compatible with the device by aspects, but actually it
fails to run on this device as it was compiled for another target
device.

Related spec change: KhronosGroup/SYCL-Docs#381

Resolves intel#7561
keryell added a commit that referenced this pull request Sep 10, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants