Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Revise design for optional features and AOT #12252

Open
wants to merge 4 commits into
base: sycl
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
45 changes: 23 additions & 22 deletions sycl/doc/design/DeviceConfigFile.md
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,7 @@ for this target. This information is optional because we plan to implement an
auto-detection mechanism that is able to infer the `aot-toolchain` from the
target name for well known targets.
- [Optional] `aot-toolchain-%option_name` information to be passed to the
`aot-toolchain` command. This information is optional. For some targets, the
`aot-toolchain` command. This information is optional. For some targets, the
auto-detection mechanism might be able to infer values for this. One example of this
information would be `ocloc-device %device_id`.

Expand All @@ -41,12 +41,13 @@ different tools and compiler modules:
- Compiler driver requires `aot-toolchain` and `ocloc-device` to trigger the
compilation for the required targets.
[https://github.com/intel/llvm/pull/6775/files]
- `sycl-aspect-filter`:
https://github.com/intel/llvm/blob/sycl/sycl/doc/design/OptionalDeviceFeatures.md#aspect-filter-tool
- `sycl-post-link` (see the description in the
[OptionalDeviceFeatures](./OptionalDeviceFeatures.md#new-filtering-pass-in-the-post-link-tool)
design doc).

Finally, overhead should be minimal. Particularly, users should not pay for what
they do not use. This motivates our decision to embed the default Device
Configuration File rather than releasing it as a separate file.
Configuration File rather than releasing it as a separate file.

## High-Level Design
The default Device Configuration File is a `.td` file located in the compiler
Expand All @@ -68,10 +69,10 @@ to implement new TableGen backends. Also, the [Search
Indexes](https://llvm.org/docs/TableGen/BackEnds.html#search-indexes) backend
already does something very similar to what we seek. It generates a table that
provides a lookup function, but it cannot be extended with new entries. We can
use _Search Indexes_ backend as inspiration for ours.
use _Search Indexes_ backend as inspiration for ours.

Our backend should generate a map where the key is the target name and the value
is an object of a custom class/struct including all the information required.
is an object of a custom class/struct including all the information required.

Firstly, we need to provide a file describing the `DynamicTable` class. An
example for this is `SearchableTable.td`, which describes `GenericEnum`, and
Expand Down Expand Up @@ -135,8 +136,8 @@ class DynamicTable {
This file should be included --either directly or indirectly-- in any other
`.td` file that uses `DynamicTable` class.

The default device configuration `.td` file should look like the one below:
```
The default device configuration `.td` file should look like the one below:
```
include "llvm/TableGen/DynamicTable.td"

// Aspect and all the aspects definitions could be outlined
Expand Down Expand Up @@ -182,8 +183,8 @@ def AspectExt_intel_device_id : Aspect<"ext_intel_device_id">;
def AspectExt_intel_memory_clock_rate : Aspect<"ext_intel_memory_clock_rate">;
def AspectExt_intel_memory_bus_width : Aspect<"ext_intel_memory_bus_width">;
def AspectEmulated : Aspect<"emulated">;
def TargetTable : DynamicTable {

def TargetTable : DynamicTable {
let FilterClass = "TargetInfo";
let Fields = ["TargetName", "aspects", "maySupportOtherAspects",
"subGroupSizes", "aotToolchain", "aotToolchainOptions"];
Expand All @@ -201,19 +202,19 @@ class TargetInfo <string tgtName, list<Aspect> aspectList, bit otherAspects,
string aotToolchainOptions = options;
}

def : TargetInfo<"TargetA", [AspectCpu, AspectAtomic64],
0, [8, 16], "ocloc", "-device tgtA">;
def : TargetInfo<"TargetA", [AspectCpu, AspectAtomic64],
0, [8, 16], "ocloc", "-device tgtA">;
def : TargetInfo<"TargetB", [AspectGpu, AspectFp16],
0, [8, 16], "ocloc", "-device tgtB">;
def : TargetInfo<"TargetC", [AspectEmulated, AspectImage],
0, [8, 32], "ocloc", "-device tgtC, -option2 val">;
```
Note: backends tested don't allow lists within `TargetInfo` class. This is a
Note: backends tested don't allow lists within `TargetInfo` class. This is a
backend limitation, rather than a TableGen limitation. Thus, we should be able
to lift this limitation in our own backend, as shown in the initial prototype
implemented to drive the design.

The generated `.inc` file should look like the example below:
The generated `.inc` file should look like the example below:
```c++
std::map<std::string, TargetInfo> TargetTable = {
{"TargetA",
Expand All @@ -228,7 +229,7 @@ We also need a header file that includes the `.inc` file generated by the
TableGen backend. Other backends don't generate the definition of `struct
TargetInfo`, and this seems a good idea to me: it simplifies the backend
implementation, and it is easier for developers to check the data structure
to understand how to work with it. The idea is simply to define the struct
to understand how to work with it. The idea is simply to define the struct
in this header file. This header file should look like the code below:
```c++
namespace DeviceConfigFile {
Expand All @@ -247,7 +248,7 @@ using TargetTable_t = std::map<std::string, TargetInfo>;

Other modules can query the map to get the information like in the example
below:
```c++
```c++
DeviceConfigFile::TargetInfo info = DeviceConfigFile::targets.find("TargetA");
if (info == DeviceConfigFile::targets.end()) {
/* Target not found */
Expand All @@ -262,7 +263,7 @@ if (info == DeviceConfigFile::targets.end()) {

## Tools and Modules Interacting with Device Config File
This is a list of the tools and compiler modules that require using the file:
- The *compiler driver* needs the file to determine the set of legal values for
- The *compiler driver* needs the file to determine the set of legal values for
`-fsycl-targets`.
- The *compiler driver* needs the file to define macros for `any_device_has/all_devices_have`.
- *Clang* needs the file to emit diagnostics related to `-fsycl-fixed-targets.`
Expand Down Expand Up @@ -299,8 +300,8 @@ map with the new information about targets. LLVM provides
[YAML/IO](https://llvm.org/docs/YamlIO.html) library to easily parse `.yaml`
files. The driver should propagate this option to all the tools that require
the Device Configuration File (e.g. `sycl-post-link`) so that each of the
tools can modify the map according to the user extensions described in the
`.yaml` file.
tools can modify the map according to the user extensions described in the
`.yaml` file.

As mentioned in [Requirements](#requirements), there is an auto-detection
mechanism for `aot-toolchain` and `aot-toolchain-options` that is able to
Expand All @@ -317,13 +318,13 @@ the other hand, it simply processes each new entry and updates the map with the
latest information found.

The auto-detection mechanism is a best effort to relieve users from specifying
`aot-toolchain` and `aot-toolchain-options` from well known devices. However,
`aot-toolchain` and `aot-toolchain-options` from well known devices. However,
it has its own limitations and potential issues:
- Rules for target names: As of now, auto-detection is only available for Intel GPU
targets. All targets starting with `intel_gpu_` will automatically set
`aot-toolchain=ocloc` and `aot-toolchain-options=-device suffix` being suffix the part
left after `intel_gpu_` prefix.
- User specifies `aot-toolchain` and `aot-toolchain-options` for a target name
- User specifies `aot-toolchain` and `aot-toolchain-options` for a target name
that can be auto-detected: user-specified information has precedence over auto-detected
information.

Expand All @@ -341,7 +342,7 @@ via `device::has` for each device listed in the `.td` file. Both lists should ma
This test could copy the mechanism of the test for `any_device_has` that goes over each
item in `aspects.def` and tries to instantiate `any_device_has` with that enumerator.

Neither of the tests provides guarantees that nothing went out-of-sync *per se*, we
Neither of the tests provides guarantees that nothing went out-of-sync *per se*, we
would require running the second test in all the targets described in the `.td` file
for such guarantees, but at least provides the mechanism to detect potential desyncs.

Loading