Skip to content

Commit

Permalink
[DOC] Initial docs on kernel extensions.
Browse files Browse the repository at this point in the history
- A very initial version of the documentation of implementing/building/using a custom kernel extension, as part of the user docs.
- All files displayed in these docs are also in scripts/examples/extensions/myKernels/ for easy use.
  • Loading branch information
pdamme committed May 3, 2024
1 parent 3b78fca commit 84fa96f
Show file tree
Hide file tree
Showing 9 changed files with 386 additions and 0 deletions.
223 changes: 223 additions & 0 deletions doc/Extensions.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,223 @@
<!--
Copyright 2021 The DAPHNE Consortium
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
-->

# Custom Extensions to DAPHNE

DAPHNE will be extensible in various respects.
Users will be able to add their own kernels, data types, value types, compiler passes, runtime schedulers, etc. without changing the DAPHNE source code itself.

So far, DAPHNE has initial support for adding custom kernels.

## Custom Kernel Extensions

Users can add their own custom kernels (physical operators) to DAPHNE following a three-step approach:

1. The extension is implemented as a stand-alone code base.
2. The extension is compiled as a shared library.
3. The extension is used in a DaphneDSL script or via DAPHNE's Python API.

Since this feature is still in an early stage, we only mention the most important points here rather than providing a full reference of what's supported.

Furthermore, we include a running example of adding two custom kernels for the summation of a dense matrix of single-precision floating-point values.
We are interested in two variants: one sequential implementation for the CPU and one implementation that uses SIMD instructions from Intel's AVX (256-bit vector registers) on the CPU.
All files shown below can be found in `/scripts/examples/extensions/myKernels/`.

### Step 1: Implementing a Kernel Extension

A kernel extension consists at least of the following:

- A *C++ source file*, which includes some essential DAPHNE headers and defines one or multiple kernel functions.
The kernel functions have to follows a certain interface and have `extern "C"` linkage.
A detailed documentation of the expected interfaces will be added later.
Within the kernel functions, extension developers have a lot of freedom.
Nevertheless, we also plan to provide some best practices and helpers to make extension development more productive.
- A *kernel catalog JSON file*, which provides some essential information on the kernels provided in the extension, such that DAPHNE knows how to use them.
This information includes: the mnemonic of the DaphneIR operation, the name of the kernel function, the list of result/argument types, the backend (e.g. CPU or a specific hardware accelerator), and the path to the shared library of the extension (relative to this JSON file).

To build the extension, it is recommendable to include a Makefile or similar as well.

*Running example:*

C++ source file `myKernels.cpp`:
```c++
#include <runtime/local/datastructures/DenseMatrix.h>

#include <immintrin.h> // for the SIMD-enabled kernel
#include <iostream>
#include <stdexcept>

class DaphneContext;

extern "C" {
// Custom sequential sum-kernel.
void mySumSeq(
float * res,
const DenseMatrix<float> * arg,
DaphneContext * ctx
) {
std::cerr << "hello from mySumSeq()" << std::endl;
const float * valuesArg = arg->getValues();
*res = 0;
for(size_t r = 0; r < arg->getNumRows(); r++) {
for(size_t c = 0; c < arg->getNumCols(); c++)
*res += valuesArg[c];
valuesArg += arg->getRowSkip();
}
}

// Custom SIMD-enabled sum-kernel.
void mySumSIMD(
float * res,
const DenseMatrix<float> * arg,
DaphneContext * ctx
) {
std::cerr << "hello from mySumSIMD()" << std::endl;

// Validation.
const size_t numCells = arg->getNumRows() * arg->getNumCols();
if(numCells % 8)
throw std::runtime_error(
"for simplicity, the number of cells must be "
"a multiple of 8"
);
if(arg->getNumCols() != arg->getRowSkip())
throw std::runtime_error(
"for simplicity, the argument must not be "
"a column segment of another matrix"
);
// SIMD accumulation (8x f32).
const float * valuesArg = arg->getValues();
__m256 acc = _mm256_setzero_ps();
for(size_t i = 0; i < numCells / 8; i++) {
acc = _mm256_add_ps(acc, _mm256_loadu_ps(valuesArg));
valuesArg += 8;
}

// Summation of accumulator elements.
*res =
(reinterpret_cast<float*>(&acc))[0] +
(reinterpret_cast<float*>(&acc))[1] +
(reinterpret_cast<float*>(&acc))[2] +
(reinterpret_cast<float*>(&acc))[3] +
(reinterpret_cast<float*>(&acc))[4] +
(reinterpret_cast<float*>(&acc))[5] +
(reinterpret_cast<float*>(&acc))[6] +
(reinterpret_cast<float*>(&acc))[7];
}
}
```

Kernel catalog file `myKernels.json`:
```json
[
{
"opMnemonic": "sumAll",
"kernelFuncName": "mySumSeq",
"resTypes": ["float"],
"argTypes": ["DenseMatrix<float>"],
"backend": "CPP",
"libPath": "libMyKernels.so"
},
{
"opMnemonic": "sumAll",
"kernelFuncName": "mySumSIMD",
"resTypes": ["float"],
"argTypes": ["DenseMatrix<float>"],
"backend": "CPP",
"libPath": "libMyKernels.so"
}
]
```

`Makefile`:
```make
libMyKernels.so: myKernels.o
g++ -shared myKernels.o -o libMyKernels.so

myKernels.o: myKernels.cpp
g++ -c -fPIC myKernels.cpp -I../../../../src/ -std=c++17 -O3 -mavx2 -o myKernels.o

clean:
rm -rf myKernels.o libMyKernels.so
```

### Step 2: Building a Kernel Extension

The kernel extension must be built as a shared library.
Additional details will follow.

*Running example:*

Given the Makefile above, the extension is built by simply running `make` in the extension's directory, which produces the shared library `libMyKernels.so`:

```bash
make
```

### Step 3: Using a Kernel Extension

The kernels in a kernel extension can be used either automatically by DAPHNE or manually by the user.
Automatic use is currently restricted to the selection of the kernel based on result/argument data/value types, but in the future we plan to support custom cost models as well.
Besides that, the manual employment of custom kernels is very useful for experimentation, e.g., to see the impact of the kernel in the context of a larger integrated data analysis pipeline.
To this end, DaphneDSL [compiler hints](/doc/DaphneDSL/LanguageRef.md#compiler-hints) tell DAPHNE to use a specific kernel, even though DAPHNE's optimizing compiler may not choose the kernel, otherwise.

*Running example:*

A minimal example using a summation on a matrix of single-precision floating-point values could look as follows:

`demo.daphne`:
```R
# Create a matrix of random f32 values in [0, 1] (400 MiB).
X = rand(10^4, 10^4, as.f32(0), as.f32(1), 1, 12345);
# Calculate the sum over the matrix.
s = sum(X);
# Print the sum.
print(s);
```

We execute this script from the DAPHNE root directory by:
```bash
bin/daphne scripts/examples/extensions/myKernels/demo.daphne
```

In order to manually use our custom sequential `sum`-kernel, we add the DAPHNE compiler hint `::mySumSeq` to the script:

`demoSeq.daphne`:
```R
X = rand(10^4, 10^4, as.f32(0), as.f32(1), 1, 12345);
s = sum::mySumSeq(X);
print(s);
```

We execute this script with the following command, whereby the argument `--kernel-ext` specified the kernel catalog JSON file of the extension to use:
```bash
bin/daphne --kernel-ext scripts/examples/extensions/myKernels/myKernels.json scripts/examples/extensions/myKernels/demoSeq.daphne
```

Alternatively, we can try our custom SIMD-enabled `sum`-kernel by adapting the compiler hint accordingly:

`demoSIMD.daphne`:
```R
X = rand(10^4, 10^4, as.f32(0), as.f32(1), 1, 12345);
s = sum::mySumSIMD(X);
print(s);
```

We execute this script by:
```bash
bin/daphne --kernel-ext scripts/examples/extensions/myKernels/myKernels.json scripts/examples/extensions/myKernels/demoSIMD.daphne
```
1 change: 1 addition & 0 deletions doc/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,7 @@
- [A Few Early Example Algorithms in DaphneDSL](/scripts/algorithms/)
- [FileMetaData Format (reading and writing data)](/doc/FileMetaDataFormat.md)
- [Profiling DAPHNE using PAPI](/doc/Profiling.md)
- [Custom Extensions to DAPHNE](/doc/Extensions.md)

## Developer Documentation

Expand Down
2 changes: 2 additions & 0 deletions scripts/examples/extensions/myKernels/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
myKernels.o
libMyKernels.so
22 changes: 22 additions & 0 deletions scripts/examples/extensions/myKernels/Makefile
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
# Copyright 2024 The DAPHNE Consortium
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

libMyKernels.so: myKernels.o
g++ -shared myKernels.o -o libMyKernels.so

myKernels.o: myKernels.cpp
g++ -c -fPIC myKernels.cpp -I../../../../src/ -std=c++17 -O3 -mavx2 -o myKernels.o

clean:
rm -rf myKernels.o libMyKernels.so
20 changes: 20 additions & 0 deletions scripts/examples/extensions/myKernels/demo.daphne
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
# Copyright 2024 The DAPHNE Consortium
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

# Create a matrix of random f32 values in [0, 1] (400 MiB).
X = rand(10^4, 10^4, as.f32(0), as.f32(1), 1, 12345);
# Calculate the sum over the matrix.
s = sum(X);
# Print the sum.
print(s);
17 changes: 17 additions & 0 deletions scripts/examples/extensions/myKernels/demoSIMD.daphne
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
# Copyright 2024 The DAPHNE Consortium
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

X = rand(10^4, 10^4, as.f32(0), as.f32(1), 1, 12345);
s = sum::mySumSIMD(X);
print(s);
17 changes: 17 additions & 0 deletions scripts/examples/extensions/myKernels/demoSeq.daphne
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
# Copyright 2024 The DAPHNE Consortium
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

X = rand(10^4, 10^4, as.f32(0), as.f32(1), 1, 12345);
s = sum::mySumSeq(X);
print(s);
66 changes: 66 additions & 0 deletions scripts/examples/extensions/myKernels/myKernels.cpp
Original file line number Diff line number Diff line change
@@ -0,0 +1,66 @@
#include <runtime/local/datastructures/DenseMatrix.h>

#include <immintrin.h> // for the SIMD-enabled kernel
#include <iostream>
#include <stdexcept>

class DaphneContext;

extern "C" {
// Custom sequential sum-kernel.
void mySumSeq(
float * res,
const DenseMatrix<float> * arg,
DaphneContext * ctx
) {
std::cerr << "hello from mySumSeq()" << std::endl;
const float * valuesArg = arg->getValues();
*res = 0;
for(size_t r = 0; r < arg->getNumRows(); r++) {
for(size_t c = 0; c < arg->getNumCols(); c++)
*res += valuesArg[c];
valuesArg += arg->getRowSkip();
}
}

// Custom SIMD-enabled sum-kernel.
void mySumSIMD(
float * res,
const DenseMatrix<float> * arg,
DaphneContext * ctx
) {
std::cerr << "hello from mySumSIMD()" << std::endl;

// Validation.
const size_t numCells = arg->getNumRows() * arg->getNumCols();
if(numCells % 8)
throw std::runtime_error(
"for simplicity, the number of cells must be "
"a multiple of 8"
);
if(arg->getNumCols() != arg->getRowSkip())
throw std::runtime_error(
"for simplicity, the argument must not be "
"a column segment of another matrix"
);

// SIMD accumulation (8x f32).
const float * valuesArg = arg->getValues();
__m256 acc = _mm256_setzero_ps();
for(size_t i = 0; i < numCells / 8; i++) {
acc = _mm256_add_ps(acc, _mm256_loadu_ps(valuesArg));
valuesArg += 8;
}

// Summation of accumulator elements.
*res =
(reinterpret_cast<float*>(&acc))[0] +
(reinterpret_cast<float*>(&acc))[1] +
(reinterpret_cast<float*>(&acc))[2] +
(reinterpret_cast<float*>(&acc))[3] +
(reinterpret_cast<float*>(&acc))[4] +
(reinterpret_cast<float*>(&acc))[5] +
(reinterpret_cast<float*>(&acc))[6] +
(reinterpret_cast<float*>(&acc))[7];
}
}
Loading

0 comments on commit 84fa96f

Please sign in to comment.