Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replacing PyHIP with new official python wrapper of ROCm HIP #285

Open
wants to merge 7 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@ All notable changes to this project will be documented in this file.
This project adheres to [Semantic Versioning](http://semver.org/).

## Unreleased
- changed HIP python bindings from pyhip-interface to the official hip-python

## [1.0.0] - 2024-04-04
- HIP backend to support tuning HIP kernels on AMD GPUs
Expand Down
27 changes: 11 additions & 16 deletions INSTALL.rst
Original file line number Diff line number Diff line change
Expand Up @@ -124,31 +124,26 @@ Or you could install Kernel Tuner and PyOpenCL together if you haven't done so a

If this fails, please see the PyOpenCL installation guide (https://wiki.tiker.net/PyOpenCL/Installation)

HIP and PyHIP
HIP and HIP Python
-------------

Before we can install PyHIP, you'll need to have the HIP runtime and compiler installed on your system.
Before we can install HIP Python, you'll need to have the HIP runtime and compiler installed on your system.
The HIP compiler is included as part of the ROCm software stack. Here is AMD's installation guide:

* `ROCm Documentation: HIP Installation Guide <https://docs.amd.com/bundle/HIP-Installation-Guide-v5.3/page/Introduction_to_HIP_Installation_Guide.html>`__

After you've installed HIP, you will need to install PyHIP. Run the following command in your terminal to install:
After you've installed HIP, you will need to install HIP Python. Run the following command in your terminal to install:

.. code-block:: bash

pip install pyhip-interface
First identify the first three digits of the version number of your ROCm™ installation.
Then install the HIP Python package(s) as follows:

Alternatively, you can install PyHIP from the source code. First, clone the repository from GitHub:

.. code-block:: bash
.. code-block:: shell

git clone https://github.com/jatinx/PyHIP

Then, navigate to the repository directory and run the following command to install:

.. code-block:: bash
python3 -m pip install -i https://test.pypi.org/simple hip-python~=$rocm_version
# if you want to install the CUDA Python interoperability package too, run:
python3 -m pip install -i https://test.pypi.org/simple hip-python-as-cuda~=$rocm_version

python setup.py install
For other installation options check `hip-python on GitHub <https://github.com/ROCm/hip-python>`_

Installing the git version
--------------------------
Expand All @@ -171,7 +166,7 @@ The runtime dependencies are:

- `cuda`: install pycuda along with kernel_tuner
- `opencl`: install pycuda along with kernel_tuner
- `hip`: install pyhip along with kernel_tuner
- `hip`: install HIP Python along with kernel_tuner
- `tutorial`: install packages required to run the guides

These can be installed by appending e.g. ``-E cuda -E opencl -E hip``.
Expand Down
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -32,7 +32,7 @@ What Kernel Tuner does:

## Installation

- First, make sure you have your [CUDA](https://kerneltuner.github.io/kernel_tuner/stable/install.html#cuda-and-pycuda), [OpenCL](https://kerneltuner.github.io/kernel_tuner/stable/install.html#opencl-and-pyopencl), or [HIP](https://kerneltuner.github.io/kernel_tuner/stable/install.html#hip-and-pyhipl) compiler installed
- First, make sure you have your [CUDA](https://kerneltuner.github.io/kernel_tuner/stable/install.html#cuda-and-pycuda), [OpenCL](https://kerneltuner.github.io/kernel_tuner/stable/install.html#opencl-and-pyopencl), or [HIP](https://kerneltuner.github.io/kernel_tuner/stable/install.html#hip-and-hip-python) compiler installed
- Then type: `pip install kernel_tuner[cuda]`, `pip install kernel_tuner[opencl]`, or `pip install kernel_tuner[hip]`
- or why not all of them: `pip install kernel_tuner[cuda,opencl,hip]`

Expand Down
2 changes: 1 addition & 1 deletion doc/source/backends.rst
Original file line number Diff line number Diff line change
Expand Up @@ -58,7 +58,7 @@ used to compile the kernels.
:header: Feature, PyCUDA, CuPy, CUDA-Python, HIP
:widths: auto

Python package, "pycuda", "cupy", "cuda-python", "pyhip-interface"
Python package, "pycuda", "cupy", "cuda-python", "hip-python"
Selected with lang=, "CUDA", "CUPY", "NVCUDA", "HIP"
Compiler used, "nvcc", "nvrtc", "nvrtc", "hiprtc"

Expand Down
2 changes: 1 addition & 1 deletion doc/source/design.rst
Original file line number Diff line number Diff line change
Expand Up @@ -49,7 +49,7 @@ building blocks for implementing runners.
The observers are explained in :ref:`observers`.

At the bottom, the backends are shown.
PyCUDA, CuPy, cuda-python, PyOpenCL and PyHIP are for tuning either CUDA, OpenCL, or HIP kernels.
PyCUDA, CuPy, cuda-python, PyOpenCL and HIP Python are for tuning either CUDA, OpenCL, or HIP kernels.
The CompilerFunctions implementation can call any compiler, typically NVCC
or GCC is used. There is limited support for tuning Fortran kernels.
This backend was created not just to be able to tune C
Expand Down
6 changes: 3 additions & 3 deletions examples/hip/test_vector_add.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,11 +5,11 @@
from kernel_tuner import run_kernel
import pytest

#Check pyhip is installed and if a HIP capable device is present, if not skip the test
#Check hip is installed and if a HIP capable device is present, if not skip the test
try:
from pyhip import hip, hiprtc
from hip import hip, hiprtc
except ImportError:
pytest.skip("PyHIP not installed or PYTHONPATH does not includes PyHIP")
pytest.skip("HIP Python not installed or PYTHONPATH does not includes HIP Python")
hip = None
hiprtc = None

Expand Down
6 changes: 3 additions & 3 deletions examples/hip/vector_add.py
Original file line number Diff line number Diff line change
Expand Up @@ -30,8 +30,8 @@ def tune():
tune_params = OrderedDict()
tune_params["block_size_x"] = [128+64*i for i in range(15)]

results, env = tune_kernel("vector_add", kernel_string, size, args, tune_params, lang="HIP",
cache="vector_add_cache.json", log=logging.DEBUG)
results, env = tune_kernel("vector_add", kernel_string, size, args, tune_params, lang="HIP",
log=logging.DEBUG)

# Store the metadata of this run
store_metadata_file("vector_add-metadata.json")
Expand All @@ -40,4 +40,4 @@ def tune():


if __name__ == "__main__":
tune()
tune()
38 changes: 34 additions & 4 deletions kernel_tuner/backends/compiler.py
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,17 @@
except ImportError:
cp = None

try:
from hip import hip
except ImportError:
hip = None

try:
from hip._util.types import DeviceArray
except ImportError:
Pointer = Exception # using Exception here as a type that will never be among kernel arguments
DeviceArray = Exception


def is_cupy_array(array):
"""Check if something is a cupy array.
Expand Down Expand Up @@ -145,9 +156,9 @@ def ready_argument_list(self, arguments):
ctype_args = [None for _ in arguments]

for i, arg in enumerate(arguments):
if not (isinstance(arg, (np.ndarray, np.number)) or is_cupy_array(arg)):
raise TypeError(f"Argument is not numpy or cupy ndarray or numpy scalar but a {type(arg)}")
dtype_str = str(arg.dtype)
if not (isinstance(arg, (np.ndarray, np.number, DeviceArray)) or is_cupy_array(arg)):
raise TypeError(f"Argument is not numpy or cupy ndarray or numpy scalar or HIP Python DeviceArray but a {type(arg)}")
dtype_str = arg.typestr if isinstance(arg, DeviceArray) else str(arg.dtype)
if isinstance(arg, np.ndarray):
if dtype_str in dtype_map.keys():
# In numpy <= 1.15, ndarray.ctypes.data_as does not itself keep a reference
Expand All @@ -156,13 +167,20 @@ def ready_argument_list(self, arguments):
# (This changed in numpy > 1.15.)
# data_ctypes = data.ctypes.data_as(C.POINTER(dtype_map[dtype_str]))
data_ctypes = arg.ctypes.data_as(C.POINTER(dtype_map[dtype_str]))
numpy_arg = arg
else:
raise TypeError("unknown dtype for ndarray")
elif isinstance(arg, np.generic):
data_ctypes = dtype_map[dtype_str](arg)
numpy_arg = arg
elif is_cupy_array(arg):
data_ctypes = C.c_void_p(arg.data.ptr)
ctype_args[i] = Argument(numpy=arg, ctypes=data_ctypes)
numpy_arg = arg
elif isinstance(arg, DeviceArray):
data_ctypes = arg.as_c_void_p()
numpy_arg = None

ctype_args[i] = Argument(numpy=numpy_arg, ctypes=data_ctypes)
return ctype_args

def compile(self, kernel_instance):
Expand Down Expand Up @@ -380,6 +398,12 @@ def memcpy_dtoh(self, dest, src):
:param src: An Argument for some memory allocation
:type src: Argument
"""
# If src.numpy is None, it means we're dealing with a HIP Python DeviceArray
if src.numpy is None:
# Skip memory copies for HIP Python DeviceArray
# This is because DeviceArray manages its own memory and donesn't need
# explicit copies like numpy arrays do
return
if isinstance(dest, np.ndarray) and is_cupy_array(src.numpy):
# Implicit conversion to a NumPy array is not allowed.
value = src.numpy.get()
Expand All @@ -397,6 +421,12 @@ def memcpy_htod(self, dest, src):
:param src: A numpy or cupy array containing the source data
:type src: np.ndarray or cupy.ndarray
"""
# If src.numpy is None, it means we're dealing with a HIP Python DeviceArray
if dest.numpy is None:
# Skip memory copies for HIP Python DeviceArray
# This is because DeviceArray manages its own memory and donesn't need
# explicit copies like numpy arrays do
return
if isinstance(dest.numpy, np.ndarray) and is_cupy_array(src):
# Implicit conversion to a NumPy array is not allowed.
value = src.get()
Expand Down
Loading
Loading