-
Notifications
You must be signed in to change notification settings - Fork 99
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HIP backend general issue #806
Comments
Here is a list of the current issues observed while building with the HIP backend:
Now that the ETI and tests are merged (or are about to be), we can make a list of what still needs to be done to get the backend fully functional. HIP spot-check enabled tests
HIP tests currently failingIssues in batchedDLA
Issues in Graph (offset==int and offset==size_t fail in the same way)
Issues in Sparse (offset==int and offset==size_t fail in the same way)
|
@lucbv I'll add amd/caraway options for the testing scripts this week |
Thanks, I have shared my current configuration on the internal repo (see the Technical tips section on the homepage). |
@lucbv I have a branch now that passes unit tests for CUDA, Serial, OpenMP but will (hopefully) also work on HIP when then unit tests are built for it. The only things still hardcoded for CUDA are things involving cusparse, cublas, graphs and streams. There are a couple places where |
@brian-kelley thanks for looking at this, I am still waiting on |
Using the latest rocm LLVM compiler the new list of failing tests is much shorter: Graph[ RUN ] hip.graph_graph_color_deterministic_double_int_int_TestExecSpace SparseSome failures related to complex atomics, updates in Kokkos Core should resolve these issues. |
More things are working now - with rocm 4.5 and MI100 (on Caraway) all tests pass except for structured SpMV ( |
At this point we are testing HIP in our CI, everything is building correct : ) |
This issue is meant to centralize issues and work being done to integrate the HIP backend in Kokkos-Kernels.
Ideally I would like other issues to be opened for specific technical issues to be opened and then referenced here so that users and developers would know what the known issues are and who is working on them.
The text was updated successfully, but these errors were encountered: