Skip to content

Commit

Permalink
[CPU][RV64] Implemented JIT Kernel for Eltwise ops (openvinotoolkit#2…
Browse files Browse the repository at this point in the history
…8727)

### Details:
- *Added [xbyak_riscv](https://github.com/herumi/xbyak_riscv) library
support*
- *Rewrote Eltwise source files a little bit to support different
archs': unite common logic in some methods of class `Eltwise`*
 - *Implemented `jit_generator` for RISC-V 64-bit*
- *Implemented `jit_uni_eltwise_generic` kernel for RISC-V 64-bit
devices with RVV 1.0 support. Now if device has RVV 1.0, JIT-supported
Eltwise ops will be executed via JIT-kernel*
- *Implemented several JIT emitters for the following ops: Add, Sub,
Dib, Mul, Clamp, Relu, PRelu, Exp, Sigmoid, PowerStatic*
- *Implemented cpu_isa_traits for RISC-V. Currently, there are defined G
(IMAFD - default ISAs'), C (compressed) and V (ratified RVV1.0 only)*
 - *Updated relevant docs*

### Tickets:
 - *161878*

### TODO:
- [x] *Locally launched tests using 3 build ways with cross-compilation
- `Eltwise` and `Activation` tests are successfully passed*
- [ ] *CI Validation?*
  • Loading branch information
a-sidorova authored Mar 6, 2025
1 parent aa213c3 commit 40bf06e
Show file tree
Hide file tree
Showing 33 changed files with 3,163 additions and 333 deletions.
3 changes: 3 additions & 0 deletions .gitmodules
Original file line number Diff line number Diff line change
Expand Up @@ -90,3 +90,6 @@
[submodule "src/plugins/intel_cpu/thirdparty/kleidiai"]
path = src/plugins/intel_cpu/thirdparty/kleidiai
url = https://github.com/ARM-software/kleidiai.git
[submodule "src/plugins/intel_cpu/thirdparty/xbyak_riscv"]
path = src/plugins/intel_cpu/thirdparty/xbyak_riscv
url = https://github.com/herumi/xbyak_riscv.git
10 changes: 8 additions & 2 deletions docs/dev/build_riscv64.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,10 +18,16 @@ The software was validated on the following devices:
## How to build
Currently, there are three ways to build OpenVINO Runtime for 64-bit RISC-V platforms:

1. **Recommended**. The build with vectorized (using RVV instructions) primitives for limited scope of operations from [`SHL`](https://github.com/XUANTIE-RV/csi-nn2) using [`xuantie-gnu-toolchain`](https://github.com/XUANTIE-RV/). This GNU Compiler Toolchain supports RVV 0.7.1, ratified RVV 1.0 and Xuantie-specific instruction sets. The vector intrinsics don't use the common prefix `__riscv_`. This method provides the best performance available at the moment.
2. The build without optimized primitives using [`riscv-gnu-toolchain`](https://github.com/riscv-collab/riscv-gnu-toolchain.git). This GNU Compiler Toolchain supports RVV 0.7.1 and ratified RVV 1.0. The vector intrinsics use the common prefix `__riscv_`. However, as mentioned earlier, this build method doesn't yet provide optimized primitives implemented using the RVV intrinsics.
1. **Recommended**. The build with vectorized (using RVV intrinsics) primitives for limited scope of operations from [`SHL`](https://github.com/XUANTIE-RV/csi-nn2) operations using [`xuantie-gnu-toolchain`](https://github.com/XUANTIE-RV/).
This GNU Compiler Toolchain supports RVV 0.7.1, ratified RVV 1.0 and Xuantie-specific instruction sets.
The vector intrinsics don't use the common prefix `__riscv_`.
This method provides the best performance available at the moment.
2. The build without optimized primitives implemented with RVV intrinsics using [`riscv-gnu-toolchain`](https://github.com/riscv-collab/riscv-gnu-toolchain.git). This GNU Compiler Toolchain supports RVV 0.7.1 and ratified RVV 1.0. The vector intrinsics use the common prefix `__riscv_`. However, as mentioned earlier, this build method doesn't yet provide optimized primitives implemented using the RVV intrinsics.
3. The build without optimized primitives using installed Linux packages. The compilers in these packages don't support RVV intrinsics.

> **NOTE**: Currently CPU Plugin in OpenVINO supports [Just-In-Time (JIT) code generation](https://github.com/openvinotoolkit/openvino/blob/master/src/plugins/intel_cpu/src/emitters/README.md) for limited scope of operations on devices with RVV 1.0.
All three described above ways to build OpenVINO Runtime for 64-bit RISC-V supports JIT code generation.

### Steps

0. Prerequisite:
Expand Down
23 changes: 21 additions & 2 deletions src/plugins/intel_cpu/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -232,6 +232,11 @@ if(NOT AARCH64)
${CMAKE_CURRENT_SOURCE_DIR}/src/emitters/snippets/aarch64/*)
endif()

if (NOT RISCV64)
list(APPEND EXCLUDE_PATHS ${CMAKE_CURRENT_SOURCE_DIR}/src/emitters/plugin/riscv64/*
${CMAKE_CURRENT_SOURCE_DIR}/src/nodes/kernels/riscv64/*)
endif()

if (NOT ENABLE_MLAS_FOR_CPU)
list(APPEND EXCLUDE_PATHS ${CMAKE_CURRENT_SOURCE_DIR}/src/nodes/executors/mlas/*)
list(APPEND EXCLUDE_PATHS ${CMAKE_CURRENT_SOURCE_DIR}/src/mlas/*)
Expand Down Expand Up @@ -286,7 +291,12 @@ endif()
if(ENABLE_KLEIDIAI_FOR_CPU)
target_link_libraries(${TARGET_NAME} PRIVATE kleidiai)
endif()

if(RISCV64)
# Set `XBYAK_RISCV_V=1` to compile Xbyak-code for RVV-related instructions
target_compile_definitions(xbyak_riscv INTERFACE XBYAK_RISCV_V=1)
target_link_libraries(${TARGET_NAME} PRIVATE xbyak_riscv)
target_include_directories(${TARGET_NAME} SYSTEM INTERFACE $<TARGET_PROPERTY:xbyak_riscv::xbyak_riscv,INTERFACE_INCLUDE_DIRECTORIES>)
endif()
target_include_directories(${TARGET_NAME} SYSTEM PRIVATE $<TARGET_PROPERTY:dnnl,INCLUDE_DIRECTORIES>)

# Temporal solution to use template reference implementations in cases where optimizied implementation
Expand Down Expand Up @@ -362,7 +372,13 @@ set_target_properties(${TARGET_NAME} PROPERTIES INTERPROCEDURAL_OPTIMIZATION_REL

if(BUILD_SHARED_LIBS)
add_library(${TARGET_NAME}_obj OBJECT ${SOURCES} ${HEADERS})
ov_link_system_libraries(${TARGET_NAME}_obj PUBLIC dnnl openvino::pugixml)

set(CPU_OBJ_LINK_SYSTEM LIBRARIES dnnl openvino::pugixml)
if(RISCV64)
list(APPEND CPU_OBJ_LINK_SYSTEM xbyak_riscv::xbyak_riscv)
endif()

ov_link_system_libraries(${TARGET_NAME}_obj PUBLIC ${CPU_OBJ_LINK_SYSTEM})

ov_add_version_defines(src/plugin.cpp ${TARGET_NAME}_obj)

Expand Down Expand Up @@ -393,6 +409,9 @@ if(BUILD_SHARED_LIBS)
if(ENABLE_KLEIDIAI_FOR_CPU)
target_include_directories(${TARGET_NAME}_obj SYSTEM PUBLIC $<TARGET_PROPERTY:kleidiai,INTERFACE_INCLUDE_DIRECTORIES>)
endif()
if(RISCV64)
target_include_directories(${TARGET_NAME}_obj SYSTEM PUBLIC $<TARGET_PROPERTY:xbyak_riscv::xbyak_riscv,INTERFACE_INCLUDE_DIRECTORIES>)
endif()

ov_set_threading_interface_for(${TARGET_NAME}_obj)

Expand Down
6 changes: 5 additions & 1 deletion src/plugins/intel_cpu/src/emitters/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,19 +11,22 @@ Just-in-time (JIT) emitters is a type of emitter designed for just-in-time code
For JIT source code generation `Xbyak JIT Assembler` is used:
* [Xbyak for X64](https://github.com/herumi/xbyak)
* [Xbyak for ARM64](https://github.com/fujitsu/xbyak_aarch64)
* [Xbyak for RISCV64](https://github.com/herumi/xbyak_riscv)

Emitters are splitted into two groups based on their usage model: common plugin emitters for complex kernels ([plugin emitters](https://github.com/openvinotoolkit/openvino/tree/master/src/plugins/intel_cpu/src/emitters/plugin)) and basic blocks for tensor compiler ([snippets emitters](https://github.com/openvinotoolkit/openvino/tree/master/src/plugins/intel_cpu/src/emitters/snippets)).

## Development

Each emitter is linked with OpenVINO operation. For example for `plugin emitters`:
* Element-wise `plugin emitters` are linked in element-wise JIT kernel `jit_uni_eltwise_generic::create_eltwise_emitter` method which implementation depends on platform:
* [X64 specific](https://github.com/openvinotoolkit/openvino/blob/master/src/plugins/intel_cpu/src/nodes/eltwise.cpp)
* [X64 specific](https://github.com/openvinotoolkit/openvino/blob/master/src/plugins/intel_cpu/src/nodes/kernels/x64/jit_uni_eltwise_generic.cpp)
* [ARM64 SIMD specific](https://github.com/openvinotoolkit/openvino/blob/master/src/plugins/intel_cpu/src/nodes/kernels/aarch64/jit_uni_eltwise_generic.cpp)
* [RISCV64 RVV1.0 specific](https://github.com/openvinotoolkit/openvino/blob/master/src/plugins/intel_cpu/src/nodes/kernels/riscv64/jit_uni_eltwise_generic.cpp)

JIT emitters are inherited from `jit_emitter` base class. The base class implementation depends on architecture:
* X64: [jit_emitter.hpp](https://github.com/openvinotoolkit/openvino/blob/master/src/plugins/intel_cpu/src/emitters/plugin/x64/jit_emitter.hpp)
* ARM64: [jit_emitter.hpp](https://github.com/openvinotoolkit/openvino/blob/master/src/plugins/intel_cpu/src/emitters/plugin/aarch64/jit_emitter.hpp)
* RISCV64: [jit_emitter.hpp](https://github.com/openvinotoolkit/openvino/blob/master/src/plugins/intel_cpu/src/emitters/plugin/riscv64/jit_emitter.hpp)

### Class diagram
JIT emitters should be inherited from `jit_emitter` base class and it's usage should be added in platform dependent JIT kernel.
Expand Down Expand Up @@ -107,6 +110,7 @@ There are two types of tests instantiations which are used to test JIT emitters:
* [platform independent element-wise operation tests](https://github.com/openvinotoolkit/openvino/blob/master/src/plugins/intel_cpu/tests/functional/custom/single_layer_tests/instances/common/eltwise.cpp)
* [X64 specific element-wise operation tests](https://github.com/openvinotoolkit/openvino/blob/master/src/plugins/intel_cpu/tests/functional/custom/single_layer_tests/instances/x64/eltwise.cpp)
* [ARM64 element-wise operation tests](https://github.com/openvinotoolkit/openvino/blob/master/src/plugins/intel_cpu/tests/functional/custom/single_layer_tests/instances/arm/eltwise.cpp)
* [RISCV64 element-wise operation tests](https://github.com/openvinotoolkit/openvino/blob/master/src/plugins/intel_cpu/tests/functional/custom/single_layer_tests/instances/riscv64/eltwise.cpp)
* element-wise operations which are used as activations:
* [platform independent activation operation tests](https://github.com/openvinotoolkit/openvino/blob/master/src/plugins/intel_cpu/tests/functional/custom/single_layer_tests/instances/common/activation.cpp)
* [X64 activation operation tests](https://github.com/openvinotoolkit/openvino/blob/master/src/plugins/intel_cpu/tests/functional/custom/single_layer_tests/instances/x64/activation.cpp)
Original file line number Diff line number Diff line change
Expand Up @@ -17,23 +17,6 @@ using namespace dnnl::impl::utils;
using namespace dnnl::impl::cpu;
using namespace Xbyak_aarch64;

namespace {
ov::element::Type get_arithmetic_binary_exec_precision(const std::shared_ptr<ov::Node>& n) {
std::vector<ov::element::Type> input_precisions;
for (const auto& input : n->inputs()) {
input_precisions.push_back(input.get_source_output().get_element_type());
}

assert(std::all_of(input_precisions.begin(),
input_precisions.end(),
[&input_precisions](const ov::element::Type& precision) {
return precision == input_precisions[0];
}));

return input_precisions[0];
}
} // namespace

/// ABS ///
jit_abs_emitter::jit_abs_emitter(dnnl::impl::cpu::aarch64::jit_generator* host,
dnnl::impl::cpu::aarch64::cpu_isa_t host_isa,
Expand Down
Loading

0 comments on commit 40bf06e

Please sign in to comment.