Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CPU][RV64] Implemented JIT Kernel for Eltwise ops #28727

Open
wants to merge 40 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
40 commits
Select commit Hold shift + click to select a range
215040e
[CPU][RV64] Added xbyak riscv64 submodule
a-sidorova Jan 29, 2025
e3792b4
[CPU][RV64] Implemented jit_generator
a-sidorova Jan 29, 2025
45e0f60
[CPU][RV64] Implemented jit_uni_eltwise_generic, jit_emitter and jit_…
a-sidorova Jan 30, 2025
edc330b
[CPU][RV64] Added jit_rvv impl type
a-sidorova Feb 5, 2025
1961e33
[CPU][RV64] Supported Mul, Sub, Div
a-sidorova Feb 5, 2025
4e52a80
[CPU][RV64] Implemented li for size_t
a-sidorova Feb 5, 2025
dc45732
[CPU][RV64] Fixed isSupportedOp
a-sidorova Feb 5, 2025
cb093ca
[CPU][RV64] Updated context in jit_emitter
a-sidorova Feb 6, 2025
f6b5011
[CPU][RV64] Implemented get_max_lmul
a-sidorova Feb 6, 2025
b4475d7
[CPU][RV64] Supported aux gpr and vec for emitters
a-sidorova Feb 6, 2025
e3bcc12
[CPU][RV64] Updated broadcasing
a-sidorova Feb 6, 2025
7da50f9
[CPU][RV64] Implemented ReLU
a-sidorova Feb 6, 2025
dd2e61e
[CPU][RV64] Implemented LeakyRelu
a-sidorova Feb 9, 2025
9f0165e
[CPU][RV64] Implemented PRelu
a-sidorova Feb 10, 2025
a1244d6
[CPU][RV64] Implemented Clamp
a-sidorova Feb 10, 2025
15d5674
[CPU][RV64] Implement Exp
a-sidorova Feb 10, 2025
75b466b
[CPU][RV64] Implemented aux fpr registers
a-sidorova Feb 10, 2025
4eec942
[CPU][RV64] Implemented table
a-sidorova Feb 10, 2025
7d37c07
[CPU][RV64] Implemented Sigmoid
a-sidorova Feb 11, 2025
36c0baf
[CPU][RV64] Fixed Relu emitter
a-sidorova Feb 11, 2025
3b67c69
[CPU][RV64] Added isa template
a-sidorova Feb 11, 2025
5b3d31c
[CPU][RV64] Supported INT32 Add, Sub
a-sidorova Feb 11, 2025
f339480
[CPU][RV64] Updated name of some emitters
a-sidorova Feb 11, 2025
9866617
[CPU][RV64] Supported fusion
a-sidorova Feb 12, 2025
c0fc2e3
[CPU][RV64] Implemented check for RVV1.0 support
a-sidorova Feb 12, 2025
0a666e0
[CPU][RV64] Updated doc
a-sidorova Feb 12, 2025
3774e2c
[CPU][RV64] Updated CMakeLists.txt and GetPrimitiveType in tests
a-sidorova Feb 12, 2025
a24f696
[CPU][RV64] Fixed unused warning in cpu_isa_traits
a-sidorova Feb 12, 2025
4ac5ddb
[CPU][RV64] Fixed build
a-sidorova Feb 12, 2025
f2a82e6
[CPU][RV64] Minor self-review fixes
a-sidorova Feb 13, 2025
9901e08
[CPU][RV64] Implemented data section
a-sidorova Feb 13, 2025
a67a2aa
[CPU][RV64] Implemented limited PowerStatic
a-sidorova Feb 13, 2025
e70a2e1
[CPU][RV64] Some fixes in emitters
a-sidorova Feb 13, 2025
2cee4a5
[CPU][RV64] Updated namespaces
a-sidorova Feb 13, 2025
4da7006
[CPU][RV64] Fixed after-rebasing problems
a-sidorova Feb 13, 2025
59faeca
[CPU][RV64] Fixed binary call postamble/preamble
a-sidorova Feb 15, 2025
3d20f53
[CPU][RV64] Implement full support of PowerStatic
a-sidorova Feb 15, 2025
60e0714
[CPU][RV64] Fixed code-style
a-sidorova Feb 17, 2025
0cd24c8
[CPU][RV64] Updated xbyak + added align to emit_data
a-sidorova Feb 19, 2025
d73a5dc
[CPU][RV64] Removed jumps through data section
a-sidorova Feb 19, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions .gitmodules
Original file line number Diff line number Diff line change
Expand Up @@ -90,3 +90,6 @@
[submodule "src/plugins/intel_cpu/thirdparty/kleidiai"]
path = src/plugins/intel_cpu/thirdparty/kleidiai
url = https://git.gitlab.arm.com/kleidi/kleidiai.git
[submodule "src/plugins/intel_cpu/thirdparty/xbyak_riscv"]
path = src/plugins/intel_cpu/thirdparty/xbyak_riscv
url = https://github.com/herumi/xbyak_riscv.git
11 changes: 9 additions & 2 deletions docs/dev/build_riscv64.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,10 +18,17 @@ The software was validated on the following devices:
## How to build
Currently, there are three ways to build OpenVINO Runtime for 64-bit RISC-V platforms:

1. **Recommended**. The build with vectorized (using RVV instructions) primitives for limited scope of operations from [`SHL`](https://github.com/XUANTIE-RV/csi-nn2) using [`xuantie-gnu-toolchain`](https://github.com/XUANTIE-RV/). This GNU Compiler Toolchain supports RVV 0.7.1, ratified RVV 1.0 and Xuantie-specific instruction sets. The vector intrinsics don't use the common prefix `__riscv_`. This method provides the best performance available at the moment.
2. The build without optimized primitives using [`riscv-gnu-toolchain`](https://github.com/riscv-collab/riscv-gnu-toolchain.git). This GNU Compiler Toolchain supports RVV 0.7.1 and ratified RVV 1.0. The vector intrinsics use the common prefix `__riscv_`. However, as mentioned earlier, this build method doesn't yet provide optimized primitives implemented using the RVV intrinsics.
1. **Recommended**. The build with vectorized (using RVV intrinsics) primitives for limited scope of operations from [`SHL`](https://github.com/XUANTIE-RV/csi-nn2) operations using [`xuantie-gnu-toolchain`](https://github.com/XUANTIE-RV/).
This GNU Compiler Toolchain supports RVV 0.7.1, ratified RVV 1.0 and Xuantie-specific instruction sets.
The vector intrinsics don't use the common prefix `__riscv_`.
This method provides the best performance available at the moment.
2. The build without optimized primitives implemented with RVV intrinsics using [`riscv-gnu-toolchain`](https://github.com/riscv-collab/riscv-gnu-toolchain.git). This GNU Compiler Toolchain supports RVV 0.7.1 and ratified RVV 1.0. The vector intrinsics use the common prefix `__riscv_`. However, as mentioned earlier, this build method doesn't yet provide optimized primitives implemented using the RVV intrinsics.
3. The build without optimized primitives using installed Linux packages. The compilers in these packages don't support RVV intrinsics.

> **NOTE**: Currently CPU Plugin in OpenVINO supports [Just-In-Time (JIT) code generation](/home/a-sidorova/projects/riscv64/openvino/src/plugins/intel_cpu/src/emitters/README.md) for limited scope of Eltwise operations on devices with RVV 1.0.
All three described above ways to build OpenVINO Runtime for 64-bit RISC-V supports JIT code generation.
It means that if the device on which inference is executed has RVV 1.0 support (GCV), there will be Just-In-Time compiled optimized kernels for some Eltwise operations for the better performance.

### Steps

0. Prerequisite:
Expand Down
23 changes: 21 additions & 2 deletions src/plugins/intel_cpu/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -228,6 +228,11 @@ if(NOT AARCH64)
${CMAKE_CURRENT_SOURCE_DIR}/src/emitters/snippets/aarch64/*)
endif()

if (NOT RISCV64)
list(APPEND EXCLUDE_PATHS ${CMAKE_CURRENT_SOURCE_DIR}/src/emitters/plugin/riscv64/*
${CMAKE_CURRENT_SOURCE_DIR}/src/nodes/kernels/riscv64/*)
endif()

if (NOT ENABLE_MLAS_FOR_CPU)
list(APPEND EXCLUDE_PATHS ${CMAKE_CURRENT_SOURCE_DIR}/src/nodes/executors/mlas/*)
list(APPEND EXCLUDE_PATHS ${CMAKE_CURRENT_SOURCE_DIR}/src/mlas/*)
Expand Down Expand Up @@ -282,7 +287,12 @@ endif()
if(ENABLE_KLEIDIAI_FOR_CPU)
target_link_libraries(${TARGET_NAME} PRIVATE kleidiai)
endif()

if(RISCV64)
# Set `XBYAK_RISCV_V=1` to compile Xbyak-code for RVV-related instructions
target_compile_definitions(xbyak_riscv INTERFACE XBYAK_RISCV_V=1)
target_link_libraries(${TARGET_NAME} PRIVATE xbyak_riscv)
target_include_directories(${TARGET_NAME} SYSTEM INTERFACE $<TARGET_PROPERTY:xbyak_riscv::xbyak_riscv,INTERFACE_INCLUDE_DIRECTORIES>)
endif()
target_include_directories(${TARGET_NAME} SYSTEM PRIVATE $<TARGET_PROPERTY:dnnl,INCLUDE_DIRECTORIES>)

# Temporal solution to use template reference implementations in cases where optimizied implementation
Expand Down Expand Up @@ -358,7 +368,13 @@ set_target_properties(${TARGET_NAME} PROPERTIES INTERPROCEDURAL_OPTIMIZATION_REL

if(BUILD_SHARED_LIBS)
add_library(${TARGET_NAME}_obj OBJECT ${SOURCES} ${HEADERS})
ov_link_system_libraries(${TARGET_NAME}_obj PUBLIC dnnl openvino::pugixml)

set(CPU_OBJ_LINK_SYSTEM LIBRARIES dnnl openvino::pugixml)
if(RISCV64)
list(APPEND CPU_OBJ_LINK_SYSTEM xbyak_riscv::xbyak_riscv)
endif()

ov_link_system_libraries(${TARGET_NAME}_obj PUBLIC ${CPU_OBJ_LINK_SYSTEM})

ov_add_version_defines(src/plugin.cpp ${TARGET_NAME}_obj)

Expand Down Expand Up @@ -389,6 +405,9 @@ if(BUILD_SHARED_LIBS)
if(ENABLE_KLEIDIAI_FOR_CPU)
target_include_directories(${TARGET_NAME}_obj SYSTEM PUBLIC $<TARGET_PROPERTY:kleidiai,INTERFACE_INCLUDE_DIRECTORIES>)
endif()
if(RISCV64)
target_include_directories(${TARGET_NAME}_obj SYSTEM PUBLIC $<TARGET_PROPERTY:xbyak_riscv::xbyak_riscv,INTERFACE_INCLUDE_DIRECTORIES>)
endif()

ov_set_threading_interface_for(${TARGET_NAME}_obj)

Expand Down
6 changes: 5 additions & 1 deletion src/plugins/intel_cpu/src/emitters/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,19 +11,22 @@ Just-in-time (JIT) emitters is a type of emitter designed for just-in-time code
For JIT source code generation `Xbyak JIT Assembler` is used:
* [Xbyak for X64](https://github.com/herumi/xbyak)
* [Xbyak for ARM64](https://github.com/fujitsu/xbyak_aarch64)
* [Xbyak for RISCV64](https://github.com/herumi/xbyak_riscv)

Emitters are splitted into two groups based on their usage model: common plugin emitters for complex kernels ([plugin emitters](https://github.com/openvinotoolkit/openvino/tree/master/src/plugins/intel_cpu/src/emitters/plugin)) and basic blocks for tensor compiler ([snippets emitters](https://github.com/openvinotoolkit/openvino/tree/master/src/plugins/intel_cpu/src/emitters/snippets)).

## Development

Each emitter is linked with OpenVINO operation. For example for `plugin emitters`:
* Element-wise `plugin emitters` are linked in element-wise JIT kernel `jit_uni_eltwise_generic::create_eltwise_emitter` method which implementation depends on platform:
* [X64 specific](https://github.com/openvinotoolkit/openvino/blob/master/src/plugins/intel_cpu/src/nodes/eltwise.cpp)
* [X64 specific](https://github.com/openvinotoolkit/openvino/blob/master/src/plugins/intel_cpu/src/nodes/kernels/x64/jit_uni_eltwise_generic.cpp)
* [ARM64 SIMD specific](https://github.com/openvinotoolkit/openvino/blob/master/src/plugins/intel_cpu/src/nodes/kernels/aarch64/jit_uni_eltwise_generic.cpp)
* [RISCV64 RVV1.0 specific](https://github.com/openvinotoolkit/openvino/blob/master/src/plugins/intel_cpu/src/nodes/kernels/riscv64/jit_uni_eltwise_generic.cpp)

JIT emitters are inherited from `jit_emitter` base class. The base class implementation depends on architecture:
* X64: [jit_emitter.hpp](https://github.com/openvinotoolkit/openvino/blob/master/src/plugins/intel_cpu/src/emitters/plugin/x64/jit_emitter.hpp)
* ARM64: [jit_emitter.hpp](https://github.com/openvinotoolkit/openvino/blob/master/src/plugins/intel_cpu/src/emitters/plugin/aarch64/jit_emitter.hpp)
* RISCV64: [jit_emitter.hpp](https://github.com/openvinotoolkit/openvino/blob/master/src/plugins/intel_cpu/src/emitters/plugin/riscv64/jit_emitter.hpp)

### Class diagram
JIT emitters should be inherited from `jit_emitter` base class and it's usage should be added in platform dependent JIT kernel.
Expand Down Expand Up @@ -107,6 +110,7 @@ There are two types of tests instantiations which are used to test JIT emitters:
* [platform independent element-wise operation tests](https://github.com/openvinotoolkit/openvino/blob/master/src/plugins/intel_cpu/tests/functional/custom/single_layer_tests/instances/common/eltwise.cpp)
* [X64 specific element-wise operation tests](https://github.com/openvinotoolkit/openvino/blob/master/src/plugins/intel_cpu/tests/functional/custom/single_layer_tests/instances/x64/eltwise.cpp)
* [ARM64 element-wise operation tests](https://github.com/openvinotoolkit/openvino/blob/master/src/plugins/intel_cpu/tests/functional/custom/single_layer_tests/instances/arm/eltwise.cpp)
* [RISCV64 element-wise operation tests](https://github.com/openvinotoolkit/openvino/blob/master/src/plugins/intel_cpu/tests/functional/custom/single_layer_tests/instances/riscv64/eltwise.cpp)
* element-wise operations which are used as activations:
* [platform independent activation operation tests](https://github.com/openvinotoolkit/openvino/blob/master/src/plugins/intel_cpu/tests/functional/custom/single_layer_tests/instances/common/activation.cpp)
* [X64 activation operation tests](https://github.com/openvinotoolkit/openvino/blob/master/src/plugins/intel_cpu/tests/functional/custom/single_layer_tests/instances/x64/activation.cpp)
Loading
Loading