Skip to content

Commit 5b43dbe

Browse files
mlxdmaliasadi
andauthored
Update doc/lightning_qubit/development/avx_kernels/kernel_tuning.rst
Co-authored-by: Ali Asadi <[email protected]>
1 parent 479c287 commit 5b43dbe

File tree

1 file changed

+1
-1
lines changed

1 file changed

+1
-1
lines changed

doc/lightning_qubit/development/avx_kernels/kernel_tuning.rst

+1-1
Original file line numberDiff line numberDiff line change
@@ -8,6 +8,6 @@ However, sometimes we may want to modify the above defaults to favour a given wo
88
OpenMP threaded kernels
99
-----------------------
1010

11-
To enable OpenMP acceleration of the gate kernels, Lightning-Qubit can be compiled with the `-DLQ_ENABLE_KERNEL_OMP=ON` CMake flag. Not, that for gradient workloads with many observables, this may reduce performance in comparison with the default mode, so this behaviour is opt-in only.
11+
To enable OpenMP acceleration of the gate kernels, Lightning-Qubit can be compiled with the ``-DLQ_ENABLE_KERNEL_OMP=ON`` CMake flag. Not, that for gradient workloads with many observables, this may reduce performance in comparison with the default mode, so this behaviour is opt-in only.
1212

1313
For workloads that show benefit from the use of threaded gate kernels, sometimes updating the CPU cache to accommodate recently modified data can become a bottleneck, and saturates the performance gained at high thread counts. This may be alleviated somewhat on systems supporting AVX2 and AVX-512 operations using the ``-DLQ_ENABLE_KERNEL_AVX_STREAMING=on`` CMake flag. This forces the data to avoid updating the CPU cache and can improve performance for larger workloads.

0 commit comments

Comments
 (0)