Accelerate multi-qubit gates #490

vincentmr · 2023-08-29T14:13:15Z

Before submitting

Please complete the following checklist when submitting a PR:

All new features must include a unit test.
If you've fixed a bug or added code that should be tested, add a test to the
tests directory!
All new functions and code must be clearly commented and documented.
If you do make documentation changes, make sure that the docs build and
render correctly by running make docs.
Ensure that the test suite passes, by running make test.
Add a new entry to the .github/CHANGELOG.md file, summarizing the
change, and including a link back to the PR.
Ensure that code is properly formatted by running make format.

When all the above are checked, delete everything above the dashed
line and fill in the pull request template.

Context:
This PR is a follow-up on #489 . The general scheme for multi-qubit gates uses three layers of parallelism with team policies. This introduces several parameters which should be tuned for optimal performance, but are currently left to Kokkos' heuristics to decide. On the other hand, the straightforward range policy-based scheme of the 1- and 2-qubit kernels outperforms the general scheme significantly.

I introduce specialized 3- to 5-qubit kernels. I draw the following conclusion:

Range-policy kernels have the same performance as the team-policy one up to 4-qubits on the OPENMP backend and are then slower.
Range-policy kernels are faster than the team-policy one up to at least 5-qubits on the CUDA and HIP backends.

The following figures show timings to apply a QubitUnitary for OPENMP, CUDA and HIP respectively.

Description of the Change:
Introduce specialized 3- to 5-qubit unitary gate kernels. Refactor applyMultiQubitOp wrapper in StateVectorKokkos.hpp. Functors are not templated on inverse anymore, taking the conjugate-transpose once and for all upon entering applyMultiQubitOp instead of on-the-fly for each element of the for loop. Add few tests.

Benefits:
Faster QubitUnitary, especially for 3+-qubit observables on GPU-devices.

Possible Drawbacks:
None

Related GitHub Issues:

…ata` to work with devices. M pennylane_lightning/core/src/simulators/lightning_kokkos/StateVectorKokkos.hpp; `applyMatrix` bugfix: use intermediate hostview to copy matrix data; same bugfix for `getDataVector`. M pennylane_lightning/core/src/simulators/lightning_kokkos/algorithms/AdjointJacobianKokkos.hpp; use copy constructor. M pennylane_lightning/core/src/simulators/lightning_kokkos/measurements/MeasurementsKokkos.hpp; use copy constructor. M pennylane_lightning/core/src/simulators/lightning_kokkos/observables/ObservablesKokkos.hpp; use copy constructor. M requirements-dev.txt; add clang-format-14.

… vector data in adjoint-diff.

…calls into two templated methods. Call specialized expval methods when possible. Remove obsolete 'Apply directly' tests.

…alueMultiQubitOpFunctor.

vincentmr · 2023-09-07T12:47:00Z

Will come back to have a look once the GPU CI is added.

I'm unsure when we'll have a runner with CUDA-12 (and not sure we'll have any runner with HIP-capable devices any time soon), so could we move forward with this PR nevertheless?

multiphaseCFD · 2023-09-07T13:04:02Z

Will come back to have a look once the GPU CI is added.

I'm unsure when we'll have a runner with CUDA-12 (and not sure we'll have any runner with HIP-capable devices any time soon), so could we move forward with this PR nevertheless?

yea, sounds good to me!

..._lightning/core/src/simulators/lightning_kokkos/gates/tests/Test_StateVectorKokkos_Param.cpp

pennylane_lightning/core/src/simulators/lightning_kokkos/StateVectorKokkos.hpp

…/tests/Test_StateVectorKokkos_Param.cpp Co-authored-by: Lee James O'Riordan <[email protected]>

pennylane_lightning/core/src/simulators/lightning_kokkos/gates/GateFunctorsParam.hpp

AmintorDusko

Hi Vincent, Nice work!
Thank you for that!

.

pennylane_lightning/core/src/simulators/lightning_kokkos/measurements/ExpValFunctors.hpp

AmintorDusko

Once again, thank you for the nice job!

mlxd

Thanks @vincentmr

vincentmr and others added 30 commits August 21, 2023 10:52

Auto update version

27b54eb

Update changelog.

8ad1a26

Merge branch 'master' into bugfix/cuda12

8375e7d

Merge branch 'master' into bugfix/cuda12

1098402

Auto update version

68881d1

Merge branch 'master' into bugfix/cuda12

e3df23b

Auto update version

fcc7fa3

Add an argument to adjointJacobian to avoid syncing and copying state…

48d9615

… vector data in adjoint-diff.

Reformat

3248276

trigger CI

504c228

[skip ci] Update changelog.

27f8e81

Introduce std::unordered_map<std::string, ExpValFunc> expval_funcs_.

c45cd23

Introduce applyExpectationValueFunctor.

33ff620

Add binding to LKokkos expval(matrix, wires). Combine expval functor …

e0d3212

…calls into two templated methods. Call specialized expval methods when possible. Remove obsolete 'Apply directly' tests.

Update changelog.

4305edc

Add test for arbitrary expval(Hermitian).

5595e3c

Add getExpectationValueMultiQubitOpFunctor.

22c47f4

Add typename hint for macos.

1e1565d

Add typename macos.

614e4de

Use Kokkos::ThreadVectorRange policy for innerloop in getExpectationV…

b1afba8

…alueMultiQubitOpFunctor.

Merge branch 'master' into bugfix/cuda12

9142b16

Auto update version

3b3ee66

Merge branch 'bugfix/cuda12' into accel/expval

7b22095

Merge branch 'master' into bugfix/cuda12

2c7cefc

Auto update version

6dc7883

Couple fix for HIP.

53b48d2

Merge branch 'bugfix/cuda12' into accel/expval

cb43f40

WIP

d31f1fa

Add specialized 3-5 qubit expval functors.

51b7497

vincentmr and others added 2 commits September 7, 2023 05:41

Update changelog.

e3bcade

Merge branch 'template/expval' into accel/mqgate_tmp

a82c78f

Base automatically changed from template/expval to master September 7, 2023 14:35

vincentmr and others added 3 commits September 7, 2023 07:49

Merge remote-tracking branch 'origin/master' into accel/mqgate_tmp

a95c429

Auto update version

50314c2

Fix codefactor error.

9c69ca2

mlxd reviewed Sep 7, 2023

View reviewed changes

vincentmr and others added 3 commits September 7, 2023 15:01

Update pennylane_lightning/core/src/simulators/lightning_kokkos/gates…

14302d4

…/tests/Test_StateVectorKokkos_Param.cpp Co-authored-by: Lee James O'Riordan <[email protected]>

Auto update version

583241c

Merge branch 'master' into accel/mqgate_tmp

5779d59

AmintorDusko reviewed Sep 7, 2023

View reviewed changes

pennylane_lightning/core/src/simulators/lightning_kokkos/gates/GateFunctorsParam.hpp Outdated Show resolved Hide resolved

pennylane_lightning/core/src/simulators/lightning_kokkos/gates/GateFunctorsParam.hpp Outdated Show resolved Hide resolved

vincentmr requested a review from mlxd September 7, 2023 19:32

Define i000 vars more explicitly outside the macros.

f7e9846

vincentmr requested review from AmintorDusko and multiphaseCFD September 8, 2023 12:39

AmintorDusko previously approved these changes Sep 8, 2023

View reviewed changes

AmintorDusko reviewed Sep 8, 2023

View reviewed changes

pennylane_lightning/core/src/simulators/lightning_kokkos/measurements/ExpValFunctors.hpp Show resolved Hide resolved

AmintorDusko reviewed Sep 8, 2023

View reviewed changes

pennylane_lightning/core/src/simulators/lightning_kokkos/measurements/ExpValFunctors.hpp Show resolved Hide resolved

vincentmr and others added 5 commits September 8, 2023 11:36

Merge branch 'master' into accel/mqgate_tmp

331b652

trigger CI

42fb09e

Auto update version

8d9ed35

trigger CI

d6c0932

trigger CI

4bed0cb

AmintorDusko approved these changes Sep 8, 2023

View reviewed changes

mlxd approved these changes Sep 8, 2023

View reviewed changes

vincentmr merged commit 3fbd4ce into master Sep 8, 2023

vincentmr deleted the accel/mqgate_tmp branch September 8, 2023 20:55

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Accelerate multi-qubit gates #490

Accelerate multi-qubit gates #490

vincentmr commented Aug 29, 2023 •

edited

Loading

vincentmr commented Sep 7, 2023

multiphaseCFD commented Sep 7, 2023

AmintorDusko left a comment

AmintorDusko left a comment

mlxd left a comment

Accelerate multi-qubit gates #490

Accelerate multi-qubit gates #490

Conversation

vincentmr commented Aug 29, 2023 • edited Loading

Before submitting

vincentmr commented Sep 7, 2023

multiphaseCFD commented Sep 7, 2023

AmintorDusko left a comment

Choose a reason for hiding this comment

AmintorDusko left a comment

Choose a reason for hiding this comment

mlxd left a comment

Choose a reason for hiding this comment

vincentmr commented Aug 29, 2023 •

edited

Loading