Optimize the input state-vector copy into the LGPU #1071

LuisAlfredoNu · 2025-02-27T15:25:08Z

Context:
After running different algorithm with LGPU and perform a memory profile. Show a memory bottleneck for LGPU on the Python layer because the peak of memory is 3 times the need for the computation.

Description of the Change:
Remove tmp allocation and skip indexes computation for common cases.

Remove temporal GPU allocation for input values and indexes.
The input state vector is copied directly from the host if the target wires are contiguous and start in the most/least significant wires (which are the most common cases).
In the case of custom target wires, LGPU follow the previous algorithm but with a speedup in the index computation thought parallel computing with OpenMP

Benefits:
Using a test algorithm with 31 qubits produce the following memory profile:

Reduction of the memory peak from 100GB to 66GB

Note: memray measures all the memory allocation, even for the GPU cudaMallocX.

Using the following toy circuit

   state_init =  random_normalize_sv(wires-1)
   target_wires = wires[:-1]
   dev = qml.device("lightning.gpu", wires=wires)

    def circuit():
        qml.StatePrep(input_state, wires=target_wires)
                
        return qml.expval(qml.PauliZ(0))

Produce the following times

Possible Drawbacks:

Related GitHub Issues:
[sc-58833]

github-actions · 2025-02-27T15:25:25Z

Hello. You may have forgotten to update the changelog!
Please edit .github/CHANGELOG.md with:

A one-to-two sentence description of the change. You may include a small working example for new features.
A link back to this PR.
Your name (or GitHub username) in the contributors section.

codecov · 2025-02-27T19:10:10Z

Codecov Report

Attention: Patch coverage is 96.59091% with 3 lines in your changes missing coverage. Please review.

Project coverage is 98.11%. Comparing base (109db9f) to head (901be60).
Report is 1 commits behind head on master.

Files with missing lines	Patch %	Lines
...lightning/core/src/utils/cuda_utils/DataBuffer.hpp	83.33%	2 Missing ⚠️
pennylane_lightning/lightning_gpu/_state_vector.py	0.00%	1 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##           master    #1071      +/-   ##
==========================================
+ Coverage   97.99%   98.11%   +0.12%     
==========================================
  Files         233      232       -1     
  Lines       40019    39268     -751     
==========================================
- Hits        39215    38527     -688     
+ Misses        804      741      -63

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

…ne-lightning into optimize_memory_lgpu

pennylane_lightning/core/src/simulators/lightning_gpu/StateVectorCudaManaged.hpp

AmintorDusko

Good job! A great improvement for memory usage. I assume you checked the codecov warning and they are all fake, right?

pennylane_lightning/core/src/simulators/lightning_gpu/StateVectorCudaManaged.hpp

...tning/core/src/simulators/lightning_gpu/gates/tests/Test_StateVectorCudaManaged_NonParam.cpp

multiphaseCFD

Nice one. Thanks @LuisAlfredoNu

pennylane_lightning/core/_state_vector_base.py

pennylane_lightning/core/src/simulators/lightning_gpu/StateVectorCudaManaged.hpp

pennylane_lightning/core/src/utils/cuda_utils/DataBuffer.hpp

pennylane_lightning/core/src/simulators/lightning_gpu/StateVectorCudaManaged.hpp

...tning/core/src/simulators/lightning_gpu/gates/tests/Test_StateVectorCudaManaged_NonParam.cpp

AmintorDusko

Thank you for your nice job!

maliasadi

Awesome work! Thanks @LuisAlfredoNu 🥇

Don't forget to add your PR to the changelog :)

pennylane_lightning/lightning_gpu/_state_vector.py

pennylane_lightning/core/src/utils/cuda_utils/DataBuffer.hpp

pennylane_lightning/core/src/simulators/lightning_gpu/StateVectorCudaManaged.hpp

maliasadi

Happy to approve 🥳

LuisAlfredoNu added 2 commits February 14, 2025 02:16

adding toys codes

9b91437

set stateprep directly from input sv

98e5097

ringo-but-quantum and others added 6 commits February 27, 2025 15:25

Auto update version from '0.41.0-dev20' to '0.41.0-dev26'

593e27d

remove help files

c8884fd

apply format

fb1fbde

avoid stateprep copies

2f76241

Merge branch 'master' into optimize_memory_lgpu

e092fd2

Auto update version from '0.41.0-dev25' to '0.41.0-dev26'

87a6598

LuisAlfredoNu added 2 commits February 27, 2025 19:26

commint to merge

848003f

Merge branch 'optimize_memory_lgpu' of github.com:PennyLaneAI/pennyla…

01281a1

…ne-lightning into optimize_memory_lgpu

multiphaseCFD reviewed Feb 27, 2025

View reviewed changes

pennylane_lightning/core/src/simulators/lightning_gpu/StateVectorCudaManaged.hpp Outdated Show resolved Hide resolved

multiphaseCFD reviewed Feb 27, 2025

View reviewed changes

pennylane_lightning/core/src/simulators/lightning_gpu/StateVectorCudaManaged.hpp Outdated Show resolved Hide resolved

LuisAlfredoNu added 4 commits February 27, 2025 20:16

adding cpp test

f973b37

add documentation

9d9dfc7

remove commnst

fb0df26

apply format

d68d63e

LuisAlfredoNu marked this pull request as ready for review February 27, 2025 22:24

LuisAlfredoNu requested review from multiphaseCFD, a team, maliasadi and josephleekl February 27, 2025 22:24

LuisAlfredoNu added the ci:use-gpu-runner Enable usage of GPU runner for this Pull Request label Feb 27, 2025

trigger CIs

4630afd

AmintorDusko reviewed Feb 28, 2025

View reviewed changes

multiphaseCFD reviewed Feb 28, 2025

View reviewed changes

LuisAlfredoNu added 2 commits February 28, 2025 16:22

Shuli and Amintor comments

1e9dcc6

apply format

5ed4c6f

LuisAlfredoNu requested a review from multiphaseCFD February 28, 2025 18:27

LuisAlfredoNu requested a review from AmintorDusko February 28, 2025 18:27

josephleekl reviewed Mar 3, 2025

View reviewed changes

pennylane_lightning/core/src/simulators/lightning_gpu/StateVectorCudaManaged.hpp Outdated Show resolved Hide resolved

LuisAlfredoNu and others added 3 commits March 3, 2025 22:31

avoid error of consecutive sorted but not significant

45a3c8b

Merge branch 'master' into optimize_memory_lgpu

db8d53c

Auto update version from '0.41.0-dev26' to '0.41.0-dev27'

d09ba63

AmintorDusko reviewed Mar 4, 2025

View reviewed changes

...tning/core/src/simulators/lightning_gpu/gates/tests/Test_StateVectorCudaManaged_NonParam.cpp Show resolved Hide resolved

AmintorDusko approved these changes Mar 4, 2025

View reviewed changes

LuisAlfredoNu and others added 2 commits March 4, 2025 15:42

Amintro's comments

2ddd6c0

Auto update version from '0.41.0-dev27' to '0.41.0-dev29'

82fc5b4

maliasadi reviewed Mar 4, 2025

View reviewed changes

LuisAlfredoNu and others added 6 commits March 5, 2025 09:26

Merge branch 'master' into optimize_memory_lgpu

be00a67

Auto update version from '0.41.0-dev28' to '0.41.0-dev29'

1cf38a0

Ali`s comments

073fcdb

add changelog

bcbe8df

apply format

2026ac6

trigger CIs

901be60

maliasadi approved these changes Mar 5, 2025

View reviewed changes

LuisAlfredoNu merged commit 9e26a91 into master Mar 5, 2025
91 of 92 checks passed

LuisAlfredoNu deleted the optimize_memory_lgpu branch March 5, 2025 18:21

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimize the input state-vector copy into the LGPU #1071

Optimize the input state-vector copy into the LGPU #1071

LuisAlfredoNu commented Feb 27, 2025 •

edited

Loading

github-actions bot commented Feb 27, 2025

codecov bot commented Feb 27, 2025 •

edited

Loading

AmintorDusko left a comment

multiphaseCFD left a comment

AmintorDusko left a comment

maliasadi left a comment

maliasadi left a comment

Optimize the input state-vector copy into the LGPU #1071

Optimize the input state-vector copy into the LGPU #1071

Conversation

LuisAlfredoNu commented Feb 27, 2025 • edited Loading

github-actions bot commented Feb 27, 2025

codecov bot commented Feb 27, 2025 • edited Loading

Codecov Report

AmintorDusko left a comment

Choose a reason for hiding this comment

multiphaseCFD left a comment

Choose a reason for hiding this comment

AmintorDusko left a comment

Choose a reason for hiding this comment

maliasadi left a comment

Choose a reason for hiding this comment

maliasadi left a comment

Choose a reason for hiding this comment

LuisAlfredoNu commented Feb 27, 2025 •

edited

Loading

codecov bot commented Feb 27, 2025 •

edited

Loading