Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add GroupQueryAttention with KV-Cache #3425

Merged
merged 81 commits into from
Oct 11, 2024
Merged

Add GroupQueryAttention with KV-Cache #3425

merged 81 commits into from
Oct 11, 2024

Conversation

turneram
Copy link
Contributor

@turneram turneram commented Sep 6, 2024

No description provided.

@turneram turneram requested a review from pfultz2 September 6, 2024 17:57
@turneram turneram marked this pull request as ready for review September 11, 2024 19:10
@turneram turneram requested a review from causten as a code owner September 11, 2024 19:10
@turneram turneram requested a review from bpickrel September 11, 2024 19:14
@migraphx-bot
Copy link
Collaborator

Test Batch Rate new
725f34
Rate old
a1e339
Diff Compare
torchvision-resnet50 64 3,260.60 3,261.15 -0.02%
torchvision-resnet50_fp16 64 6,985.29 6,998.48 -0.19%
torchvision-densenet121 32 2,438.58 2,437.74 0.03%
torchvision-densenet121_fp16 32 4,107.52 4,082.54 0.61%
torchvision-inceptionv3 32 1,639.29 1,637.99 0.08%
torchvision-inceptionv3_fp16 32 2,762.34 2,759.69 0.10%
cadene-inceptionv4 16 776.94 777.37 -0.05%
cadene-resnext64x4 16 810.02 809.89 0.02%
slim-mobilenet 64 7,535.95 7,538.41 -0.03%
slim-nasnetalarge 64 211.84 211.83 0.01%
slim-resnet50v2 64 3,504.66 3,505.62 -0.03%
bert-mrpc-onnx 8 1,148.36 1,153.97 -0.49%
bert-mrpc-tf 1 465.22 463.62 0.35%
pytorch-examples-wlang-gru 1 423.64 416.54 1.71%
pytorch-examples-wlang-lstm 1 474.21 375.99 26.12% 🔆
torchvision-resnet50_1 1 783.58 787.35 -0.48%
cadene-dpn92_1 1 400.96 398.81 0.54%
cadene-resnext101_1 1 384.06 382.80 0.33%
onnx-taau-downsample 1 342.41 342.52 -0.03%
dlrm-criteoterabyte 1 33.33 33.35 -0.07%
dlrm-criteoterabyte_fp16 1 52.77 52.52 0.49%
agentmodel 1 10,167.28 8,468.76 20.06% 🔆
unet_fp16 2 58.94 58.99 -0.08%
resnet50v1_fp16 1 953.52 916.35 4.06% 🔆
resnet50v1_int8 1 980.77 972.58 0.84%
bert_base_cased_fp16 64 1,172.32 1,170.84 0.13%
bert_large_uncased_fp16 32 363.60 363.63 -0.01%
bert_large_fp16 1 201.32 199.03 1.15%
distilgpt2_fp16 16 2,204.29 2,203.30 0.05%
yolov5s 1 531.15 528.92 0.42%
tinyllama 1 43.42 43.76 -0.78%
vicuna-fastchat 1 171.63 173.75 -1.22%
whisper-tiny-encoder 1 417.81 418.39 -0.14%
whisper-tiny-decoder 1 427.35 427.20 0.04%

Check results before merge 🔆

@migraphx-bot
Copy link
Collaborator


     ✅ bert-mrpc-onnx: PASSED: MIGraphX meets tolerance

     ✅ bert-mrpc-tf: PASSED: MIGraphX meets tolerance

     ✅ pytorch-examples-wlang-gru: PASSED: MIGraphX meets tolerance

     ✅ pytorch-examples-wlang-lstm: PASSED: MIGraphX meets tolerance

     ✅ torchvision-resnet50_1: PASSED: MIGraphX meets tolerance

     ✅ cadene-dpn92_1: PASSED: MIGraphX meets tolerance

     ✅ cadene-resnext101_1: PASSED: MIGraphX meets tolerance

     ✅ dlrm-criteoterabyte: PASSED: MIGraphX meets tolerance

     ✅ agentmodel: PASSED: MIGraphX meets tolerance

     ✅ unet: PASSED: MIGraphX meets tolerance

     ✅ resnet50v1: PASSED: MIGraphX meets tolerance

     ✅ bert_base_cased_fp16: PASSED: MIGraphX meets tolerance

🔴bert_large_uncased_fp16: FAILED: MIGraphX is not within tolerance - check verbose output


     ✅ bert_large: PASSED: MIGraphX meets tolerance

     ✅ yolov5s: PASSED: MIGraphX meets tolerance

     ✅ tinyllama: PASSED: MIGraphX meets tolerance

     ✅ vicuna-fastchat: PASSED: MIGraphX meets tolerance

     ✅ whisper-tiny-encoder: PASSED: MIGraphX meets tolerance

     ✅ whisper-tiny-decoder: PASSED: MIGraphX meets tolerance

     ✅ distilgpt2_fp16: PASSED: MIGraphX meets tolerance

@causten causten merged commit 7370075 into develop Oct 11, 2024
29 checks passed
@causten causten deleted the gqa-jit branch October 11, 2024 15:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Perf Improve roadmap Tasks to finish for a release
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants