Skip to content

[Neuron][Kernel] Vectorize KV cache load in FlashPagedAttention to maximize DMA bandwidth#13245

Merged
simon-mo merged 12 commits intovllm-project:mainfrom lingfanyu:fast_vectorized_dmaFeb 21, 2025

Commits

Commits on Feb 13, 2025

Commits on Feb 14, 2025

Commits on Feb 18, 2025

Commits on Feb 19, 2025