Skip to content

Commit

Permalink
Wenxh/fp8 on a100 v5 (#1)
Browse files Browse the repository at this point in the history
Group Gemm Version
  • Loading branch information
wenxcs authored Jun 14, 2024
1 parent 03e3bda commit aca4a33
Show file tree
Hide file tree
Showing 11 changed files with 997 additions and 446 deletions.
8 changes: 7 additions & 1 deletion requirements-cuda.txt
Original file line number Diff line number Diff line change
Expand Up @@ -8,4 +8,10 @@ vllm-nccl-cu12>=2.18,<2.19 # for downloading nccl library
torch == 2.2.1
xformers == 0.0.25 # Requires PyTorch 2.2.1

cupy-cuda12x
# Dependencies for pycublas-moe-groupe-gemm
gitpython
pytest
loguru
# In case of invalide url, please install from this file:
# pip install gitpython pytest loguru vllm/model_executor/layers/fused_moe/pycublas.zip
git+https://github.com/wenxcs/pycublas.git@moe-group-gemm
Loading

0 comments on commit aca4a33

Please sign in to comment.