[Model] use FusedMoE layer in Jamba #6935

avshalomman · 2024-07-30T09:10:42Z

This PR replaces the Jamba-specific MoE layer impl with the standard vLLM FusedMoE layer.

github-actions · 2024-07-30T09:10:55Z

👋 Hi! Thank you for contributing to the vLLM project.
Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which consists a small and essential subset of CI tests to quickly catch errors. You can run other CI tests on top of default ones by unblocking the steps in your fast-check build on Buildkite UI.

Once the PR is approved and ready to go, please make sure to run full CI as it is required to merge (or just use auto-merge).

To run full CI, you can do one of these:

Comment /ready on the PR
Add ready label to the PR
Enable auto-merge.

🚀

avshalomman · 2024-07-30T12:44:22Z

/ready

robertgshaw2-redhat · 2024-07-30T13:01:48Z

Thanks for making this change! Did not realize Jamba had FusedMoE during the refactor

avshalomman · 2024-07-30T13:24:59Z

I see that some tests in the extended suite are failing but they don't seem to be related

mgoin

LGTM but is there an active test in CI to ensure this works?

vllm/model_executor/models/jamba.py

avshalomman · 2024-07-31T06:42:25Z

LGTM but is there an active test in CI to ensure this works?

There are 10 general tests for the Jamba model, which pass after the change to the FusedMoE layer

avshalomman · 2024-07-31T07:21:05Z

/ready

mgoin

Thanks for the answers, LGTM

avshalomman · 2024-07-31T18:13:27Z

Thanks for the answers, LGTM

Thanks! How can I re-run the failing tests? These seem like transient, unrelated errors

Signed-off-by: Alvant <[email protected]>

use FusedMoE layer in Jamba

71b0b53

fix ruff

6ffe62b

avshalomman changed the title ~~use FusedMoE layer in Jamba~~ [Model] use FusedMoE layer in Jamba Jul 30, 2024

github-actions bot added the ready ONLY add when PR is ready to merge/full CI is needed label Jul 30, 2024

mgoin reviewed Jul 30, 2024

View reviewed changes

vllm/model_executor/models/jamba.py Show resolved Hide resolved

avshalomman added 2 commits July 31, 2024 10:07

initiating ci

061ce18

resorting imports

78a2b85

mgoin approved these changes Jul 31, 2024

View reviewed changes

simon-mo merged commit 2ee8d3b into vllm-project:main Jul 31, 2024
68 of 72 checks passed

dtrifiro mentioned this pull request Aug 5, 2024

Sync with [email protected] opendatahub-io/vllm#120

Closed

kylesayrs pushed a commit to neuralmagic/vllm that referenced this pull request Aug 17, 2024

[Model] use FusedMoE layer in Jamba (vllm-project#6935)

26cbbbe

Alvant pushed a commit to compressa-ai/vllm that referenced this pull request Oct 26, 2024

[Model] use FusedMoE layer in Jamba (vllm-project#6935)

5337a43

Signed-off-by: Alvant <[email protected]>

KuntaiDu pushed a commit to KuntaiDu/vllm that referenced this pull request Nov 20, 2024

[Model] use FusedMoE layer in Jamba (vllm-project#6935)

db0343b

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Model] use FusedMoE layer in Jamba #6935

[Model] use FusedMoE layer in Jamba #6935

avshalomman commented Jul 30, 2024

github-actions bot commented Jul 30, 2024

avshalomman commented Jul 30, 2024

robertgshaw2-redhat commented Jul 30, 2024

avshalomman commented Jul 30, 2024

mgoin left a comment

avshalomman commented Jul 31, 2024

avshalomman commented Jul 31, 2024

mgoin left a comment

avshalomman commented Jul 31, 2024

[Model] use FusedMoE layer in Jamba #6935

[Model] use FusedMoE layer in Jamba #6935

Conversation

avshalomman commented Jul 30, 2024

github-actions bot commented Jul 30, 2024

avshalomman commented Jul 30, 2024

robertgshaw2-redhat commented Jul 30, 2024

avshalomman commented Jul 30, 2024

mgoin left a comment

Choose a reason for hiding this comment

avshalomman commented Jul 31, 2024

avshalomman commented Jul 31, 2024

mgoin left a comment

Choose a reason for hiding this comment

avshalomman commented Jul 31, 2024