Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Disco] Add loader for presharded params. #15957

Merged
merged 3 commits into from
Nov 9, 2023

Conversation

Lunderberg
Copy link
Contributor

Prior to this commit, sharding of model weights was always performed when initializing the model. This could cause slow initialization, especially for larger numbers of GPUs, as all model weights are initially transferred to GPU-0, before being scattered to all workers.

This commit updates the tvm::runtime::ShardLoaderObj to also allow loading of pre-sharded model weights. With pre-sharded model weights, the tensors are sharded while the model is being built, and each worker independently loads the specific model weights that it requires.

@Lunderberg
Copy link
Contributor Author

Lunderberg commented Oct 20, 2023

This PR was developed in collaboration with @csullivan, and is based on #15676.

@Lunderberg
Copy link
Contributor Author

Rebased onto main to re-run CI, as 2-week-old CI results are a bit stale for my preferences.

@junrushao Could I get a review on this PR?

Prior to this commit, sharding of model weights was always performed
when initializing the model.  This could cause slow initialization,
especially for larger numbers of GPUs, as all model weights are
initially transferred to GPU-0, before being scattered to all workers.

This commit updates the `tvm::runtime::ShardLoaderObj` to also allow
loading of pre-sharded model weights.  With pre-sharded model weights,
the tensors are sharded while the model is being built, and each
worker independently loads the specific model weights that it
requires.
@Lunderberg Lunderberg force-pushed the disco_load_presharded_params branch from 9f019ca to 8a62451 Compare November 6, 2023 20:00
Copy link
Member

@masahi masahi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I confirmed that it works. @junrushao Any concerns in merging this?

@Lunderberg Lunderberg merged commit e359e7a into apache:unity Nov 9, 2023
@Lunderberg Lunderberg deleted the disco_load_presharded_params branch November 9, 2023 14:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants