[Feature]: support layer size undividable by pp size in pipeline parallel inference #6114

youkaichao · 2024-07-03T18:51:15Z

🚀 The feature, motivation and pitch

tracking the issue here

Alternatives

No response

Additional context

No response

youkaichao · 2024-07-03T18:52:01Z

~~first step: merge init_device into worker.__init__, before we create the model runner and cache engine, so that they can know the pp rank.~~

initialize distributed environment has complicated interference with spec decode. the temporary solution is to store the rank in the parallel config.

youkaichao added the feature request New feature or request label Jul 3, 2024

youkaichao mentioned this issue Jul 3, 2024

[core][distributed] support layer size undividable by pp size in pipeline parallel inference #6115

Merged

youkaichao closed this as completed in #6115 Jul 3, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature]: support layer size undividable by pp size in pipeline parallel inference #6114

[Feature]: support layer size undividable by pp size in pipeline parallel inference #6114

youkaichao commented Jul 3, 2024

youkaichao commented Jul 3, 2024 •

edited

Loading

[Feature]: support layer size undividable by pp size in pipeline parallel inference #6114

[Feature]: support layer size undividable by pp size in pipeline parallel inference #6114

Comments

youkaichao commented Jul 3, 2024

🚀 The feature, motivation and pitch

Alternatives

Additional context

youkaichao commented Jul 3, 2024 • edited Loading

youkaichao commented Jul 3, 2024 •

edited

Loading