Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move initializers from subgraph to the main graph to reduce memory #12310

Merged
merged 3 commits into from
Jul 26, 2022

Conversation

tianleiwu
Copy link
Contributor

@tianleiwu tianleiwu commented Jul 25, 2022

Description:

Gpt2 beam search model uses 2.5x memory (For example, 0.6G model uses 1.5G memory). The cause is initializers kept in subgraph uses more memory in Model creation and resolving graph in ORT (It looks like a bug that duplicates initializers). Here we use a walkaround to move initializers from subgraph to main graph.

T5 cannot apply similar walkaround since ORT will fail. The cause need investigation so this PR only applies to gpt2.

Motivation and Context

  • Why is this change required? What problem does it solve?
  • If it fixes an open issue, please link to the issue here.

#12246

@tianleiwu tianleiwu requested a review from wangyems July 25, 2022 23:18
@tianleiwu tianleiwu requested a review from viboga July 25, 2022 23:48
@tianleiwu tianleiwu merged commit 51a7998 into master Jul 26, 2022
@tianleiwu tianleiwu deleted the tlwu/beam_search_onnx_move_initializers branch July 26, 2022 18:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants