Move initializers from subgraph to the main graph to reduce memory #12310

tianleiwu · 2022-07-25T23:18:34Z

Description:

Gpt2 beam search model uses 2.5x memory (For example, 0.6G model uses 1.5G memory). The cause is initializers kept in subgraph uses more memory in Model creation and resolving graph in ORT (It looks like a bug that duplicates initializers). Here we use a walkaround to move initializers from subgraph to main graph.

T5 cannot apply similar walkaround since ORT will fail. The cause need investigation so this PR only applies to gpt2.

Motivation and Context

Why is this change required? What problem does it solve?
If it fixes an open issue, please link to the issue here.

#12246

tianleiwu added 2 commits July 25, 2022 23:08

move initializers from decoder to the main graph

ef83057

update comments

52ceca4

tianleiwu requested a review from wangyems July 25, 2022 23:18

Merge branch 'master' into tlwu/beam_search_onnx_move_initializers

aafe6f4

tianleiwu requested a review from viboga July 25, 2022 23:48

wangyems approved these changes Jul 26, 2022

View reviewed changes

tianleiwu merged commit 51a7998 into master Jul 26, 2022

tianleiwu deleted the tlwu/beam_search_onnx_move_initializers branch July 26, 2022 18:30

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Move initializers from subgraph to the main graph to reduce memory #12310

Move initializers from subgraph to the main graph to reduce memory #12310

tianleiwu commented Jul 25, 2022 •

edited

Loading

Move initializers from subgraph to the main graph to reduce memory #12310

Move initializers from subgraph to the main graph to reduce memory #12310

Conversation

tianleiwu commented Jul 25, 2022 • edited Loading

tianleiwu commented Jul 25, 2022 •

edited

Loading