Skip to content
This repository has been archived by the owner on Nov 17, 2023. It is now read-only.

[1.7.x] Fix memory leaks in Gluon #18358

Merged
merged 1 commit into from
May 19, 2020
Merged

[1.7.x] Fix memory leaks in Gluon #18358

merged 1 commit into from
May 19, 2020

Conversation

leezu
Copy link
Contributor

@leezu leezu commented May 18, 2020

Backport #18328

Fix leak of ndarray objects in the frontend due to reference cycle.

Backport of 3e676fc
@leezu leezu requested a review from szha as a code owner May 18, 2020 20:32
@mxnet-bot
Copy link

Hey @leezu , Thanks for submitting the PR
All tests are already queued to run once. If tests fail, you can trigger one or more tests again with the following commands:

  • To trigger all jobs: @mxnet-bot run ci [all]
  • To trigger specific jobs: @mxnet-bot run ci [job1, job2]

CI supported jobs: [unix-cpu, unix-gpu, windows-gpu, sanity, clang, centos-gpu, miscellaneous, edge, website, centos-cpu, windows-cpu]


Note:
Only following 3 categories can trigger CI :PR Author, MXNet Committer, Jenkins Admin.
All CI tests must pass before the PR can be merged.

@leezu leezu added Backport 1.x Pending backport to 1.x v1.x Targeting v1.x branch R1.7.0 labels May 18, 2020
@leezu
Copy link
Contributor Author

leezu commented May 19, 2020

@mxnet-bot run ci [all]

@ciyongch
Copy link
Contributor

@TaoLv @pengzhao-intel please take a review and help to merge it :)

@TaoLv TaoLv changed the title [1.7] Fix memory leaks in Gluon [1.7.x] Fix memory leaks in Gluon May 19, 2020
@TaoLv
Copy link
Member

TaoLv commented May 19, 2020

Please also make sure it's fixed on branch v1.x.

@TaoLv TaoLv merged commit c4d9270 into apache:v1.7.x May 19, 2020
@ciyongch
Copy link
Contributor

@TaoLv here's the PR for v1.x branch: #18359

@leezu leezu deleted the 17memleak branch May 19, 2020 02:40
@leezu
Copy link
Contributor Author

leezu commented May 22, 2020

@zhreshold should we revert this PR on the 1.7 and 1.x branch as weakref doesn't work with deepcopy?

https://github.com/dmlc/gluon-cv/blob/428ee05d7ae4f2955ef00380a1b324b05e6bc80f/gluoncv/data/transforms/presets/yolo.py#L144

(Though I'm surprised that copy.deepcopy worked in the first place; for example, using deepcopy probably leads to double free and/or use of invalid memory as you duplicate the ndarray handles of the parameters)

@zhreshold
Copy link
Member

@leezu Fix to mem leak is nice to have and I don't see any problem with this pariticular PR, however, deepcopy of a full network isn't a rare usecase which means a fix to deepcopy is needed anyway.

In fact, the behavior of __deepcopy__ of a Block has to be overrided with the _BlockScope being renewed in the new Block, and the suspicious copy behavior of parameters has to be verified and secured.

rondogency added a commit to rondogency/incubator-mxnet that referenced this pull request Jul 10, 2020
szha pushed a commit that referenced this pull request Jul 12, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Backport 1.x Pending backport to 1.x R1.7.0 v1.x Targeting v1.x branch
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants