-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reduce memory usage for scripts with many Session objects #2934
Conversation
Codecov ReportPatch coverage:
📣 This organization is not using Codecov’s GitHub App Integration. We recommend you install it so Codecov can continue to function properly for your repositories. Learn more Additional details and impacted files@@ Coverage Diff @@
## develop #2934 +/- ##
========================================
Coverage 93.41% 93.41%
========================================
Files 63 63
Lines 13561 13571 +10
========================================
+ Hits 12668 12678 +10
Misses 893 893
☔ View full report in Codecov by Sentry. |
06ac876
to
5c44af8
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
⛵ Thanks @jonemo. Awesome write up!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we get a changelog entry too before we release?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
* release-1.29.131: Bumping version to 1.29.131 Update to latest partitions and endpoints Update to latest models Reduce memory usage for scripts with many Session objects (#2934) Fix changelog
Addresses boto/boto3#3614
Alternative to #2889
This retains the cache on
botocore.endpoint_provider.EndpointProvider.resolve_endpoint
while mitigating the effect on memory usage reported in boto/boto3#3614.Problem description
The current code results in excessive memory usage when a large number of botocore
Session
objects is created in a script. Previous to botocore version 1.29.0, Python's garbage collector would have cleaned these objects up in most situations. The introduction of theEndpointProvider
and the use oflru_cache
within now results in up to 100Session
objects staying memory while they are referenced in the cache.Solution in this PR:
This PR proposes a solution where Python's
weakref
module is used to temporarily replace the full reference with a weak reference during the cacheing process. This way, the cache no longer interferes with the garbage collector but otherwise performs the same function. This comes at the expense of a small computational overhead for creating (always) and resolving (for cache misses) the weak reference.Other solutions considered:
instance_cache
decorator (Replacelru_cache
withinstance_cache
#2889): For long-lived sessions where each operation call results in a cache miss, this can lead to indefinitely growing memory usage. This is becauseinstance_cache
has nomaxsize
parameter.lru_cache
in the constructor (Memory leak after updating to 1.25.0 boto3#3614 (comment)): This solution addresses the problem and performs well on both memory and runtime metrics on all Python versions we tested. However, it relies on an undocumented way of using thelru_cache
decorate (namely: not as a decorator) and could therefore result in unexpected behavior changes in future Python versions.lru_cache
in place but reducemaxsize
to 10: This avoids the excessive memory usage but still uses on the order of MBs more than the solution in this PR with a maxsize of 100.Benchmarking results:
Each line represents a single run.