Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce memory usage for scripts with many Session objects #2934

Merged
merged 3 commits into from
May 9, 2023

Conversation

jonemo
Copy link
Contributor

@jonemo jonemo commented May 8, 2023

Addresses boto/boto3#3614
Alternative to #2889

This retains the cache on botocore.endpoint_provider.EndpointProvider.resolve_endpoint while mitigating the effect on memory usage reported in boto/boto3#3614.

Problem description

The current code results in excessive memory usage when a large number of botocore Session objects is created in a script. Previous to botocore version 1.29.0, Python's garbage collector would have cleaned these objects up in most situations. The introduction of the EndpointProvider and the use of lru_cache within now results in up to 100 Session objects staying memory while they are referenced in the cache.

Solution in this PR:

This PR proposes a solution where Python's weakref module is used to temporarily replace the full reference with a weak reference during the cacheing process. This way, the cache no longer interferes with the garbage collector but otherwise performs the same function. This comes at the expense of a small computational overhead for creating (always) and resolving (for cache misses) the weak reference.

Other solutions considered:

  • Remove the cache entirely: This is the best solution for the use case with many sessions. However, creating many sessions is only required in rare circumstances. Session object reuse is the recommended best practice and more common use case. For long-lived sessions where only a few services and operations are called, the cache has positive impact on runtime that we want to retain.
  • Use botocore's instance_cache decorator (Replace lru_cache with instance_cache #2889): For long-lived sessions where each operation call results in a cache miss, this can lead to indefinitely growing memory usage. This is because instance_cache has no maxsize parameter.
  • Use lru_cache in the constructor (Memory leak after updating to 1.25.0 boto3#3614 (comment)): This solution addresses the problem and performs well on both memory and runtime metrics on all Python versions we tested. However, it relies on an undocumented way of using the lru_cache decorate (namely: not as a decorator) and could therefore result in unexpected behavior changes in future Python versions.
  • Keep lru_cache in place but reduce maxsize to 10: This avoids the excessive memory usage but still uses on the order of MBs more than the solution in this PR with a maxsize of 100.

Benchmarking results:

multiplot

multiplot

multiplot

Each line represents a single run.

@codecov-commenter
Copy link

codecov-commenter commented May 8, 2023

Codecov Report

Patch coverage: 100.00% and no project coverage change.

Comparison is base (c95feba) 93.41% compared to head (5c44af8) 93.41%.

📣 This organization is not using Codecov’s GitHub App Integration. We recommend you install it so Codecov can continue to function properly for your repositories. Learn more

Additional details and impacted files
@@           Coverage Diff            @@
##           develop    #2934   +/-   ##
========================================
  Coverage    93.41%   93.41%           
========================================
  Files           63       63           
  Lines        13561    13571   +10     
========================================
+ Hits         12668    12678   +10     
  Misses         893      893           
Impacted Files Coverage Δ
botocore/endpoint_provider.py 99.02% <100.00%> (-0.01%) ⬇️
botocore/utils.py 79.35% <100.00%> (+0.14%) ⬆️

☔ View full report in Codecov by Sentry.
📢 Do you have feedback about the report comment? Let us know in this issue.

@jonemo jonemo force-pushed the fix-lru-cache-memory-usage branch from 06ac876 to 5c44af8 Compare May 8, 2023 21:40
@jonemo jonemo requested a review from nateprewitt May 9, 2023 16:26
Copy link
Contributor

@nateprewitt nateprewitt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⛵ Thanks @jonemo. Awesome write up!

Copy link
Contributor

@nateprewitt nateprewitt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we get a changelog entry too before we release?

Copy link
Contributor

@nateprewitt nateprewitt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit:

@jonemo jonemo merged commit 5e6bb36 into boto:develop May 9, 2023
@jonemo jonemo deleted the fix-lru-cache-memory-usage branch May 9, 2023 17:15
aws-sdk-python-automation added a commit that referenced this pull request May 9, 2023
* release-1.29.131:
  Bumping version to 1.29.131
  Update to latest partitions and endpoints
  Update to latest models
  Reduce memory usage for scripts with many Session objects (#2934)
  Fix changelog
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants