Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

User parameter data is referenced inside RelationalCommandCache #34028

Closed
renanhgon opened this issue Jun 18, 2024 · 28 comments · Fixed by #34803
Closed

User parameter data is referenced inside RelationalCommandCache #34028

renanhgon opened this issue Jun 18, 2024 · 28 comments · Fixed by #34803

Comments

@renanhgon
Copy link

renanhgon commented Jun 18, 2024

When I start my application, gradually, as processes are executed, the memory consumption just keeps increasing until my k8s pod simply dies due to lack of memory and gets restarted.

After analyzing the generated dumps, we observed the following:

image

To ensure that the dispose method of my DbContext is always executed, some logs were added during its creation and disposal.

public MyDbContext(IDatabaseConfig configuration, 
    ISeed seed, 
    ILogger<MyDbContext> logger)
{
    _contextId = Guid.NewGuid();
    _logger = logger;

    _logger.LogInformation("Context created. ID {0}", _contextId.ToString());

    _configuration = configuration;
    _seed = seed;
}

public override void Dispose()
{
    base.Dispose();
    _logger.LogInformation("Context destroyed. ID {0}", _contextId.ToString());
}

public override ValueTask DisposeAsync()
{
    var result = base.DisposeAsync();
    _logger.LogInformation("Context destroyed. ID {0}", _contextId.ToString());
    return result;
}

And from these logs, we were able to confirm that the object is indeed always created and disposed without any problems.

In some previous issues, I observed some possible causes within the OnConfiguring method, but I believe that is not the problem in my case.

protected override void OnConfiguring(DbContextOptionsBuilder optionsBuilder)
{
    if (!optionsBuilder.IsConfigured)
        optionsBuilder.UseNpgsql(_configuration.ConnectionString);

    base.OnConfiguring(optionsBuilder);
}

My DbContext instance is resolved in this way:

private static void RegisterContext(IServiceCollection services) 
    => services.AddDbContext<IMyDbContext, MyDbContext>();

The issue was created here because Microsoft.EntityFrameworkCore.Internal.ServiceProviderCache is indicated. But perhaps the problem might be related to the GarbageCollector?

EF Core version: 8.0.6
Database provider: Npgsql.EntityFrameworkCore.PostgreSQL
Target framework: NET 8.0
Operating system: Windows 10
IDE: Visual Studio 2022 17.10.1
My container is running with this base image: mcr.microsoft.com/dotnet/aspnet:8.0

@ajcvickers
Copy link
Contributor

This issue is lacking enough information for us to be able to fully understand what is happening. Please attach a small, runnable project or post a small, runnable code listing that reproduces what you are seeing so that we can investigate.

@renanhgon
Copy link
Author

Sure! I created this example repository with an implementation very close to my real scenario.
https://github.com/renanhgon/EFCoreMemoryLeak

When I run my application, just the devops/ping route triggered by k8s is enough to cause a constant increase in memory usage. You can simulate these pings directly from the browser console:

setInterval(() => fetch('/v1/devops/ping'), 10000);
setInterval(() => fetch('/v1/devops/ping'), 30000);

@cssatkmd
Copy link

cssatkmd commented Jul 3, 2024

Has this issued been looked at or given any time and effort? We can reproduce a large memory usage, but we have yet to create a reproduceable example.
The example above seems to have a leak, so can this be confirmed or mitigated because of something?

@peterkiss1
Copy link

Something similar is happening in my app;

EF Core version: 8.0.6 or EF Core version: 8.0.4
Database provider: Npgsql.EntityFrameworkCore.PostgreSQL 8.0.3

The setup is simple as the database holds 3 tables, from which 2 are in relation in a one-to-many mapping.
Application is a Web API, web requests are interacting with the context but also hosted services are doing database operations, context registered as transient, hosted services are creating their scope for each process then dispose the whole, DbContext also disposed everywhere still the memory usage is sky rocketing (low load is on the app). PostgreSQL driver has an issue with multiplexing and OpenTelemetry but we are not using multiplexing.

@peterkiss1
Copy link

Executed a couple of extra tests and

  • 8.0.0 looks better than 8.0.4 or 8.0.6
  • if i turn ON pooling for the context everything looks fine which is bad as pooling itself is not an essential thing and i'm afraid that something deeper is simple wrong with EF

Who has memory issues please check the suffering apps with pooling.

@ajcvickers
Copy link
Contributor

Note for team: we need to decide how to best investigate this kind of issue. My usual process doesn't work here, so I'd like to get advice from the team on how to proceed.

@ajcvickers
Copy link
Contributor

@renanhgon Can you please post the OutOfMemoryException you get from running the repro until it runs out of memory?

@ajcvickers ajcvickers removed their assignment Aug 31, 2024
@bleapleaper
Copy link

bleapleaper commented Sep 5, 2024

@ajcvickers Semi-similar issue where my stuff is kept in memory. Used PerfView (first time) to find that RelationalCommandCache uses my object in a cache key. No idea how it ended up there xD. However it would be nice if there was an option to turn on som debug messages when saving abnormal types to cache, wether MemoryCache or static

edit; never mind. memory cache says no entries but visual studio heap says two entries - no idea what is going on
edit2: oh its probably ef own service provider?

anyway ServiceProviderCache or some cache is holding back 1 million ...Core.Update.ColumnModification and 13000 entites from being garbage collected

@cliffankh0z
Copy link

cliffankh0z commented Sep 9, 2024

@ajcvickers found the bug, inspect IMemoryCache in a breakpoint and you'll see what is mean. Lists of Guids and all different kinds of parameter values are put in IMemoryCache as CommandCacheKey. Only need the type info for the cache key, not the actual parameter value. These object can link to other object, which links to other objects thereby keeping all of them in memory

Youll have to use optionsBuilder.UseMemoryCache(_memoryCache); to be able to inspect this

yinzara pushed a commit to yinzara/efcore that referenced this issue Oct 1, 2024
Only store necessary info for paramter values RelationalCommandCacheKey to prevent memory leaks.

Fixes dotnet#34028
@yinzara
Copy link
Contributor

yinzara commented Oct 1, 2024

I have now been able to reliably reproduce this issue and am producing a PR to fix it.

Thank you to @cliffankh0z for the PR that nearly fixed it already.

The following test case should work on both .NET 8 and .NET 9 (and is included in my PR) so it can be ported back to the release/8.0 branch easily. This fix is very important to get ported back ASAP as it is an active showstopper defect IMHO.

It shows the problem exists and now is fixed.

public class RelationalCommandCacheTest
{
    [ConditionalFact]
    public void MemoryLeak()
    {
        // Setup:
        // allow 1000 records to exist for a sizeable enough cache to see the problem
        const int cacheSize = 1000;

        // create a parameter expression and mock sql generator and processors to pass into all relational command cache creations
        var param = Expression.Parameter(typeof(string));
        var expression = Expression.Lambda<Func<string, string>>(param, param); // _ => _
        var memoryCache = new MemoryCache(Options.Create(new MemoryCacheOptions { SizeLimit = cacheSize }));
        var generator = new TestSqlGeneratorFactory();
        var processor = new TestSqlProcessorFactory();
        var cache = new RelationalCommandCache(memoryCache, generator, processor, expression, false
#if !NET8_0
            , null!
#endif
        );

        // fill the cache completely once for a good baseline memory size
        RunTest();
        var memoryBefore = GC.GetTotalMemory(true);

        // Act:
        // attempt to look up the records again which should not increase the memory needed for the cache
        // even though we will have all cache misses and new objects created with each lookup
        RunTest();
        var memoryAfter = GC.GetTotalMemory(true);

        // Verify:
        // memory footprint has not increased/decreased by more than a small margin of error (128k)
        var difference = Math.Abs(memoryAfter - memoryBefore) / 1024;
        Assert.True(difference < 128, $"Memory leak detected in RelationalCommandCache: {difference}kb");

        return;
        void RunTest()
        {
            for (var i = 0; i < cacheSize; i++) // run the size limit of the cache to completely replace all entries
            {
                var parameters = new ReadOnlyDictionary<string, object?>(new Dictionary<string, object?>
                {
                    // allocate a large object array with each parameter creation to exacerbate the problem
                    // use a random key to cause a consistent cache miss
                    [Guid.NewGuid().ToString()] = Enumerable.Range(1, 5000).Select(j => $"{j}").ToArray<object>()
                });
                cache.GetRelationalCommandTemplate(
                    parameters); // ignore the result so it gets garbage collected
            }
        }
    }

    private class TestSqlProcessorFactory : IRelationalParameterBasedSqlProcessorFactory
    {
#if NET8_0
        public RelationalParameterBasedSqlProcessor Create(bool useRelationalNulls)
#else
        public RelationalParameterBasedSqlProcessor Create(RelationalParameterBasedSqlProcessorParameters parameters)
#endif
            => new TestRelationalParameterBasedSqlProcessor();
    }

    private class TestRelationalParameterBasedSqlProcessor : RelationalParameterBasedSqlProcessor
    {
        internal TestRelationalParameterBasedSqlProcessor() : base(null!, null!) {}

        public override Expression Optimize(
            Expression queryExpression,
            IReadOnlyDictionary<string, object?> parametersValues,
            out bool canCache)
        {
            canCache = true;
            return null!;
        }
    }

    private class TestSqlGeneratorFactory : IQuerySqlGeneratorFactory
    {
        public QuerySqlGenerator Create()
            => new TestQuerySqlGenerator();
    }

    private class TestQuerySqlGenerator() : QuerySqlGenerator(new QuerySqlGeneratorDependencies(null!, null!))
    {
        public override IRelationalCommand GetCommand(Expression queryExpression)
            => new TestRelationalCommand();
    }

    private class TestRelationalCommand : IRelationalCommand
    {
        public string CommandText
            => throw new NotImplementedException();

        public IReadOnlyList<IRelationalParameter> Parameters
            => throw new NotImplementedException();

        public DbCommand CreateDbCommand(RelationalCommandParameterObject parameterObject, Guid commandId, DbCommandMethod commandMethod)
            => throw new NotImplementedException();

        public int ExecuteNonQuery(RelationalCommandParameterObject parameterObject)
            => throw new NotImplementedException();

        public Task<int> ExecuteNonQueryAsync(RelationalCommandParameterObject parameterObject, CancellationToken cancellationToken = default)
            => throw new NotImplementedException();

        public object ExecuteScalar(RelationalCommandParameterObject parameterObject)
            => throw new NotImplementedException();

        public Task<object?> ExecuteScalarAsync(RelationalCommandParameterObject parameterObject, CancellationToken cancellationToken = default)
            => throw new NotImplementedException();

        public RelationalDataReader ExecuteReader(RelationalCommandParameterObject parameterObject)
            => throw new NotImplementedException();

        public Task<RelationalDataReader> ExecuteReaderAsync(RelationalCommandParameterObject parameterObject, CancellationToken cancellationToken = default)
            => throw new NotImplementedException();

        public void PopulateFrom(IRelationalCommandTemplate commandTemplate)
            => throw new NotImplementedException();
    }
}

@roji
Copy link
Member

roji commented Oct 1, 2024

@yinzara and others, I can indeed see that RelationalCommandCache retains references to user data, which indeed isn't ideal. But I'd like to make sure we're on the same page in terms of whether there's an actual leak here.

EF instantiates a RelationalCommandCache for each LINQ query it compiles. The point of that cache is to have one entry for each permutation of parameter nullability values; that is, if there are two parameters in the query, there should always be at most 4 entries in that query's cache. In other words, you can execute the same query as many times as you want with as many values, it will only ever retain the user values from the first four invocations. It's certainly not great that the user values are retained, but I'm not seeing a leak here in the regular sense.

Now, if you run different queries, you'll get different instances of RelationalCommandCache, which each will retain some user values. The total number of RelationalCommandCaches instances has an upper limit though, and old ones will get evicted. So again, I don't see a critical leak here either. @yinzara your repro above creates an unbounded number of RelationalCommandCache instances, which doesn't correspond to how EF actually works.

To be sure, I think we should improve things, but at this point I'm still not seeing a critical leak that needs to be fixed and backported. If I've missed something, an EF repro - not an artificial unit test-like repro that @yinzara wrote above - would help here.

yinzara pushed a commit to yinzara/efcore that referenced this issue Oct 1, 2024
Only store necessary info for paramter values RelationalCommandCacheKey to prevent memory leaks.

Fixes dotnet#34028
@yinzara
Copy link
Contributor

yinzara commented Oct 1, 2024

The total number of RelationalCommandCaches instances has an upper limit though,

Btw I updated my example to only create a single instance of RelationalCommandCache and the memory leak still occurs.

The issue isn't with the instances of RelationalCommandCache, the issue is the upper limit of the total number of IRelationalCommandTemplate(s) that get stored in the IMemoryCache during the lookup. The capacity of which is only constrained by whatever the IMemoryCache that's injected into it will hold (i.e. whatever its SizeLimit is) where each cache entry is 10L and has no expiration.

Yes I agree that within a single query plan, the cache will not grow except if you change the nullability of a parameter or the array length of a parameter.

However, every new Expression that is planned (every time a query is a new structure), you will have an instance of the parameters stored in active memory. These parameters could be references to objects loaded from the database that then reference tons of others leaving vast amount of memory used by this cache. In the end you will fill the entire IMemoryCache with these leaked objects of unknown memory size.

If someone has piece of badly written code the recompiles the Expression with each request, this will become a serious problem very quickly. Even UPDATE statements on tables with a lot of columns in which varying columns are changing would cause this.

So the default behavior if no configuration parameters are set is to have a cache that grows continually as new command queries are made on the instance and is only constrained by the default SizeLimit of 10240 which means 1024 entries will be maintained.

1024 entries could be megabytes (or more) of data depending on what's in the query parameters and something the application developer would have no control of directly. All of it completely wasted memory.

My test case only set the SizeLimit to 1000, a factor of 10 less than the default behavior. I was able to leak 2MB of memory on average with only a single parameter that was a string array of length 5000 and with strings of 4 characters or less: an object of less than 20k.

Forgive me but that seems like a showstopper to me.

@yinzara
Copy link
Contributor

yinzara commented Oct 2, 2024

One note about my PR though, I'm not sure the effect being described in the OPs example repository is fixed by my PR. If that example is actually causing a memory leak, it doesn't even execute any queries. It just starts a transaction. So I'm not sure how my fix could help it. They may both be problems. I don't know.

@roji
Copy link
Member

roji commented Oct 2, 2024

However, every new Expression that is planned (every time a query is a new structure), you will have an instance of the parameters stored in active memory.

Yes, that's right.

These parameters could be references to objects loaded from the database that then reference tons of others leaving vast amount of memory used by this cache. In the end you will fill the entire IMemoryCache with these leaked objects of unknown memory size.

Objects with references to other objects generally cannot be parameters, since only scalar types supported by the database can be query parameters. The problem here is with parameters which are huge strings/byte arrays.

If someone has piece of badly written code the recompiles the Expression with each request, this will become a serious problem very quickly.

This case of badly-written code creates lots of problem far beyond the capturing of user parameter data we're describing; the constant recompilation itself is a huge slowdown that is likely to be far more problematic than the captured data (and crucially, the recompilation happens no matter what in that case, whereas the parameter data capture is meaningful only with huge strings/byte arrays). So this case of badly-written code would stay highly problematic whether we fix this issue or not.

My test case only set the SizeLimit to 1000, a factor of 10 less than the default behavior. I was able to leak 2MB of memory on average with only a single parameter that was a string array of length 5000 and with strings of 4 characters or less: an object of less than 20k.

Forgive me but that seems like a showstopper to me

It's indeed possible to create contrived cases where the parameter data capture is very problematic. But at the end of the day, the problem here really is when the user executes lots of different queries, and those queries all have huge parameter data. I do agree this needs to be fixed - and probably also that this is worth backporting to 8.0 and 9.0 - but it's important to realize the exact scope of the problem and not exaggerate it. For example, this behavior has been identical since EF Core 1.0, and nobody has raised it so far (huge parameter data is generally quite rare and can cause other issues).

I'll bring this up with the team to decide what we do.

@peterkiss1
Copy link

@roji and what about I mentioned above? As I had - probably still have - a memory leak issue with EF, workaround is to have context pooling enabled.
I don't have a specific repro code or anything but I believe a few is at hand already to execute the same query with the same parameters but like 14000 times (executed these with some parallelism). I checked how I verified the workaround and basically this is what I've done many weeks ago.

@roji roji changed the title Memory Leak - Microsoft.EntityFrameworkCore.Internal.ServiceProviderCache User parameter data is referenced inside RelationalCommandCache Oct 2, 2024
@roji
Copy link
Member

roji commented Oct 2, 2024

@peterkiss1 this issue has indeed bifurcated to discuss multiple different things. As we have a clear problem with the capturing of parameter data, I'm repurposing this issue to track that.

For the other issues, we have @renanhgon's repro here, although at first glance that's quite a big project with a lot of unrelated things. We require your help to provide minimal, runnable repros as much as possible, since figuring out people's big projects can take a lot of time (this is one reason this wasn't investigated earlier).

In any case, I've opened #34809 to track this remaining report - please post there, and keep this issue for the captured parameter data problem discussed above.

yinzara added a commit to yinzara/efcore that referenced this issue Oct 3, 2024
Store only nullness and array lengths in struct form to prevent parameters memory leaks

Fixes dotnet#34028
yinzara added a commit to yinzara/efcore that referenced this issue Oct 3, 2024
Store only nullness and array lengths in struct form to prevent parameters memory leaks

Fixes dotnet#34028
yinzara added a commit to yinzara/efcore that referenced this issue Oct 3, 2024
Store only nullness and array lengths in struct form to prevent parameters memory leaks

Fixes dotnet#34028
yinzara added a commit to yinzara/efcore that referenced this issue Oct 4, 2024
Store only nullness and array lengths in struct form to prevent parameters memory leaks

Fixes dotnet#34028
roji added a commit that referenced this issue Oct 14, 2024
)

Store only nullness and array lengths in struct form to prevent parameters memory leaks

Fixes #34028

Co-authored-by: Shay Rojansky <[email protected]>
roji pushed a commit to roji/efcore that referenced this issue Oct 14, 2024
…net#34803)

Store only nullness and array lengths in struct form to prevent parameters memory leaks

Fixes dotnet#34028

Co-authored-by: Shay Rojansky <[email protected]>
(cherry picked from commit af420cd)
roji pushed a commit to roji/efcore that referenced this issue Oct 14, 2024
Store only nullness and array lengths in struct form to prevent parameters memory leaks

Fixes dotnet#34028

Co-authored-by: Shay Rojansky <[email protected]>

(cherry picked from commit af420cd)
roji pushed a commit to roji/efcore that referenced this issue Oct 14, 2024
Store only nullness and array lengths in struct form to prevent parameters memory leaks

Fixes dotnet#34028

Co-authored-by: Shay Rojansky <[email protected]>

(cherry picked from commit af420cd)
roji added a commit that referenced this issue Oct 14, 2024
)

Store only nullness and array lengths in struct form to prevent parameters memory leaks

Fixes #34028

(cherry picked from commit af420cd)

Co-authored-by: Matthew Vance <[email protected]>
roji added a commit that referenced this issue Oct 14, 2024
…mandCache (#34908)

Store only nullness and array lengths in struct form to prevent parameters memory leaks

Fixes #34028

Co-authored-by: Matthew Vance <[email protected]>
@ajcvickers
Copy link
Contributor

@roji Is this fixed? If so, which release?

@ajcvickers ajcvickers reopened this Nov 11, 2024
@roji roji added this to the 8.0.x milestone Nov 12, 2024
@roji
Copy link
Member

roji commented Nov 12, 2024

Good catch. This was backported to 8.0 via #34908, AFAIK should be out with the upcoming 8.0.9 8.0.11 - changed the milestone to 8.0.x (that's how we do it right?)

@ajcvickers
Copy link
Contributor

We use 8.0.x when we are taking it to Tactics but we don't have a approval for a specific release. If it is approved for 8.0.9, then it can go in the 8.0.9 milestone.

@roji roji modified the milestones: 8.0.x, 8.0.9, 8.0.11 Nov 12, 2024
@roji
Copy link
Member

roji commented Nov 12, 2024

OK. BTW I was wrong and the upcoming is 8.0.11, changed it to that.

@bleapleaper
Copy link

Probably fixed it for this type of cache, but still having issues.

I was unable to create a repro in a timely manner, but some investigation led me to QueryContext.cs _parameterValues which might be put into precompiled query cache. There is an Expression in the precompiled query cache which i struggled a little with.

private readonly IDictionary<string, object?> _parameterValues = new Dictionary<string, object?>();

Multiple consecutive imports will result in out of memory, we are doing dbContextOptions.UseMemoryCache(mc); MemoryCache.Clear() and GC.Collect() to work around this issue, but GC can be indeterminate sometimes. Running in azure kubernetes containers.

@roji roji closed this as completed Jan 23, 2025
@roji
Copy link
Member

roji commented Jan 23, 2025

I was unable to create a repro in a timely manner, but some investigation led me to QueryContext.cs _parameterValues which might be put into precompiled query cache.

When you say "precompiled query cache", are you referring to the "compiled query" feature (docs)? Or to the new "precompiled query" feature introduced in EF 9.0 for NativeAOT spuport (docs)? Or just to regular query functioning?

Multiple consecutive imports will result in out of memory, we are doing dbContextOptions.UseMemoryCache(mc); MemoryCache.Clear() and GC.Collect() to work around this issue, but GC can be indeterminate sometimes. Running in azure kubernetes containers.

I'm not sure I understand this... If GC.Collect() helps in any way, that means nothing is being referenced by EF, since the garbage collector only even collects objects which are completely unreferenced. In any case, can you provide some context on why you're using UseMemoryCache()?

Regardless, there simply isn't enough information here for us to investigate - I don't really understand from your description what problem you're investigating...

@bleapleaper
Copy link

I suspect QueryContext with _parameterValues somehow ends up in Microsoft.EntityFrameworkCore.Query.CompiledQueryCacheKeyGenerator.CompiledQueryCacheKey

About memory cache usage

  • UseMemoryCache() is so we can clear the contents and inspect what EF Core puts there
  • MemoryCache.Clear() clears cache - our workaround does not work without this step
  • GC.Collect() not strictly needed, but we wanted to clear unused immediately

@roji
Copy link
Member

roji commented Jan 23, 2025

I suspect QueryContext with _parameterValues somehow ends up in Microsoft.EntityFrameworkCore.Query.CompiledQueryCacheKeyGenerator.CompiledQueryCacheKey

Can you please try to put together a repro for that?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

8 participants