-
Notifications
You must be signed in to change notification settings - Fork 3.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
User parameter data is referenced inside RelationalCommandCache #34028
Comments
This issue is lacking enough information for us to be able to fully understand what is happening. Please attach a small, runnable project or post a small, runnable code listing that reproduces what you are seeing so that we can investigate. |
Sure! I created this example repository with an implementation very close to my real scenario. When I run my application, just the devops/ping route triggered by k8s is enough to cause a constant increase in memory usage. You can simulate these pings directly from the browser console: setInterval(() => fetch('/v1/devops/ping'), 10000);
setInterval(() => fetch('/v1/devops/ping'), 30000); |
Has this issued been looked at or given any time and effort? We can reproduce a large memory usage, but we have yet to create a reproduceable example. |
Something similar is happening in my app; EF Core version: 8.0.6 or EF Core version: 8.0.4 The setup is simple as the database holds 3 tables, from which 2 are in relation in a one-to-many mapping. |
Executed a couple of extra tests and
Who has memory issues please check the suffering apps with pooling. |
Note for team: we need to decide how to best investigate this kind of issue. My usual process doesn't work here, so I'd like to get advice from the team on how to proceed. |
@renanhgon Can you please post the |
@ajcvickers Semi-similar issue where my stuff is kept in memory. Used PerfView (first time) to find that RelationalCommandCache uses my object in a cache key. No idea how it ended up there xD. However it would be nice if there was an option to turn on som debug messages when saving abnormal types to cache, wether MemoryCache or static edit; never mind. memory cache says no entries but visual studio heap says two entries - no idea what is going on anyway ServiceProviderCache or some cache is holding back 1 million ...Core.Update.ColumnModification and 13000 entites from being garbage collected |
@ajcvickers found the bug, inspect IMemoryCache in a breakpoint and you'll see what is mean. Lists of Guids and all different kinds of parameter values are put in IMemoryCache as CommandCacheKey. Only need the type info for the cache key, not the actual parameter value. These object can link to other object, which links to other objects thereby keeping all of them in memory Youll have to use optionsBuilder.UseMemoryCache(_memoryCache); to be able to inspect this |
Only store necessary info for paramter values RelationalCommandCacheKey to prevent memory leaks. Fixes dotnet#34028
I have now been able to reliably reproduce this issue and am producing a PR to fix it. Thank you to @cliffankh0z for the PR that nearly fixed it already. The following test case should work on both .NET 8 and .NET 9 (and is included in my PR) so it can be ported back to the It shows the problem exists and now is fixed. public class RelationalCommandCacheTest
{
[ConditionalFact]
public void MemoryLeak()
{
// Setup:
// allow 1000 records to exist for a sizeable enough cache to see the problem
const int cacheSize = 1000;
// create a parameter expression and mock sql generator and processors to pass into all relational command cache creations
var param = Expression.Parameter(typeof(string));
var expression = Expression.Lambda<Func<string, string>>(param, param); // _ => _
var memoryCache = new MemoryCache(Options.Create(new MemoryCacheOptions { SizeLimit = cacheSize }));
var generator = new TestSqlGeneratorFactory();
var processor = new TestSqlProcessorFactory();
var cache = new RelationalCommandCache(memoryCache, generator, processor, expression, false
#if !NET8_0
, null!
#endif
);
// fill the cache completely once for a good baseline memory size
RunTest();
var memoryBefore = GC.GetTotalMemory(true);
// Act:
// attempt to look up the records again which should not increase the memory needed for the cache
// even though we will have all cache misses and new objects created with each lookup
RunTest();
var memoryAfter = GC.GetTotalMemory(true);
// Verify:
// memory footprint has not increased/decreased by more than a small margin of error (128k)
var difference = Math.Abs(memoryAfter - memoryBefore) / 1024;
Assert.True(difference < 128, $"Memory leak detected in RelationalCommandCache: {difference}kb");
return;
void RunTest()
{
for (var i = 0; i < cacheSize; i++) // run the size limit of the cache to completely replace all entries
{
var parameters = new ReadOnlyDictionary<string, object?>(new Dictionary<string, object?>
{
// allocate a large object array with each parameter creation to exacerbate the problem
// use a random key to cause a consistent cache miss
[Guid.NewGuid().ToString()] = Enumerable.Range(1, 5000).Select(j => $"{j}").ToArray<object>()
});
cache.GetRelationalCommandTemplate(
parameters); // ignore the result so it gets garbage collected
}
}
}
private class TestSqlProcessorFactory : IRelationalParameterBasedSqlProcessorFactory
{
#if NET8_0
public RelationalParameterBasedSqlProcessor Create(bool useRelationalNulls)
#else
public RelationalParameterBasedSqlProcessor Create(RelationalParameterBasedSqlProcessorParameters parameters)
#endif
=> new TestRelationalParameterBasedSqlProcessor();
}
private class TestRelationalParameterBasedSqlProcessor : RelationalParameterBasedSqlProcessor
{
internal TestRelationalParameterBasedSqlProcessor() : base(null!, null!) {}
public override Expression Optimize(
Expression queryExpression,
IReadOnlyDictionary<string, object?> parametersValues,
out bool canCache)
{
canCache = true;
return null!;
}
}
private class TestSqlGeneratorFactory : IQuerySqlGeneratorFactory
{
public QuerySqlGenerator Create()
=> new TestQuerySqlGenerator();
}
private class TestQuerySqlGenerator() : QuerySqlGenerator(new QuerySqlGeneratorDependencies(null!, null!))
{
public override IRelationalCommand GetCommand(Expression queryExpression)
=> new TestRelationalCommand();
}
private class TestRelationalCommand : IRelationalCommand
{
public string CommandText
=> throw new NotImplementedException();
public IReadOnlyList<IRelationalParameter> Parameters
=> throw new NotImplementedException();
public DbCommand CreateDbCommand(RelationalCommandParameterObject parameterObject, Guid commandId, DbCommandMethod commandMethod)
=> throw new NotImplementedException();
public int ExecuteNonQuery(RelationalCommandParameterObject parameterObject)
=> throw new NotImplementedException();
public Task<int> ExecuteNonQueryAsync(RelationalCommandParameterObject parameterObject, CancellationToken cancellationToken = default)
=> throw new NotImplementedException();
public object ExecuteScalar(RelationalCommandParameterObject parameterObject)
=> throw new NotImplementedException();
public Task<object?> ExecuteScalarAsync(RelationalCommandParameterObject parameterObject, CancellationToken cancellationToken = default)
=> throw new NotImplementedException();
public RelationalDataReader ExecuteReader(RelationalCommandParameterObject parameterObject)
=> throw new NotImplementedException();
public Task<RelationalDataReader> ExecuteReaderAsync(RelationalCommandParameterObject parameterObject, CancellationToken cancellationToken = default)
=> throw new NotImplementedException();
public void PopulateFrom(IRelationalCommandTemplate commandTemplate)
=> throw new NotImplementedException();
}
} |
@yinzara and others, I can indeed see that RelationalCommandCache retains references to user data, which indeed isn't ideal. But I'd like to make sure we're on the same page in terms of whether there's an actual leak here. EF instantiates a RelationalCommandCache for each LINQ query it compiles. The point of that cache is to have one entry for each permutation of parameter nullability values; that is, if there are two parameters in the query, there should always be at most 4 entries in that query's cache. In other words, you can execute the same query as many times as you want with as many values, it will only ever retain the user values from the first four invocations. It's certainly not great that the user values are retained, but I'm not seeing a leak here in the regular sense. Now, if you run different queries, you'll get different instances of RelationalCommandCache, which each will retain some user values. The total number of RelationalCommandCaches instances has an upper limit though, and old ones will get evicted. So again, I don't see a critical leak here either. @yinzara your repro above creates an unbounded number of RelationalCommandCache instances, which doesn't correspond to how EF actually works. To be sure, I think we should improve things, but at this point I'm still not seeing a critical leak that needs to be fixed and backported. If I've missed something, an EF repro - not an artificial unit test-like repro that @yinzara wrote above - would help here. |
Only store necessary info for paramter values RelationalCommandCacheKey to prevent memory leaks. Fixes dotnet#34028
Btw I updated my example to only create a single instance of RelationalCommandCache and the memory leak still occurs. The issue isn't with the instances of RelationalCommandCache, the issue is the upper limit of the total number of IRelationalCommandTemplate(s) that get stored in the IMemoryCache during the lookup. The capacity of which is only constrained by whatever the Yes I agree that within a single query plan, the cache will not grow except if you change the nullability of a parameter or the array length of a parameter. However, every new Expression that is planned (every time a query is a new structure), you will have an instance of the parameters stored in active memory. These parameters could be references to objects loaded from the database that then reference tons of others leaving vast amount of memory used by this cache. In the end you will fill the entire IMemoryCache with these leaked objects of unknown memory size. If someone has piece of badly written code the recompiles the Expression with each request, this will become a serious problem very quickly. Even So the default behavior if no configuration parameters are set is to have a cache that grows continually as new command queries are made on the instance and is only constrained by the default SizeLimit of 10240 which means 1024 entries will be maintained. 1024 entries could be megabytes (or more) of data depending on what's in the query parameters and something the application developer would have no control of directly. All of it completely wasted memory. My test case only set the Forgive me but that seems like a showstopper to me. |
One note about my PR though, I'm not sure the effect being described in the OPs example repository is fixed by my PR. If that example is actually causing a memory leak, it doesn't even execute any queries. It just starts a transaction. So I'm not sure how my fix could help it. They may both be problems. I don't know. |
Yes, that's right.
Objects with references to other objects generally cannot be parameters, since only scalar types supported by the database can be query parameters. The problem here is with parameters which are huge strings/byte arrays.
This case of badly-written code creates lots of problem far beyond the capturing of user parameter data we're describing; the constant recompilation itself is a huge slowdown that is likely to be far more problematic than the captured data (and crucially, the recompilation happens no matter what in that case, whereas the parameter data capture is meaningful only with huge strings/byte arrays). So this case of badly-written code would stay highly problematic whether we fix this issue or not.
It's indeed possible to create contrived cases where the parameter data capture is very problematic. But at the end of the day, the problem here really is when the user executes lots of different queries, and those queries all have huge parameter data. I do agree this needs to be fixed - and probably also that this is worth backporting to 8.0 and 9.0 - but it's important to realize the exact scope of the problem and not exaggerate it. For example, this behavior has been identical since EF Core 1.0, and nobody has raised it so far (huge parameter data is generally quite rare and can cause other issues). I'll bring this up with the team to decide what we do. |
@roji and what about I mentioned above? As I had - probably still have - a memory leak issue with EF, workaround is to have context pooling enabled. |
@peterkiss1 this issue has indeed bifurcated to discuss multiple different things. As we have a clear problem with the capturing of parameter data, I'm repurposing this issue to track that. For the other issues, we have @renanhgon's repro here, although at first glance that's quite a big project with a lot of unrelated things. We require your help to provide minimal, runnable repros as much as possible, since figuring out people's big projects can take a lot of time (this is one reason this wasn't investigated earlier). In any case, I've opened #34809 to track this remaining report - please post there, and keep this issue for the captured parameter data problem discussed above. |
Store only nullness and array lengths in struct form to prevent parameters memory leaks Fixes dotnet#34028
Store only nullness and array lengths in struct form to prevent parameters memory leaks Fixes dotnet#34028
Store only nullness and array lengths in struct form to prevent parameters memory leaks Fixes dotnet#34028
Store only nullness and array lengths in struct form to prevent parameters memory leaks Fixes dotnet#34028
) Store only nullness and array lengths in struct form to prevent parameters memory leaks Fixes #34028 Co-authored-by: Shay Rojansky <[email protected]>
…net#34803) Store only nullness and array lengths in struct form to prevent parameters memory leaks Fixes dotnet#34028 Co-authored-by: Shay Rojansky <[email protected]> (cherry picked from commit af420cd)
Store only nullness and array lengths in struct form to prevent parameters memory leaks Fixes dotnet#34028 Co-authored-by: Shay Rojansky <[email protected]> (cherry picked from commit af420cd)
Store only nullness and array lengths in struct form to prevent parameters memory leaks Fixes dotnet#34028 Co-authored-by: Shay Rojansky <[email protected]> (cherry picked from commit af420cd)
) Store only nullness and array lengths in struct form to prevent parameters memory leaks Fixes #34028 (cherry picked from commit af420cd) Co-authored-by: Matthew Vance <[email protected]>
…mandCache (#34908) Store only nullness and array lengths in struct form to prevent parameters memory leaks Fixes #34028 Co-authored-by: Matthew Vance <[email protected]>
@roji Is this fixed? If so, which release? |
Good catch. This was backported to 8.0 via #34908, AFAIK should be out with the upcoming |
We use 8.0.x when we are taking it to Tactics but we don't have a approval for a specific release. If it is approved for 8.0.9, then it can go in the 8.0.9 milestone. |
OK. BTW I was wrong and the upcoming is 8.0.11, changed it to that. |
Probably fixed it for this type of cache, but still having issues. I was unable to create a repro in a timely manner, but some investigation led me to QueryContext.cs _parameterValues which might be put into precompiled query cache. There is an Expression in the precompiled query cache which i struggled a little with. efcore/src/EFCore/Query/QueryContext.cs Line 24 in 6119066
Multiple consecutive imports will result in out of memory, we are doing dbContextOptions.UseMemoryCache(mc); MemoryCache.Clear() and GC.Collect() to work around this issue, but GC can be indeterminate sometimes. Running in azure kubernetes containers. |
When you say "precompiled query cache", are you referring to the "compiled query" feature (docs)? Or to the new "precompiled query" feature introduced in EF 9.0 for NativeAOT spuport (docs)? Or just to regular query functioning?
I'm not sure I understand this... If GC.Collect() helps in any way, that means nothing is being referenced by EF, since the garbage collector only even collects objects which are completely unreferenced. In any case, can you provide some context on why you're using Regardless, there simply isn't enough information here for us to investigate - I don't really understand from your description what problem you're investigating... |
I suspect QueryContext with _parameterValues somehow ends up in Microsoft.EntityFrameworkCore.Query.CompiledQueryCacheKeyGenerator.CompiledQueryCacheKey About memory cache usage
|
Can you please try to put together a repro for that? |
When I start my application, gradually, as processes are executed, the memory consumption just keeps increasing until my k8s pod simply dies due to lack of memory and gets restarted.
After analyzing the generated dumps, we observed the following:
To ensure that the dispose method of my DbContext is always executed, some logs were added during its creation and disposal.
And from these logs, we were able to confirm that the object is indeed always created and disposed without any problems.
In some previous issues, I observed some possible causes within the OnConfiguring method, but I believe that is not the problem in my case.
My DbContext instance is resolved in this way:
The issue was created here because Microsoft.EntityFrameworkCore.Internal.ServiceProviderCache is indicated. But perhaps the problem might be related to the GarbageCollector?
EF Core version: 8.0.6
Database provider: Npgsql.EntityFrameworkCore.PostgreSQL
Target framework: NET 8.0
Operating system: Windows 10
IDE: Visual Studio 2022 17.10.1
My container is running with this base image: mcr.microsoft.com/dotnet/aspnet:8.0
The text was updated successfully, but these errors were encountered: