-
Notifications
You must be signed in to change notification settings - Fork 11
Performance
This page has tips for getting better performance out of Nevermore.
Nevermore 12.0+ is generally faster for most queries, and with larger documents, uses much less memory. If you're on an older version, consider upgrading. See what's new in Nevermore 12.0+ for benchmarks.
Nevermore is optimized particularly around reading, with a focus on being fast (similar to hand-coded SqlCommand
speed) and with very few large memory allocations. This section explains a bit about how it all works.
When querying or loading documents, Nevermore uses a different "strategy" depending on the type being queried (primitive, value tuple, arbitrary class, or a mapped document). Each strategy behaves a little differently, but the core idea is the same.
When Nevermore calls ExecuteReader
, it specifies the SequentialAccess
command behavior for the data reader, which allows Nevermore to read and process data as it streams in from SQL Server without buffering the entire row. This is especially important for large documents. Each Nevermore strategy ensures data is only read in the correct order (i.e., field 2 is read before field 3...)
Most strategies keep a cache of queries they've seen before, and a compiled func that knows how to execute the reader for that particular query.
- For primitives, it creates a func for each type you query (
int
,string
) - For ValueTuples, it creates a func for each tuple combination you query (
ValueTuple<int, int>
,ValueTuple<string, int>
) - For arbitrary classes (without a document map) + documents, it creates one per type + returned columns
The compiled func will look as much as possible as if it was written by hand. For example, when reading a document with two mapped columns, it will generate and compile something like this:
(DbDataReader reader, DocumentReaderContext context) =>
{
var temp0 = reader.GetString(0); // Id
var temp1 = reader.GetString(1); // FirstName
var temp2 = reader.IsDBNull(2) ? (int?)null : reader.GetInt32(2); // Age
var jsonResult = context.DeserializeText<Contact>(reader, 7);
result.Id = temp0;
result.FirstName = temp1;
result.Age = temp2;
return result;
}
The above is overly simplified - there are extra null checks (e.g., if GetString()
returns DBNull.Value
), and some field assignments to keep track of what column is being read, in case there's an error, and so on. But you get the picture. Note that each field is read and converted into temporary variables, as we can't deserialize until we encounter the JSON column, and the JSON column is usually not at the start of the query.
If you're curious, look at DocumentReaderExpressionBuilder
in the source code. Also, if your query causes an exception, the error message will include the source code for the generated reader in case it isn't what you expect.
The func that is compiled out of this will be fast, but building it is slow, so the result is kept for future queries.
Nevermore can store JSON either as text, or in a compressed format.
For JSON as text, there are two strategies: one when we know there will always be small documents, and one if we think there could be large documents.
- Small documents: Nevermore simply calls
DbDataReader.GetString()
and then passes it to JSON.NET for deserialization - Large documents: Nevermore uses a custom
TextReader
that calls through toDbDataReader.GetChars()
which allows it to read and process the text directly into a buffer as needed, without loading the entire document into memory.
The benefit of the large document approach is great: Nevermore can completely avoid LOH allocations and keep overall memory usage down. However, for small documents (1K or smaller) it's slightly slower, which is why there's two strategies.
How does Nevermore know which strategy to use? It looks at an ExpectLargeDocuments
property on the document map, which defaults to false. If it's true, Nevermore always uses the large document strategy.
By default, ExpectLargeDocuments
is false
- but don't worry. If Nevermore reads a string larger than 1K when deserializing, it will set ExpectLargeDocuments
to true
and remember it for next time (until the application restarts). So in this way, Nevermore is self-tuning.
This means:
- If you know for sure that the table will have at least a few large documents, set
ExpectLargeDocuments
totrue
. - If you are very confident it won't, or at least for some customers it won't, leave it as
false
.
In Octopus Deploy's case, roughly 80% of documents fit the small category (1K or under). Looking at large customer databases, only certain tables tend to have clusters of large documents.
For compressed JSON, Nevermore calls GetStream
on the data reader, and sends it through a GZipStream
to decompress through to the JSON.NET deserializer.
Nevermore also makes use of array pools when possible to reduce allocations, though all buffers used are quite small anyway.
Once the expression is compiled, the actual reading is simply done one row at a time by calling the compiled func.
In general, most optimization work in Nevermore is focussed on reading data. For writing, the focus is mostly on memory use, especially with large documents.
When inserting or updating documents, Nevermore simply generates a SQL statement with the fields you'd expect, and generates command arguments, then executes it. The properties are read from the object into the SqlCommand
using compiled funcs (one for each property, cached on the document map).
When serializing JSON (as text or compressed) Nevermore first has to serialize the document to a memory stream. For this, we use RecyclableMemoryStream
, which in testing works much better especially when documents can be quite large. The memory stream is then passed to the SqlCommand
as a stream - when the command is executed, the stream is read. Again, this means there should be no LOH allocations when dealing with even very large documents.
When JSON is compressed, Nevermore uses GZipStream
with the compression level set to Optimal
by default. This is because we assume the SQL Server database will be on another server, and the goal is to reduce network I/O on large documents. Documents are written infrequently but read often, so this seems like a good tradeoff. You can, however, change the compression level if you need to.
Beyond reading and writing documents, Nevermore has a few other operations - ExecuteQuery
, ExecuteReader
and so on. These all call the corresponding method on SqlCommand
so there isn't much to optimize.
Nevermore uses JSON.NET by default. You could instead use the System.Text.Json
serializer, which seems to use less memory. You can do this by setting your own IDocumentSerializer
on the store configuration. However, System.Text.Json
makes the assumption that everything you do is in UTF-8, which isn't the case in SQL Server, so you might waste time converting anyway.
Consider enabling compression
If the time spent reading and writing JSON from the database and transferring it over the network between the application and SQL Server is a concern, consider enabling compression. This makes a very big difference in bandwidth-constrained environments, especially for larger documents.
TODO: Include
**Avoid stream with projection
There is an overload of Stream
that takes a callback, which is used when doing a projection. It's used when you want to do a query with a join, then map the result to a pair (or three or four) documents. It works well when the result is small (or when you query an arbitrary class) but not for large documents. This is because the data reader cannot use SequentialAccess
(as your projection may be out of order), so the entire row must be read into memory.
Large strings/arrays
Nevermore works hard to avoid any LOH allocations (objects 85K or more) by streaming and buffering everything it can. However, if your document includes one huge allocation anyway - e.g., because it has an 85K HTML string embedded in it, or a giant array of ints - then those allocations will still be needed.
For this kind of design, you might want to consider a separate table. One table could store the JSON (which is fully deserialized) but when the full data is needed, you could fetch from the second table instead (of course, maybe this data makes more sense to store on disk anyway!).
If you call IRelationalTransaction.ExecuteReader
you'll get raw access to a data reader - you could use this, for example, to read the text using GetChars
in chunks and write it to the HTML response directly, without ever allocating a large string.
Overview
Getting started
Extensibility
Misc