Dolt serializes and deserializes JSON unnecessarily. #7749

nicktobey · 2024-04-16T15:29:11Z

Currently, when running a query on a JSON column, the following sequence of events happens:

tree.GetField loads the entire on-disk representation of the JSON document into memory by calling tree.JSONDoc.byte(). This representation is currently a JSON string.
tree.GetField converts this string into a types.JSONDocument by de-serializing the string into a golang map.
Before being displayed to the user or send over the wire by the server, it must be re-serialized back into a string.

If the document is not inspected or modified by the query, we are deserializing and reserializing the document unnecessarily. I have benchmarks where this is 30% of the runtime for a simple select query.

We should have an optimization that detects when the structure of a JSON column is not actually needed for resolving a query, and treats the document as a text string instead.

The text was updated successfully, but these errors were encountered:

nicktobey self-assigned this Apr 16, 2024

timsehn added bug Something isn't working performance labels Apr 16, 2024

This was referenced Apr 24, 2024

Add LazyJSONDocument, which wraps a JSON string and only deserializes it if needed. dolthub/go-mysql-server#2470

Merged

Use LazyJSONDocument when reading from a JSON column. #7785

Merged

nicktobey closed this as completed May 2, 2024

BrewTestBot mentioned this issue May 3, 2024

dolt 1.35.13 Homebrew/homebrew-core#170760

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Dolt serializes and deserializes JSON unnecessarily. #7749

Dolt serializes and deserializes JSON unnecessarily. #7749

nicktobey commented Apr 16, 2024

Dolt serializes and deserializes JSON unnecessarily. #7749

Dolt serializes and deserializes JSON unnecessarily. #7749

Comments

nicktobey commented Apr 16, 2024