Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dolt serializes and deserializes JSON unnecessarily. #7749

Closed
nicktobey opened this issue Apr 16, 2024 · 0 comments
Closed

Dolt serializes and deserializes JSON unnecessarily. #7749

nicktobey opened this issue Apr 16, 2024 · 0 comments
Assignees
Labels
bug Something isn't working performance

Comments

@nicktobey
Copy link
Contributor

Currently, when running a query on a JSON column, the following sequence of events happens:

  • tree.GetField loads the entire on-disk representation of the JSON document into memory by calling tree.JSONDoc.byte(). This representation is currently a JSON string.
  • tree.GetField converts this string into a types.JSONDocument by de-serializing the string into a golang map.
  • Before being displayed to the user or send over the wire by the server, it must be re-serialized back into a string.

If the document is not inspected or modified by the query, we are deserializing and reserializing the document unnecessarily. I have benchmarks where this is 30% of the runtime for a simple select query.

We should have an optimization that detects when the structure of a JSON column is not actually needed for resolving a query, and treats the document as a text string instead.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working performance
Projects
None yet
Development

No branches or pull requests

2 participants