Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RD-8272 implement json infer and parse #32

Merged
merged 7 commits into from
Jul 12, 2023

Conversation

alexzerntev
Copy link
Contributor

  • Added Json.InferAndParse . For this entry a new sources package had to be built called raw-sources-in-memory . It implements ByteStreamLocation traits and instead of reading from URL like other Location objects, it reads directly from memory. The schema for it is called in-memory:.
  • Fixed some warnings
  • Removed deprecated sparkUrl field in Locations


override def retryStrategy: RetryStrategy = NoRetry()

override def rawUri: String = "in-memory:"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm torn here. I know we discussed it and agreed, but I'm wondering if we shouldn't instead have in-memory be a base64 string encoding of a binary? In that case, the data would be in the URL itself. It's probably silly and slower so will leave it at your discretion.
If we'd like to expose "in-memory" locations to users, how would that look like?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering if we shouldn't instead have in-memory be a base64 string encoding of a binary
Seems the same to me, but if the string is large enough maybe we will have extra encode-decode overhead.

If we'd like to expose "in-memory" locations to users, how would that look like?
JsonInferAndRead("in-memory://" + Base64.Encode("data"))

JsonInferAndRead(InMemoryLocation.Build(string data, string encoding))

Something like that? But seems a little complex.

@alexzerntev alexzerntev merged commit f59dfdb into main Jul 12, 2023
@alexzerntev alexzerntev deleted the RD-8272-implement-json-infer-and-parse branch July 12, 2023 08:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants