chore(vrl): add development design document #8735

JeanMertz · 2021-08-16T08:23:54Z

There's still a few TODO's left in the document, but it's well enough along that an initial round of reviews is welcome. Any additions will be made in follow-up commits, to make reviewing them easier.

👀 RENDERED

Signed-off-by: Jean Mertz <[email protected]>

JeanMertz · 2021-08-16T08:33:19Z

lib/vrl/DESIGN.md

+- Use `parse_*` for string decoding functions (e.g. `parse_json` and
+  `parse_grok`).
+
+- Use `encode_*` for string encoding functions (e.g. `encode_base64`).


It should be noted that some decoding functions are named decode_* (decode_base64 and decode_percent).

Ideally, we'd rectify this, but that's a breaking change. We can create issues for when a current implementation goes against the design principles so that we can do a sweep of breaking changes before we hit 1.0.

Alternatively, we accept this, document it, and keep things as is, even when we release 1.0.

🤔 I'm torn on this, I prefer aligning on parse_base64 but using base64 directly it's referred to as "decode".

I'd lean towards internal consistency, maybe document "this decodes base64 encoded values" similar to us referencing "exploding" with the unnest function

I think there is some distinction here between functions that parse a string into some other type (like parse_json or parse_int) and functions that turn a string into another string (decode_base64 or decode_percent).

To me parse_base64 reads pretty weird given no "parsing" is really happening, just translation. I think parse makes sense for only for structured data.

It does makes encode a little confusing since it is used as the opposite of both decode and parse when you might only expect it to be the opposite of decode. I might suggest we use to_X like to_json and to_key_value but this would overload to_X given it is also used for converting to primitives. We could go the Go/Java route of unmarshal and marshal?

Ideally, we'd rectify this, but that's a breaking change. We can create issues for when a current implementation goes against the design principles so that we can do a sweep of breaking changes before we hit 1.0.

I think we could also just alias the functions to rename them to the desired scheme; dropping the deprecated aliases at 1.0.

💭 marshal/unmarshal seem natural given my history in Go - but it is a larger change (aliasing seems reasonable)

JeanMertz · 2021-08-16T08:34:39Z

lib/vrl/DESIGN.md

+
+- Return boolean from `is_*` functions (e.g. `is_string`).
+
+- Return an error when `parse_*` functions fail to decode the string.


All but one parse_* functions are fallible. parse_query_string being the exception.

JeanMertz · 2021-08-16T08:35:49Z

lib/vrl/DESIGN.md

+
+- `parse_*` functions should almost always error when used incorrectly.
+
+- `to_*` functions must never fail.


This isn't currently true because we decided to have to_* functions also take a string, meaning it does both decoding and type conversion.

We can either update this rule to match reality, or (at some point before 1.0) do a breaking change to update the functions to match this rule.

💭 would that end up being something like:

if is_integer(.field) { .field = int(.field) }

lib/vrl/DESIGN.md

JeanMertz · 2021-08-16T08:37:30Z

lib/vrl/DESIGN.md

+       { "error": "invalid data format" }
+```
+
+TODO


If you've seen any common patterns in the wild that are worth mentioning here, please let me know!

👍 I like this pattern.

I am wondering if we'll want to introduce "pipelining" at some point like:

data = decode_base64(.message) |> parse_json

That is really just syntax sugar though.

Pipe operator support is tracked in https://github.com/timberio/vector/issues/5431, but I don't think there's anything to note here, since it isn't actually part of the language yet.

spencergilbert

One comment I'd generally make about VRL is that users seem to generally be using ! just about everywhere. This may be because most of the docs/examples use it for terseness, or because it's easier, or even because their event flow is much more known in advance and they need less handling.

lib/vrl/DESIGN.md

spencergilbert · 2021-08-16T14:56:51Z

lib/vrl/DESIGN.md

+- Use `parse_*` for string decoding functions (e.g. `parse_json` and
+  `parse_grok`).
+
+- Use `encode_*` for string encoding functions (e.g. `encode_base64`).


🤔 I'm torn on this, I prefer aligning on parse_base64 but using base64 directly it's referred to as "decode".

I'd lean towards internal consistency, maybe document "this decodes base64 encoded values" similar to us referencing "exploding" with the unnest function

spencergilbert · 2021-08-16T15:09:01Z

lib/vrl/DESIGN.md

+
+- `parse_*` functions should almost always error when used incorrectly.
+
+- `to_*` functions must never fail.


💭 would that end up being something like:

if is_integer(.field) { .field = int(.field) }

jszwedko

Thanks @JeanMertz ! This seems like a great start and will help guide decisions in VRL in the future. I especially like the target audience and calling out of anti-patterns that we've rejected.

I can review again when the TODOs are addressed.

lib/vrl/DESIGN.md

jszwedko · 2021-08-18T20:19:39Z

lib/vrl/DESIGN.md

+- Use `parse_*` for string decoding functions (e.g. `parse_json` and
+  `parse_grok`).
+
+- Use `encode_*` for string encoding functions (e.g. `encode_base64`).


I think there is some distinction here between functions that parse a string into some other type (like parse_json or parse_int) and functions that turn a string into another string (decode_base64 or decode_percent).

To me parse_base64 reads pretty weird given no "parsing" is really happening, just translation. I think parse makes sense for only for structured data.

It does makes encode a little confusing since it is used as the opposite of both decode and parse when you might only expect it to be the opposite of decode. I might suggest we use to_X like to_json and to_key_value but this would overload to_X given it is also used for converting to primitives. We could go the Go/Java route of unmarshal and marshal?

Ideally, we'd rectify this, but that's a breaking change. We can create issues for when a current implementation goes against the design principles so that we can do a sweep of breaking changes before we hit 1.0.

I think we could also just alias the functions to rename them to the desired scheme; dropping the deprecated aliases at 1.0.

jszwedko · 2021-08-18T20:21:08Z

lib/vrl/DESIGN.md

+- For one or more parameters, the first parameter must be the "target" of the
+  function (e.g. `parse_regex(target: <string>, pattern: <regex>)`).
+
+- The first parameter must therefor almost always be named `target`.


Suggested change

- The first parameter must therefor almost always be named `target`.

- The first parameter must therefore almost always be named `target`.

lib/vrl/DESIGN.md

jszwedko · 2021-08-18T20:34:02Z

lib/vrl/DESIGN.md

+       { "error": "invalid data format" }
+```
+
+TODO


👍 I like this pattern.

I am wondering if we'll want to introduce "pipelining" at some point like:

data = decode_base64(.message) |> parse_json

That is really just syntax sugar though.

binarylogic

This is an excellent start. I left a few comments, I'm curious what you think?

lib/vrl/DESIGN.md

Signed-off-by: Jean Mertz <[email protected]>

lib/vrl/DESIGN.md

bruceg

Looks like a good start. I think it would be worth adding a section about how VRL handles types and type inference, particularly in light of the statement that "All errors are caught at compile time."

lib/vrl/DESIGN.md

StephenWakely

This is amazing!

Would it be worth going into more detail about the type defs returned from a function? For example, it could be worth mentioning that some functions return objects with fields that may or may not be defined - the type defs for those fields should be in the definition as Kind::Bytes | Kind::Null.

lib/vrl/DESIGN.md

lukesteensen

Overall this looks great, though I'd think it'd be helpful to discuss the design of the fallibility system itself a bit here too. The way that it interacts with the type system in particular seems to make up the bulk of the language's learning curve.

Signed-off-by: Jean Mertz <[email protected]>

JeanMertz · 2021-09-28T12:40:38Z

This one should be ready for a final round of reviews.

I'll file issues for any lingering comments, to allow us to iterate on this document in future PRs, and avoid keeping this PR open until all i's are dotted and t's are crossed.

jszwedko

Nice work on this!

lib/vrl/DESIGN.md

StephenWakely · 2021-09-30T12:46:57Z

lib/vrl/DESIGN.md

+  performant as it can be, and there's no way to use functions in such a way
+  that performance of a VRL program tanks. We might introduce network calls at
+  some point, if we find a good caching solution to solve most of our concerns,
+  but so far we've avoided any network calls inside our stdlib.


I think the biggest issue here is async. If we want to start introducing IO we will need to rewire VRL calls to be async. If we do it well it shouldn't affect performance too much since the runtime can be handling other events while waiting on IO.

That's a fair point. I don't think it fits this document, but worth keeping in mind.

Signed-off-by: Jean Mertz <[email protected]>

initial VRL design document

96be808

Signed-off-by: Jean Mertz <[email protected]>

JeanMertz added type: task Generic non-code related tasks domain: vrl Anything related to the Vector Remap Language labels Aug 16, 2021

JeanMertz requested review from StephenWakely, lukesteensen, bruceg and binarylogic August 16, 2021 08:23

This comment has been minimized.

Sign in to view

fixup! initial VRL design document

89460d0

Signed-off-by: Jean Mertz <[email protected]>

JeanMertz commented Aug 16, 2021

View reviewed changes

spencergilbert reviewed Aug 16, 2021

View reviewed changes

jszwedko reviewed Aug 18, 2021

View reviewed changes

binarylogic suggested changes Aug 19, 2021

View reviewed changes

JeanMertz added 3 commits August 20, 2021 14:58

process review feedback

4acc901

Signed-off-by: Jean Mertz <[email protected]>

add section on syntax vs functions

d9e4f67

Signed-off-by: Jean Mertz <[email protected]>

error section

d6dda6c

Signed-off-by: Jean Mertz <[email protected]>

spencergilbert approved these changes Aug 20, 2021

View reviewed changes

lib/vrl/DESIGN.md Show resolved Hide resolved

lib/vrl/DESIGN.md Show resolved Hide resolved

JeanMertz requested a review from binarylogic August 20, 2021 14:32

binarylogic reviewed Aug 20, 2021

View reviewed changes

lib/vrl/DESIGN.md Show resolved Hide resolved

bruceg reviewed Aug 20, 2021

View reviewed changes

lib/vrl/DESIGN.md Show resolved Hide resolved

lib/vrl/DESIGN.md Outdated Show resolved Hide resolved

lib/vrl/DESIGN.md Outdated Show resolved Hide resolved

lib/vrl/DESIGN.md Outdated Show resolved Hide resolved

StephenWakely reviewed Aug 30, 2021

View reviewed changes

lib/vrl/DESIGN.md Show resolved Hide resolved

lib/vrl/DESIGN.md Show resolved Hide resolved

lukesteensen reviewed Aug 30, 2021

View reviewed changes

expand documentation

94bb86a

Signed-off-by: Jean Mertz <[email protected]>

JeanMertz requested review from StephenWakely, bruceg and binarylogic September 28, 2021 12:39

JeanMertz requested a review from jszwedko September 28, 2021 14:40

jszwedko approved these changes Sep 28, 2021

View reviewed changes

lib/vrl/DESIGN.md Outdated Show resolved Hide resolved

spencergilbert approved these changes Sep 29, 2021

View reviewed changes

StephenWakely reviewed Sep 30, 2021

View reviewed changes

lib/vrl/DESIGN.md Outdated Show resolved Hide resolved

StephenWakely reviewed Sep 30, 2021

View reviewed changes

fixup! initial VRL design document

3d811d7

Signed-off-by: Jean Mertz <[email protected]>

JeanMertz mentioned this pull request Oct 1, 2021

Use unmarshal / marshal terminology for converting from/to serialized structured data #9405

Open

trim

5c5a74e

Signed-off-by: Jean Mertz <[email protected]>

JeanMertz enabled auto-merge (squash) October 1, 2021 13:14

JeanMertz disabled auto-merge October 1, 2021 14:07

JeanMertz merged commit 59075fc into master Oct 1, 2021

JeanMertz deleted the jean/vrl-design-doc branch October 1, 2021 14:07

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

chore(vrl): add development design document #8735

chore(vrl): add development design document #8735

JeanMertz commented Aug 16, 2021 •

edited

Loading

This comment has been minimized.

JeanMertz Aug 16, 2021

spencergilbert Aug 16, 2021

jszwedko Aug 18, 2021

spencergilbert Aug 18, 2021

JeanMertz Aug 16, 2021

JeanMertz Aug 16, 2021

spencergilbert Aug 16, 2021

JeanMertz Aug 16, 2021

jszwedko Aug 18, 2021

JeanMertz Aug 20, 2021

spencergilbert left a comment

spencergilbert Aug 16, 2021

spencergilbert Aug 16, 2021

jszwedko left a comment

jszwedko Aug 18, 2021

jszwedko Aug 18, 2021

jszwedko Aug 18, 2021

binarylogic left a comment

bruceg left a comment

StephenWakely left a comment

lukesteensen left a comment

JeanMertz commented Sep 28, 2021

jszwedko left a comment

StephenWakely Sep 30, 2021

JeanMertz Oct 1, 2021


		- Return boolean from `is_*` functions (e.g. `is_string`).

		- Return an error when `parse_*` functions fail to decode the string.


		- `parse_*` functions should almost always error when used incorrectly.

		- `to_*` functions must never fail.

	- The first parameter must therefor almost always be named `target`.
	- The first parameter must therefore almost always be named `target`.

chore(vrl): add development design document #8735

chore(vrl): add development design document #8735

Conversation

JeanMertz commented Aug 16, 2021 • edited Loading

This comment has been minimized.

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

spencergilbert left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jszwedko left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

binarylogic left a comment

Choose a reason for hiding this comment

bruceg left a comment

Choose a reason for hiding this comment

StephenWakely left a comment

Choose a reason for hiding this comment

lukesteensen left a comment

Choose a reason for hiding this comment

JeanMertz commented Sep 28, 2021

jszwedko left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

JeanMertz commented Aug 16, 2021 •

edited

Loading