Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Recursive macro expansion in TDL containers #647

Merged
merged 5 commits into from
Sep 29, 2023
Merged

Recursive macro expansion in TDL containers #647

merged 5 commits into from
Sep 29, 2023

Conversation

zslayton
Copy link
Contributor

@zslayton zslayton commented Sep 27, 2023

Builds on #645.

This PR adds the logic needed to recursively evaluate macro invocations found inside containers in TDL (the Template Definition Language).

Prior to this change, e-expressions (macro invocations in the data stream) inside containers would be recursively expanded:

[1, 2, (:values 3 4), 5, 6, (:make_string (:values foo bar) baz), 7] 

// Evaluates to:

[1, 2, 3, 4, 5, 6, "foobarbaz", 7]

but macro invocations in a TDL expression would not:

(values [1, 2, (values 3 4), 5, (void), 6])

// "Evaluates" to:

[1, 2, (values 3 4), 5, (void), 6]

with the first level of invocation (the outer (values ...) being evaluated, but invocations nested within a container (here: a list) would not.

Following this patch, TDL macro invocations found in lists and structs are evaluated correctly:

(values [1, 2, (values 3 4), 5, (void), 6])

// Evaluates to:

[1, 2, 3, 4, 5, 6]

// and

(values {a: 1, b: (values 2 3), c: (void), d: 4})

// Evaluates to:

{a: 1, b: 2, b: 3, d: 4}

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

@zslayton zslayton changed the base branch from main to 1_1-text-reader September 27, 2023 21:29
@codecov
Copy link

codecov bot commented Sep 27, 2023

Codecov Report

Attention: 73 lines in your changes are missing coverage. Please review.

Files Coverage Δ
src/lazy/binary/raw/sequence.rs 47.22% <100.00%> (ø)
src/lazy/binary/raw/struct.rs 45.33% <100.00%> (ø)
src/lazy/encoding.rs 38.46% <ø> (ø)
src/lazy/expanded/e_expression.rs 100.00% <100.00%> (ø)
src/lazy/expanded/tdl_macro.rs 80.43% <100.00%> (ø)
src/lazy/struct.rs 70.00% <100.00%> (-0.18%) ⬇️
src/lazy/system_reader.rs 87.73% <ø> (-0.10%) ⬇️
src/lazy/text/buffer.rs 94.91% <100.00%> (ø)
src/lazy/text/raw/sequence.rs 68.55% <100.00%> (ø)
src/lazy/value.rs 80.83% <100.00%> (-1.34%) ⬇️
... and 10 more

... and 3 files with indirect coverage changes

📢 Thoughts on this report? Let us know!.

Copy link
Contributor Author

@zslayton zslayton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🗺️ PR tour

@@ -12,7 +12,7 @@ use crate::lazy::binary::raw::value::LazyRawBinaryValue;
use crate::lazy::decoder::private::{LazyContainerPrivate, LazyRawValuePrivate};
use crate::lazy::decoder::{
LazyDecoder, LazyRawFieldExpr, LazyRawReader, LazyRawSequence, LazyRawStruct, LazyRawValue,
LazyRawValueExpr,
LazyRawValueExpr, RawFieldExpr, RawValueExpr,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🗺️ The enum LazyRawValueExpr<'data, D> can be either a value literal (D::Value) or a macro invocation (D::MacroInvocation). Starting with this PR, we needed the ability to talk about value expressions that might have come from templates, which are not a kind of stream encoding and so do not have a corresponding implementation D: LazyDecoder<'data>.

The new enum RawValueExpr<V, M> allows any types to be used as the value (V) and the macro invocation (M). LazyRawValueExpr<'data, D> is now a type alias for the most common kind of value expression:

pub type LazyRawValueExpr<'data, D> =
    RawValueExpr<
        <D as LazyDecoder<'data>>::Value,
        <D as LazyDecoder<'data>>::MacroInvocation
    >;

An analogous change has been made for the LazyRawFieldExpr/RawFieldExpr pair.

Comment on lines -117 to +121
Some(Ok(LazyRawValueExpr::ValueLiteral(value))) => Some(Ok(
LazyRawValueExpr::ValueLiteral(LazyRawAnyValue::from(value)),
)),
Some(Ok(LazyRawValueExpr::MacroInvocation(invocation))) => Some(Ok(
LazyRawValueExpr::MacroInvocation(LazyRawAnyMacroInvocation {
Some(Ok(RawValueExpr::ValueLiteral(value))) => {
Some(Ok(RawValueExpr::ValueLiteral(LazyRawAnyValue::from(value))))
}
Some(Ok(RawValueExpr::MacroInvocation(invocation))) => Some(Ok(
RawValueExpr::MacroInvocation(LazyRawAnyMacroInvocation {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🗺️ Now that LazyRawValueExpr<'data, D> is a type alias, existing concrete uses of it would need to supply type annotations:

// Before:
LazyRawValueExpr::ValueLiteral(v)
// After:
LazyRawValueExpr::<'data, D>::ValueLiteral(v)

However, the compiler is able to infer the correct types for RawValuExpr, to which the LazyRawValueExpr alias points. I've switched instantiations of LazyRawValueExpr to RawValueExpr for concision.

Unfortunately, that change makes up the lion's share of the line count in this diff. x_x

An analogous change has been made for the LazyRawFieldExpr/RawFieldExpr pair.

pub enum ArgumentKind<'top, 'data: 'top, D: LazyDecoder<'data>, M: MacroInvocation<'data, D>> {
pub enum ArgumentKind<'top, 'data, D: LazyDecoder<'data>, M: MacroInvocation<'data, D>> {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🗺️ In a few places, I've removed an "outlives" constraint: 'data: 'top. While it is always true throughout our API, only some methods and trait implementations "care." Anywhere else it's just line noise and--since lifetimes are one of the tougher bits of Rust to grok--probably a barrier to understanding.

Comment on lines -101 to +102
M: 'data + 'top,
'data: 'top,
M: 'top,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🗺️ This is an equivalent formulation, but I liked that it had the familiar 'data: 'top constraint and emphasizes the bit that our macro invocation type will live as long as 'top.

spooky: PhantomData<(&'data D, &'data M)>,
spooky: PhantomData<(&'data D, M)>,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🗺️ It took way too long to realize that this throwaway &'data M added to a PhantomData was causing a lifetime constraint in the struct expander to be unsatisfiable.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is'data strictly necessary for D? (Also, since it's Phantom, does the & matter?)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I needed 'data to appear somewhere in MacroEvaluator because it's being used in the type's trait bounds. I can leave it with D because D is guaranteed to outlive 'data, but I couldn't have it with M because the types that represent a macro invocation include 'top lifetime template values. (&'top Element)

}

impl<
'iter,
'top: 'iter,
'top,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🗺️ Similar to an earlier change: this lifetime constraint is true, but the implementation doesn't care so it's just line noise.

@@ -391,7 +398,6 @@ impl<
evaluator,
context,
initial_stack_depth,
spooky: PhantomData,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🗺️ An earlier version of this type needed PhantomData, but it no longer does.

Comment on lines +538 to +540
MacroEvaluator::<'data, D, M, BumpVec<'top, MacroExpansion<D, M>>>::new_transient(
context,
);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🗺️ There's a subtle bug here. Even though the type annotations specify BumpVec<'top, _>, the new() method always constructs an evaluator backed by a Vec.

I've opened #648 to track fixing this.

@@ -177,101 +229,212 @@ impl<'top, 'data, D: LazyDecoder<'data>> Iterator for ExpandedStructIterator<'to
ref mut state,
} = *self;
match source {
ExpandedStructIteratorSource::ValueLiteral(evaluator, iter) => {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🗺️ This change happens to address feedback in #645.

@@ -50,7 +50,6 @@ use crate::{IonError, IonResult, SymbolTable};
/// ```
pub struct LazyList<'top, 'data, D: LazyDecoder<'data>> {
pub(crate) expanded_list: LazyExpandedList<'top, 'data, D>,
pub(crate) symbol_table: &'top SymbolTable,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🗺️ The Lazy[Value|List|SExp|Struct types don't need their own SymbolTable reference; the expanded value they wrap has an EncodingContext<'top> they can use.

@zslayton zslayton marked this pull request as ready for review September 28, 2023 15:13
@zslayton zslayton requested a review from popematt September 28, 2023 15:13
@zslayton zslayton changed the title Tdl containers Recursive macro expansion in TDL containers Sep 28, 2023
Base automatically changed from 1_1-text-reader to main September 28, 2023 19:54
Comment on lines +49 to +53
// `RawValueExpr` above has no ties to a particular encoding. The `LazyRawValueExpr` type alias
// below uses the `Value` and `MacroInvocation` associated types from the decoder `D`. In most
// places, this is a helpful constraint; we can talk about the value expression in terms of the
// LazyDecoder it's associated with. However, in some places (primarily when expanding template
// values that don't have a LazyDecoder) we need to be able to use it without constraints.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be moved to the doc comments of one or both of RawValueExpr and LazyRawValueExpr since it's the best explanation of why you would use one vs the other.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call, I'll add this to #652.

spooky: PhantomData<(&'data D, &'data M)>,
spooky: PhantomData<(&'data D, M)>,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is'data strictly necessary for D? (Also, since it's Phantom, does the & matter?)

@zslayton zslayton merged commit d73f6bb into main Sep 29, 2023
@zslayton zslayton deleted the tdl_containers branch September 29, 2023 12:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants