Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Text reader implementation cleanup #763

Merged
merged 6 commits into from
May 9, 2024
Merged

Text reader implementation cleanup #763

merged 6 commits into from
May 9, 2024

Conversation

zslayton
Copy link
Contributor

@zslayton zslayton commented May 7, 2024

Many of the types that comprise the text reader API are quite large. This PR makes a variety of small changes to shrink their layout, hopefully helping to minimize the need for memcpys and reducing bump allocated memory usage. Benchmarks showed modest improvements in the 1-3% range. I suggest reviewing the commits individually.

Following PR #760, field name information is no longer stored in the LazyRawTextValue; it has its own type that is only constructed in the context of a struct. Commit 3f68fab modifies text values so that their TextBufferView layout starts with the annotations sequence (if any) and ends with the final byte of the value. Prior to this change, many values held a slice that contained the rest of the available input. This change makes it possible to use the TextBufferView's offset and length as a proxy for the value's own. It also eliminates most of the metadata stored in the LazyRawTextValue, leaving only a data_offset: u16 that indicates where the first byte of the value is.

The same commit also modified the MacroEvaluator to avoid storing an extra copy of the allocator when one could be fetched from a nested BumpVec. Similarly, the MacroEvaluator can now reference the EncodingContext that lives on each MacroExpansion.

Commit 054df22 causes the MacroEvaluator to use a capacity hint when bump-allocating space to store macro argument expressions.

Commit a8514b6 moves the EncodingContext that was previously copied into instances of most types into the bump allocator and passes around a reference to it instead. This shrank the size of many types by 16 bytes.

Commit c0256c0 removes the superfluous MatchedRawTextValue type, which was intended to be a shim allowing the 1.0 and 1.1 text value types to use the same structure. I was able to remove the need for it via generics.

Commit 375915d shrinks the size of the macro parameter index stored in each TemplateVariableReference from usize to u16. This made space for the TemplateVariableReference's enum discriminant, saving 8 bytes.


Here are some example data type size reductions for AnyEncoding:

Data type Size Before Size After
LazyValue 128 96
LazyField 208 160
StructIterator 320 224

While some types were specifically target for improvement, many types benefitted from the EncodingContext change.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

@zslayton zslayton requested review from nirosys and popematt May 8, 2024 19:14
@popematt
Copy link
Contributor

popematt commented May 9, 2024

Given that mem::size_of returns the size in bytes, I assume that the sizes given in your table are also in bytes. If that's the case, I'm a little bit surprised at how large some of those types are, but my point of comparison is the IonValue and IonElement types, which are for JVM... so it's not really a fair comparison.

For my own interest, do you know if there are any tools that can be used that show how the bytes are used for the various parts of a type?

@zslayton zslayton merged commit 6bd5532 into main May 9, 2024
32 checks passed
@zslayton zslayton deleted the text-reader-cleanup branch May 9, 2024 15:50
@zslayton
Copy link
Contributor Author

zslayton commented May 9, 2024

Given that mem::size_of returns the size in bytes, I assume that the sizes given in your table are also in bytes. If that's the case, I'm a little bit surprised at how large some of those types are, but my point of comparison is the IonValue and IonElement types, which are for JVM... so it's not really a fair comparison.

For my own interest, do you know if there are any tools that can be used that show how the bytes are used for the various parts of a type?

The best available method is to add the -Zprint-type-sizes flag to a nightly release build:

RUSTFLAGS=-Zprint-type-sizes cargo +nightly build --release

This produces a horribly noisy wall of unstructured type size information. If you store the output in a file like so:

RUSTFLAGS=-Zprint-type-sizes cargo +nightly build --release 2>&1 > /tmp/sizes

you can use the top-type-sizes tool to post-process that output into something more helpful:

cargo install top-type-sizes

top-type-sizes \
    # Print the type information for the top 200 types. (Default limit: 100)
    -l 200 \
    # Include types defined in the module `lazy`
    -p lazy \
    # Exclude types defined in the standard library (`Option<T>`, `Result<T>`, etc)
    -e std \
    # Input file
    < /tmp/sizes
    # Output file
    > /tmp/top_sizes

Here are some example type size readouts:

/// A value produced by expanding the 'raw' view of the input data.
#[derive(Copy, Clone)]
pub struct LazyExpandedValue<'top, Encoding: LazyDecoder> {
pub(crate) context: EncodingContext<'top>,
pub(crate) source: ExpandedValueSource<'top, Encoding>,
// If this value came from a variable reference in a template macro expansion, the
// template and the name of the variable can be found here.
pub(crate) variable: Option<TemplateVariableReference<'top>>,
}

104 lazy::expanded::LazyExpandedValue<'_, lazy::any_encoding::AnyEncoding> align=8
     72 source
      8 context
     24 variable

pub enum ExpandedValueSource<'top, D: LazyDecoder> {
/// This value was a literal in the input stream.
ValueLiteral(D::Value<'top>),
/// This value was part of a template definition.
Template(Environment<'top, D>, TemplateElement<'top>),
/// This value was the computed result of a macro invocation like `(:make_string `...)`.
Constructed(
// TODO: Make this an associated type on the LazyDecoder trait so 1.0 types can set
// it to `Never` and the compiler can eliminate this code path where applicable.
// Constructed data stored in the bump allocator. Holding references instead of the data
// itself allows this type (and those that contain it) to impl `Copy`.
&'top [&'top str], // Annotations (if any)
&'top ExpandedValueRef<'top, D>, // Value
),
}

72 lazy::expanded::ExpandedValueSource<'_, lazy::any_encoding::AnyEncoding> align=8
     72 variant ValueLiteral
     48 variant Template
          8 <padding>
         16 0 align=8
         24 1
     32 variant Constructed
          8 <padding>
         16 0 align=8
          8 1

I found this information in the Rust Performance mdBook.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants