-
Notifications
You must be signed in to change notification settings - Fork 36
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Text reader implementation cleanup #763
Conversation
Given that For my own interest, do you know if there are any tools that can be used that show how the bytes are used for the various parts of a type? |
The best available method is to add the RUSTFLAGS=-Zprint-type-sizes cargo +nightly build --release This produces a horribly noisy wall of unstructured type size information. If you store the output in a file like so: RUSTFLAGS=-Zprint-type-sizes cargo +nightly build --release 2>&1 > /tmp/sizes you can use the cargo install top-type-sizes
top-type-sizes \
# Print the type information for the top 200 types. (Default limit: 100)
-l 200 \
# Include types defined in the module `lazy`
-p lazy \
# Exclude types defined in the standard library (`Option<T>`, `Result<T>`, etc)
-e std \
# Input file
< /tmp/sizes
# Output file
> /tmp/top_sizes Here are some example type size readouts: ion-rust/src/lazy/expanded/mod.rs Lines 500 to 508 in 158bff7
ion-rust/src/lazy/expanded/mod.rs Lines 445 to 459 in 158bff7
I found this information in the Rust Performance mdBook. |
Many of the types that comprise the text reader API are quite large. This PR makes a variety of small changes to shrink their layout, hopefully helping to minimize the need for
memcpy
s and reducing bump allocated memory usage. Benchmarks showed modest improvements in the 1-3% range. I suggest reviewing the commits individually.Following PR #760, field name information is no longer stored in the
LazyRawTextValue
; it has its own type that is only constructed in the context of a struct. Commit 3f68fab modifies text values so that theirTextBufferView
layout starts with the annotations sequence (if any) and ends with the final byte of the value. Prior to this change, many values held a slice that contained the rest of the available input. This change makes it possible to use theTextBufferView
's offset and length as a proxy for the value's own. It also eliminates most of the metadata stored in theLazyRawTextValue
, leaving only adata_offset: u16
that indicates where the first byte of the value is.The same commit also modified the
MacroEvaluator
to avoid storing an extra copy of the allocator when one could be fetched from a nestedBumpVec
. Similarly, theMacroEvaluator
can now reference theEncodingContext
that lives on eachMacroExpansion
.Commit 054df22 causes the
MacroEvaluator
to use a capacity hint when bump-allocating space to store macro argument expressions.Commit a8514b6 moves the
EncodingContext
that was previously copied into instances of most types into the bump allocator and passes around a reference to it instead. This shrank the size of many types by 16 bytes.Commit c0256c0 removes the superfluous
MatchedRawTextValue
type, which was intended to be a shim allowing the 1.0 and 1.1 text value types to use the same structure. I was able to remove the need for it via generics.Commit 375915d shrinks the size of the macro parameter index stored in each
TemplateVariableReference
fromusize
tou16
. This made space for theTemplateVariableReference
's enum discriminant, saving 8 bytes.Here are some example data type size reductions for
AnyEncoding
:LazyValue
LazyField
StructIterator
While some types were specifically target for improvement, many types benefitted from the
EncodingContext
change.By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.