-
Notifications
You must be signed in to change notification settings - Fork 36
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implements writing delimited and length-prefixed sequence types #701
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🗺️ PR tour
let untyped_ptr: *mut () = ptr.cast(); | ||
untyped_ptr | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🗺️ These methods were redundantly defined in two different modules, so I created a private unsafe_helpers
module to house them.
// container. | ||
// 2. a delimited container, this will be the parent's own encoding buffer, to which the delimited | ||
// container start opcode has already been written. | ||
buffer: &'value mut BumpVec<'top, u8>, | ||
} | ||
|
||
impl<'value, 'top> BinaryContainerWriter_1_1<'value, 'top> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🗺️ I did some refactoring of lifetimes that allowed me to remove some methods/types.
Previously, each container writer had two lifetime parameters and could instantiate an affiliated type with a single lifetime parameter to be used as a closure argument.
BinaryContainerWriter_1_1<'value, 'top>
->BinaryContainerValuesWriter_1_1<'value>
BinaryListWriter_1_1<'value, 'top>
->BinaryListValuesWriter_1_1<'value>
BinarySExpWriter_1_1<'value, 'top>
->BinarySExpValuesWriter_1_1<'value>
BinarStructWriter_1_1<'value, 'top>
->BinaryStructFieldsWriter_1_1<'value>
Now, the two-lifetime types can be used as the closure argument themselves which simplifies things quite a bit.
@@ -24,6 +22,7 @@ use crate::{ | |||
pub struct BinaryValueWriter_1_1<'value, 'top> { | |||
allocator: &'top BumpAllocator, | |||
encoding_buffer: &'value mut BumpVec<'top, u8>, | |||
delimited_containers: bool, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🗺️ Where previously I was using type families to statically express different configurations (e.g. with/without annotations), I'm going to start moving towards branching on stored values.
pub fn with_delimited_containers(mut self) -> Self { | ||
self.delimited_containers = true; | ||
self | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🗺️ This builder-style config method allows users to write code like:
writer
.value_writer()
.with_delimited_containers()
.write_sexp(|sexp| {
for text in str_array {
sexp.write_string(text)?;
}
})?;
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you start using delimited containers, and then you want to write a non-delimited container, how would you do that? E.g.: I want to write a struct where one field is a delimited list and another field is a length-prefixed list; or I want to write a delimited list that contains one or more length-prefixed s-expressions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Each value gets its own ValueWriter
; they're single-use. In your example, a value writer would be created for the outer struct, and another would be created for each of the nested field values, allowing you to configure the encoding options for each one separately.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
or I want to write a delimited list that contains one or more length-prefixed s-expressions
writer
.value_writer()
.with_delimited_containers()
.write_list(|list| {
list
// The type's `WriteAsIon` impl will use the default ValueWriter configuration (currently
// always length-prefixed)
.write([1, 2, 3].as_sexp())?
// Also uses the default ValueWriter configuration (currently always length-prefixed)
.write_sexp(|sexp| sexp.write(1)?.write(2)?.write(3)?)?
// Allows you to explicitly configure the value writer
.value_writer()
.with_delimited_containers()
.write([1, 2, 3].as_sexp())?;
Ok(())
})?;
The above would produce a delimited list that contains two length-prefixed s-expressions and a delimited s-expression.
>( | ||
mut self, | ||
sexp_fn: F, | ||
) -> IonResult<()> { | ||
self.sexp_writer().write_values(sexp_fn) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🗺️ While it was very tempting to try and write a generic "sequence" implementation for the list
/sexp
impls to call, in practice I found that it required a lot of generics and ended up being not much shorter. I decided to just give each their own impl for simplicity.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is just an idea, not a request for a change.
If you really want to DRY it up, I think it should work to do something like this:
fn write_length_prefixed_list<
F: for<'a> FnOnce(&mut <Self as ValueWriter>::ListWriter<'a>) -> IonResult<()>,
>(
mut self,
list_fn: F,
) -> IonResult<()> {
const LIST_NIBBLE: u8 = 0xA;
self.write_length_prefixed_sequence(LIST_NIBBLE, |container_writer| {
// Create a BinaryListWriter_1_1 to pass to the user's closure.
list_fn(&mut BinaryListWriter_1_1::new(container_writer))
})
}
fn write_length_prefixed_sexp<
F: for<'a> FnOnce(&mut <Self as ValueWriter>::SExpWriter<'a>) -> IonResult<()>,
>(
mut self,
sexp_fn: F,
) -> IonResult<()> {
const SEXP_NIBBLE: u8 = 0xB;
self.write_length_prefixed_sequence(SEXP_NIBBLE, |container_writer| {
// Create a BinarySExpWriter_1_1 to pass to the user's closure.
sexp_fn(&mut BinarySExpWriter_1_1::new(container_writer))
})
}
fn write_length_prefixed_sequence<F: FnOnce(BinaryContainerWriter_1_1) -> IonResult<()>>(mut self, container_nibble: u8, container_fn: F) -> IonResult<()> {
// We're writing a length-prefixed sexp/list, so we need to set up a space to encode the list's children.
let child_encoding_buffer = self
.allocator
.alloc_with(|| BumpVec::new_in(self.allocator));
let mut container_writer =
BinaryContainerWriter_1_1::new(self.allocator, child_encoding_buffer);
// Pass it to the closure, allowing the user to encode child values.
container_fn(container_writer)?;
// Write the appropriate opcode for a sexp of this length
let encoded_length = container_writer.buffer().len();
match encoded_length {
0..=15 => {
let opcode = (container_nibble << 4) | encoded_length as u8;
self.push_byte(opcode);
}
_ => {
let opcode = 0xF | container_nibble; // SExp or List w/FlexUInt length
self.push_byte(opcode);
FlexUInt::write_u64(self.encoding_buffer, encoded_length as u64)?;
}
}
self.encoding_buffer
.extend_from_slice(sexp_writer.container_writer.buffer());
Ok(())
}
type StructWriter<'a> = BinaryStructFieldsWriter_1_1<'a>; | ||
type ListWriter<'a> = BinaryListWriter_1_1<'value, 'top>; | ||
type SExpWriter<'a> = BinarySExpWriter_1_1<'value, 'top>; | ||
type StructWriter<'a> = BinaryStructWriter_1_1<'value, 'top>; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🗺️ The aforementioned change in closure argument types.
let untyped_ptr: *mut () = ptr.cast(); | ||
untyped_ptr | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🗺️ This is the second file where these methods were defined; they were removed and usages updated to use the unsafe_helpers
module.
pub fn with_delimited_containers(mut self) -> Self { | ||
self.delimited_containers = true; | ||
self | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you start using delimited containers, and then you want to write a non-delimited container, how would you do that? E.g.: I want to write a struct where one field is a delimited list and another field is a length-prefixed list; or I want to write a delimited list that contains one or more length-prefixed s-expressions?
>( | ||
mut self, | ||
sexp_fn: F, | ||
) -> IonResult<()> { | ||
self.sexp_writer().write_values(sexp_fn) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is just an idea, not a request for a change.
If you really want to DRY it up, I think it should work to do something like this:
fn write_length_prefixed_list<
F: for<'a> FnOnce(&mut <Self as ValueWriter>::ListWriter<'a>) -> IonResult<()>,
>(
mut self,
list_fn: F,
) -> IonResult<()> {
const LIST_NIBBLE: u8 = 0xA;
self.write_length_prefixed_sequence(LIST_NIBBLE, |container_writer| {
// Create a BinaryListWriter_1_1 to pass to the user's closure.
list_fn(&mut BinaryListWriter_1_1::new(container_writer))
})
}
fn write_length_prefixed_sexp<
F: for<'a> FnOnce(&mut <Self as ValueWriter>::SExpWriter<'a>) -> IonResult<()>,
>(
mut self,
sexp_fn: F,
) -> IonResult<()> {
const SEXP_NIBBLE: u8 = 0xB;
self.write_length_prefixed_sequence(SEXP_NIBBLE, |container_writer| {
// Create a BinarySExpWriter_1_1 to pass to the user's closure.
sexp_fn(&mut BinarySExpWriter_1_1::new(container_writer))
})
}
fn write_length_prefixed_sequence<F: FnOnce(BinaryContainerWriter_1_1) -> IonResult<()>>(mut self, container_nibble: u8, container_fn: F) -> IonResult<()> {
// We're writing a length-prefixed sexp/list, so we need to set up a space to encode the list's children.
let child_encoding_buffer = self
.allocator
.alloc_with(|| BumpVec::new_in(self.allocator));
let mut container_writer =
BinaryContainerWriter_1_1::new(self.allocator, child_encoding_buffer);
// Pass it to the closure, allowing the user to encode child values.
container_fn(container_writer)?;
// Write the appropriate opcode for a sexp of this length
let encoded_length = container_writer.buffer().len();
match encoded_length {
0..=15 => {
let opcode = (container_nibble << 4) | encoded_length as u8;
self.push_byte(opcode);
}
_ => {
let opcode = 0xF | container_nibble; // SExp or List w/FlexUInt length
self.push_byte(opcode);
FlexUInt::write_u64(self.encoding_buffer, encoded_length as u64)?;
}
}
self.encoding_buffer
.extend_from_slice(sexp_writer.container_writer.buffer());
Ok(())
}
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.