Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve array builder documentation (#3949) #3951

Merged
merged 3 commits into from
Mar 28, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 6 additions & 0 deletions arrow-array/src/builder/generic_list_builder.rs
Original file line number Diff line number Diff line change
Expand Up @@ -40,6 +40,12 @@ pub struct GenericListBuilder<OffsetSize: OffsetSizeTrait, T: ArrayBuilder> {
values_builder: T,
}

impl<O: OffsetSizeTrait, T: ArrayBuilder + Default> Default for GenericListBuilder<O, T> {
fn default() -> Self {
Self::new(T::default())
}
}

impl<OffsetSize: OffsetSizeTrait, T: ArrayBuilder> GenericListBuilder<OffsetSize, T> {
/// Creates a new [`GenericListBuilder`] from a given values array builder
pub fn new(values_builder: T) -> Self {
Expand Down
131 changes: 130 additions & 1 deletion arrow-array/src/builder/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,136 @@
// specific language governing permissions and limitations
// under the License.

//! Defines builders for the various array types
//! Defines builders that can be used to safely build arrays
//!
//! # Basic Usage
//!
//! Builders can be used to build simple, non-nested arrays
//!
//! ```
//! # use arrow_array::builder::Int32Builder;
//! # use arrow_array::PrimitiveArray;
//! let mut a = Int32Builder::new();
//! a.append_value(1);
//! a.append_null();
//! a.append_value(2);
//! let a = a.finish();
//!
//! assert_eq!(a, PrimitiveArray::from(vec![Some(1), None, Some(2)]));
//! ```
//!
//! ```
//! # use arrow_array::builder::StringBuilder;
//! # use arrow_array::{Array, StringArray};
//! let mut a = StringBuilder::new();
//! a.append_value("foo");
//! a.append_value("bar");
//! a.append_null();
//! let a = a.finish();
//!
//! assert_eq!(a, StringArray::from_iter([Some("foo"), Some("bar"), None]));
//! ```
//!
//! # Nested Usage
//!
//! Builders can also be used to build more complex nested arrays, such as lists
//!
//! ```
//! # use arrow_array::builder::{Int32Builder, ListBuilder};
//! # use arrow_array::ListArray;
//! # use arrow_array::types::Int32Type;
//! let mut a = ListBuilder::new(Int32Builder::new());
//! // [1, 2]
//! a.values().append_value(1);
//! a.values().append_value(2);
//! a.append(true);
//! // null
//! a.append(false);
//! // []
//! a.append(true);
//! // [3, null]
//! a.values().append_value(3);
//! a.values().append_null();
//! a.append(true);
//!
//! // [[1, 2], null, [], [3, null]]
//! let a = a.finish();
//!
//! assert_eq!(a, ListArray::from_iter_primitive::<Int32Type, _, _>([
//! Some(vec![Some(1), Some(2)]),
//! None,
//! Some(vec![]),
//! Some(vec![Some(3), None])]
//! ))
//! ```
//!
//! # Custom Builders
//!
//! It is common to have a collection of statically defined Rust types that
//! you want to convert to Arrow arrays. An example of doing so is below
//!
//! ```
//! # use std::any::Any;
//! # use arrow_array::builder::{ArrayBuilder, Int32Builder, ListBuilder, StringBuilder};
//! # use arrow_array::{ArrayRef, RecordBatch, StructArray};
//! # use arrow_schema::{DataType, Field};
//! # use std::sync::Arc;
//! /// A custom row representation
//! struct MyRow {
//! i32: i32,
//! optional_i32: Option<i32>,
//! string: Option<String>,
//! i32_list: Option<Vec<Option<i32>>>,
//! }
//!
//! /// Converts `Vec<Row>` into `StructArray`
//! #[derive(Debug, Default)]
//! struct MyRowBuilder {
//! i32: Int32Builder,
//! string: StringBuilder,
//! i32_list: ListBuilder<Int32Builder>,
//! }
//!
//! impl MyRowBuilder {
//! fn append(&mut self, row: &MyRow) {
//! self.i32.append_value(row.i32);
//! self.string.append_option(row.string.as_ref());
//! self.i32_list.append_option(row.i32_list.as_ref().map(|x| x.iter().copied()));
//! }
//!
//! /// Note: returns StructArray to allow nesting within another array if desired
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe it is also worth mentioning (or adding a function that does so) that it could return a 4 column RecordBatch as well

//! fn finish(&mut self) -> StructArray {
//! let i32 = Arc::new(self.i32.finish()) as ArrayRef;
//! let i32_field = Field::new("i32", DataType::Int32, false);
//!
//! let string = Arc::new(self.string.finish()) as ArrayRef;
//! let string_field = Field::new("i32", DataType::Utf8, false);
//!
//! let i32_list = Arc::new(self.i32_list.finish()) as ArrayRef;
//! let value_field = Box::new(Field::new("item", DataType::Int32, true));
//! let i32_list_field = Field::new("i32_list", DataType::List(value_field), true);
//!
//! StructArray::from(vec![
//! (i32_field, i32),
//! (string_field, string),
//! (i32_list_field, i32_list),
//! ])
//! }
//! }
//!
//! impl<'a> Extend<&'a MyRow> for MyRowBuilder {
//! fn extend<T: IntoIterator<Item = &'a MyRow>>(&mut self, iter: T) {
//! iter.into_iter().for_each(|row| self.append(row));
//! }
//! }
//!
//! /// Converts a slice of [`MyRow`] to a [`RecordBatch`]
//! fn rows_to_batch(rows: &[MyRow]) -> RecordBatch {
//! let mut builder = MyRowBuilder::default();
//! builder.extend(rows);
//! RecordBatch::from(&builder.finish())
//! }
//! ```

mod boolean_buffer_builder;
pub use boolean_buffer_builder::*;
Expand Down