Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow handling large external buffers #396

Merged
merged 5 commits into from
Dec 13, 2023
Merged
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 3 additions & 3 deletions examples/export/main.rs
Original file line number Diff line number Diff line change
Expand Up @@ -71,7 +71,7 @@ fn export(output: Output) {

let (min, max) = bounding_coords(&triangle_vertices);

let buffer_length = (triangle_vertices.len() * mem::size_of::<Vertex>()) as u32;
let buffer_length = (triangle_vertices.len() * mem::size_of::<Vertex>()) as u64;
let buffer = json::Buffer {
byte_length: buffer_length,
extensions: Default::default(),
Expand All @@ -87,7 +87,7 @@ fn export(output: Output) {
buffer: json::Index::new(0),
byte_length: buffer.byte_length,
byte_offset: None,
byte_stride: Some(mem::size_of::<Vertex>() as u32),
byte_stride: Some(mem::size_of::<Vertex>() as u64),
extensions: Default::default(),
extras: Default::default(),
name: None,
Expand Down Expand Up @@ -198,7 +198,7 @@ fn export(output: Output) {
header: gltf::binary::Header {
magic: *b"glTF",
version: 2,
length: json_offset + buffer_length,
length: json_offset + buffer_length as u32, // This may truncate long buffers
},
bin: Some(Cow::Owned(to_padded_byte_vector(triangle_vertices))),
json: Cow::Owned(json_string.into_bytes()),
Expand Down
8 changes: 4 additions & 4 deletions gltf-json/src/buffer.rs
Original file line number Diff line number Diff line change
Expand Up @@ -47,7 +47,7 @@ impl ser::Serialize for Target {
pub struct Buffer {
/// The length of the buffer in bytes.
#[serde(default, rename = "byteLength")]
pub byte_length: u32,
pub byte_length: u64,

/// Optional user-defined name for this object.
#[cfg(feature = "names")]
Expand Down Expand Up @@ -81,22 +81,22 @@ pub struct View {

/// The length of the `BufferView` in bytes.
#[serde(rename = "byteLength")]
pub byte_length: u32,
pub byte_length: u64,

/// Offset into the parent buffer in bytes.
#[serde(
default,
rename = "byteOffset",
skip_serializing_if = "Option::is_none"
)]
pub byte_offset: Option<u32>,
pub byte_offset: Option<u64>,

/// The stride in bytes between vertex attributes or other interleavable data.
///
/// When zero, data is assumed to be tightly packed.
#[serde(rename = "byteStride")]
#[serde(skip_serializing_if = "Option::is_none")]
pub byte_stride: Option<u32>,
pub byte_stride: Option<u64>,

/// Optional user-defined name for this object.
#[cfg(feature = "names")]
Expand Down
1 change: 1 addition & 0 deletions gltf-json/src/validation.rs
Original file line number Diff line number Diff line change
Expand Up @@ -182,6 +182,7 @@ impl std::fmt::Display for Error {
// These types are assumed to be always valid.
impl Validate for bool {}
impl Validate for u32 {}
impl Validate for u64 {}
impl Validate for i32 {}
impl Validate for f32 {}
impl Validate for [f32; 3] {}
Expand Down
4 changes: 2 additions & 2 deletions src/accessor/util.rs
Original file line number Diff line number Diff line change
Expand Up @@ -8,8 +8,8 @@ fn buffer_view_slice<'a, 's>(
view: buffer::View<'a>,
get_buffer_data: &dyn Fn(buffer::Buffer<'a>) -> Option<&'s [u8]>,
) -> Option<&'s [u8]> {
let start = view.offset();
let end = start + view.length();
let start = usize::try_from(view.offset()).ok()?;
let end = usize::try_from(start as u64 + view.length()).ok()?;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't this going to fail on 32bit systems?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this will fail on 32bit systems when the buffer is too large to fit into the 32bit address space, which seems fair to me. I guess technically the part of the buffer to be accessed in this function specifically may fit, but I'm not sure that edge case is worth handling. Would you like to handle it differently?

get_buffer_data(view.buffer()).and_then(|slice| slice.get(start..end))
}

Expand Down
1 change: 1 addition & 0 deletions src/animation/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@ use crate::{accessor, scene, Document};
use crate::Buffer;

pub use json::animation::{Interpolation, Property};
#[cfg(feature = "extensions")]
use serde_json::{Map, Value};

/// Iterators.
Expand Down
13 changes: 7 additions & 6 deletions src/buffer.rs
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@ use std::ops;
use crate::Document;

pub use json::buffer::Target;
#[cfg(feature = "extensions")]
use serde_json::{Map, Value};

/// A buffer points to binary data representing geometry, animations, or skins.
Expand Down Expand Up @@ -91,8 +92,8 @@ impl<'a> Buffer<'a> {
}

/// The length of the buffer in bytes.
pub fn length(&self) -> usize {
self.json.byte_length as usize
pub fn length(&self) -> u64 {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is just going to lead to casts every time it is used. I think usize was the right choice. Buffers exceeding u32::MAX length become problematic on 32bit systems. I don't think you can even address a slice in Rust beyond u32::MAX on 32bit systems.

I agree that internally, the length should be stored as u64 but I think the offsets and lengths should be reported as usize.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Loading the gltf file should fail on 32bit systems if there are buffers with lengths larger than u32::MAX.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I figured it's worth preserving the actual information losslessly as long as possible so users of the interface can decide how they want to deal with it. Do you want to truncate the information here by casting to usize, which may lead to unexpected behaviour, or should this function be fallible?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An application might want to be able to access a glTF that is too big to fit into memory …

  • as a viewer or editor that has a use for showing metadata to the user even without rendering;
  • because they will be loading only selected objects from the glTF file (e.g. the scene the user selects, or selected “layers” of a complex dataset visualization; or
  • to process a large buffer in a streaming fashion.

Therefore, it's not necessarily the case that loading a glTF containing buffers that won't fit into memory is useless. Note that if a user wants a usize they can call usize::try_from() on the u64 and the result is precisely either a usize or an error that they can .unwrap() or ? to fail on. (But perhaps there's an even better API possible; I haven't looked at the bigger picture.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@aloucks Do you have an update on how you would prefer this to be handled? It sounds like you'd like to fail when loading a glTF file that has any buffer that is too large and then just hard cast to usize here later? The use case of working with larger files on 32 bit systems is not worth supporting?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm tempted to settle on usize as 32 bit systems are the exception rather than the norm nowadays. The WASM runtime however is a popular target for this library and 64 bit addressing is unstable. For this reason, I think it's important the user can handle this scenario gracefully somehow.

For the majority of users running on 64 bit systems, the cast from u32 to usize is already a bit annoying, so I suggest few things:

  1. Create a new type struct Address(u64) (the name is debatable) in gltf-json.
  2. Implement Validate for Address and report an error on 32 bit systems if it exceeds the usize range.
  3. In the main/wrapper crate, change the return type for existing u32 offsets/sizes to usize, using as to cast internally.

This should cover all of the points mentioned previously. 32 bit systems will still be able to process large glTF (and the wrapper via. from_json_without_validation) or report to the user the out of range size otherwise.

@ShaddyDC, would you like to implement this yourself? I can chip in and push to this branch if you like otherwise.

Copy link
Contributor Author

@ShaddyDC ShaddyDC Dec 12, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That seems like a good way to go about it. I can take a shot at an implementation, but if you've got the time to do it, I'd appreciate if you could take care of it. I'm not yet familiar with the validation implementation.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Leave it with me 😄

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've opened a PR on your fork: ShaddyDC#1

Copy link
Contributor Author

@ShaddyDC ShaddyDC Dec 13, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lgtm, merged! Though I think you should also have commit access

self.json.byte_length
}

/// Optional user-defined name for this object.
Expand Down Expand Up @@ -150,13 +151,13 @@ impl<'a> View<'a> {
}

/// Returns the length of the buffer view in bytes.
pub fn length(&self) -> usize {
self.json.byte_length as usize
pub fn length(&self) -> u64 {
self.json.byte_length
}

/// Returns the offset into the parent buffer in bytes.
pub fn offset(&self) -> usize {
self.json.byte_offset.unwrap_or(0) as usize
pub fn offset(&self) -> u64 {
self.json.byte_offset.unwrap_or(0)
}

/// Returns the stride in bytes between vertex attributes or other interleavable
Expand Down
10 changes: 6 additions & 4 deletions src/import.rs
Original file line number Diff line number Diff line change
Expand Up @@ -123,11 +123,11 @@ pub fn import_buffers(
let mut buffers = Vec::new();
for buffer in document.buffers() {
let data = buffer::Data::from_source_and_blob(buffer.source(), base, &mut blob)?;
if data.len() < buffer.length() {
if (data.len() as u64) < buffer.length() {
return Err(Error::BufferLength {
buffer: buffer.index(),
expected: buffer.length(),
actual: data.len(),
actual: data.len() as u64,
});
}
buffers.push(data);
Expand Down Expand Up @@ -191,8 +191,10 @@ impl image::Data {
},
image::Source::View { view, mime_type } => {
let parent_buffer_data = &buffer_data[view.buffer().index()].0;
let begin = view.offset();
let end = begin + view.length();
let begin = usize::try_from(view.offset())
.map_err(|_| Error::OverlargeBuffer(view.offset()))?;
let end = usize::try_from(begin as u64 + view.length())
.map_err(|_| Error::OverlargeBuffer(begin as u64 + view.offset()))?;
let encoded_image = &parent_buffer_data[begin..end];
let encoded_format = match mime_type {
"image/png" => Png,
Expand Down
11 changes: 9 additions & 2 deletions src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -213,10 +213,10 @@ pub enum Error {
buffer: usize,

/// The expected buffer length in bytes.
expected: usize,
expected: u64,

/// The number of bytes actually available.
actual: usize,
actual: u64,
},

/// JSON deserialization error.
Expand Down Expand Up @@ -257,6 +257,9 @@ pub enum Error {

/// glTF validation error.
Validation(Vec<(json::Path, json::validation::Error)>),

/// buffer size does not fit in target platform usize type
OverlargeBuffer(u64),
}

/// glTF JSON wrapper plus binary payload.
Expand Down Expand Up @@ -613,6 +616,10 @@ impl std::fmt::Display for Error {
}
Ok(())
}
Error::OverlargeBuffer(n) => write!(
f,
"buffer with size over {n} exceeds platform address space"
),
}
}
}
Expand Down