-
Notifications
You must be signed in to change notification settings - Fork 404
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Stop using arrow unions #6388
Comments
Punting on datatype conversions by simplifying typesarchetype Image {
pixel_buffer: PixelBuffer,
pixel_format: PixelFormat,
resolution: Resolution2D,
stride: Option<PixelStride>,
}
enum PixelFormat {
/* Image formats */
RGBA8_SRGB_22,
RG32F,
NV12,
// ...
/* Depth formats */
F16,
F32,
F32_LINEAR_XXX,
// ...
/* Segmentation formats */
U8,
U16,
// ...
}
archetype ImageEncoded {
blob: ImageBlob,
media_type: Option<MediaType>,
}
archetype DepthImage {
depth_buffer: PixelBuffer,
depth_format: PixelFormat,
depth_meter: DepthMeter,
resolution: Resolution2D,
stride: Option<PixelStride>,
}
archetype SegmentationImage {
buffer: PixelBuffer,
buffer_format: PixelFormat,
resolution: Resolution2D,
stride: Option<PixelStride>,
}
component PixelStride {
bytes_per_row: u32,
bytes_per_plane: Option<u32>,
}
archetype TensorU8 {
buffer: BufferU8,
// One of these
shape: TensorShape,
shape: Vec<TensorDimension>,
}
component BufferU8 {
data: [u8],
}
archetype TensorF32 {
buffer: BufferF32,
// One of these
shape: TensorShape,
shape: Vec<TensorDimension>,
}
component BufferF32 {
data: [f32],
}
// Two possibilities:
// - Only legal to set one of them
// - Or apply them all in deterministic order
archetype Transform {
mat4: Option<Mat4>,
translation: Option<Translation3>,
mat3: Option<Mat3>,
rotation: Option<Rotation3D>,
scale3: Option<Scale3D>,
scale: Option<Scale>,
}
// TODO: Separate the skeleton stuff in its own archetype -- figure it out.
archetype AnnotationContext {
class_ids: Vec<ClassId>,
colors: Vec<Color>,
labels: Vec<Text>,
}
archetype ScalarU8 {
value: ScalarU8,
}
component ScalarU8 {
value: u8,
}
archetype ScalarF32 {
value: ScalarF32,
}
component ScalarF32 {
value: f32,
} Conclusion
Punting on field accessor DSL by simplifying types
Conclusion
Other random killings
|
Some additional notes on the above: Why should Images use an untyped buffer + pixel format while tensors use a typed buffer?While at first glance this proposal might seem to introduce an inconsistency, in practice it serves to highlight the fundamental differences between these two approaches to data representation. Images are a way of describing a (possibly multi-channel) pixel value over a 2D image plane. Images are almost always specifically grounded in data received from sensors or sent to displays. This usage, as it relates to special-built hardware, has given rise to pragmatic ways of describing these pixel values more efficiently for purposes of implementation. It is not uncommon for pixel encodings to pack data in ways that simply don't align with a uniform-shape tensor representation. See, chroma subsampling, bayer patterns, etc. It is also quite common to consider an approximate or interpolated pixel value as the data is inherently 2d-spatial. As such a raw buffer + image encoding really is the most authentic representation we can achieve. For many low-level image libraries or sensor drivers we should be able to directly map this structure to an API that lets us access or load the raw image buffer + some metadata. On the other hand, Tensors are much more generally mapped to multi-dimensional arrays. They are often used in pure data and computational contexts that have nothing to do with images. Due to the wildly varied applications, the patterns of tensor compression (beyond things like run-length-encoding, or sparse / dense representation) are much more varied and domain specific. This means there simply aren't equivalent forms of tensor-encoding that are as common/applicable as what you see in images. In this case, a strongly typed buffer of primitives dramatically simplifies questions of indexing and tensor-value access. This is the exact approach taken by the Arrow tensor spec (https://arrow.apache.org/docs/format/CanonicalExtensions.html#variable-shape-tensor). Again, most tensor libraries work under this assumption and so feeding a tensor library from a typed buffer + shape will be the most naturally way to work with this data. What about "RGB" Tensors?All that said, it's still a very common pattern for a user to decode an image into an HxWxC (or CxHxW) tensor. And this is, in fact, what many users will expect to provide as an input. A numpy Even for users working with images, whether the user expects to provide an Image (buffer + encoding) or a Tensor (ndarray) will heavily depend on where the user sits in the software stack of their organization. Rather than fight against this, we may also want to support an "ImageTensor" archetype, which would be a Tensor datatype which we know stores the pixels of an image in one of the common tensor arrangements. This would not support any pixel-encoded images. Only those that had already been decoded into multi-channel tensors. |
Most of the choices for working with tensors fall into one of 4 categories. Typed buffer, multiple data-types (the proposal)Pros:
Cons:
The current hypothesis is that proliferating types is a known challenge and can be mostly automated with a mixture of code-gen and some helper code, whereas datatype conversions is an unknown challenge. Still this puts us on a pathway where once we support multi-typed components, we mostly delete a bunch of code and everything gets simpler. Any type conversions move from visualizer-space to data-query-space, but the types and arrow representations we work with don't actually need to change. Untyped buffer with type-idPros
Cons
Typed buffer with unionPros
Cons
|
Why
Arrow unions has downsides:
TODO
TimeRangeBoundary
TensorBuffer
Related
TensorBuffer
with datatype generics #9119The text was updated successfully, but these errors were encountered: