Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added docs about data representation #1847

Merged
merged 1 commit into from
Jun 8, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
43 changes: 43 additions & 0 deletions docs/data-representation.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,43 @@
# Data representation

ChaiNNer has to deal with multiples types of data (images, text, numbers, etc.). To make data handling easier, ChaiNNer enforces that all node implementations follow certain conventions. These conventions are guaranteed by inputs and enforced by outputs.

## Numbers

Depending on the precision of the inputs, chaiNNer will either use `float` (precision > 0) or `int` (precision = 0) to represent numbers.

This is guaranteed by all numeric inputs. Numeric outputs do not enforce any such thing and will accept both `float` and `int`.

## Text

Text is represented as `str`. If a number is connected to a text input, it will be converted to `str`.

## Images

Images are represented using numpy `ndarray` and are guaranteed to have the following:

- They are `float32` arrays.
- All values are finite (no NaN, inf, -inf) and range between 0 and 1 (inclusive).
- The shape of the array is either `(height, width, channels)` (with `channels > 1`) or `(height, width)`. Grayscale/single-channel images are represented as 2D arrays.
- Channels are in BGR and BGRA order.
- The array is readonly. Writing to it will cause a runtime error.

Note: You may **not** assume that chaiNNer only supports images with 1, 3, or 4 channels. It is possible that chaiNNer will support images with more channels in the future. If your node relies on this assumption, you should use `ImageInput`'s `channels` argument to enforce it. Simply add `channels=[1, 3, 4]`.

`ImageOutput` will try to convert the output image to fit the above conventions. It will:

1. Convert the image to `float32` if it isn't already. Integer formats (e.g. `uint8`) will automatically be brought into 0 to 1 range.
2. Clip values between 0 and 1 (inclusive).
3. Convert single-channel 3D images to 2D.

Since channel order cannot be checked, you have to guarantee this yourself.

## Colors

Colors are represented using chaiNNer's `Color` class.

The `Color` class is a wrapper around a tuple of floats that represent the channels of the color. Just like with images, the channels are in BGR and BGRA order. You can generally think of `Color` as a 1x1 image.

## Models

ChaiNNer has types for models of each platform (PyTorch, NCNN, ONNX). Models will be represented using platform-specific classes. See their input and output classes for more information.
2 changes: 1 addition & 1 deletion docs/nodes.md
Original file line number Diff line number Diff line change
Expand Up @@ -54,7 +54,7 @@ Metadata is used to define a contract between a node, the rest of the backend, a

Nodes must explicitly declare all their inputs and outputs as part of their metadata. This is done using the `inputs` and `outputs` properties. Each argument of the python function has a corresponding input, and the return value of the function has a corresponding output. See [The anatomy of a node](#the-anatomy-of-a-node) for an example.

The main purpose of the explicitly declaring inputs and outputs is to provide more information about the node. E.g. the type information is used to determine which connections are valid, the minimum/maximum information is used to validate user inputs, and placeholder and defaults are used to improve the user experience.
The main purpose of the explicitly declaring inputs and outputs is to provide more information about the node. E.g. the type information is used to determine which connections are valid, the minimum/maximum information is used to validate user inputs, and placeholder and defaults are used to improve the user experience. They also [make guarantees about input data and enforce conventions](./data-representation.md).

The most common classes used to declare inputs are `NumberInput`, `TextInput`, and `ImageInput`. There is also `SliderInput` as an alternative to `NumberInput`. Many more inputs are available.

Expand Down