Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use unsigned long for axis of concat operation #345

Closed
miaobin opened this issue Feb 15, 2023 · 11 comments · Fixed by #359
Closed

Use unsigned long for axis of concat operation #345

miaobin opened this issue Feb 15, 2023 · 11 comments · Fixed by #359

Comments

@miaobin
Copy link

miaobin commented Feb 15, 2023

partial interface MLGraphBuilder {
  MLOperand concat(sequence<MLOperand> inputs, long axis);
};

We now limit the value of axis in the interval [0, N) where N is the rank of all the inputs. It should be a positive value and we may use unsigned long for axis.

@huningxin
Copy link
Contributor

huningxin commented Feb 23, 2023

@miaobin , thanks for raising this issue. Given the valid value of axis is in the range [0, N), it makes sense to me that it should be of type unsigned long.

And there may be an opportunity to unify the axis definitions across this spec. For example, MLSplitOptions.axis is also long, and the spec says "A negative value is interpreted as counting back from the end.". This is inconsistent with concat, as #307 mentions.

Although the unsigned long axis may simplify the mapping to some native ML APIs, e.g., DirectML's DML_JOIN_OPERATOR_DESC and DML_SPLIT_OPERATOR_DESC declare axis as UINT, XNNPACK's xnn_define_concatenate2() and xnn_define_even_split2() declare axis as size_t. There are other native ML APIs use signed integer, such as MPSGraph concatTensors and splitTensor use NSInteger for axis, and NNAPI ANEURALNETWORKS_CONCATENATION and ANEURALNETWORKS_SPLIT use ANEURALNETWORKS_INT32 for axis as well.

And I suppose the negative axis value might be useful from the user of graph API perspective. Because MLOperand interface doesn't expose the rank, the user code may easily use -1 to specify the last dimension without knowing the rank, like reductions.

The implementation should not be much complex, because internally the implementation should know the rank. The conversion of the negative axis value to positive value (if native ML API requires) could be simply by axis += rank.

WDYT?

@BruceDai
Copy link
Contributor

And there may be an opportunity to unify the axis definitions across this spec.

Link to #307, batchNormalization also uses axis option.

huningxin added a commit to huningxin/webnn that referenced this issue Feb 23, 2023
huningxin added a commit to huningxin/webnn that referenced this issue Feb 23, 2023
huningxin added a commit to huningxin/webnn that referenced this issue Feb 23, 2023
@huningxin
Copy link
Contributor

huningxin commented Mar 1, 2023

Related to Chromium concat CL. In CL review, @wacky6 mentioned

FYI, I do have a slight preference for using unsigned long

Jiewei, feel free to share your additional thoughts. Thanks!

@wacky6
Copy link

wacky6 commented Mar 2, 2023

There're a few things to consider in WebNN context:

  • Intended user: do we expect developers use this API directly (where negative indices makes more sense), or via a framework (e.g. tfjs, which can easily figure out the correct index before calling this API)
  • Implementation complexity and robustness: signed integer needs extra caution during validation stage. Existing CL's CheckedNumeric + if x<0 takes more mental energy to understand than unsigned math (Do the math, then AssignIfValid)
  • The last dimension usage. I think this is related to static / dynamic graph issue? If we only support static graph (i.e. can't change tensor shape after graph initialization), then all dimensions are known at graph build time. In this case negative index is a mere syntax sugar at the cost of more complicated / error-prone C++ math calculation, and more WPT test cases

@huningxin
Copy link
Contributor

huningxin commented Mar 2, 2023

@wacky6 , thanks for your implementation reviewer perspectives, very helpful!

  • Intended user:

WebNN aims at being framework backend. So the users of this API would be most likely the framework developers.

I agree with you that framework can support negative axis in their way and does the conversion if the backend API only supports unsigned integer axis.

For example, although TFLite supports negative axis, XNNPACK only support unsigned integer axis. The TFLite XNNPACK delegate does the conversion by if (axis < 0) axis += rank. As another example, ONNXRuntime DirectML execution provider uses GetDmlAdjustedAxis() to convert negative axis supported by ONNX to unsigned integer axis that is supported by DirectML.

So, I guess if WebNN supports unsigned integer axis, the WebNN framework backend can implement in a similar way.

  • Implementation complexity and robustness:

I agree the unsigned mat would be much easier and less error-prone.

If WebNN supports unsigned integer axis, when implement a backend that supports signed integer axis, such as MPSGraph, it may be cautious about overflow when converting unsigned integer to signed integer.

  • The last dimension usage

I agree it is less useful for static graph that WebNN only supports.

@huningxin
Copy link
Contributor

huningxin commented Mar 2, 2023

My another point this spec should make the axis definition consistent across APIs, including @BruceDai mentioned in #307 for batchNormalization, and also @lisa0314 mentioned in #317 for transpose. I copied Bruce's table below and added all ops that uses axis/axes for your reference.

Op Axis Type Value description
batchNormalization MLBatchNormalizationOptions.axis long When it’s not specified, the default value is 1.When input is a 4-D tensor of the "nchw" or "nhwc" layout, options.axis should be set to 1 or 3 respectively.
concat axis is the second parameter long with the value in the interval [0, N) where N is the rank of all the inputs
split MLSplitOptions.axis long Default to 0. A negative value is interpreted as counting back from the end
Reduction operations MLReduceOptions.axes sequence<long> axes: a sequence of long. The dimensions to reduce where -1 means the last dimension. If not present, all dimensions are reduced.
resample2d MLResample2dOptions.axes sequence<long> axes: a sequence of long of length 2. The two consecutive dimensions of the input tensor to which the interpolation algorithm applies. The valid values in the sequence are [0, 1], [1, 2] or [2, 3]. When not specified, the sequence is assumed to be [2, 3].
slice MLSliceOptions.axes sequence<long> axes: a sequence of long. The dimensions of the input shape to which starts and sizes apply. The values in the sequence are either within the [0, r-1] range where r is the input tensor rank, or the [-r, -1] range where negative values mean counting back from the end of the input shape. When not specified, the sequence is assumed to be [0,1,..r-1].
squeeze MLSqueezeOptions.axes sequence<long> axes: a sequence of long. Indices to the shape dimensions of size 1 to eliminate. When not specified, every shape dimensions of size 1 in the tensor are eliminated.
transpose MLTransposeOptions.permutation sequence<long> permutation: a sequence of long values. The values used to permute the output shape. When it’s not specified, it’s set to [N-1...0], where N is the rank of the input tensor. These default values cause the output to become a transposed tensor of the input. When specified, the number of values in the sequence must be the same as the rank of the input tensor, and the values in the sequence must be within the range from 0 to N-1 with no two or more same values found in the sequence.

@huningxin
Copy link
Contributor

/cc @wchao1115 for inputs. Thanks!

@wchao1115
Copy link
Collaborator

@fdwr

@fdwr
Copy link
Collaborator

fdwr commented Mar 3, 2023

@huningxin Thanks for the holistic table above comparing operators. Yeah, from the consistency POV, I strongly agree that all the operators sharing similar indices/axes either all share this special policy of negative numbers or that they all do not (so if split supports negative axes, then concat should too, or vice versa). I don't see the negative number policy is that convenient anyway given other existing constraints in WebNN like ahead-of-time shape inference, and I'm not sure the original target audience for WebNN (more data scientists vs more framework implementers), but my view of WebNN has been that it is lower level than say PyTorch or TensorFlow, and should be more explicit, leaving higher level framework policies like that to already be resolved before reaching WebNN. This also makes testing WebNN conformance simpler.

The one case where I would argue in favor of negative axes for generality is if your model supported arbitrary rank inputs (e.g. although uncommon, ONNX models allow arbitrary rank inputs 1D/2D/3D... by just leaving input TensorShapeProto shape blank, and then from-the-end indexing is useful), but again since WebNN requires statically known shapes ahead of time, there should be no ambiguous cases where negative numbers can't be resolved ahead of time.

@huningxin
Copy link
Contributor

@fdwr , great points!

This also makes testing WebNN conformance simpler.

+1. And I think WebNN test developers, @BruceDai , would be happy about that.

With inputs from both @fdwr and @wacky6 , I'd drop PR #352 and make another PR that unifies the axis/axes definitions by using unsinged integer with a consistent value range. I'll invite you review once it is ready.

Thanks!

@huningxin
Copy link
Contributor

The PR #359 that uses unsigned integer axis is ready for review. Thanks!

dontcallmedom pushed a commit that referenced this issue Mar 27, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
6 participants