Add matmul #49

huningxin · 2020-03-11T07:59:28Z

This PR is based on matmul signature proposal and compatibility table.

@wchao1115 @nsthorat @BenjaminPoulain and @anssiko, please take a look. Thanks!

wchao1115 · 2020-03-19T05:24:35Z

@huningxin numpy behavior is that if only one of the two operands is 1D, the promotion of the 1D tensor to 2D takes place. But if both are 1D, the operation is a dot product, returning a scalar tensor.

It might be more clear to add the case when both operands are 1D.
As for the scalar output, can Operand be a scalar value? What's the convention?

huningxin · 2020-03-19T15:00:23Z

@wchao1115 , thanks for the review.

It might be more clear to add the case when both operands are 1D.

This case can be covered by

If a is 1-D, it is converted to a 2-D tensor by prepending a 1 to its dimensions.
If b is 1-D, it is converted to a 2-D tensor by by appending a 1 to its dimensions.

But it is still an good idea to be more clearer. I'll add this case.

As for the scalar output, can Operand be a scalar value? What's the convention?

Operand can be scalar value. In that case, its corresponding OperandDescriptor has scalar type, e.g. float32.

For example, to create a scalar float32 constant with value 0:

const tensor = nn.constant({type: 'float32'}, new Float32Array([0.0]));

gramalingam · 2020-03-19T15:32:18Z

Note that numpy does not allow inputs to be scalar. https://docs.scipy.org/doc/numpy-1.13.0/reference/generated/numpy.matmul.html

huningxin · 2020-03-19T15:52:57Z

Thanks @gramalingam for comment.

Note that numpy does not allow inputs to be scalar. https://docs.scipy.org/doc/numpy-1.13.0/reference/generated/numpy.matmul.html

Yes. It follows numpy's behavior. Do you think we need to mention it specifically?

gramalingam · 2020-03-19T16:32:00Z

@huningxin , that's good. I was just confused by the previous two comments that "Operand can be a scalar" meant the input could be a scalar. It may be worth being explicit in the documentation, thanks.

wchao1115 · 2020-03-19T20:05:32Z

index.bs

+    **Returns:** an {{Operand}}. The output N-D tensor that contains the matrix
+    product of two input tensors.
+
+    Computes the matrix product of two input tensors. It behaves as following:
+        - If both *a* and *b* are 2-D, they are multiplied like conventional
+            matrices.
+        - If either *a* or *b* is N-D, N > 2, it is treated as a stack of
+            matrices (rank-3) with dimensions corresponding to the last two
+            indices. The matrix multiplication will be broadcasted accordingly.
+        - If *a* is 1-D, it is converted to a 2-D tensor by prepending a 1 to
+            its dimensions.
+        - If *b* is 1-D, it is converted to a 2-D tensor by by appending a 1 to
+            its dimensions.


Please consider this wording instead. Hopefully its more spelled out how the API would behave.

**Returns:** an {{Operand}}. The output N-D tensor that contains the matrix product of two input tensors. Computes the matrix product of two input tensors. It behaves as following: - If both *a* and *b* are 2-D, they are multiplied like conventional matrices and produce a 2-D tensor as the output. - If either *a* or *b* is N-D, N > 2, it is treated as a stack of matrices (rank-3) with dimensions corresponding to the last two indices. The other input broadcasts its dimensions accordingly to allow for matrix multiplications of the last two dimensions of both inputs. The output is an N-D tensor of the same shape as the N-D input. - If only *a* is 1-D, it is converted to a 2-D tensor by prepending a 1 to its dimensions. - If only *b* is 1-D, it is converted to a 2-D tensor by by appending a 1 to its dimensions. - If both *a* and *b* are 1-D, the operation is a vector dot-product, which produces a scalar output.

What does "rank-3" mean above? Shouldn't it be (rank-2)? How about saying

... it is treated as a stack of matrices of rank (N-2) with matrix dimensions corresponding to the last two dimensions

What does "rank-3" mean above? Shouldn't it be (rank-2)? How about saying

... it is treated as a stack of matrices of rank (N-2) with matrix dimensions corresponding to the last two dimensions

That was the original wording from @huningxin. I believe it refers to the stacking part of the sentence and not the matrix part of the sentence i.e. the stacking is considered a third rank. But I agree it's confusing. I like your suggestion but also think the sentence is already clear without it. How about we just remove the '(rank-3)' part?

Since this is the first use of broadcasting, we should also include some general description of broadcasting. The description above is a little imprecise if the broadcasting is bidirectional ... I assume it is bidirectional?

How about replacing the line

The output is an N-D tensor of the same shape as the N-D input.

by

If the two tensors have dimensions [a1, ..., ap, M, N] and [b1, ..., bq, N, K], then the output is a tensor of dimensions [c1, ..., cr, M, K] where [c1, ..., cr] is the result of broadcasting [a1, ... ap] and [b1, ..., bq].

Removing the (rank-3) part is also fine.

Since we decided to follow the numpy broadcasting rule, I think it's a good idea to add a link referencing the numpy rule here in the API spec. ONNX broadcasting rule is compatible with the numpy rule but further classifies the type of broadcasting to two categories -- either uni or multi-directional. In this case since it only concerns 2 operands and that either one of them can be N-D, this distinction, I believe, is not relevant here i.e. the broadcasting behavior of the N-D operand simply follows the numpy rule as previously stated.

@wchao1115 @gramalingam thanks for your comments.

What does "rank-3" mean above?

'(rank-3)' refers to the stack. I agree it is confusing. I update the PR to remove it.

Since we decided to follow the numpy broadcasting rule, I think it's a good idea to add a link referencing the numpy rule here in the API spec.

It sounds good to me. I will add that link as a normative reference into the PR. How about saying

The matrix multiplication will be broadcasted accordingly by following [numpy-broadcasting-rule].

The output is an N-D tensor of the same shape as the N-D input.

This may not be true. For example, if a has shape [2, 3, 4, 5] and b has shape [5, 4], the resulting tensor will have a shape of [2, 3, 4, 4].

As we refer to numpy broadcasting rule, how about we say

The output is a N-D tensor whose rank is the maximum rank of the input tensors. For each dimension of the output tensor, its size is the maximum size along that dimension of the input tensors.

Regarding the output dimension: yes, we need to change "The output is an N-D tensor of the same shape as the N-D input." @huningxin 's proposal looks good, except that "the maximum size along that dimension of the input tensors" does not apply to the last two dimension, which are subject to the matrix-multiplication rules. (Which is why I suggested a slightly different wording above; but we can adapt Ningxin's words also accordingly.)

@gramalingam, your suggestion sounds good. I'll incorporate that. Thanks.

huningxin · 2020-03-25T02:28:16Z

@wchao1115 @gramalingam , I pushed a new commit 2fd2baf to address your comments. Please take another look. Thanks!

huningxin · 2020-04-02T05:44:12Z

@wchao1115 @gramalingam , thanks for the review and approval.

anssiko · 2020-04-02T15:12:37Z

This PR has received adequate review and all the feedback has been addressed.

@huningxin @wchao1115 feel free to merge this PR.

Add matmul

1d41427

huningxin mentioned this pull request Mar 11, 2020

[op compatibility] matMul #27

Closed

wchao1115 self-requested a review March 19, 2020 04:56

wchao1115 reviewed Mar 19, 2020

View reviewed changes

Address the review comments

2fd2baf

wchao1115 approved these changes Mar 26, 2020

View reviewed changes

gramalingam approved these changes Apr 2, 2020

View reviewed changes

huningxin merged commit b341ec2 into webmachinelearning:master Apr 2, 2020

huningxin mentioned this pull request Apr 3, 2020

How to handle a failing compilation #50

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add matmul #49

Add matmul #49

huningxin commented Mar 11, 2020 •

edited by pr-preview bot

Loading

wchao1115 commented Mar 19, 2020

huningxin commented Mar 19, 2020 •

edited

Loading

gramalingam commented Mar 19, 2020

huningxin commented Mar 19, 2020 •

edited

Loading

gramalingam commented Mar 19, 2020

wchao1115 Mar 19, 2020

gramalingam Mar 19, 2020 •

edited

Loading

wchao1115 Mar 19, 2020

gramalingam Mar 19, 2020

gramalingam Mar 19, 2020 •

edited

Loading

gramalingam Mar 19, 2020

wchao1115 Mar 19, 2020

huningxin Mar 20, 2020 •

edited

Loading

gramalingam Mar 20, 2020

huningxin Mar 25, 2020

huningxin commented Mar 25, 2020

huningxin commented Apr 2, 2020

anssiko commented Apr 2, 2020

Add matmul #49

Add matmul #49

Conversation

huningxin commented Mar 11, 2020 • edited by pr-preview bot Loading

wchao1115 commented Mar 19, 2020

huningxin commented Mar 19, 2020 • edited Loading

gramalingam commented Mar 19, 2020

huningxin commented Mar 19, 2020 • edited Loading

gramalingam commented Mar 19, 2020

wchao1115 Mar 19, 2020

Choose a reason for hiding this comment

gramalingam Mar 19, 2020 • edited Loading

Choose a reason for hiding this comment

wchao1115 Mar 19, 2020

Choose a reason for hiding this comment

gramalingam Mar 19, 2020

Choose a reason for hiding this comment

gramalingam Mar 19, 2020 • edited Loading

Choose a reason for hiding this comment

gramalingam Mar 19, 2020

Choose a reason for hiding this comment

wchao1115 Mar 19, 2020

Choose a reason for hiding this comment

huningxin Mar 20, 2020 • edited Loading

Choose a reason for hiding this comment

gramalingam Mar 20, 2020

Choose a reason for hiding this comment

huningxin Mar 25, 2020

Choose a reason for hiding this comment

huningxin commented Mar 25, 2020

huningxin commented Apr 2, 2020

anssiko commented Apr 2, 2020

huningxin commented Mar 11, 2020 •

edited by pr-preview bot

Loading

huningxin commented Mar 19, 2020 •

edited

Loading

huningxin commented Mar 19, 2020 •

edited

Loading

gramalingam Mar 19, 2020 •

edited

Loading

gramalingam Mar 19, 2020 •

edited

Loading

huningxin Mar 20, 2020 •

edited

Loading