Handling unsupported OperandType #36

huningxin · 2020-01-15T06:57:24Z

Current spec defines OperandType enum

enum OperandType {
  "float32",
  "int32",
  "uint32",
  "tensor-float32",
  "tensor-int32",
  "tensor-quant8-asymm"
};

The float16 and tensor-float16 are being added #35 .

However, as mentioned by @wchao1115 in #26 (comment), there are situations that the selected device doesn't have native support of an OperandType. For example, some CPUs may not support tensor-float16, some GPUs may not support tensor-quant8-asymm and some AI accelerators may not support tensor-float32. To allow the app gracefully handle these situations, e.g. select a different device, or use different model with supported operand type, the API should report the unsupported OperandType error.

Open this issue to explore the definition of unsupported OperandType error and the API behavior to return that error.

Thoughts?

The text was updated successfully, but these errors were encountered:

wchao1115 · 2020-02-01T21:53:46Z

@huningxin Should "tensor-quant8-asymc" simply be "tensor-int8"? Does the current naming imply that the content in the tensor should be coupled with quantization data e.g. zero-point? It might be cleaner to just define the data type here independent of the operation (e.g. quantization) that have originated it.

huningxin · 2020-02-02T08:01:06Z

Does the current naming imply that the content in the tensor should be coupled with quantization data e.g. zero-point?

Yes. It is supposed to be used together with scale and zeroPoint of OpearndDescriptor.

It might be cleaner to just define the data type here independent of the operation (e.g. quantization) that have originated it.

I notice there are different quantization support ways in native API, e.g. ONNX defines quantization operations, such as QLinearConv, but NNAPI defines quantized tensors.

We may need to create a new issue for compatibility investigation among them. WDYT?

wchao1115 · 2020-02-03T00:43:48Z

@huningxin My feedback is not about compatibility. If this enum is meant to describe the data type of an operand, then it should not mix with the notion of a process to be applied with that operand. It just makes it more confusing and hard to evolve over time. If an API requires or produces quantized data blocks, naturally it will also require or produce quantization parameters e.g. zero point and scale (for linear quantization). The fact that the quantized data type is int8 or int4 is independent of the quantization process the data type is involved in. Therefore, defining an enum that mixes the two concepts together is not a good design and makes it hard to extend it in the future.

huningxin · 2020-02-03T03:59:54Z

@wchao1115 , thanks for your feedback. This enum is used by OperandDescriptor to describe the type of an operand. In this design, the quantized tensor is one type of operand. For example, user code can describe a 8-bit asymmetric quantized tensor operand by setting OperandDescriptor.type as tensor-quant8-asymm and specifying the OperandDescriptor.scale and OperandDescriptor.zeroPoint value accordingly. It may be extended to support other quantization schema by adding a new type into the OperandType enum and extending the OperandDescriptor interface for additional parameters. This design can map to NNAPI ANeuralNetworksOperandType that supports multiple quantization schemas, e.g. 8-bit per-channel symmetric quantized tensor used for weights. However it may not map well to other native APIs, so I propose to create a dedicated issue about it (for compatibility investigation or new design proposal).

wchao1115 · 2020-02-05T04:59:26Z

@huningxin You can define OperandDescriptor with only the data type and the shape, and instead adding a method that accepts the zeropoint and scale, and produces an Operand interface object of a quantized data block. This way you won't need a quantized-specific enum value in OperandType. This way OperandDescriptor remains nice and compact, and more durable over time. The current design makes this data structure too broadly defined, which risks constant changes in the future whenever a new operand type is needed (say, for a non-linear quantized operation, or a sparse tensor, etc.)

huningxin · 2020-02-05T14:13:45Z

You can define OperandDescriptor with only the data type and the shape, and instead adding a method that accepts the zeropoint and scale, and produces an Operand interface object of a quantized data block.

This sounds like a good design. @wchao1115 , could you please open a new issue for this proposal? Thanks.

gramalingam · 2020-02-05T21:58:44Z

I agree. To rephrase the question, as I understand it, do the quantization-parameters (zeropoint/scale) have to be part of the type (fixed statically), or can we let them be dynamically determined?

wchao1115 · 2020-02-13T14:56:06Z

You can define OperandDescriptor with only the data type and the shape, and instead adding a method that accepts the zeropoint and scale, and produces an Operand interface object of a quantized data block.

This sounds like a good design. @wchao1115 , could you please open a new issue for this proposal? Thanks.

Create new issue #44

huningxin · 2020-02-17T14:35:14Z

@wchao1115 , do you think we still need this issue? I suppose #44 is only about redesign the quantization, am I correct?

BenjaminPoulain · 2020-02-22T01:11:45Z

@wchao1115 , do you think we still need this issue? I suppose #44 is only about redesign the quantization, am I correct?

Yes

huningxin · 2020-02-26T06:41:11Z

As the discussion in Feb 20 CG meeting, reopen this issue. When hardware doesn't support a data type in a model, the compilation should fail with an exception. The proposal would depend on #46 that changes compilation interface.

wchao1115 · 2020-03-05T14:57:10Z

As the discussion in Feb 20 CG meeting, reopen this issue. When hardware doesn't support a data type in a model, the compilation should fail with an exception.

When an API throw an exception, it usually means it hits a condition in which it doesn't expect and one that its immediate caller can't easily recover from e.g. a hardware failure in the middle of an I/O operation etc. This is why an exception is not the same as a return error code, and that a catch statement on call site should be rare.

A data type not supported by the underlying hardware is an error code not an exception because there is a defined behavior of what the immediate caller of the API should do to recover from such a condition i.e. compile with a fallback option.

To handle this situation, we will need to look at the current CompilationPreference enum or maybe introducing a new enum that the caller can use to recover from this error condition i.e. a CPU fallback for instance.

dontcallmedom · 2023-03-03T14:17:09Z

was this solved by #50?

anssiko · 2023-03-03T14:57:33Z

Indeed we can close this per #50 (comment), @wchao1115 please feel free to re-open if you think otherwise.

huningxin mentioned this issue Jan 15, 2020

Add float16 and tensor-float16 into OperandType #35

Merged

wchao1115 closed this as completed Feb 13, 2020

huningxin reopened this Feb 26, 2020

wchao1115 mentioned this issue Mar 19, 2020

How to handle a failing compilation #50

Merged

anssiko closed this as completed Mar 3, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Handling unsupported OperandType #36

Handling unsupported OperandType #36

huningxin commented Jan 15, 2020

wchao1115 commented Feb 1, 2020

huningxin commented Feb 2, 2020

wchao1115 commented Feb 3, 2020 •

edited

Loading

huningxin commented Feb 3, 2020 •

edited

Loading

wchao1115 commented Feb 5, 2020 •

edited

Loading

huningxin commented Feb 5, 2020

gramalingam commented Feb 5, 2020

wchao1115 commented Feb 13, 2020

huningxin commented Feb 17, 2020

BenjaminPoulain commented Feb 22, 2020

huningxin commented Feb 26, 2020

wchao1115 commented Mar 5, 2020

dontcallmedom commented Mar 3, 2023

anssiko commented Mar 3, 2023

Handling unsupported OperandType #36

Handling unsupported OperandType #36

Comments

huningxin commented Jan 15, 2020

wchao1115 commented Feb 1, 2020

huningxin commented Feb 2, 2020

wchao1115 commented Feb 3, 2020 • edited Loading

huningxin commented Feb 3, 2020 • edited Loading

wchao1115 commented Feb 5, 2020 • edited Loading

huningxin commented Feb 5, 2020

gramalingam commented Feb 5, 2020

wchao1115 commented Feb 13, 2020

huningxin commented Feb 17, 2020

BenjaminPoulain commented Feb 22, 2020

huningxin commented Feb 26, 2020

wchao1115 commented Mar 5, 2020

dontcallmedom commented Mar 3, 2023

anssiko commented Mar 3, 2023

wchao1115 commented Feb 3, 2020 •

edited

Loading

huningxin commented Feb 3, 2020 •

edited

Loading

wchao1115 commented Feb 5, 2020 •

edited

Loading