Regression: Torch exported Onnx doesn't run after Onnxruntime>=1.17 update - [ShapeInferenceError] #20808

MengLinMaker · 2024-05-24T13:21:13Z

Describe the issue (Issue solved, see closing comment)

Previously - exporting Torch==2.3 model to Onnx - the model would run on Onnxruntime==1.16

Currently - exporting Torch==2.3 model to Onnx - the model doesn't run on Onnxruntime==1.17 nor Onnxruntime==1.18

The error originates from [ShapeInferenceError] First input does not have rank 2

However, I could not track down the location of the error using Netron

It's also possible that the problem is caused by "Torch.onnx.export", which can be found in this tutorial

Log truncated - for context

  File "/Users/menglinmaker/Documents/2-Engineering/Personal/Musidi/transcribe/src/transcribe/inference.py", line 19, in inference
    model = InferenceSession(f'layer/model_onnx/{onnx_path}', providers=['CPUExecutionProvider'])
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/menglinmaker/Documents/2-Engineering/Personal/Musidi/transcribe/.venv/lib/python3.11/site-packages/onnxruntime/capi/onnxruntime_inference_collection.py", line 419, in __init__
    self._create_inference_session(providers, provider_options, disabled_optimizers)
  File "/Users/menglinmaker/Documents/2-Engineering/Personal/Musidi/transcribe/.venv/lib/python3.11/site-packages/onnxruntime/capi/onnxruntime_inference_collection.py", line 483, in _create_inference_session
    sess.initialize_session(providers, provider_options, disabled_optimizers)

Log last line - I believe this is the cause of the error

onnxruntime.capi.onnxruntime_pybind11_state.Fail: [ONNXRuntimeError] : 1 : FAIL : Node (MatMulBnFusion_Gemm) Op (Gemm) [ShapeInferenceError] First input does not have rank 2

To reproduce

I'm using a custom model.
If necessary, I could create a google colab if that helps.

Urgency

I can no longer update Onnxruntime without breaking the application.

Platform

Mac

OS Version

macOS Sonoma 14.5

ONNX Runtime Installation

Released Package

ONNX Runtime Version or Commit ID

1.17.0 and above

ONNX Runtime API

Python

Architecture

ARM64

Execution Provider

Default CPU

Execution Provider Library Version

No response

The text was updated successfully, but these errors were encountered:

MengLinMaker · 2024-05-24T13:28:01Z

Here's the DEBUG (0) log from Onnxruntime - I could't find any helpful info:

2024-05-24 23:25:21.261904 [I:onnxruntime:, inference_session.cc:533 TraceSessionOptions] Session Options {  execution_mode:0 execution_order:DEFAULT enable_profiling:0 optimized_model_filepath: enable_mem_pattern:1 enable_mem_reuse:1 enable_cpu_mem_arena:1 profile_file_prefix:onnxruntime_profile_ session_logid: session_log_severity_level:0 session_log_verbosity_level:0 max_num_graph_transformation_steps:10 graph_optimization_level:3 intra_op_param:OrtThreadPoolParams { thread_pool_size: 0 auto_set_affinity: 0 allow_spinning: 1 dynamic_block_base_: 0 stack_size: 0 affinity_str:  set_denormal_as_zero: 0 } inter_op_param:OrtThreadPoolParams { thread_pool_size: 0 auto_set_affinity: 0 allow_spinning: 1 dynamic_block_base_: 0 stack_size: 0 affinity_str:  set_denormal_as_zero: 0 } use_per_session_threads:1 thread_pool_allow_spinning:1 use_deterministic_compute:0 config_options: {  } }
2024-05-24 23:25:21.262142 [I:onnxruntime:, inference_session.cc:433 operator()] Flush-to-zero and denormal-as-zero are off
2024-05-24 23:25:21.262152 [I:onnxruntime:, inference_session.cc:441 ConstructorCommon] Creating and using per session threadpools since use_per_session_threads_ is true
2024-05-24 23:25:21.262158 [I:onnxruntime:, inference_session.cc:459 ConstructorCommon] Dynamic block base set to 0
2024-05-24 23:25:21.289917 [I:onnxruntime:, inference_session.cc:1602 Initialize] Initializing session.
2024-05-24 23:25:21.304691 [I:onnxruntime:, graph_partitioner.cc:900 InlineFunctionsAOT] This model does not have any local functions defined. AOT Inlining is not performed
2024-05-24 23:25:21.305127 [I:onnxruntime:, graph_transformer.cc:15 Apply] GraphTransformer EnsureUniqueDQForNodeUnit modified: 0 with status: OK
2024-05-24 23:25:21.322771 [I:onnxruntime:, graph_transformer.cc:15 Apply] GraphTransformer Level1_RuleBasedTransformer modified: 1 with status: OK
multiprocessing.pool.RemoteTraceback:

MengLinMaker · 2024-05-25T11:47:20Z

Solved the issue!

Cause:

Onnx opset version was not compatible with onnxruntime.

Note: This is not an issue with ONNXRuntime

Fix

Examine which onnx opset and onnxruntime version is required. Eg: onnxruntime==1.18 requires onnx=1.16 and opset 21.
Upgrade onnx opset:

import onnx

oldModel = onnx.load(modelPath)
upgradedModel = onnx.version_converter.convert_version(oldModel, 21)
onnx.save(upgradedModel, modelPath)

MengLinMaker · 2024-05-31T04:15:18Z

This issue will be updated in a few months from 31st of May 2024:
pytorch/pytorch#127167

For general best practice, I recommend explicitly stating the ONNX opset version

csukuangfj · 2024-07-10T04:12:09Z

Onnx opset version was not compatible with onnxruntime.

@MengLinMaker Could you list any reference about it?

The page
https://onnxruntime.ai/docs/reference/compatibility.html#onnx-opset-support
says

ONNX Runtime supports all opsets from the latest released version of the ONNX spec. All versions of ONNX Runtime support ONNX opsets from ONNX v1.2.1+ (opset version 7 and higher).

which means the latest onnxruntime would support all opsets >= 7.

MengLinMaker · 2024-07-10T06:11:51Z

Onnx opset version was not compatible with onnxruntime.

@MengLinMaker Could you list any reference about it?

The page
https://onnxruntime.ai/docs/reference/compatibility.html#onnx-opset-support
says

ONNX Runtime supports all opsets from the latest released version of the ONNX spec. All versions of ONNX Runtime support ONNX opsets from ONNX v1.2.1+ (opset version 7 and higher).

which means the latest onnxruntime would support all opsets >= 7.

Yep, there's no issues with ONNX Runtime that I could find, except that statement you referenced is inaccurate. A specific opset version was required to solve this issue.

Pytorch hard coded the default opset for ONNX conversion.

MengLinMaker · 2024-08-27T09:25:02Z

Update:
This issue should be solved in pytorch/pytorch#134571
Pytorch will be able to generate opset 21 ONNX for ONNXRuntime>=1.17 once the PR is merged and new package is released.

MengLinMaker changed the title ~~Regression: Torch exported Onnx doesn't run after OnnxRT update~~ Regression: Torch exported Onnx doesn't run after Onnxruntime>=1.17 update May 24, 2024

MengLinMaker changed the title ~~Regression: Torch exported Onnx doesn't run after Onnxruntime>=1.17 update~~ Regression: Torch exported Onnx doesn't run after Onnxruntime>=1.17 update - [ShapeInferenceError] May 24, 2024

MengLinMaker closed this as completed May 25, 2024

lumi-a mentioned this issue Jun 14, 2024

ONNXRuntimeError with a model that passes onnx.checker and was supported on previous ort versions #20641

Closed

csukuangfj mentioned this issue Jun 21, 2024

Support onnxruntime 1.18.0 k2-fsa/sherpa-onnx#906

Merged

csukuangfj mentioned this issue Jul 8, 2024

TTS bug k2-fsa/sherpa-onnx#1081

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Regression: Torch exported Onnx doesn't run after Onnxruntime>=1.17 update - [ShapeInferenceError] #20808

Regression: Torch exported Onnx doesn't run after Onnxruntime>=1.17 update - [ShapeInferenceError] #20808

MengLinMaker commented May 24, 2024 •

edited

Loading

MengLinMaker commented May 24, 2024

MengLinMaker commented May 25, 2024 •

edited

Loading

MengLinMaker commented May 31, 2024

csukuangfj commented Jul 10, 2024

MengLinMaker commented Jul 10, 2024 •

edited

Loading

MengLinMaker commented Aug 27, 2024

Regression: Torch exported Onnx doesn't run after Onnxruntime>=1.17 update - [ShapeInferenceError] #20808

Regression: Torch exported Onnx doesn't run after Onnxruntime>=1.17 update - [ShapeInferenceError] #20808

Comments

MengLinMaker commented May 24, 2024 • edited Loading

Describe the issue (Issue solved, see closing comment)

To reproduce

Urgency

Platform

OS Version

ONNX Runtime Installation

ONNX Runtime Version or Commit ID

ONNX Runtime API

Architecture

Execution Provider

Execution Provider Library Version

MengLinMaker commented May 24, 2024

MengLinMaker commented May 25, 2024 • edited Loading

Solved the issue!

Cause:

Fix

MengLinMaker commented May 31, 2024

csukuangfj commented Jul 10, 2024

MengLinMaker commented Jul 10, 2024 • edited Loading

MengLinMaker commented Aug 27, 2024

MengLinMaker commented May 24, 2024 •

edited

Loading

MengLinMaker commented May 25, 2024 •

edited

Loading

MengLinMaker commented Jul 10, 2024 •

edited

Loading