Enable provider unit tests for TensorRT #802

stevenlix · 2019-04-09T23:27:44Z

No description provided.

jywu-msft · 2019-04-10T20:14:02Z

would have expected more changes than just enabling the unit tests for TensorRT...some are sure to fail in current state.
and sure enough, it looks like one of the tests is causing a segfault.

jywu-msft · 2019-04-11T23:09:23Z

it would be nice to have some comments about why specific tests are being disabled.
ideally, we have categories of issues, and we can add comments to indicate which category a disabled test falls under.

onnxruntime/core/providers/tensorrt/tensorrt_execution_provider.h

jywu-msft · 2019-04-11T22:56:24Z

onnxruntime/core/providers/tensorrt/tensorrt_execution_provider.h

-static const int kMaxBatchSize = 1;
-static const int kMaxWorkSpaceSize = 1 << 30;
+static const int kMaxBatchSize = 13;
+static const int kMaxWorkSpaceSize = 1 << 24;


why is this changing?

It's better to add an env variable for MaxWorkSpaceSize too.

jywu-msft · 2019-04-11T23:04:35Z

onnxruntime/test/providers/cpu/nn/conv_op_test.cc

@@ -50,6 +51,9 @@ void TestConvOp(const ConvOpAttributes& attributes,
  if (!is_mkldnn_supported) {
    excluded_providers.insert(kMklDnnExecutionProvider);
  }
+  if (!is_tensorrt_supported) {
+    excluded_providers.insert(kTensorrtExecutionProvider);


none of the conv tests pass for tensorrt right?
in that case, maybe it's better to not have an option for "is_tensorrt_supported" at all. (our expectation is they all pass, once weights as inputs is supported?)
we can remove the is_mkldnn_supported option as well, I think that's also not needed anymore.

The reason is that though all tests in the same file failed, the reason of failure may be different. We may need to enable part of the tests if some issues are fixed while others are not yet.

I think it's better to aim for less code changes to improve readability and maintainability.
The current state is all the tests fail for conv,convtranspose and others. I think it's better in general to not add code which is not used.

jywu-msft · 2019-04-11T23:05:39Z

onnxruntime/test/providers/cpu/nn/conv_transpose_op_test.cc

@@ -45,7 +46,11 @@ void TestConvTransposeOp(const ConvTransposeOpAttributes& attributes,
    test.AddInput<float>(szNames[i], input_shapes[i], inputs[i]);
  }
  test.AddOutput<float>("Y", expected_output_shape, expected_output);
-  test.Run(expect_result, err_str);
+  std::unordered_set<std::string> excluded_providers;
+  if (!is_tensorrt_supported) {


similar comment to conv above.
maybe easier to just exclude tensorrt , and not have option for the case where we need to disable all the tests.

onnxruntime/core/providers/tensorrt/tensorrt_execution_provider.cc

onnxruntime/test/providers/cpu/math/element_wise_ops_test.cc

…unsupported tests

cmake/onnxruntime_providers.cmake

onnxruntime/core/providers/tensorrt/tensorrt_execution_provider.h

jywu-msft · 2019-04-17T17:37:27Z

onnxruntime/test/providers/cpu/nn/conv_op_test.cc

@@ -50,6 +50,7 @@ void TestConvOp(const ConvOpAttributes& attributes,
  if (!is_mkldnn_supported) {
    excluded_providers.insert(kMklDnnExecutionProvider);
  }
+  excluded_providers.insert(kTensorrtExecutionProvider);


please add comment for why

jywu-msft · 2019-04-17T17:37:57Z

onnxruntime/test/providers/cpu/nn/conv_transpose_op_test.cc

@@ -45,7 +45,7 @@ void TestConvTransposeOp(const ConvTransposeOpAttributes& attributes,
    test.AddInput<float>(szNames[i], input_shapes[i], inputs[i]);
  }
  test.AddOutput<float>("Y", expected_output_shape, expected_output);
-  test.Run(expect_result, err_str);
+  test.Run(expect_result, err_str, {kTensorrtExecutionProvider});


ditto. please try to add more comments broadly explaining why certain tests are disabled.

jywu-msft · 2019-04-17T17:42:18Z

tools/ci_build/build.py

@@ -480,6 +480,8 @@ def setup_tensorrt_vars(args):
                             "tensorrt_home='{}' valid={}."
                             .format(tensorrt_home, tensorrt_home_valid))

+        os.environ["TENSORRT_MAX_BATCH_SIZE"] = "13"


add a comment?
btw, why 13? is that the maximum batch size that we test in the unit tests?
if someone adds a test which requires larger, let's make it easy for them to figure out why.

jywu-msft · 2019-04-17T17:44:49Z

onnxruntime/core/providers/tensorrt/tensorrt_execution_provider.h

 private:
+ int max_batch_size_ = 1;
+ const int max_workspace_size_ = 1 << 30;


wouldn't we want to make workspace_size configurable too?

Per TensorRT doc, max workspace size_ should be set as large as possible and there is no need to make it configurable. Actually memory is allocated as needed when creating an IExecutionContext.

1 << 30 doesn't strike me to mean "as large as possible"
how did you choose that value? if it can't guarantee it will work for all cases, then it would be safer to make it configurable.

actually we should be considering the reverse scenario.
on devices that don't have as much memory (e.g. jetson nano) , one may want to limit the size of the workspace to be small.
it's better to make this configurable to cover both sides of the spectrum.

jywu-msft · 2019-04-17T17:47:39Z

onnxruntime/core/providers/tensorrt/tensorrt_execution_provider.cc

-    trt_builder->setMaxBatchSize(kMaxBatchSize);
-    trt_builder->setMaxWorkspaceSize(kMaxWorkSpaceSize);
+
+    const char* batch_env = getenv("TENSORRT_MAX_BATCH_SIZE");


would be better if the env variable started with ORT_
to ensure we don't potentially collide with real TENSORRT env variables

jywu-msft

added some more comments.

Update provider_test_utils.cc

0387162

stevenlix requested a review from jywu-msft April 9, 2019 23:27

stevenlix requested a review from a team as a code owner April 9, 2019 23:27

stevenlix added 19 commits April 10, 2019 15:02

Update tensorrt_execution_provider.h

14ba959

Update tensorrt_execution_provider.cc

71c0e2e

Update gemm_test.cc

cd77ac3

Update softmax_test.cc

14f56b4

Update logsoftmax_test.cc

2662c3f

Update matmul_test.cc

d078cd9

Update batch_norm_op_test.cc

faf862b

Update conv_op_test.cc

bb3f15c

Update batch_norm_op_test.cc

c3daaaf

Update softmax_test.cc

f05f794

Update conv_transpose_op_test.cc

8dd1f0c

Update instance_norm_op_test.cc

8763628

Update flatten_op_test.cc

30720d5

Update loop_test.cc

c7869da

Disable failed tests for TensorRT

a2329fa

Disable unsupported tests for TensorRT

92a1638

Disable unsupported tests for TensorRT

b58c6eb

Disable unsupported tests for TensorRT

275e9a6

Disable unsupported tests for TensorRT

a3403a1

stevenlix added 3 commits April 11, 2019 16:32

Update matmul_test.cc

541908a

Update logsoftmax_test.cc

53fd9d9

Update topk_op_test.cc

4bea4c0

jywu-msft requested changes Apr 12, 2019

View reviewed changes

stevenlix added 2 commits April 13, 2019 17:17

disable unsupported tests for TensorRT

12381b3

resolve conflicts

3f39d90

stevenlix added 3 commits April 13, 2019 17:51

Merge branch 'master' into stevenlix/trttest

320d1c4

Update identity_op_test.cc

4583008

Update activation_op_test.cc

5454ad2

jywu-msft reviewed Apr 14, 2019

View reviewed changes

onnxruntime/test/providers/cpu/math/element_wise_ops_test.cc Show resolved Hide resolved

jywu-msft reviewed Apr 14, 2019

View reviewed changes

onnxruntime/test/providers/cpu/math/element_wise_ops_test.cc Outdated Show resolved Hide resolved

make max batch size configurable and simplify the code for disabling …

dd70484

…unsupported tests

jywu-msft reviewed Apr 16, 2019

View reviewed changes

cmake/onnxruntime_providers.cmake Outdated Show resolved Hide resolved

jywu-msft reviewed Apr 16, 2019

View reviewed changes

onnxruntime/core/providers/tensorrt/tensorrt_execution_provider.h Outdated Show resolved Hide resolved

stevenlix and others added 6 commits April 16, 2019 13:31

make max batch size configurable at runtime

53bc7cc

update tensorrt ci pipline

589f7be

move max batch size to private

71230b4

Update tensorrt_execution_provider.cc

cc10cfa

Update tensorrt_execution_provider.h

bfd7a7a

Update tensorrt_execution_provider.cc

bb9467d

jywu-msft reviewed Apr 17, 2019

View reviewed changes

jywu-msft requested changes Apr 17, 2019

View reviewed changes

stevenlix and others added 4 commits April 17, 2019 16:15

add comments on the test changes

b20f96c

Update tensorrt_execution_provider.h

293b98a

Update tensorrt_execution_provider.cc

6218bd3

Update build.py

8997989

jywu-msft approved these changes Apr 18, 2019

View reviewed changes

stevenlix merged commit f2694ab into master Apr 18, 2019

raymondxyang deleted the stevenlix/trttest branch April 26, 2019 07:43

jywu-msft mentioned this pull request May 11, 2019

Improve TensorRT GetCapability to Enable More Models #1012

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Enable provider unit tests for TensorRT #802

Enable provider unit tests for TensorRT #802

stevenlix commented Apr 9, 2019

jywu-msft commented Apr 10, 2019

jywu-msft commented Apr 11, 2019

jywu-msft Apr 11, 2019

stevenlix Apr 14, 2019

jywu-msft Apr 11, 2019

stevenlix Apr 14, 2019

jywu-msft Apr 14, 2019

jywu-msft Apr 11, 2019

jywu-msft Apr 17, 2019

jywu-msft Apr 17, 2019

jywu-msft Apr 17, 2019

jywu-msft Apr 17, 2019

stevenlix Apr 17, 2019

jywu-msft Apr 17, 2019

jywu-msft Apr 18, 2019

jywu-msft Apr 17, 2019

jywu-msft left a comment

Enable provider unit tests for TensorRT #802

Enable provider unit tests for TensorRT #802

Conversation

stevenlix commented Apr 9, 2019

jywu-msft commented Apr 10, 2019

jywu-msft commented Apr 11, 2019

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jywu-msft left a comment

Choose a reason for hiding this comment