Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[TensorRT EP] support TensorRT 8.5 #13867

Merged
merged 52 commits into from
Dec 14, 2022
Merged
Show file tree
Hide file tree
Changes from 43 commits
Commits
Show all changes
52 commits
Select commit Hold shift + click to select a range
a5971f0
test TRT 8.5 GA
jywu-msft Nov 4, 2022
4d11ee8
update onnx-tensorrt submodule to 8.5-GA
jywu-msft Nov 9, 2022
fa2a58a
test builtin parser
jywu-msft Nov 14, 2022
ce021a2
try OSS parser again
jywu-msft Nov 15, 2022
b82948d
add back --gpus all
chilo-ms Nov 18, 2022
8c1bb7f
Revert to the state where build and test are running in container
chilo-ms Nov 21, 2022
0dd1129
Revert to the state where build and test are running in container (co…
chilo-ms Nov 21, 2022
1514ff9
Revert to the state where build and test are running in container (co…
chilo-ms Nov 21, 2022
42666d3
Update linux-gpu-tensorrt-ci-pipeline.yml
chilo-ms Nov 21, 2022
de0f435
skip tests for known issues
chilo-ms Nov 22, 2022
5a217de
skip tests for known issues
chilo-ms Nov 22, 2022
4f5ef22
Update TRT Windows CI ymal
chilo-ms Dec 6, 2022
0677f5e
Merge branch 'main' into chi_trt85
chilo-ms Dec 6, 2022
9288869
update CI ymals
chilo-ms Dec 6, 2022
100b934
Merge branch 'chi_trt85' of https://github.com/microsoft/onnxruntime …
chilo-ms Dec 6, 2022
331a947
use original pool
chilo-ms Dec 7, 2022
f59bd59
add placeholder flag for package pipelines
chilo-ms Dec 7, 2022
26d9c84
increase timeout for TRT EP
chilo-ms Dec 7, 2022
50d583a
revert increase timeout
chilo-ms Dec 7, 2022
7494080
add back timeout
chilo-ms Dec 7, 2022
c59a421
remove place holder since it still causes application deadlock
chilo-ms Dec 7, 2022
cade3ab
increase timeout to 10 hours
chilo-ms Dec 8, 2022
48d66ff
Merge branch 'main' into chi_trt85
chilo-ms Dec 8, 2022
10611cb
update deps.txt
chilo-ms Dec 8, 2022
948279a
remove increased time since merging the main
chilo-ms Dec 8, 2022
a21b306
fix bug
chilo-ms Dec 8, 2022
ff83678
include https://github.com/microsoft/onnxruntime/pull/13918 to fix co…
chilo-ms Dec 9, 2022
ea0c763
add comment to deps.txt
chilo-ms Dec 9, 2022
01741c1
fix bug
chilo-ms Dec 9, 2022
c32458c
increase timeout
chilo-ms Dec 9, 2022
2bcdef4
fix python format
chilo-ms Dec 9, 2022
9c199c1
format compliance
chilo-ms Dec 9, 2022
e1d6aaf
increase timeout for package pipeline
chilo-ms Dec 10, 2022
af7d169
fix bug for increasing timeout for package pipeline
chilo-ms Dec 10, 2022
35ab897
fix bug for increasing timeout for package pipeline
chilo-ms Dec 10, 2022
a1fded4
test CUDA_MODULE_LOADING=LAZY
chilo-ms Dec 10, 2022
a62f6e1
skip unnecessay and time-consuming unittests for TRT EP
chilo-ms Dec 11, 2022
d7c7ba7
only run TRT related tests
chilo-ms Dec 11, 2022
0e89046
fix format bug
chilo-ms Dec 11, 2022
63d664b
rename flag
chilo-ms Dec 12, 2022
f390402
remove timeout for TRT EP unittests
chilo-ms Dec 12, 2022
5fc64ca
remove timeout for TRT EP unittests
chilo-ms Dec 12, 2022
ed16a9b
add --skip_and_perform_filtered_tensorrt_tests to package pipeline
chilo-ms Dec 12, 2022
61fdf47
make timeout configurable
chilo-ms Dec 12, 2022
11c9d29
make timeout configurable (cont.)
chilo-ms Dec 12, 2022
c3376d9
make timeout configurable
chilo-ms Dec 12, 2022
ba4b59e
make timeout configurable
chilo-ms Dec 12, 2022
af57f18
make timeout configurable
chilo-ms Dec 12, 2022
68c185d
make timeout configurable (fix bug)
chilo-ms Dec 12, 2022
f3ccdd6
refactor
chilo-ms Dec 13, 2022
8b63162
refactor
chilo-ms Dec 13, 2022
ffce45c
fix for flake8 error
chilo-ms Dec 13, 2022
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion cgmanifests/generate_cgmanifest.py
Original file line number Diff line number Diff line change
Expand Up @@ -59,6 +59,7 @@ def add_github_dep(name, parsed_url):
if dep not in git_deps:
git_deps[dep] = name
else:
# TODO: support urls like: https://github.com/onnx/onnx-tensorrt/archive/refs/tags/release/7.1.zip
snnn marked this conversation as resolved.
Show resolved Hide resolved
if len(segments) == 5:
tag = PurePosixPath(segments[4]).stem
if tag.endswith(".tar"):
Expand All @@ -72,6 +73,7 @@ def add_github_dep(name, parsed_url):
return
# Make a REST call to convert to tag to a git commit
url = "https://api.github.com/repos/%s/%s/git/refs/tags/%s" % (org_name, repo_name, tag)
print("requesting %s ..." % url)
res = requests.get(url, auth=(args.username, args.token))
response_json = res.json()
tag_object = response_json["object"]
Expand Down Expand Up @@ -148,7 +150,6 @@ def normalize_path_separators(path):
"submodule",
"foreach",
"--quiet",
"--recursive",
"'{}' '{}' $toplevel/$sm_path".format(
normalize_path_separators(sys.executable),
normalize_path_separators(os.path.join(SCRIPT_DIR, "print_submodule_info.py")),
Expand Down
6 changes: 3 additions & 3 deletions cmake/deps.txt
Original file line number Diff line number Diff line change
Expand Up @@ -24,13 +24,13 @@ microsoft_wil;https://github.com/microsoft/wil/archive/5f4caba4e7a9017816e47becd
mimalloc;https://github.com/microsoft/mimalloc/archive/refs/tags/v2.0.3.zip;e4f37b93b2da78a5816c2495603a4188d316214b
mp11;https://github.com/boostorg/mp11/archive/refs/tags/boost-1.79.0.zip;c8f04e378535ededbe5af52c8f969d2dedbe73d5
onnx;https://github.com/onnx/onnx/archive/5a5f8a5935762397aa68429b5493084ff970f774.zip;edc8e1338c02f3ab222f3d803a24e17608c13895
#Branch name: 8.4-GA
onnx_tensorrt;https://github.com/onnx/onnx-tensorrt/archive/87c7a70688fd98fb355b8976f41425b40e4fe52f.zip;b97d112d9d6efa180c9b94e05268f2ff3294a534
#use the commit where it's several commits after 8.5-GA branch (https://github.com/onnx/onnx-tensorrt/commit/369d6676423c2a6dbf4a5665c4b5010240d99d3c)
onnx_tensorrt;https://github.com/onnx/onnx-tensorrt/archive/369d6676423c2a6dbf4a5665c4b5010240d99d3c.zip;62119892edfb78689061790140c439b111491275
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

leave a comment indicating which branch it's from. previously there was a comment for 8.4-GA

protobuf;https://github.com/protocolbuffers/protobuf/archive/refs/tags/v3.18.3.zip;b95bf7e9de9c2249b6c1f2ca556ace49999e90bd
psimd;https://github.com/Maratyszcza/psimd/archive/072586a71b55b7f8c584153d223e95687148a900.zip;1f5454b01f06f9656b77e4a5e2e31d7422487013
pthreadpool;https://github.com/Maratyszcza/pthreadpool/archive/1787867f6183f056420e532eec640cba25efafea.zip;e43e80781560c5ab404a4da20f34d846f5f5d101
pybind11;https://github.com/pybind/pybind11/archive/refs/tags/v2.10.1.zip;769b6aa67a77f17a770960f604b727645b6f6a13
pytorch_cpuinfo;https://github.com/pytorch/cpuinfo/archive/5916273f79a21551890fd3d56fc5375a78d1598d.zip;2be4d2ae321fada97cb39eaf4eeba5f8c85597cf
re2;https://github.com/google/re2/archive/refs/tags/2022-06-01.zip;aa77313b76e91b531ee7f3e45f004c6a502a5374
safeint;https://github.com/dcleblanc/SafeInt/archive/ff15c6ada150a5018c5ef2172401cb4529eac9c0.zip;913a4046e5274d329af2806cb53194f617d8c0ab
tensorboard;https://github.com/tensorflow/tensorboard/archive/373eb09e4c5d2b3cc2493f0949dc4be6b6a45e81.zip;67b833913605a4f3f499894ab11528a702c2b381
tensorboard;https://github.com/tensorflow/tensorboard/archive/373eb09e4c5d2b3cc2493f0949dc4be6b6a45e81.zip;67b833913605a4f3f499894ab11528a702c2b381
10 changes: 6 additions & 4 deletions cmake/onnxruntime_providers.cmake
Original file line number Diff line number Diff line change
Expand Up @@ -635,10 +635,12 @@ if (onnxruntime_USE_TENSORRT)
FetchContent_Declare(
onnx_tensorrt
URL ${DEP_URL_onnx_tensorrt}
URL_HASH SHA1=${DEP_SHA1_onnx_tensorrt}
URL_HASH SHA1=${DEP_SHA1_onnx_tensorrt}
)
FetchContent_MakeAvailable(onnx_tensorrt)
include_directories(${onnx_tensorrt_SOURCE_DIR})
# The onnx_tensorrt repo contains a test program, getSupportedAPITest, which doesn't support Windows. It uses
# unistd.h. So we must exclude it from our build. onnxruntime_fetchcontent_makeavailable is for the purpose.
onnxruntime_fetchcontent_makeavailable(onnx_tensorrt)
include_directories(${onnx_tensorrt_SOURCE_DIR})
set(CMAKE_CXX_FLAGS ${OLD_CMAKE_CXX_FLAGS})
if ( CMAKE_COMPILER_IS_GNUCC )
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -Wno-unused-parameter")
Expand Down Expand Up @@ -1479,7 +1481,7 @@ if (onnxruntime_USE_TVM)
# wd4100: identifier' : unreferenced formal parameter
# wd4127: conditional expression is constant
# wd4244: conversion from 'int' to 'char', possible loss of data
# TODO: 4244 should not be disabled
# TODO: 4244 should not be disabled
target_compile_options(onnxruntime_providers_tvm PRIVATE "/wd4100" "/wd4127" "/wd4244")
else()
target_compile_options(onnxruntime_providers_tvm PRIVATE "-Wno-error=type-limits")
Expand Down
22 changes: 15 additions & 7 deletions cmake/onnxruntime_unittests.cmake
Original file line number Diff line number Diff line change
Expand Up @@ -708,13 +708,21 @@ endif()

set(test_all_args)
if (onnxruntime_USE_TENSORRT)
# TRT EP CI takes much longer time when updating to TRT 8.2
# So, we only run trt ep and exclude other eps to reduce CI test time.
#
# The test names of model tests were using sequential number in the past.
# This PR https://github.com/microsoft/onnxruntime/pull/10220 (Please see ExpandModelName function in model_tests.cc for more details)
# made test name contain the "ep" and "model path" information, so we can easily filter the tests using cuda ep or other ep with *cpu__* or *xxx__*.
list(APPEND test_all_args "--gtest_filter=-*cpu__*:*cuda__*" )
if (onnxruntime_SKIP_AND_PERFORM_FILTERED_TENSORRT_TESTS)
# TRT EP package pipelines takes much longer time to run tests with TRT 8.5. We can't use placeholder to reduce testing time due to application test deadlock.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what's the impact of this?
how much did test time increase and what test coverage do we lose?

Copy link
Member

@jywu-msft jywu-msft Dec 12, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we make the timeout configurable and schedule a daily run which runs through all the tests?

Copy link
Contributor Author

@chilo-ms chilo-ms Dec 12, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what's the impact of this? how much did test time increase and what test coverage do we lose?

The test time is 2.5 hours for TRT 8.4 to finish, but it increases to more than 9 hours for TRT 8.5 still not even finished. (I think it needs several more hours to finish)

With this change, we won't test any unit tests instead of TensorrtExecutionProviderTest, but we will run model tests.

Copy link
Contributor Author

@chilo-ms chilo-ms Dec 12, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we make the timeout configurable and schedule a daily run which runs through all the tests?

Yes, we can.

# Therefore we only run filtered TRT EP tests.
list(APPEND test_all_args "--gtest_filter=*tensorrt_*:*TensorrtExecutionProviderTest*" )
#list(APPEND test_all_args "--gtest_filter=-*cpu_*:*cuda_*:*ContribOpTest*:*QuantGemmTest*:*QLinearConvTest*:*MurmurHash3OpTest*:*PadOpTest*:*QLinearConvTest*" )
else()
# TRT EP CI takes much longer time when updating to TRT 8.2
# So, we only run trt ep and exclude other eps to reduce CI test time.
#
# The test names of model tests were using sequential number in the past.
# This PR https://github.com/microsoft/onnxruntime/pull/10220 (Please see ExpandModelName function in model_tests.cc for more details)
# made test name contain the "ep" and "model path" information, so we can easily filter the tests using cuda ep or other ep with *cpu_* or *xxx_*.
list(APPEND test_all_args "--gtest_filter=-*cpu_*:*cuda_*" )
endif()

endif ()

AddTest(
Expand Down
8 changes: 4 additions & 4 deletions onnxruntime/test/contrib_ops/quantize_ops_test.cc
Original file line number Diff line number Diff line change
Expand Up @@ -203,7 +203,7 @@ void TestQuantizeLinearPerTensorFloatUint8(bool use_initializer_except_x) {
255, 0,
255, 0,
255, 0});
test.Run();
test.Run(OpTester::ExpectResult::kExpectSuccess, "", {kTensorrtExecutionProvider}); //TensorRT doesn't support support UINT8 for quantization
}

TEST(QuantizeLinearContribOpTest, QuantizeLinear_per_tensor_float_uint8) {
Expand Down Expand Up @@ -270,7 +270,7 @@ TEST(QuantizeLinearContribOpTest, QuantizeLinear_per_tensor_half_uint8) {
255, 0,
255, 0,
255, 0});
test.Run();
test.Run(OpTester::ExpectResult::kExpectSuccess, "", {kTensorrtExecutionProvider}); //TensorRT doesn't support support UINT8 for quantization
}

TEST(QuantizeLinearContribOpTest, QuantizeLinear_per_tensor_half_int8) {
Expand Down Expand Up @@ -317,7 +317,7 @@ TEST(QuantizeLinearContribOpTest, QuantizeLinear_per_channel) {
{0, 2, 3, 255,
0, 1, 2, 255,
0, 0, 1, 250});
test.Run();
test.Run(OpTester::ExpectResult::kExpectSuccess, "", {kTensorrtExecutionProvider}); //TensorRT doesn't support support UINT8 for quantization
}

// quantize with broadcasting and negative axis (-2 resolves to axis 0)
Expand All @@ -335,7 +335,7 @@ TEST(QuantizeLinearContribOpTest, QuantizeLinear_per_channel_negative_axis) {
{0, 2, 3, 255,
0, 1, 2, 255,
0, 0, 1, 250});
test.Run();
test.Run(OpTester::ExpectResult::kExpectSuccess, "", {kTensorrtExecutionProvider}); //TensorRT doesn't support support UINT8 for quantization
}
} // namespace test
} // namespace onnxruntime
4 changes: 2 additions & 2 deletions onnxruntime/test/contrib_ops/tensor_op_test.cc
Original file line number Diff line number Diff line change
Expand Up @@ -120,7 +120,7 @@ void MeanVarianceNormalizationAcrossChannels(bool across_channels, bool normaliz
test.AddAttribute("normalize_variance", normalize_variance ? one : zero);
test.AddInput<float>("input", {N, C, H, W}, X);
test.AddOutput<float>("output", {N, C, H, W}, result);
test.Run(OpTester::ExpectResult::kExpectSuccess, "", {kOpenVINOExecutionProvider}); //OpenVINO doesn't support MVN operator below opset 9
test.Run(OpTester::ExpectResult::kExpectSuccess, "", {kOpenVINOExecutionProvider, kTensorrtExecutionProvider}); //OpenVINO doesn't support MVN operator below opset 9. TensorRT doesn't support opset 8 of MVN operator.
}

void MeanVarianceNormalizationPerChannel(bool across_channels, bool normalize_variance) {
Expand Down Expand Up @@ -187,7 +187,7 @@ void MeanVarianceNormalizationPerChannel(bool across_channels, bool normalize_va
test.AddAttribute("normalize_variance", normalize_variance ? one : zero);
test.AddInput<float>("input", {N, C, H, W}, X);
test.AddOutput<float>("output", {N, C, H, W}, result);
test.Run(OpTester::ExpectResult::kExpectSuccess, "", {kOpenVINOExecutionProvider}); //OpenVINO doesn't support MVN operator below opset 9
test.Run(OpTester::ExpectResult::kExpectSuccess, "", {kOpenVINOExecutionProvider, kTensorrtExecutionProvider}); //OpenVINO doesn't support MVN operator below opset 9. TensorRT doesn't support opset 8 of MVN operator.
}

TEST(MVNContribOpTest, MeanVarianceNormalizationCPUTest_Version1_TO_8) {
Expand Down
3 changes: 2 additions & 1 deletion onnxruntime/test/providers/cpu/generator/random_test.cc
Original file line number Diff line number Diff line change
Expand Up @@ -71,7 +71,8 @@ void RunRandomNormalLike3DFloat(bool infer_dtype = false) {

test.AddOutput<float>("Y", dims, expected_output);

test.Run(OpTester::ExpectResult::kExpectSuccess, "", {kCudaExecutionProvider, kRocmExecutionProvider});
// TensorRT does not support manual seed overrides and there will be result mismatch
test.Run(OpTester::ExpectResult::kExpectSuccess, "", {kCudaExecutionProvider, kRocmExecutionProvider, kTensorrtExecutionProvider});
}

TEST(Random, RandomNormalLike3DDouble) {
Expand Down
4 changes: 2 additions & 2 deletions onnxruntime/test/providers/cpu/math/element_wise_ops_test.cc
Original file line number Diff line number Diff line change
Expand Up @@ -2883,7 +2883,7 @@ TEST(ModOpTest, Int8_mixed_sign) {
test.AddInput<int8_t>("Y", {6}, {2, -3, 8, -2, 3, 5});
test.AddOutput<int8_t>("Z", {6}, {0, -2, 5, 0, 2, 3});

test.Run();
test.Run(OpTester::ExpectResult::kExpectSuccess, "", {kTensorrtExecutionProvider}); // For TensorRT running in these in INT8 quantization scales are needed, so skip it now
}

TEST(ModOpTest, Int8_mixed_sign_fmod) {
Expand All @@ -2894,7 +2894,7 @@ TEST(ModOpTest, Int8_mixed_sign_fmod) {
test.AddInput<int8_t>("Y", {6}, {2, -3, 8, -2, 3, 5});
test.AddOutput<int8_t>("Z", {6}, {0, 1, 5, 0, -1, 3});

test.Run();
test.Run(OpTester::ExpectResult::kExpectSuccess, "", {kTensorrtExecutionProvider}); // For TensorRT running in these in INT8 quantization scales are needed, so skip it now
}

TEST(ModOpTest, UInt8_mod) {
Expand Down
2 changes: 1 addition & 1 deletion onnxruntime/test/providers/cpu/nn/shrink_test.cc
Original file line number Diff line number Diff line change
Expand Up @@ -97,7 +97,7 @@ const std::vector<MLFloat16> ConvertFloatToMLFloat16(const std::vector<float>& f

TEST(MathOpTest, ShrinkInt8Type) {
const auto& test_cases = GenerateSignedTestCases<int8_t>();
RunShrinkTest<int8_t>(test_cases);
RunShrinkTest<int8_t>(test_cases, {kTensorrtExecutionProvider}); // For TensorRT running in these in INT8 quantization scales are needed, so skip it now
}

TEST(MathOpTest, ShrinkUint8Type) {
Expand Down
14 changes: 7 additions & 7 deletions onnxruntime/test/providers/cpu/tensor/quantize_linear_test.cc
Original file line number Diff line number Diff line change
Expand Up @@ -228,7 +228,7 @@ TEST(QuantizeLinearOpTest, Uint8) {
test.AddInput<float>("y_scale", {}, {2.0f});
test.AddInput<uint8_t>("y_zero_point", {}, {128});
test.AddOutput<uint8_t>("y", dims, {128, 129, 130, 255, 1, 0});
test.Run();
test.Run(OpTester::ExpectResult::kExpectSuccess, "", {kTensorrtExecutionProvider}); //TensorRT doesn't support support UINT8 for quantization
}

// quantize with scalar zero point and scale
Expand Down Expand Up @@ -296,7 +296,7 @@ TEST(QuantizeLinearOpTest, 2D) {
{0, 0, 1, 250,
0, 0, 1, 250,
0, 0, 1, 250});
test.Run();
test.Run(OpTester::ExpectResult::kExpectSuccess, "", {kTensorrtExecutionProvider}); //TensorRT doesn't support support UINT8 for quantization
}

// quantize with scalar data
Expand All @@ -306,7 +306,7 @@ TEST(QuantizeLinearOpTest, Scalar) {
test.AddInput<float>("y_scale", {}, {2.0f});
test.AddInput<uint8_t>("y_zero_point", {}, {128});
test.AddOutput<uint8_t>("y", {}, {130});
test.Run();
test.Run(OpTester::ExpectResult::kExpectSuccess, "", {kTensorrtExecutionProvider}); //TensorRT doesn't support support UINT8 for quantization
}

// quantize with scalar data
Expand All @@ -315,7 +315,7 @@ TEST(QuantizeLinearOpTest, DISABLED_QuantizeLinear_Without_Zero_Point) {
test.AddInput<float>("x", {}, {3});
test.AddInput<float>("y_scale", {}, {2.0f});
test.AddOutput<uint8_t>("y", {}, {2});
test.Run();
test.Run(OpTester::ExpectResult::kExpectSuccess, "", {kTensorrtExecutionProvider}); //TensorRT doesn't support support UINT8 for quantization
}

TEST(QuantizeLinearOpTest, Per_Channel_Axis_Default) {
Expand All @@ -331,7 +331,7 @@ TEST(QuantizeLinearOpTest, Per_Channel_Axis_Default) {
{64, 101, 127, 177,
65, 100, 128, 182,
66, 102, 128, 187});
test.Run();
test.Run(OpTester::ExpectResult::kExpectSuccess, "", {kTensorrtExecutionProvider}); //TensorRT doesn't support support UINT8 for quantization
}

TEST(QuantizeLinearOpTest, Per_Channel_Axis_0) {
Expand All @@ -348,7 +348,7 @@ TEST(QuantizeLinearOpTest, Per_Channel_Axis_0) {
{0, 2, 3, 255,
0, 1, 2, 255,
0, 0, 1, 250});
test.Run();
test.Run(OpTester::ExpectResult::kExpectSuccess, "", {kTensorrtExecutionProvider}); //TensorRT doesn't support support UINT8 for quantization
}

// quantize with per-channel and negative axis (-2 resolves to axis 0)
Expand All @@ -366,7 +366,7 @@ TEST(QuantizeLinearOpTest, Per_Channel_Axis_neg) {
{0, 2, 3, 255,
0, 1, 2, 255,
0, 0, 1, 250});
test.Run();
test.Run(OpTester::ExpectResult::kExpectSuccess, "", {kTensorrtExecutionProvider}); //TensorRT doesn't support support UINT8 for quantization
}

} // namespace test
Expand Down
7 changes: 7 additions & 0 deletions tools/ci_build/build.py
Original file line number Diff line number Diff line change
Expand Up @@ -491,6 +491,11 @@ def convert_arg_line_to_args(self, arg_line):
"--tensorrt_placeholder_builder", action="store_true", help="Instantiate Placeholder TensorRT Builder"
)
parser.add_argument("--tensorrt_home", help="Path to TensorRT installation dir")
parser.add_argument(
"--skip_and_perform_filtered_tensorrt_tests",
action="store_true",
help="Skip time-consuming and only perform filtered tests for TensorRT EP",
)
parser.add_argument("--use_migraphx", action="store_true", help="Build with MIGraphX")
parser.add_argument("--migraphx_home", help="Path to MIGraphX installation dir")
parser.add_argument("--use_full_protobuf", action="store_true", help="Use the full protobuf library")
Expand Down Expand Up @@ -876,6 +881,8 @@ def generate_build_tree(
"-Donnxruntime_ENABLE_MICROSOFT_INTERNAL=" + ("ON" if args.enable_msinternal else "OFF"),
"-Donnxruntime_USE_VITISAI=" + ("ON" if args.use_vitisai else "OFF"),
"-Donnxruntime_USE_TENSORRT=" + ("ON" if args.use_tensorrt else "OFF"),
"-Donnxruntime_SKIP_AND_PERFORM_FILTERED_TENSORRT_TESTS="
+ ("ON" if args.skip_and_perform_filtered_tensorrt_tests else "OFF"),
"-Donnxruntime_USE_TENSORRT_BUILTIN_PARSER=" + ("ON" if args.use_tensorrt_builtin_parser else "OFF"),
"-Donnxruntime_TENSORRT_PLACEHOLDER_BUILDER=" + ("ON" if args.tensorrt_placeholder_builder else "OFF"),
# set vars for TVM
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -116,7 +116,7 @@ jobs:
buildArch: x64
msbuildPlatform: x64
packageName: x64-tensorrt
buildparameter: --use_tensorrt --tensorrt_home="C:\local\TensorRT-8.4.1.5.Windows10.x86_64.cuda-11.6.cudnn8.4" --cuda_version=11.6 --cuda_home="C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.6" --enable_onnx_tests --enable_wcos --build_java --cmake_extra_defines "CMAKE_CUDA_ARCHITECTURES=37;52;60;61;70;75;80"
buildparameter: --use_tensorrt --skip_and_perform_filtered_tensorrt_tests --tensorrt_home="C:\local\TensorRT-8.5.1.7.Windows10.x86_64.cuda-11.8.cudnn8.6" --cuda_version=11.6 --cuda_home="C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.6" --enable_onnx_tests --enable_wcos --build_java --cmake_extra_defines "CMAKE_CUDA_ARCHITECTURES=37;52;60;61;70;75;80"
runTests: ${{ parameters.RunOnnxRuntimeTests }}
buildJava: true
java_artifact_id: onnxruntime_gpu
Expand Down Expand Up @@ -294,11 +294,11 @@ jobs:
Steps:
- script: |
tools/ci_build/get_docker_image.py \
--dockerfile tools/ci_build/github/linux/docker/Dockerfile.manylinux2014_cuda11_6_tensorrt8_4 \
--dockerfile tools/ci_build/github/linux/docker/Dockerfile.manylinux2014_cuda11_6_tensorrt8_5 \
--context tools/ci_build/github/linux/docker \
--docker-build-args "--network=host --build-arg POLICY=manylinux2014 --build-arg PLATFORM=x86_64 --build-arg DEVTOOLSET_ROOTPATH=/opt/rh/devtoolset-11/root --build-arg PREPEND_PATH=/opt/rh/devtoolset-11/root/usr/bin: --build-arg LD_LIBRARY_PATH_ARG=/opt/rh/devtoolset-11/root/usr/lib64:/opt/rh/devtoolset-11/root/usr/lib:/opt/rh/devtoolset-11/root/usr/lib64/dyninst:/opt/rh/devtoolset-11/root/usr/lib/dyninst:/usr/local/lib64 --build-arg BUILD_UID=$( id -u )" \
--container-registry onnxruntimebuildcache \
--repository onnxruntimecuda116xtrt84build
--repository onnxruntimecuda116xtrt85build
displayName: "Getonnxruntimecuda116xtrt84build image for tools/ci_build/github/linux/docker/Dockerfile.manylinux2014_cuda11_6_tensorrt8_4"
workingDirectory: $(Build.SourcesDirectory)/onnxruntime
ContainerRegistry: onnxruntimebuildcache
Expand Down Expand Up @@ -351,7 +351,7 @@ jobs:
inputs:
script: |
docker run --gpus all -e CC=/opt/rh/devtoolset-11/root/usr/bin/cc -e CXX=/opt/rh/devtoolset-11/root/usr/bin/c++ -e CFLAGS="-Wp,-D_FORTIFY_SOURCE=2 -Wp,-D_GLIBCXX_ASSERTIONS -fstack-protector-strong -fstack-clash-protection -fcf-protection -O3 -Wl,--strip-all" -e CXXFLAGS="-Wp,-D_FORTIFY_SOURCE=2 -Wp,-D_GLIBCXX_ASSERTIONS -fstack-protector-strong -fstack-clash-protection -fcf-protection -O3 -Wl,--strip-all" -e NVIDIA_VISIBLE_DEVICES=all --rm --volume $(Build.SourcesDirectory):/src_dir \
--volume $(Build.ArtifactStagingDirectory):/artifact_src -e NIGHTLY_BUILD onnxruntimecuda116xtrt84build \
--volume $(Build.ArtifactStagingDirectory):/artifact_src -e NIGHTLY_BUILD onnxruntimecuda116xtrt85build \
/src_dir/onnxruntime-inference-examples/c_cxx/squeezenet/run_capi_application.sh -o /src_dir/onnxruntime -p /artifact_src/onnxruntime-linux-x64-gpu-$(OnnxRuntimeVersion).tgz -w /src_dir/onnxruntime-inference-examples/c_cxx/squeezenet
workingDirectory: '$(Build.ArtifactStagingDirectory)'

Expand Down
Loading