-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enhance OTEL testing to capture and verify Cancellation Requests and Non-Decoupled model inference. #7132
Merged
Merged
Enhance OTEL testing to capture and verify Cancellation Requests and Non-Decoupled model inference. #7132
Changes from 16 commits
Commits
Show all changes
18 commits
Select commit
Hold shift + click to select a range
bb13bb8
Cancel OTEL TC
indrajit96 96385fc
Update TC number
indrajit96 a2932be
Merge remote-tracking branch 'origin/main' into indrajit_otel_testing
indrajit96 c935822
OTEL Cancel TC added
indrajit96 3c71a5d
Minor updates before PR
indrajit96 4a724e0
Merge remote-tracking branch 'origin/main' into indrajit_otel_testing
indrajit96 3ae9c25
Remove debugs
indrajit96 ec182a6
Review comments fixed
indrajit96 f4305f3
Revert "Review comments fixed"
indrajit96 fa2d331
Review comments fixed
indrajit96 b58388f
Review Comments fixed
indrajit96 4f102b0
Pre-Commit typo
indrajit96 1b7a9d5
Merge branch 'main' into indrajit_otel_testing
indrajit96 e464e3b
Add new test for queued cancellation
indrajit96 63001f1
Merge branch 'indrajit_otel_testing' of github.com:triton-inference-s…
indrajit96 2c884b5
Added test fo cancelling queued request
indrajit96 5046571
More Comments
indrajit96 ec175c0
Merge branch 'main' into indrajit_otel_testing
indrajit96 File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,49 @@ | ||
# Copyright 2024, NVIDIA CORPORATION & AFFILIATES. All rights reserved. | ||
# | ||
# Redistribution and use in source and binary forms, with or without | ||
# modification, are permitted provided that the following conditions | ||
# are met: | ||
# * Redistributions of source code must retain the above copyright | ||
# notice, this list of conditions and the following disclaimer. | ||
# * Redistributions in binary form must reproduce the above copyright | ||
# notice, this list of conditions and the following disclaimer in the | ||
# documentation and/or other materials provided with the distribution. | ||
# * Neither the name of NVIDIA CORPORATION nor the names of its | ||
# contributors may be used to endorse or promote products derived | ||
# from this software without specific prior written permission. | ||
# | ||
# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS ``AS IS'' AND ANY | ||
# EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE | ||
# IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR | ||
# PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR | ||
# CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, | ||
# EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, | ||
# PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR | ||
# PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY | ||
# OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT | ||
# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE | ||
# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. | ||
|
||
import json | ||
import time | ||
|
||
import numpy as np | ||
import triton_python_backend_utils as pb_utils | ||
|
||
|
||
class TritonPythonModel: | ||
def initialize(self, args): | ||
self.model_config = json.loads(args["model_config"]) | ||
|
||
def execute(self, requests): | ||
"""This function is called on inference request.""" | ||
# Less than collector timeout which is 10 | ||
time.sleep(2) | ||
responses = [] | ||
for _ in requests: | ||
# Include one of each specially parsed JSON value: nan, inf, and -inf | ||
out_0 = np.array([1], dtype=np.float32) | ||
out_tensor_0 = pb_utils.Tensor("OUTPUT0", out_0) | ||
responses.append(pb_utils.InferenceResponse([out_tensor_0])) | ||
|
||
return responses |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,55 @@ | ||
# Copyright (c) 2024, NVIDIA CORPORATION & AFFILIATES. All rights reserved. | ||
# | ||
# Redistribution and use in source and binary forms, with or without | ||
# modification, are permitted provided that the following conditions | ||
# are met: | ||
# * Redistributions of source code must retain the above copyright | ||
# notice, this list of conditions and the following disclaimer. | ||
# * Redistributions in binary form must reproduce the above copyright | ||
# notice, this list of conditions and the following disclaimer in the | ||
# documentation and/or other materials provided with the distribution. | ||
# * Neither the name of NVIDIA CORPORATION nor the names of its | ||
# contributors may be used to endorse or promote products derived | ||
# from this software without specific prior written permission. | ||
# | ||
# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS ``AS IS'' AND ANY | ||
# EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE | ||
# IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR | ||
# PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR | ||
# CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, | ||
# EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, | ||
# PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR | ||
# PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY | ||
# OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT | ||
# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE | ||
# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. | ||
|
||
name: "input_all_required" | ||
backend: "python" | ||
input [ | ||
{ | ||
name: "INPUT0" | ||
data_type: TYPE_FP32 | ||
dims: [ -1 ] | ||
}, | ||
{ | ||
name: "INPUT1" | ||
data_type: TYPE_FP32 | ||
dims: [ -1 ] | ||
}, | ||
{ | ||
name: "INPUT2" | ||
data_type: TYPE_FP32 | ||
dims: [ -1 ] | ||
} | ||
] | ||
|
||
output [ | ||
{ | ||
name: "OUTPUT0" | ||
data_type: TYPE_FP32 | ||
dims: [ 1 ] | ||
} | ||
] | ||
|
||
instance_group [{ kind: KIND_CPU }] |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you please clarify what delay in the model? and if it is a model related, why are we using
COLLECTOR_TIMEOUT
variable?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We want to capture a cancellation request traces WHILE the inference is in the COMPUTE stage.
Because the model "input_all_required" has a delay/wait in the compute phase so the cancellation request can be send while the request is waiting in the compute phase.
The idea here is to wait before we try and read the traces from the file.
If we do not wait, the file in empty without traces.
Updated the comments in the test to reflect his better.