Skip to content

Commit

Permalink
Fixes in request cancellation doc (#6409)
Browse files Browse the repository at this point in the history
  • Loading branch information
tanmayv25 authored Oct 11, 2023
1 parent 9c707e3 commit 85487a1
Showing 1 changed file with 4 additions and 4 deletions.
8 changes: 4 additions & 4 deletions docs/user_guide/request_cancellation.md
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,7 @@

# Request Cancellation

Starting from 23.10, Triton supports handling request cancellation received
Starting from r23.10, Triton supports handling request cancellation received
from the gRPC client or a C API user. Long running inference requests such
as for auto generative large language models may run for an indeterminate
amount of time or indeterminate number of steps. Additionally clients may
Expand All @@ -39,7 +39,7 @@ resources.

## Issuing Request Cancellation

### Triton C API
### In-Process C API

[In-Process Triton Server C API](../customization_guide/inference_protocols.md#in-process-triton-server-api) has been enhanced with `TRITONSERVER_InferenceRequestCancel`
and `TRITONSERVER_InferenceRequestIsCancelled` to issue cancellation and query
Expand Down Expand Up @@ -77,9 +77,9 @@ detection and handling within Triton core is work in progress.

## Handling in Backend

Upon receiving request cancellation, triton does its best to terminate request
Upon receiving request cancellation, Triton does its best to terminate request
at various points. However, once a request has been given to the backend
for execution, it is upto the individual backends to detect and handle
for execution, it is up to the individual backends to detect and handle
request termination.
Currently, the following backends support early termination:
- [vLLM backend](https://github.com/triton-inference-server/vllm_backend)
Expand Down

0 comments on commit 85487a1

Please sign in to comment.