Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Support for request id field in generate API #7392

Merged
merged 9 commits into from
Jul 10, 2024
10 changes: 8 additions & 2 deletions docs/protocol/extension_generate.md
Original file line number Diff line number Diff line change
Expand Up @@ -87,10 +87,12 @@ return an error.

$generate_request =
{
"id" : $string, #optional
rmccorm4 marked this conversation as resolved.
Show resolved Hide resolved
"text_input" : $string,
"parameters" : $parameters #optional
}

* "id": An identifier for this request. Optional, but if specified this identifier must be returned in the response.
* "text_input" : The text input that the model should generate output from.
* "parameters" : An optional object containing zero or more parameters for this
generate request expressed as key/value pairs. See
Expand Down Expand Up @@ -121,14 +123,15 @@ specification to set the parameters.
Below is an example to send generate request with additional model parameters `stream` and `temperature`.

```
$ curl -X POST localhost:8000/v2/models/mymodel/generate -d '{"text_input": "client input", "parameters": {"stream": false, "temperature": 0}}'
$ curl -X POST localhost:8000/v2/models/mymodel/generate -d '{"id": "42", "text_input": "client input", "parameters": {"stream": false, "temperature": 0}}'

POST /v2/models/mymodel/generate HTTP/1.1
Host: localhost:8000
Content-Type: application/json
Content-Length: <xx>
{
"text_input": "client input",
"id" : "42",
"text_input" : "client input",
"parameters" :
{
"stream": false,
Expand All @@ -145,11 +148,13 @@ the HTTP body.

$generate_response =
{
"id" : $string
"model_name" : $string,
"model_version" : $string,
"text_output" : $string
}

* "id" : The "id" identifier given in the request, if any.
* "model_name" : The name of the model used for inference.
* "model_version" : The specific model version used for inference.
* "text_output" : The output of the inference.
Expand All @@ -159,6 +164,7 @@ the HTTP body.
```
200
{
"id" : "42"
"model_name" : "mymodel",
"model_version" : "1",
"text_output" : "model output"
Expand Down
1 change: 1 addition & 0 deletions src/http_server.cc
Original file line number Diff line number Diff line change
Expand Up @@ -3327,6 +3327,7 @@ HTTPAPIServer::HandleGenerate(
// thus the string must live as long as the JSON message).
triton::common::TritonJson::Value request;
RETURN_AND_CALLBACK_IF_ERR(EVRequestToJson(req, &request), error_callback);
RETURN_AND_CALLBACK_IF_ERR(ParseJsonTritonRequestID(request, irequest), error_callback);
rmccorm4 marked this conversation as resolved.
Show resolved Hide resolved

RETURN_AND_CALLBACK_IF_ERR(
generate_request->ConvertGenerateRequest(
Expand Down