-
Notifications
You must be signed in to change notification settings - Fork 1.5k
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
docs: Re-structure User Guides for Discoverability (#7807)
Co-authored-by: Meenakshi Sharma <[email protected]> Co-authored-by: Kyle McGill <[email protected]>
- Loading branch information
1 parent
838966a
commit 0194c3d
Showing
44 changed files
with
6,114 additions
and
976 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,11 @@ | ||
######## | ||
vLLM | ||
######## | ||
|
||
.. toctree:: | ||
:hidden: | ||
:caption: vLLM | ||
:maxdepth: 2 | ||
|
||
../vllm_backend/README | ||
Multi-LoRA <../vllm_backend/docs/llama_multi_lora_tutorial> |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,10 @@ | ||
#### | ||
API Reference | ||
#### | ||
|
||
.. toctree:: | ||
:maxdepth: 1 | ||
:hidden: | ||
|
||
OpenAI API <openai_readme.md> | ||
kserve |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,39 @@ | ||
#### | ||
In-Process Triton Server API | ||
#### | ||
|
||
|
||
The Triton Inference Server provides a backwards-compatible C API/ python-bindings/java-bindings that | ||
allows Triton to be linked directly into a C/C++/java/python application. This API | ||
is called the "Triton Server API" or just "Server API" for short. The | ||
API is implemented in the Triton shared library which is built from | ||
source contained in the `core | ||
repository <https://github.com/triton-inference-server/core>`__. On Linux | ||
this library is libtritonserver.so and on Windows it is | ||
tritonserver.dll. In the Triton Docker image the shared library is | ||
found in /opt/tritonserver/lib. The header file that defines and | ||
documents the Server API is | ||
`tritonserver.h <https://github.com/triton-inference-server/core/blob/main/include/triton/core/tritonserver.h>`__. | ||
`Java bindings for In-Process Triton Server API <../customization_guide/inprocess_java_api.html#java-bindings-for-in-process-triton-server-api>`__ | ||
are built on top of `tritonserver.h` and can be used for Java applications that | ||
need to use Tritonserver in-process. | ||
|
||
All capabilities of Triton server are encapsulated in the shared | ||
library and are exposed via the Server API. The `tritonserver` | ||
executable implements HTTP/REST and GRPC endpoints and uses the Server | ||
API to communicate with core Triton logic. The primary source files | ||
for the endpoints are `grpc_server.cc <https://github.com/triton-inference-server/server/blob/main/src/grpc/grpc_server.cc>`__ and | ||
`http_server.cc <https://github.com/triton-inference-server/server/blob/main/src/http_server.cc>`__. In these source files you can | ||
see the Server API being used. | ||
|
||
You can use the Server API in your own application as well. A simple | ||
example using the Server API can be found in | ||
`simple.cc <https://github.com/triton-inference-server/server/blob/main/src/simple.cc>`__. | ||
|
||
.. toctree:: | ||
:maxdepth: 1 | ||
:hidden: | ||
|
||
C/C++ <../customization_guide/inprocess_c_api.md> | ||
python | ||
Java <../customization_guide/inprocess_java_api.md> |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,15 @@ | ||
#### | ||
KServe API | ||
#### | ||
|
||
|
||
Triton uses the | ||
`KServe community standard inference protocols <https://github.com/kserve/kserve/tree/master/docs/predict-api/v2>`__ | ||
to define HTTP/REST and GRPC APIs plus several extensions. | ||
|
||
.. toctree:: | ||
:maxdepth: 1 | ||
:hidden: | ||
|
||
HTTP/REST and GRPC Protocol <../customization_guide/inference_protocols.md> | ||
kserve_extension |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,24 @@ | ||
#### | ||
Extensions | ||
#### | ||
|
||
To fully enable all capabilities | ||
Triton also implements `HTTP/REST and GRPC | ||
extensions <https://github.com/triton-inference-server/server/tree/main/docs/protocol>`__ | ||
to the KServe inference protocol. | ||
|
||
.. toctree:: | ||
:maxdepth: 1 | ||
:hidden: | ||
|
||
Binary tensor data extension <../protocol/extension_binary_data.md> | ||
Classification extension <../protocol/extension_classification.md> | ||
Schedule policy extension <../protocol/extension_schedule_policy.md> | ||
Sequence extension <../protocol/extension_sequence.md> | ||
Shared-memory extension <../protocol/extension_shared_memory.md> | ||
Model configuration extension <../protocol/extension_model_configuration.md> | ||
Model repository extension <../protocol/extension_model_repository.md> | ||
Statistics extension <../protocol/extension_statistics.md> | ||
Trace extension <../protocol/extension_trace.md> | ||
Logging extension <../protocol/extension_logging.md> | ||
Parameters extension <../protocol/extension_parameters.md> |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
../../python/openai/README.md |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,12 @@ | ||
#### | ||
Python | ||
#### | ||
|
||
.. include:: python_readme.rst | ||
|
||
.. toctree:: | ||
:maxdepth: 1 | ||
:hidden: | ||
|
||
Kafka I/O <../tutorials/Triton_Inference_Server_Python_API/examples/kafka-io/README.md> | ||
Rayserve <../tutorials/Triton_Inference_Server_Python_API/examples/rayserve/README.md> |
Oops, something went wrong.