Standardize KServe multi-model management SPI and add built-in support #159

njhill · 2022-05-12T19:28:14Z

For dynamic loading/unloading of models, Triton defines a "Model Repository" API which is described as an extension to the KServe v2 dataplane API.

This includes both REST and gRPC variants of the following API endpoints:

POST v2/repository/index
POST v2/repository/models/${MODEL_NAME}/load
POST v2/repository/models/${MODEL_NAME}/unload

MLServer followed this and have implemented the same API but unfortunately their gRPC service definition uses different service and packages name:

MLServer defines a separate service with name inference.model_repository.ModelRepositoryService
Triton just includes them as additional methods in the same inference.GRPCInferenceService data-plane service

ModelMesh uses these in the built-in modelmesh support for Triton/MLServer to manage models in each Triton instance, but currently the logic is mostly specific to each because of the differing service names and different filesystem layout requirements. Note that only the load/unload methods are used, index isn't required.

It seems that this is an at least de facto standard KServe API for model management so it would make sense to support it as an option for other/custom model server implementations via our built-in adapter, as alternative to implementing the native model-mesh gRPC model runtime SPI.

First though we should decide on the official/standard package and service name to use for the gRPC service, and copy its specification into the KServe repo somewhere.

The text was updated successfully, but these errors were encountered:

njhill · 2022-06-01T20:01:20Z

Looks like MLServer has now standardize on the Triton package/service names: SeldonIO/MLServer#616 🎉

#### Motivation Related to [updating the MLServer runtime image](kserve/modelmesh-serving#355), the `ModelRepository` endpoint was deprecated. References: kserve/modelmesh-serving#159 SeldonIO/MLServer#616 #### Modifications - Updated protobuf - Updated mock server testing - Updated runtime-adapter code to call new endpoint #### Result - MLServer runtime adapter no longer uses the deprecated model repository API Signed-off-by: Rafael Vasquez <[email protected]>

rafvasq · 2024-01-19T17:23:25Z

Closed by kserve/modelmesh-runtime-adapter#45.

rafvasq mentioned this issue Apr 25, 2023

chore: Update MLServer protobuf kserve/modelmesh-runtime-adapter#45

Merged

rafvasq closed this as completed Jan 19, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Standardize KServe multi-model management SPI and add built-in support #159

Standardize KServe multi-model management SPI and add built-in support #159

njhill commented May 12, 2022

njhill commented Jun 1, 2022

rafvasq commented Jan 19, 2024

Standardize KServe multi-model management SPI and add built-in support #159

Standardize KServe multi-model management SPI and add built-in support #159

Comments

njhill commented May 12, 2022

njhill commented Jun 1, 2022

rafvasq commented Jan 19, 2024