Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Standardize KServe multi-model management SPI and add built-in support #159

Closed
njhill opened this issue May 12, 2022 · 2 comments
Closed

Comments

@njhill
Copy link
Member

njhill commented May 12, 2022

For dynamic loading/unloading of models, Triton defines a "Model Repository" API which is described as an extension to the KServe v2 dataplane API.

This includes both REST and gRPC variants of the following API endpoints:

POST v2/repository/index
POST v2/repository/models/${MODEL_NAME}/load
POST v2/repository/models/${MODEL_NAME}/unload

MLServer followed this and have implemented the same API but unfortunately their gRPC service definition uses different service and packages name:

  • MLServer defines a separate service with name inference.model_repository.ModelRepositoryService
  • Triton just includes them as additional methods in the same inference.GRPCInferenceService data-plane service

ModelMesh uses these in the built-in modelmesh support for Triton/MLServer to manage models in each Triton instance, but currently the logic is mostly specific to each because of the differing service names and different filesystem layout requirements. Note that only the load/unload methods are used, index isn't required.

It seems that this is an at least de facto standard KServe API for model management so it would make sense to support it as an option for other/custom model server implementations via our built-in adapter, as alternative to implementing the native model-mesh gRPC model runtime SPI.

First though we should decide on the official/standard package and service name to use for the gRPC service, and copy its specification into the KServe repo somewhere.

@njhill
Copy link
Member Author

njhill commented Jun 1, 2022

Looks like MLServer has now standardize on the Triton package/service names: SeldonIO/MLServer#616 🎉

kserve-oss-bot pushed a commit to kserve/modelmesh-runtime-adapter that referenced this issue May 26, 2023
#### Motivation

Related to [updating the MLServer runtime image](kserve/modelmesh-serving#355), the `ModelRepository` endpoint was deprecated. 

References:
kserve/modelmesh-serving#159
SeldonIO/MLServer#616

#### Modifications
- Updated protobuf 
- Updated mock server testing
- Updated runtime-adapter code to call new endpoint

#### Result
- MLServer runtime adapter no longer uses the deprecated model repository API

Signed-off-by: Rafael Vasquez <[email protected]>
@rafvasq
Copy link
Member

rafvasq commented Jan 19, 2024

Closed by kserve/modelmesh-runtime-adapter#45.

@rafvasq rafvasq closed this as completed Jan 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants