You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This includes both REST and gRPC variants of the following API endpoints:
POST v2/repository/index
POST v2/repository/models/${MODEL_NAME}/load
POST v2/repository/models/${MODEL_NAME}/unload
MLServer followed this and have implemented the same API but unfortunately their gRPC service definition uses different service and packages name:
MLServer defines a separate service with name inference.model_repository.ModelRepositoryService
Triton just includes them as additional methods in the same inference.GRPCInferenceService data-plane service
ModelMesh uses these in the built-in modelmesh support for Triton/MLServer to manage models in each Triton instance, but currently the logic is mostly specific to each because of the differing service names and different filesystem layout requirements. Note that only the load/unload methods are used, index isn't required.
It seems that this is an at least de facto standard KServe API for model management so it would make sense to support it as an option for other/custom model server implementations via our built-in adapter, as alternative to implementing the native model-mesh gRPC model runtime SPI.
First though we should decide on the official/standard package and service name to use for the gRPC service, and copy its specification into the KServe repo somewhere.
The text was updated successfully, but these errors were encountered:
#### Motivation
Related to [updating the MLServer runtime image](kserve/modelmesh-serving#355), the `ModelRepository` endpoint was deprecated.
References:
kserve/modelmesh-serving#159SeldonIO/MLServer#616
#### Modifications
- Updated protobuf
- Updated mock server testing
- Updated runtime-adapter code to call new endpoint
#### Result
- MLServer runtime adapter no longer uses the deprecated model repository API
Signed-off-by: Rafael Vasquez <[email protected]>
For dynamic loading/unloading of models, Triton defines a "Model Repository" API which is described as an extension to the KServe v2 dataplane API.
This includes both REST and gRPC variants of the following API endpoints:
MLServer followed this and have implemented the same API but unfortunately their gRPC service definition uses different service and packages name:
inference.model_repository.ModelRepositoryService
inference.GRPCInferenceService
data-plane serviceModelMesh uses these in the built-in modelmesh support for Triton/MLServer to manage models in each Triton instance, but currently the logic is mostly specific to each because of the differing service names and different filesystem layout requirements. Note that only the load/unload methods are used, index isn't required.
It seems that this is an at least de facto standard KServe API for model management so it would make sense to support it as an option for other/custom model server implementations via our built-in adapter, as alternative to implementing the native model-mesh gRPC model runtime SPI.
First though we should decide on the official/standard package and service name to use for the gRPC service, and copy its specification into the KServe repo somewhere.
The text was updated successfully, but these errors were encountered: