-
Notifications
You must be signed in to change notification settings - Fork 120
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
docs: Update docs to make isvc focal #190
Conversation
Signed-off-by: Paul Van Eck <[email protected]>
35cd183
to
e623d96
Compare
@@ -11,9 +11,9 @@ The model data itself is pulled from one or more external [storage instances](pr | |||
ModelMesh Serving makes use of two core Kubernetes Custom Resource types: | |||
|
|||
- `ServingRuntime` - Templates for Pods that can serve one or more particular model formats. There are three "built in" runtimes that cover the out-of-the-box model types (Triton, MLServer and OpenVINO Model Server OVMS), [custom runtimes](runtimes/) can be defined by creating additional ones. | |||
- [`Predictor`](predictors/) - This represents a logical endpoint for serving predictions using a particular model. The Predictor spec specifies the model type, the storage in which it resides and the path to the model within that storage. The corresponding endpoint is "stable" and will seamlessly transition between different model versions or types when the spec is updated. | |||
- [`InferenceService`](predictors/) - This is the main interface KServe uses for managing models on Kubernetes. ModelMesh Serving can be used for deploying `InferenceService` predictors which represent a logical endpoint for serving predictions using a particular model. The `InferenceService` predictor spec specifies the model format, the storage location in which the model resides, and other optional configuration. The corresponding endpoint is "stable" and will seamlessly transition between different model versions or types when the spec is updated. Note that many features like transformers, explainers, and canary rollouts do not currently apply or fully work using InferenceServices with `deploymentMode` set to `ModelMesh`. And `PodSpec` fields that are set in the `InferenceService` predictor spec will be ignored. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure the "InferenceService link" should point to the ModelMesh predictors docs. I understand ModelMesh implements only the predictor component in the KServe InferenceService spec. But it feels a little strange to follow the link and see nothing about InferenceService. I would suggest either remove the link or have it point to KServe InferenceService.
- `modelId` - The internal id of the model in question. This includes a hash of the InferenceService's predictor spec. | ||
- `time` - The time at which the failure occurred, if applicable. | ||
|
||
Upon creation, InferenceService the model status will always transition to `Loaded` state (unless the loading fails), but later if unused it is possible that they end up in a `Standby` state which means they are still available to serve requests but the first request could incur a loading delay. Whether this happens is a function of the available capacity and usage pattern of other models. It's possible that models will transition from `Standby` back to `Loaded` "by themselves" if more capacity becomes available. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe just me, not quite sure
- what "InferenceService the model status" means. I suppose it means the model status of the InferenceService?
- what "they" refer to in "...they end up in a
Standby
state which means they are still available to serve requests..." I suppose they means InferenceServices?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yea, it's not quite clear here. I will reword.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, @pvaneck ! Looks very good, just a few minor comments.
Signed-off-by: Paul Van Eck <[email protected]>
Thanks, @chinhuang007 for the thorough review! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: chinhuang007, pvaneck The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
#### Motivation The InferenceService CRD has evolved enough to become the primary interface for interacting with ModelMesh. The documentation should reflect that. #### Modifications Documentation was adjusted to make the InferenceService CRD focal as opposed to the previous Predictor CRD. Examples and snippets were added for InferenceServices. #### Result Users will learn and become more familiar with deploying on ModelMesh using the KServe InferenceService. Signed-off-by: Paul Van Eck <[email protected]>
#### Motivation The InferenceService CRD has evolved enough to become the primary interface for interacting with ModelMesh. The documentation should reflect that. #### Modifications Documentation was adjusted to make the InferenceService CRD focal as opposed to the previous Predictor CRD. Examples and snippets were added for InferenceServices. #### Result Users will learn and become more familiar with deploying on ModelMesh using the KServe InferenceService. Signed-off-by: Paul Van Eck <[email protected]>
Motivation
The InferenceService CRD has evolved enough to become the primary interface for interacting with ModelMesh.
The documentation should reflect that.
Modifications
Documentation was adjusted to make the InferenceService CRD focal as opposed to the previous Predictor CRD.
Examples and snippets were added for InferenceServices.
Result
Users will learn and become more familiar with deploying on ModelMesh using the KServe InferenceService.