- Creating mar file for an eager mode model
- Creating mar file for torchscript mode model
- Serving torchvision image classification models
- Serving custom model with custom service handler
- Serving text classification model
- Serving text classification model with scriptable tokenizer
- Serving object detection model
- Serving image segmentation model
- Serving huggingface transformers model
- Serving image generator model
- Serving machine translation model
- Serving waveglow text to speech synthesizer model
- Serving multi modal framework model
- Serving Image Classification Workflow
- Serving Neural Machine Translation Workflow
- Serving Torchrec DLRM (Recommender Model)
The following are examples on how to create and serve model archives with TorchServe.
Following are the steps to create a torch-model-archive (.mar) to execute an eager mode torch model in TorchServe :
-
Pre-requisites to create a torch model archive (.mar) :
- serialized-file (.pt) : This file represents the
state_dict
in case of eager mode model. - model-file (.py) : This file contains model class extended from
torch nn
.modules representing the model architecture. This parameter is mandatory for eager mode models. This file must contain only one class definition extended from torch.nn.modules - index_to_name.json : This file contains the mapping of predicted index to class. The default TorchServe handles returns the predicted index and probability. This file can be passed to model archiver using --extra-files parameter.
- version : Model's version.
- handler : TorchServe default handler's name or path to custom inference handler(.py)
- serialized-file (.pt) : This file represents the
-
Syntax
torch-model-archiver --model-name <model_name> --version <model_version_number> --model-file <path_to_model_architecture_file> --serialized-file <path_to_state_dict_file> --handler <path_to_custom_handler_or_default_handler_name> --extra-files <path_to_index_to_name_json_file>
Following are the steps to create a torch-model-archive (.mar) to execute an eager mode torch model in TorchServe :
-
Pre-requisites to create a torch model archive (.mar) :
- serialized-file (.pt) : This file represents the state_dict in case of eager mode model or an executable
ScriptModule
in case of TorchScript. - index_to_name.json : This file contains the mapping of predicted index to class. The default TorchServe handles returns the predicted index and probability. This file can be passed to model archiver using --extra-files parameter.
- version : Model's version.
- handler : TorchServe default handler's name or path to custom inference handler(.py)
- serialized-file (.pt) : This file represents the state_dict in case of eager mode model or an executable
-
Syntax
torch-model-archiver --model-name <model_name> --version <model_version_number> --serialized-file <path_to_executable_script_module> --extra-files <path_to_index_to_name_json_file> --handler <path_to_custom_handler_or_default_handler_name>
The following example demonstrates how to create image classifier model archive, serve it on TorchServe and run image prediction using TorchServe's default image_classifier handler :
The following example demonstrates how to create and serve a custom NN model with custom handler archives in TorchServe :
The following example demonstrates how to create and serve a custom text_classification NN model with default text_classifer handler provided by TorchServe :
This example shows how to combine a text classification model with a scriptable tokenizer into a single, scripted artifact to serve with TorchServe. A scriptable tokenizer is a tokenizer compatible with TorchScript.
The following example demonstrates how to create and serve a pretrained fast-rcnn NN model with default object_detector handler provided by TorchServe :
The following example demonstrates how to create and serve a pretrained fcn NN model with default image_segmenter handler provided by TorchServe :
The following example demonstrates how to create and serve a pretrained transformer models from Huggingface such as BERT, RoBERTA, XLM
The following example demonstrates how to create and serve a pretrained DCGAN model from facebookresearch/pytorch_GAN_zoo
The following example demonstrates how to create and serve a neural translation model using fairseq
The following example demonstrates how to create and serve the waveglow text to speech synthesizer
The following example demonstrates how to create and serve a multi modal model including audio, text and video
The following example demonstrates how to create and serve a complex image classification workflow for dog breed classification
The following example demonstrates how to create and serve a complex neural machine translation workflow
This example shows how to deploy a Deep Learning Recommendation Model (DLRM) with TorchRec