diff --git a/docs/IE_DG/Deep_Learning_Inference_Engine_DevGuide.md b/docs/IE_DG/Deep_Learning_Inference_Engine_DevGuide.md index 0f07f5503811f5..f8362188ab2366 100644 --- a/docs/IE_DG/Deep_Learning_Inference_Engine_DevGuide.md +++ b/docs/IE_DG/Deep_Learning_Inference_Engine_DevGuide.md @@ -1,7 +1,5 @@ # Inference Engine Developer Guide {#openvino_docs_IE_DG_Deep_Learning_Inference_Engine_DevGuide} -> **NOTE:** [Intel® System Studio](https://software.intel.com/content/www/us/en/develop/tools/oneapi/commercial-base-iot.html) (click "Intel® System Studio Users" tab) is an all-in-one, cross-platform tool suite, purpose-built to simplify system bring-up and improve system and IoT device application performance on Intel® platforms. If you are using the Intel® Distribution of OpenVINO™ with Intel® System Studio, go to [Get Started with Intel® System Studio](https://software.intel.com/en-us/articles/get-started-with-openvino-and-intel-system-studio-2019). - This Guide provides an overview of the Inference Engine describing the typical workflow for performing inference of a pre-trained and optimized deep learning model and a set of sample applications. > **NOTE:** Before you perform inference with the Inference Engine, your models should be converted to the Inference Engine format using the Model Optimizer or built directly in runtime using nGraph API. To learn about how to use Model Optimizer, refer to the [Model Optimizer Developer Guide](../MO_DG/Deep_Learning_Model_Optimizer_DevGuide.md). To learn about the pre-trained and optimized models delivered with the OpenVINO™ toolkit, refer to [Pre-Trained Models](@ref omz_models_group_intel). @@ -111,10 +109,8 @@ The common workflow contains the following steps: 8. **Get the output** - After inference is completed, get the output memory or read the memory you provided earlier. Do this with the `InferenceEngine::IInferRequest::GetBlob()` method. ## Video: Inference Engine Concept -[![](https://img.youtube.com/vi/e6R13V8nbak/0.jpg)](https://www.youtube.com/watch?v=e6R13V8nbak) -\htmlonly + -\endhtmlonly ## Further Reading diff --git a/docs/IE_DG/Extensibility_DG/AddingNGraphOps.md b/docs/IE_DG/Extensibility_DG/AddingNGraphOps.md index 8ca911f7d0cda9..ed4d65595320a5 100644 --- a/docs/IE_DG/Extensibility_DG/AddingNGraphOps.md +++ b/docs/IE_DG/Extensibility_DG/AddingNGraphOps.md @@ -1,6 +1,6 @@ # Custom nGraph Operation {#openvino_docs_IE_DG_Extensibility_DG_AddingNGraphOps} -Inference Engine Extension API allows you to register operation sets (opsets) with custom nGraph operations to support models with operations which OpenVINO™ does not support out-of-the-box. +The Inference Engine Extension API allows you to register operation sets (opsets) with custom nGraph operations to support models with operations that OpenVINO™ does not support out-of-the-box. ## Operation Class diff --git a/docs/IE_DG/Extensibility_DG/Extension.md b/docs/IE_DG/Extensibility_DG/Extension.md index 178d0099df68ee..e941cb9c13c1a8 100644 --- a/docs/IE_DG/Extensibility_DG/Extension.md +++ b/docs/IE_DG/Extensibility_DG/Extension.md @@ -25,5 +25,6 @@ Also, an `Extension` object should implement the following methods: Implement the InferenceEngine::IExtension::getOpSets method if the extension contains custom layers. Read [Custom nGraph Operation](AddingNGraphOps.md) for more information. -To integrate execution kernels to the extension library, read [How to Implement Custom CPU Operations](CPU_Kernel.md). -To register a custom ONNX\* operator to the extension library, read [Custom ONNX Operators](Custom_ONNX_Ops.md). +To understand how to integrate execution kernels to the extension library, read the [documentation about development of custom CPU kernels](CPU_Kernel.md). + +To understand how to register custom ONNX operator to the extension library, read the [documentation about custom ONNX operators](Custom_ONNX_Ops.md). diff --git a/docs/IE_DG/Int8Inference.md b/docs/IE_DG/Int8Inference.md index 889af6a53278b1..2577e7dc4ecab7 100644 --- a/docs/IE_DG/Int8Inference.md +++ b/docs/IE_DG/Int8Inference.md @@ -1,12 +1,5 @@ # Low-Precision 8-bit Integer Inference {#openvino_docs_IE_DG_Int8Inference} -## Table of Contents -1. [Supported devices](#supported-devices) -2. [Low-Precision 8-bit Integer Inference Workflow](#low-precision-8-bit-integer-inference-workflow) -3. [Prerequisites](#prerequisites) -4. [Inference](#inference) -5. [Results analysis](#results-analysis) - ## Supported devices Low-precision 8-bit inference is optimized for: @@ -24,34 +17,35 @@ Low-precision 8-bit inference is optimized for: ## Low-Precision 8-bit Integer Inference Workflow -8-bit computations (referred to as `int8`) offer better performance compared to the results of inference in higher precision (for example, `fp32`), because they allow loading more data into a single processor instruction. Usually the cost for significant boost is a reduced accuracy. However, it is proved that an accuracy drop can be negligible and depends on task requirements, so that the application engineer can set up the maximum accuracy drop that is acceptable. +8-bit computations (referred to as `int8`) offer better performance compared to the results of inference in higher precision (for example, `fp32`), because they allow loading more data into a single processor instruction. Usually the cost for significant boost is reduced accuracy. However, it is proved that an accuracy drop can be negligible and depends on task requirements, so that the application engineer can set up the maximum accuracy drop that is acceptable. For 8-bit integer computations, a model must be quantized. Quantized models can be downloaded from [Overview of OpenVINO™ Toolkit Intel's Pre-Trained Models](@ref omz_models_group_intel). If the model is not quantized, you can use the [Post-Training Optimization Tool](@ref pot_README) to quantize the model. The quantization process adds [FakeQuantize](../ops/quantization/FakeQuantize_1.md) layers on activations and weights for most layers. Read more about mathematical computations in the [Uniform Quantization with Fine-Tuning](https://github.com/openvinotoolkit/nncf/blob/develop/docs/compression_algorithms/Quantization.md). When you pass the quantized IR to the OpenVINO™ plugin, the plugin automatically recognizes it as a quantized model and performs 8-bit inference. Note, if you pass a quantized model to another plugin that does not support 8-bit inference but supports all operations from the model, the model is inferred in precision that this plugin supports. -In *Runtime stage* stage, the quantized model is loaded to the plugin. The plugin uses `Low Precision Transformation` component to update the model to infer it in low precision: - - Update `FakeQuantize` layers to have quantized output tensors in low precision range and add dequantization layers to compensate the update. Dequantization layers are pushed through as many layers as possible to have more layers in low precision. After that, most layers have quantized input tensors in low precision range and can be inferred in low precision. Ideally, dequantization layers should be fused in the next `FakeQuantize` layer. - - Weights are quantized and stored in `Constant` layers. +In *Runtime stage*, the quantized model is loaded to the plugin. The plugin uses the `Low Precision Transformation` component to update the model to infer it in low precision: + - Update `FakeQuantize` layers to have quantized output tensors in a low precision range and add dequantization layers to compensate the update. Dequantization layers are pushed through as many layers as possible to have more layers in low precision. After that, most layers quantized input tensors in the low precision range and can be inferred in low precision. Ideally, dequantization layers should be fused in the next `FakeQuantize` layer. + - Quantize weights and store them in `Constant` layers. ## Prerequisites -Let's explore quantized [TensorFlow* implementation of ResNet-50](https://github.com/openvinotoolkit/open_model_zoo/tree/master/models/public/resnet-50-tf) model. Use [Model Downloader](@ref omz_tools_downloader) tool to download the `fp16` model from [OpenVINO™ Toolkit - Open Model Zoo repository](https://github.com/openvinotoolkit/open_model_zoo): +Let's explore the quantized [TensorFlow* implementation of ResNet-50](https://github.com/openvinotoolkit/open_model_zoo/tree/master/models/public/resnet-50-tf) model. Use the [Model Downloader](@ref omz_tools_downloader) tool to download the `fp16` model from [OpenVINO™ Toolkit - Open Model Zoo repository](https://github.com/openvinotoolkit/open_model_zoo): ```sh -./downloader.py --name resnet-50-tf --precisions FP16-INT8 +cd $INTEL_OPENVINO_DIR/deployment_tools/tools/model_downloader +./downloader.py --name resnet-50-tf --precisions FP16-INT8 --output_dir ``` -After that you should quantize model by the [Model Quantizer](@ref omz_tools_downloader) tool. +After that, you should quantize the model by the [Model Quantizer](@ref omz_tools_downloader) tool. For the dataset, you can choose to download the ImageNet dataset from [here](https://www.image-net.org/download.php). ```sh -./quantizer.py --model_dir public/resnet-50-tf --dataset_dir --precisions=FP16-INT8 +./quantizer.py --model_dir --name public/resnet-50-tf --dataset_dir --precisions=FP16-INT8 ``` ## Inference -The simplest way to infer the model and collect performance counters is [C++ Benchmark Application](../../inference-engine/samples/benchmark_app/README.md). +The simplest way to infer the model and collect performance counters is the [C++ Benchmark Application](../../inference-engine/samples/benchmark_app/README.md). ```sh ./benchmark_app -m resnet-50-tf.xml -d CPU -niter 1 -api sync -report_type average_counters -report_folder pc_report_dir ``` -If you infer the model with the OpenVINO™ CPU plugin and collect performance counters, all operations (except last not quantized SoftMax) are executed in INT8 precision. +If you infer the model with the Inference Engine CPU plugin and collect performance counters, all operations (except the last non-quantized SoftMax) are executed in INT8 precision. ## Results analysis diff --git a/docs/IE_DG/Legal_Information.md b/docs/IE_DG/Legal_Information.md deleted file mode 100644 index 3b39dba5810fa4..00000000000000 --- a/docs/IE_DG/Legal_Information.md +++ /dev/null @@ -1,12 +0,0 @@ -# Legal Information {#openvino_docs_IE_DG_Legal_Information} - -No license (express or implied, by estoppel or otherwise) to any intellectual property rights is granted by this document.
-Intel disclaims all express and implied warranties, including without limitation, the implied warranties of merchantability, fitness for a particular purpose, and non-infringement, as well as any warranty arising from course of performance, course of dealing, or usage in trade.
-This document contains information on products, services and/or processes in development. All information provided here is subject to change without notice. Contact your Intel representative to obtain the latest forecast, schedule, specifications and roadmaps.
-The products and services described may contain defects or errors known as errata which may cause deviations from published specifications. Current characterized errata are available on request.
-Copies of documents which have an order number and are referenced in this document may be obtained by calling 1-800-548-4725 or by visiting [www.intel.com/design/literature.htm](http://www.intel.com/design/literature.htm).
-Intel, Intel logo, Intel Core, VTune, Xeon are trademarks of Intel Corporation in the U.S. and other countries.
-\* Other names and brands may be claimed as the property of others.
-Copyright © 2016-2018 Intel Corporation.
-This software and the related documents are Intel copyrighted materials, and your use of them is governed by the express license under which they were provided to you (License). Unless the License provides otherwise, you may not use, modify, copy, publish, distribute, disclose or transmit this software or the related documents without Intel's prior written permission.
-This software and the related documents are provided as is, with no express or implied warranties, other than those that are expressly stated in the License.
diff --git a/docs/IE_DG/Samples_Overview.md b/docs/IE_DG/Samples_Overview.md index f9e21cf5e4dcce..6d3cb495831096 100644 --- a/docs/IE_DG/Samples_Overview.md +++ b/docs/IE_DG/Samples_Overview.md @@ -109,7 +109,7 @@ for the debug configuration — in `/intel64/Debug/`. The recommended Windows* build environment is the following: * Microsoft Windows* 10 -* Microsoft Visual Studio* 2017, or 2019 +* Microsoft Visual Studio* 2017, or 2019. Make sure that C++ CMake tools for Windows is [enabled](https://docs.microsoft.com/en-us/cpp/build/cmake-projects-in-visual-studio?view=msvc-160#:~:text=The%20Visual%20C%2B%2B%20Tools%20for,Visual%20Studio%20generators%20are%20supported). * CMake* version 3.10 or higher > **NOTE**: If you want to use Microsoft Visual Studio 2019, you are required to install CMake 3.14. diff --git a/docs/IE_DG/ShapeInference.md b/docs/IE_DG/ShapeInference.md index dcc4b5c3f8837b..a265f2e9703e2e 100644 --- a/docs/IE_DG/ShapeInference.md +++ b/docs/IE_DG/ShapeInference.md @@ -33,7 +33,7 @@ If a model has a hard-coded batch dimension, use `InferenceEngine::CNNNetwork::s Inference Engine takes three kinds of a model description as an input, which are converted into an `InferenceEngine::CNNNetwork` object: 1. [Intermediate Representation (IR)](../MO_DG/IR_and_opsets.md) through `InferenceEngine::Core::ReadNetwork` -2. [ONNX model](../IE_DG/OnnxImporterTutorial.md) through `InferenceEngine::Core::ReadNetwork` +2. [ONNX model](../IE_DG/ONNX_Support.md) through `InferenceEngine::Core::ReadNetwork` 3. [nGraph function](../nGraph_DG/nGraph_dg.md) through the constructor of `InferenceEngine::CNNNetwork` `InferenceEngine::CNNNetwork` keeps an `ngraph::Function` object with the model description internally. diff --git a/docs/IE_DG/supported_plugins/CPU.md b/docs/IE_DG/supported_plugins/CPU.md index 8f75a792adeeb2..12b005099ba092 100644 --- a/docs/IE_DG/supported_plugins/CPU.md +++ b/docs/IE_DG/supported_plugins/CPU.md @@ -105,17 +105,18 @@ These are general options, also supported by other plugins: | Parameter name | Parameter values | Default | Description | | :--- | :--- | :--- | :----------------------------------------------------------------------------------------------------------------------------| -| KEY_EXCLUSIVE_ASYNC_REQUESTS | YES/NO | NO | Forces async requests (also from different executable networks) to execute serially. This prevents potential oversubscription| -| KEY_PERF_COUNT | YES/NO | NO | Enables gathering performance counters | +| `KEY_EXCLUSIVE_ASYNC_REQUESTS` | `YES`/`NO` | `NO` | Forces async requests (also from different executable networks) to execute serially. This prevents potential oversubscription| +| `KEY_PERF_COUNT` | `YES`/`NO` | `NO` | Enables gathering performance counters | CPU-specific settings: -| Parameter name | Parameter values | Default | Description | -| :--- | :--- | :--- | :--- | -| KEY_CPU_THREADS_NUM | positive integer values| 0 | Specifies the number of threads that CPU plugin should use for inference. Zero (default) means using all (logical) cores| -| KEY_CPU_BIND_THREAD | YES/NUMA/NO | YES | Binds inference threads to CPU cores. 'YES' (default) binding option maps threads to cores - this works best for static/synthetic scenarios like benchmarks. The 'NUMA' binding is more relaxed, binding inference threads only to NUMA nodes, leaving further scheduling to specific cores to the OS. This option might perform better in the real-life/contended scenarios. Note that for the latency-oriented cases (number of the streams is less or equal to the number of NUMA nodes, see below) both YES and NUMA options limit number of inference threads to the number of hardware cores (ignoring hyper-threading) on the multi-socket machines. | -| KEY_CPU_THROUGHPUT_STREAMS | KEY_CPU_THROUGHPUT_NUMA, KEY_CPU_THROUGHPUT_AUTO, or positive integer values| 1 | Specifies number of CPU "execution" streams for the throughput mode. Upper bound for the number of inference requests that can be executed simultaneously. All available CPU cores are evenly distributed between the streams. The default value is 1, which implies latency-oriented behavior for single NUMA-node machine, with all available cores processing requests one by one. On the multi-socket (multiple NUMA nodes) machine, the best latency numbers usually achieved with a number of streams matching the number of NUMA-nodes.
KEY_CPU_THROUGHPUT_NUMA creates as many streams as needed to accommodate NUMA and avoid associated penalties.
KEY_CPU_THROUGHPUT_AUTO creates bare minimum of streams to improve the performance; this is the most portable option if you don't know how many cores your target machine has (and what would be the optimal number of streams). Note that your application should provide enough parallel slack (for example, run many inference requests) to leverage the throughput mode.
Non-negative integer value creates the requested number of streams. If a number of streams is 0, no internal streams are created and user threads are interpreted as stream master threads.| -| KEY_ENFORCE_BF16 | YES/NO| YES | The name for setting to execute in bfloat16 precision whenever it is possible. This option lets plugin know to downscale the precision where it sees performance benefits from bfloat16 execution. Such option does not guarantee accuracy of the network, you need to verify the accuracy in this mode separately, based on performance and accuracy results. It should be your decision whether to use this option or not. | + +| Parameter name | Parameter values | Default | Description | +| :--- | :--- | :--- |:-----------------------------------------------------------------------------| +| `KEY_CPU_THREADS_NUM` | `positive integer values`| `0` | Specifies the number of threads that CPU plugin should use for inference. Zero (default) means using all (logical) cores| +| `KEY_CPU_BIND_THREAD` | `YES`/`NUMA`/`NO` | `YES` | Binds inference threads to CPU cores. 'YES' (default) binding option maps threads to cores - this works best for static/synthetic scenarios like benchmarks. The 'NUMA' binding is more relaxed, binding inference threads only to NUMA nodes, leaving further scheduling to specific cores to the OS. This option might perform better in the real-life/contended scenarios. Note that for the latency-oriented cases (number of the streams is less or equal to the number of NUMA nodes, see below) both YES and NUMA options limit number of inference threads to the number of hardware cores (ignoring hyper-threading) on the multi-socket machines. | +| `KEY_CPU_THROUGHPUT_STREAMS` | `KEY_CPU_THROUGHPUT_NUMA`, `KEY_CPU_THROUGHPUT_AUTO`, or `positive integer values`| `1` | Specifies number of CPU "execution" streams for the throughput mode. Upper bound for the number of inference requests that can be executed simultaneously. All available CPU cores are evenly distributed between the streams. The default value is 1, which implies latency-oriented behavior for single NUMA-node machine, with all available cores processing requests one by one. On the multi-socket (multiple NUMA nodes) machine, the best latency numbers usually achieved with a number of streams matching the number of NUMA-nodes.
`KEY_CPU_THROUGHPUT_NUMA` creates as many streams as needed to accommodate NUMA and avoid associated penalties.
`KEY_CPU_THROUGHPUT_AUTO` creates bare minimum of streams to improve the performance; this is the most portable option if you don't know how many cores your target machine has (and what would be the optimal number of streams). Note that your application should provide enough parallel slack (for example, run many inference requests) to leverage the throughput mode.
Non-negative integer value creates the requested number of streams. If a number of streams is 0, no internal streams are created and user threads are interpreted as stream master threads.| +| `KEY_ENFORCE_BF16` | `YES`/`NO`| `YES` | The name for setting to execute in bfloat16 precision whenever it is possible. This option lets plugin know to downscale the precision where it sees performance benefits from bfloat16 execution. Such option does not guarantee accuracy of the network, you need to verify the accuracy in this mode separately, based on performance and accuracy results. It should be your decision whether to use this option or not. | > **NOTE**: To disable all internal threading, use the following set of configuration parameters: `KEY_CPU_THROUGHPUT_STREAMS=0`, `KEY_CPU_THREADS_NUM=1`, `KEY_CPU_BIND_THREAD=NO`. diff --git a/docs/IE_DG/supported_plugins/GPU.md b/docs/IE_DG/supported_plugins/GPU.md index cc12be98a121e1..ab84dfbac06a9f 100644 --- a/docs/IE_DG/supported_plugins/GPU.md +++ b/docs/IE_DG/supported_plugins/GPU.md @@ -99,23 +99,24 @@ The plugin supports the configuration parameters listed below. All parameters must be set before calling InferenceEngine::Core::LoadNetwork() in order to take effect. When specifying key values as raw strings (that is, when using Python API), omit the `KEY_` prefix. + | Parameter Name | Parameter Values | Default | Description | |---------------------|-----------------------------|-----------------|-----------------------------------------------------------| | `KEY_CACHE_DIR` | `""` | `""` | Specifies a directory where compiled OCL binaries can be cached. First model loading generates the cache, and all subsequent LoadNetwork calls use precompiled kernels which significantly improves load time. If empty - caching is disabled | | `KEY_PERF_COUNT` | `YES` / `NO` | `NO` | Collect performance counters during inference | | `KEY_CONFIG_FILE` | `" [ ...]"` | `""` | Load custom layer configuration files | -| `KEY_GPU_PLUGIN_PRIORITY` | `<0-3>` | `0` | OpenCL queue priority (before usage, make sure your OpenCL driver supports appropriate extension)
Higher value means higher priority for OpenCL queue. 0 disables the setting. | -| `KEY_GPU_PLUGIN_THROTTLE` | `<0-3>` | `0` | OpenCL queue throttling (before usage, make sure your OpenCL driver supports appropriate extension)
Lower value means lower driver thread priority and longer sleep time for it. 0 disables the setting. | -| `KEY_CLDNN_ENABLE_FP16_FOR_QUANTIZED_MODELS` | `YES` / `NO` | `YES` | Allows using FP16+INT8 mixed precision mode, so non-quantized parts of a model will be executed in FP16 precision for FP16 IR. Does not affect quantized FP32 IRs | -| `KEY_GPU_NV12_TWO_INPUTS` | `YES` / `NO` | `NO` | Controls preprocessing logic for nv12 input. If it's set to YES, then device graph will expect that user will set biplanar nv12 blob as input wich will be directly passed to device execution graph. Otherwise, preprocessing via GAPI is used to convert NV12->BGR, thus GPU graph have to expect single input | -| `KEY_GPU_THROUGHPUT_STREAMS` | `KEY_GPU_THROUGHPUT_AUTO`, or positive integer| 1 | Specifies a number of GPU "execution" streams for the throughput mode (upper bound for a number of inference requests that can be executed simultaneously).
This option is can be used to decrease GPU stall time by providing more effective load from several streams. Increasing the number of streams usually is more effective for smaller topologies or smaller input sizes. Note that your application should provide enough parallel slack (e.g. running many inference requests) to leverage full GPU bandwidth. Additional streams consume several times more GPU memory, so make sure the system has enough memory available to suit parallel stream execution. Multiple streams might also put additional load on CPU. If CPU load increases, it can be regulated by setting an appropriate `KEY_GPU_PLUGIN_THROTTLE` option value (see above). If your target system has relatively weak CPU, keep throttling low.
The default value is 1, which implies latency-oriented behavior.
`KEY_GPU_THROUGHPUT_AUTO` creates bare minimum of streams to improve the performance; this is the most portable option if you are not sure how many resources your target machine has (and what would be the optimal number of streams).
A positive integer value creates the requested number of streams. | -| `KEY_EXCLUSIVE_ASYNC_REQUESTS` | `YES` / `NO` | `NO` | Forces async requests (also from different executable networks) to execute serially.| -| `KEY_GPU_MAX_NUM_THREADS` | `integer value` | `maximum # of HW threads available in host environment` | Specifies the number of CPU threads that can be used for GPU engine, e.g, JIT compilation of GPU kernels or cpu kernel processing within GPU plugin. The default value is set as the number of maximum available threads in host environment to minimize the time for LoadNetwork, where the GPU kernel build time occupies a large portion. Note that if the specified value is larger than the maximum available # of threads or less than zero, it is set as maximum available # of threads. It can be specified with a smaller number than the available HW threads according to the usage scenario, e.g., when the user wants to assign more CPU threads while GPU plugin is running. Note that setting this value with lower number will affect not only the network loading time but also the cpu layers of GPU networks that are optimized with multi-threading. | -| `KEY_GPU_ENABLE_LOOP_UNROLLING` | `YES` / `NO` | `YES` | Enables recurrent layers such as TensorIterator or Loop with fixed iteration count to be unrolled. It is turned on by default. Turning this key on will achieve better inference performance for loops with not too many iteration counts (less than 16, as a rule of thumb). Turning this key off will achieve better performance for both graph loading time and inference time with many iteration counts (greater than 16). Note that turning this key on will increase the graph loading time in proportion to the iteration counts. Thus, this key should be turned off if graph loading time is considered to be most important target to optimize. | -| `KEY_CLDNN_PLUGIN_PRIORITY` | `<0-3>` | `0` | OpenCL queue priority (before usage, make sure your OpenCL driver supports appropriate extension)
Higher value means higher priority for OpenCL queue. 0 disables the setting. **Deprecated**. Please use KEY_GPU_PLUGIN_PRIORITY | -| `KEY_CLDNN_PLUGIN_THROTTLE` | `<0-3>` | `0` | OpenCL queue throttling (before usage, make sure your OpenCL driver supports appropriate extension)
Lower value means lower driver thread priority and longer sleep time for it. 0 disables the setting. **Deprecated**. Please use KEY_GPU_PLUGIN_THROTTLE | -| `KEY_CLDNN_GRAPH_DUMPS_DIR` | `""` | `""` | clDNN graph optimizer stages dump output directory (in GraphViz format) **Deprecated**. Will be removed in the next release | -| `KEY_CLDNN_SOURCES_DUMPS_DIR` | `""` | `""` | Final optimized clDNN OpenCL sources dump output directory. **Deprecated**. Will be removed in the next release | +| `KEY_GPU_PLUGIN_`
`PRIORITY` | `<0-3>` | `0` | OpenCL queue priority (before usage, make sure your OpenCL driver supports appropriate extension)
Higher value means higher priority for OpenCL queue. 0 disables the setting. | +| `KEY_GPU_PLUGIN_`
`THROTTLE` | `<0-3>` | `0` | OpenCL queue throttling (before usage, make sure your OpenCL driver supports appropriate extension)
Lower value means lower driver thread priority and longer sleep time for it. 0 disables the setting. | +| `KEY_CLDNN_ENABLE_`
`MODELS` | `YES` / `NO` | `YES` | Allows using FP16+INT8 mixed precision mode, so non-quantized parts of a model will be executed in FP16 precision for FP16 IR. Does not affect quantized FP32 IRs | +| `KEY_GPU_NV12_`
`TWO_INPUTS` | `YES` / `NO` | `NO` | Controls preprocessing logic for nv12 input. If it's set to YES, then device graph will expect that user will set biplanar nv12 blob as input wich will be directly passed to device execution graph. Otherwise, preprocessing via GAPI is used to convert NV12->BGR, thus GPU graph have to expect single input | +| `KEY_GPU_THROUGHPUT_`
`STREAMS` | `KEY_GPU_THROUGHPUT_AUTO`, or positive integer| 1 | Specifies a number of GPU "execution" streams for the throughput mode (upper bound for a number of inference requests that can be executed simultaneously).
This option is can be used to decrease GPU stall time by providing more effective load from several streams. Increasing the number of streams usually is more effective for smaller topologies or smaller input sizes. Note that your application should provide enough parallel slack (e.g. running many inference requests) to leverage full GPU bandwidth. Additional streams consume several times more GPU memory, so make sure the system has enough memory available to suit parallel stream execution. Multiple streams might also put additional load on CPU. If CPU load increases, it can be regulated by setting an appropriate `KEY_GPU_PLUGIN_THROTTLE` option value (see above). If your target system has relatively weak CPU, keep throttling low.
The default value is 1, which implies latency-oriented behavior.
`KEY_GPU_THROUGHPUT_AUTO` creates bare minimum of streams to improve the performance; this is the most portable option if you are not sure how many resources your target machine has (and what would be the optimal number of streams).
A positive integer value creates the requested number of streams. | +| `KEY_EXCLUSIVE_ASYNC_`
`REQUESTS` | `YES` / `NO` | `NO` | Forces async requests (also from different executable networks) to execute serially.| +| `KEY_GPU_MAX_NUM_`
`THREADS` | `integer value` | `maximum # of HW threads available in host environment` | Specifies the number of CPU threads that can be used for GPU engine, e.g, JIT compilation of GPU kernels or cpu kernel processing within GPU plugin. The default value is set as the number of maximum available threads in host environment to minimize the time for LoadNetwork, where the GPU kernel build time occupies a large portion. Note that if the specified value is larger than the maximum available # of threads or less than zero, it is set as maximum available # of threads. It can be specified with a smaller number than the available HW threads according to the usage scenario, e.g., when the user wants to assign more CPU threads while GPU plugin is running. Note that setting this value with lower number will affect not only the network loading time but also the cpu layers of GPU networks that are optimized with multi-threading. | +| `KEY_GPU_ENABLE_`
`LOOP_UNROLLING` | `YES` / `NO` | `YES` | Enables recurrent layers such as TensorIterator or Loop with fixed iteration count to be unrolled. It is turned on by default. Turning this key on will achieve better inference performance for loops with not too many iteration counts (less than 16, as a rule of thumb). Turning this key off will achieve better performance for both graph loading time and inference time with many iteration counts (greater than 16). Note that turning this key on will increase the graph loading time in proportion to the iteration counts. Thus, this key should be turned off if graph loading time is considered to be most important target to optimize. | +| `KEY_CLDNN_PLUGIN_`
`PRIORITY` | `<0-3>` | `0` | OpenCL queue priority (before usage, make sure your OpenCL driver supports appropriate extension)
Higher value means higher priority for OpenCL queue. 0 disables the setting. **Deprecated**. Please use KEY_GPU_PLUGIN_PRIORITY | +| `KEY_CLDNN_PLUGIN_`
`THROTTLE` | `<0-3>` | `0` | OpenCL queue throttling (before usage, make sure your OpenCL driver supports appropriate extension)
Lower value means lower driver thread priority and longer sleep time for it. 0 disables the setting. **Deprecated**. Please use KEY_GPU_PLUGIN_THROTTLE | +| `KEY_CLDNN_GRAPH_`
`DUMPS_DIR` | `""` | `""` | clDNN graph optimizer stages dump output directory (in GraphViz format) **Deprecated**. Will be removed in the next release | +| `KEY_CLDNN_SOURCES_`
`DUMPS_DIR` | `""` | `""` | Final optimized clDNN OpenCL sources dump output directory. **Deprecated**. Will be removed in the next release | | `KEY_DUMP_KERNELS` | `YES` / `NO` | `NO` | Dump the final kernels used for custom layers. **Deprecated**. Will be removed in the next release | | `KEY_TUNING_MODE` | `TUNING_DISABLED`
`TUNING_USE_EXISTING` | `TUNING_DISABLED` | Disable inference kernel tuning
Create tuning file (expect much longer runtime)
Use an existing tuning file. **Deprecated**. Will be removed in the next release | | `KEY_TUNING_FILE` | `""` | `""` | Tuning file to create / use. **Deprecated**. Will be removed in the next release | diff --git a/docs/IE_DG/supported_plugins/MULTI.md b/docs/IE_DG/supported_plugins/MULTI.md index a3f7dc2afc9a89..cebc03ba135fdc 100644 --- a/docs/IE_DG/supported_plugins/MULTI.md +++ b/docs/IE_DG/supported_plugins/MULTI.md @@ -96,10 +96,8 @@ Notice that you can use the FP16 IR to work with multi-device (as CPU automatica Also notice that no demos are (yet) fully optimized for the multi-device, by means of supporting the OPTIMAL_NUMBER_OF_INFER_REQUESTS metric, using the GPU streams/throttling, and so on. ## Video: MULTI Plugin -[![](https://img.youtube.com/vi/xbORYFEmrqU/0.jpg)](https://www.youtube.com/watch?v=xbORYFEmrqU) -\htmlonly + -\endhtmlonly ## See Also * [Supported Devices](Supported_Devices.md) diff --git a/docs/Legal_Information.md b/docs/Legal_Information.md index 2f3526f2902677..2936ae2a949665 100644 --- a/docs/Legal_Information.md +++ b/docs/Legal_Information.md @@ -1,22 +1,20 @@ # Legal Information {#openvino_docs_Legal_Information} -This software and the related documents are Intel copyrighted materials, and your use of them is governed by the express license (the “License”) under which they were provided to you. No license (express or implied, by estoppel or otherwise) to any intellectual property rights is granted by this document. Unless the License provides otherwise, you may not use, modify, copy, publish, distribute, disclose or transmit this software or the related documents without Intel's prior written permission. This software and the related documents are provided as is, with no express or implied warranties, other than those that are expressly stated in the License. Intel disclaims all express and implied warranties, including without limitation, the implied warranties of merchantability, fitness for a particular purpose, and non-infringement, as well as any warranty arising from course of performance, course of dealing, or usage in trade. - -This document contains information on products, services and/or processes in development. All information provided here is subject to change without notice. Contact your Intel representative to obtain the latest forecast, schedule, specifications and roadmaps. The products and services described may contain defects or errors known as errata which may cause deviations from published specifications. Current characterized errata are available on request. Copies of documents which have an order number and are referenced in this document may be obtained by calling 1-800-548-4725 or by visiting [www.intel.com/design/literature.htm](https://www.intel.com/design/literature.htm). - Performance varies by use, configuration and other factors. Learn more at [www.intel.com/PerformanceIndex](https://www.intel.com/PerformanceIndex). - -Performance results are based on testing as of dates shown in configurations and may not reflect all publicly available updates. See backup for configuration details. No product or component can be absolutely secure. - -Your costs and results may vary. - + +Performance results are based on testing as of dates shown in configurations and may not reflect all publicly available updates. See backup for configuration details. No product or component can be absolutely secure. + +Your costs and results may vary. + Intel technologies may require enabled hardware, software or service activation. -© Intel Corporation. Intel, the Intel logo, and other Intel marks are trademarks of Intel Corporation or its subsidiaries. \*Other names and brands may be claimed as the property of others. +OpenCL and the OpenCL logo are trademarks of Apple Inc. used by permission by Khronos. +© Intel Corporation. Intel, the Intel logo, and other Intel marks are trademarks of Intel Corporation or its subsidiaries. Other names and brands may be claimed as the property of others. + ## OpenVINO™ Logo To build equity around the project, the OpenVINO logo was created for both Intel and community usage. The logo may only be used to represent the OpenVINO toolkit and offerings built using the OpenVINO toolkit. - + ## Logo Usage Guidelines The OpenVINO logo must be used in connection with truthful, non-misleading references to the OpenVINO toolkit, and for no other purpose. -Modification of the logo or use of any separate element(s) of the logo alone is not allowed. +Modification of the logo or use of any separate element(s) of the logo alone is not allowed. \ No newline at end of file diff --git a/docs/MO_DG/Deep_Learning_Model_Optimizer_DevGuide.md b/docs/MO_DG/Deep_Learning_Model_Optimizer_DevGuide.md index 2aed66ba719934..378d559f895805 100644 --- a/docs/MO_DG/Deep_Learning_Model_Optimizer_DevGuide.md +++ b/docs/MO_DG/Deep_Learning_Model_Optimizer_DevGuide.md @@ -1,136 +1,54 @@ # Model Optimizer Developer Guide {#openvino_docs_MO_DG_Deep_Learning_Model_Optimizer_DevGuide} +## Introduction + Model Optimizer is a cross-platform command-line tool that facilitates the transition between the training and deployment environment, performs static model analysis, and adjusts deep learning models for optimal execution on end-point target devices. -Model Optimizer process assumes you have a network model trained using a supported deep learning framework. The scheme below illustrates the typical workflow for deploying a trained deep learning model: +Model Optimizer process assumes you have a network model trained using supported deep learning frameworks: Caffe*, TensorFlow*, Kaldi*, MXNet* or converted to the ONNX* format. Model Optimizer produces an Intermediate Representation (IR) of the network, which can be inferred with the [Inference Engine](../IE_DG/Deep_Learning_Inference_Engine_DevGuide.md). + +> **NOTE**: Model Optimizer does not infer models. Model Optimizer is an offline tool that runs before the inference takes place. + +The scheme below illustrates the typical workflow for deploying a trained deep learning model: ![](img/workflow_steps.png) -Model Optimizer produces an Intermediate Representation (IR) of the network, which can be read, loaded, and inferred with the Inference Engine. The Inference Engine API offers a unified API across a number of supported Intel® platforms. The Intermediate Representation is a pair of files describing the model: +The IR is a pair of files describing the model: * .xml - Describes the network topology * .bin - Contains the weights and biases binary data. -> **TIP**: You also can work with the Model Optimizer inside the OpenVINO™ [Deep Learning Workbench](@ref workbench_docs_Workbench_DG_Introduction) (DL Workbench). -> [DL Workbench](@ref workbench_docs_Workbench_DG_Introduction) is a platform built upon OpenVINO™ and provides a web-based graphical environment that enables you to optimize, fine-tune, analyze, visualize, and compare -> performance of deep learning models on various Intel® architecture -> configurations. In the DL Workbench, you can use most of OpenVINO™ toolkit components. ->
-> Proceed to an [easy installation from Docker](@ref workbench_docs_Workbench_DG_Install_from_Docker_Hub) to get started. - -## What's New in the Model Optimizer in this Release? - -* Common changes: - * Implemented several optimization transformations to replace sub-graphs of operations with HSwish, Mish, Swish and SoftPlus operations. - * Model Optimizer generates IR keeping shape-calculating sub-graphs **by default**. Previously, this behavior was triggered if the "--keep_shape_ops" command line parameter was provided. The key is ignored in this release and will be deleted in the next release. To trigger the legacy behavior to generate an IR for a fixed input shape (folding ShapeOf operations and shape-calculating sub-graphs to Constant), use the "--static_shape" command line parameter. Changing model input shape using the Inference Engine API in runtime may fail for such an IR. - * Fixed Model Optimizer conversion issues resulted in non-reshapeable IR using the Inference Engine reshape API. - * Enabled transformations to fix non-reshapeable patterns in the original networks: - * Hardcoded Reshape - * In Reshape(2D)->MatMul pattern - * Reshape->Transpose->Reshape when the pattern can be fused to the ShuffleChannels or DepthToSpace operation - * Hardcoded Interpolate - * In Interpolate->Concat pattern - * Added a dedicated requirements file for TensorFlow 2.X as well as the dedicated install prerequisites scripts. - * Replaced the SparseToDense operation with ScatterNDUpdate-4. -* ONNX*: - * Enabled an ability to specify the model output **tensor** name using the "--output" command line parameter. - * Added support for the following operations: - * Acosh - * Asinh - * Atanh - * DepthToSpace-11, 13 - * DequantizeLinear-10 (zero_point must be constant) - * HardSigmoid-1,6 - * QuantizeLinear-10 (zero_point must be constant) - * ReduceL1-11, 13 - * ReduceL2-11, 13 - * Resize-11, 13 (except mode="nearest" with 5D+ input, mode="tf_crop_and_resize", and attributes exclude_outside and extrapolation_value with non-zero values) - * ScatterND-11, 13 - * SpaceToDepth-11, 13 -* TensorFlow*: - * Added support for the following operations: - * Acosh - * Asinh - * Atanh - * CTCLoss - * EuclideanNorm - * ExtractImagePatches - * FloorDiv -* MXNet*: - * Added support for the following operations: - * Acosh - * Asinh - * Atanh -* Kaldi*: - * Fixed bug with ParallelComponent support. Now it is fully supported with no restrictions. - -> **NOTE:** -> [Intel® System Studio](https://software.intel.com/en-us/system-studio) is an all-in-one, cross-platform tool suite, purpose-built to simplify system bring-up and improve system and IoT device application performance on Intel® platforms. If you are using the Intel® Distribution of OpenVINO™ with Intel® System Studio, go to [Get Started with Intel® System Studio](https://software.intel.com/en-us/articles/get-started-with-openvino-and-intel-system-studio-2019). - -## Table of Contents - -* [Preparing and Optimizing your Trained Model with Model Optimizer](prepare_model/Prepare_Trained_Model.md) - * [Configuring Model Optimizer](prepare_model/Config_Model_Optimizer.md) - * [Converting a Model to Intermediate Representation (IR)](prepare_model/convert_model/Converting_Model.md) - * [Converting a Model Using General Conversion Parameters](prepare_model/convert_model/Converting_Model_General.md) - * [Converting Your Caffe* Model](prepare_model/convert_model/Convert_Model_From_Caffe.md) - * [Converting Your TensorFlow* Model](prepare_model/convert_model/Convert_Model_From_TensorFlow.md) - * [Converting BERT from TensorFlow](prepare_model/convert_model/tf_specific/Convert_BERT_From_Tensorflow.md) - * [Converting GNMT from TensorFlow](prepare_model/convert_model/tf_specific/Convert_GNMT_From_Tensorflow.md) - * [Converting YOLO from DarkNet to TensorFlow and then to IR](prepare_model/convert_model/tf_specific/Convert_YOLO_From_Tensorflow.md) - * [Converting Wide and Deep Models from TensorFlow](prepare_model/convert_model/tf_specific/Convert_WideAndDeep_Family_Models.md) - * [Converting FaceNet from TensorFlow](prepare_model/convert_model/tf_specific/Convert_FaceNet_From_Tensorflow.md) - * [Converting DeepSpeech from TensorFlow](prepare_model/convert_model/tf_specific/Convert_DeepSpeech_From_Tensorflow.md) - * [Converting Language Model on One Billion Word Benchmark from TensorFlow](prepare_model/convert_model/tf_specific/Convert_lm_1b_From_Tensorflow.md) - * [Converting Neural Collaborative Filtering Model from TensorFlow*](prepare_model/convert_model/tf_specific/Convert_NCF_From_Tensorflow.md) - * [Converting TensorFlow* Object Detection API Models](prepare_model/convert_model/tf_specific/Convert_Object_Detection_API_Models.md) - * [Converting TensorFlow*-Slim Image Classification Model Library Models](prepare_model/convert_model/tf_specific/Convert_Slim_Library_Models.md) - * [Converting CRNN Model from TensorFlow*](prepare_model/convert_model/tf_specific/Convert_CRNN_From_Tensorflow.md) - * [Converting Your MXNet* Model](prepare_model/convert_model/Convert_Model_From_MxNet.md) - * [Converting a Style Transfer Model from MXNet](prepare_model/convert_model/mxnet_specific/Convert_Style_Transfer_From_MXNet.md) - * [Converting Your Kaldi* Model](prepare_model/convert_model/Convert_Model_From_Kaldi.md) - * [Converting Your ONNX* Model](prepare_model/convert_model/Convert_Model_From_ONNX.md) - * [Converting Faster-RCNN ONNX* Model](prepare_model/convert_model/onnx_specific/Convert_Faster_RCNN.md) - * [Converting Mask-RCNN ONNX* Model](prepare_model/convert_model/onnx_specific/Convert_Mask_RCNN.md) - * [Converting GPT2 ONNX* Model](prepare_model/convert_model/onnx_specific/Convert_GPT2.md) - * [Converting Your PyTorch* Model](prepare_model/convert_model/Convert_Model_From_PyTorch.md) - * [Converting F3Net PyTorch* Model](prepare_model/convert_model/pytorch_specific/Convert_F3Net.md) - * [Converting QuartzNet PyTorch* Model](prepare_model/convert_model/pytorch_specific/Convert_QuartzNet.md) - * [Converting YOLACT PyTorch* Model](prepare_model/convert_model/pytorch_specific/Convert_YOLACT.md) - * [Model Optimizations Techniques](prepare_model/Model_Optimization_Techniques.md) - * [Cutting parts of the model](prepare_model/convert_model/Cutting_Model.md) - * [Sub-graph Replacement in Model Optimizer](prepare_model/customize_model_optimizer/Subgraph_Replacement_Model_Optimizer.md) - * [Supported Framework Layers](prepare_model/Supported_Frameworks_Layers.md) - * [Intermediate Representation and Operation Sets](IR_and_opsets.md) - * [Operations Specification](../ops/opset.md) - * [Intermediate Representation suitable for INT8 inference](prepare_model/convert_model/IR_suitable_for_INT8_inference.md) - * [Model Optimizer Extensibility](prepare_model/customize_model_optimizer/Customize_Model_Optimizer.md) - * [Extending Model Optimizer with New Primitives](prepare_model/customize_model_optimizer/Extending_Model_Optimizer_with_New_Primitives.md) - * [Extending Model Optimizer with Caffe Python Layers](prepare_model/customize_model_optimizer/Extending_Model_Optimizer_with_Caffe_Python_Layers.md) - * [Extending Model Optimizer with Custom MXNet* Operations](prepare_model/customize_model_optimizer/Extending_MXNet_Model_Optimizer_with_New_Primitives.md) - * [Legacy Mode for Caffe* Custom Layers](prepare_model/customize_model_optimizer/Legacy_Mode_for_Caffe_Custom_Layers.md) - * [Model Optimizer Frequently Asked Questions](prepare_model/Model_Optimizer_FAQ.md) - -* [Known Issues](Known_Issues_Limitations.md) - -**Typical Next Step:** [Preparing and Optimizing your Trained Model with Model Optimizer](prepare_model/Prepare_Trained_Model.md) - -## Video: Model Optimizer Concept - -[![](https://img.youtube.com/vi/Kl1ptVb7aI8/0.jpg)](https://www.youtube.com/watch?v=Kl1ptVb7aI8) -\htmlonly - -\endhtmlonly - -## Video: Model Optimizer Basic Operation -[![](https://img.youtube.com/vi/BBt1rseDcy0/0.jpg)](https://www.youtube.com/watch?v=BBt1rseDcy0) -\htmlonly - -\endhtmlonly +Below is a simple command running Model Optimizer to generate an IR for the input model: + +```sh +python3 mo.py --input_model INPUT_MODEL +``` +To learn about all Model Optimizer parameters and conversion technics, see the [Converting a Model to IR](prepare_model/convert_model/Converting_Model.md) page. + +> **TIP**: You can quick start with the Model Optimizer inside the OpenVINO™ [Deep Learning Workbench](@ref +> openvino_docs_get_started_get_started_dl_workbench) (DL Workbench). +> [DL Workbench](@ref workbench_docs_Workbench_DG_Introduction) is the OpenVINO™ toolkit UI that enables you to +> import a model, analyze its performance and accuracy, visualize the outputs, optimize and prepare the model for +> deployment on various Intel® platforms. + +## Videos + + + + + + + + + + + + +
+ + + + + +
Model Optimizer Concept.
Duration: 3:56
Model Optimizer Basic
Duration: 2:57.
Choosing the Right Precision.
Duration: 4:18.
-## Video: Choosing the Right Precision -[![](https://img.youtube.com/vi/RF8ypHyiKrY/0.jpg)](https://www.youtube.com/watch?v=RF8ypHyiKrY) -\htmlonly - -\endhtmlonly diff --git a/docs/MO_DG/img/small_IR_graph_demonstration.png b/docs/MO_DG/img/small_IR_graph_demonstration.png index 91a3fe385ae32f..332c11fdb65b66 100644 --- a/docs/MO_DG/img/small_IR_graph_demonstration.png +++ b/docs/MO_DG/img/small_IR_graph_demonstration.png @@ -1,3 +1,3 @@ version https://git-lfs.github.com/spec/v1 -oid sha256:c8ae479880ab43cdb12eeb2fbaaf3b7861f786413c583eeba906c5fdf4b66730 -size 30696 +oid sha256:e8a86ea362473121a266c0ec1257c8d428a4bb6438fecdc9d4a4f1ff5cfc9047 +size 26220 diff --git a/docs/MO_DG/img/workflow_steps.png b/docs/MO_DG/img/workflow_steps.png index 6bf780127ad14c..fee04b7cb33ebe 100644 --- a/docs/MO_DG/img/workflow_steps.png +++ b/docs/MO_DG/img/workflow_steps.png @@ -1,3 +1,3 @@ version https://git-lfs.github.com/spec/v1 -oid sha256:5e22bc22d614c7335ae461a8ce449ea8695973d755faca718cf74b95972c94e2 -size 19773 +oid sha256:5281f26cbaa468dc4cafa4ce2fde35d338fe0f658bbb796abaaf793e951939f6 +size 13943 diff --git a/docs/MO_DG/prepare_model/Config_Model_Optimizer.md b/docs/MO_DG/prepare_model/Config_Model_Optimizer.md index 9b978d750aa586..3b190dd6272b33 100644 --- a/docs/MO_DG/prepare_model/Config_Model_Optimizer.md +++ b/docs/MO_DG/prepare_model/Config_Model_Optimizer.md @@ -1,8 +1,6 @@ -# Configuring the Model Optimizer {#openvino_docs_MO_DG_prepare_model_Config_Model_Optimizer} +# Installing Model Optimizer Pre-Requisites {#openvino_docs_MO_DG_prepare_model_Config_Model_Optimizer} -You must configure the Model Optimizer for the framework that was used to train -the model. This section tells you how to configure the Model Optimizer either -through scripts or by using a manual process. +Before running the Model Optimizer, you must install the Model Optimizer pre-requisites for the framework that was used to train the model. This section tells you how to install the pre-requisites either through scripts or by using a manual process. ## Using Configuration Scripts @@ -154,6 +152,10 @@ pip3 install -r requirements_onnx.txt ``` ## Using the protobuf Library in the Model Optimizer for Caffe\* +
+ Click to expand + + These procedures require: @@ -166,7 +168,7 @@ By default, the library executes pure Python\* language implementation, which is slow. These steps show how to use the faster C++ implementation of the protobuf library on Windows OS or Linux OS. -### Using the protobuf Library on Linux\* OS +#### Using the protobuf Library on Linux\* OS To use the C++ implementation of the protobuf library on Linux, it is enough to set up the environment variable: @@ -174,7 +176,7 @@ set up the environment variable: export PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION=cpp ``` -### Using the protobuf Library on Windows\* OS +#### Using the protobuf Library on Windows\* OS On Windows, pre-built protobuf packages for Python versions 3.4, 3.5, 3.6, and 3.7 are provided with the installation package and can be found in @@ -262,6 +264,8 @@ python3 -m easy_install dist/protobuf-3.6.1-py3.6-win-amd64.egg set PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION=cpp ``` +
+ ## See Also * [Converting a Model to Intermediate Representation (IR)](convert_model/Converting_Model.md) diff --git a/docs/MO_DG/prepare_model/Prepare_Trained_Model.md b/docs/MO_DG/prepare_model/Prepare_Trained_Model.md deleted file mode 100644 index a74d1b789a2f34..00000000000000 --- a/docs/MO_DG/prepare_model/Prepare_Trained_Model.md +++ /dev/null @@ -1,63 +0,0 @@ -# Preparing and Optimizing Your Trained Model {#openvino_docs_MO_DG_prepare_model_Prepare_Trained_Model} - -Inference Engine enables _deploying_ your network model trained with any of supported deep learning frameworks: Caffe\*, TensorFlow\*, Kaldi\*, MXNet\* or converted to the ONNX\* format. To perform the inference, the Inference Engine does not operate with the original model, but with its Intermediate Representation (IR), which is optimized for execution on end-point target devices. To generate an IR for your trained model, the Model Optimizer tool is used. - -## How the Model Optimizer Works - -Model Optimizer loads a model into memory, reads it, builds the internal representation of the model, optimizes it, and produces the Intermediate Representation. Intermediate Representation is the only format the Inference Engine accepts. - -> **NOTE**: Model Optimizer does not infer models. Model Optimizer is an offline tool that runs before the inference takes place. - -Model Optimizer has two main purposes: - -* **Produce a valid Intermediate Representation**. If this main conversion artifact is not valid, the Inference Engine cannot run. The primary responsibility of the Model Optimizer is to produce the two files (`.xml` and `.bin`) that form the Intermediate Representation. -* **Produce an optimized Intermediate Representation**. Pre-trained models contain layers that are important for training, such as the `Dropout` layer. These layers are useless during inference and might increase the inference time. In many cases, these operations can be automatically removed from the resulting Intermediate Representation. However, if a group of operations can be represented as a single mathematical operation, and thus as a single operation node in a model graph, the Model Optimizer recognizes such patterns and replaces this group of operation nodes with the only one operation. The result is an Intermediate Representation that has fewer operation nodes than the original model. This decreases the inference time. - -To produce a valid Intermediate Representation, the Model Optimizer must be able to read the original model operations, handle their properties and represent them in Intermediate Representation format, while maintaining validity of the resulting Intermediate Representation. The resulting model consists of operations described in the [Operations Specification](../../ops/opset.md). - -## What You Need to Know about Your Model - -Many common layers exist across known frameworks and neural network topologies. Examples of these layers are `Convolution`, `Pooling`, and `Activation`. To read the original model and produce the Intermediate Representation of a model, the Model Optimizer must be able to work with these layers. - -The full list of them depends on the framework and can be found in the [Supported Framework Layers](Supported_Frameworks_Layers.md) section. If your topology contains only layers from the list of layers, as is the case for the topologies used by most users, the Model Optimizer easily creates the Intermediate Representation. After that you can proceed to work with the Inference Engine. - -However, if you use a topology with layers that are not recognized by the Model Optimizer out of the box, see [Custom Layers in the Model Optimizer](customize_model_optimizer/Customize_Model_Optimizer.md) to learn how to work with custom layers. - -## Model Optimizer Directory Structure - -After installation with OpenVINO™ toolkit or Intel® Deep Learning Deployment Toolkit, the Model Optimizer folder has the following structure (some directories omitted for clarity): -``` -|-- model_optimizer - |-- extensions - |-- front - Front-End framework agnostic transformations (operations output shapes are not defined yet). - |-- caffe - Front-End Caffe-specific transformations and Caffe layers extractors - |-- CustomLayersMapping.xml.example - example of file for registering custom Caffe layers (compatible with the 2017R3 release) - |-- kaldi - Front-End Kaldi-specific transformations and Kaldi operations extractors - |-- mxnet - Front-End MxNet-specific transformations and MxNet symbols extractors - |-- onnx - Front-End ONNX-specific transformations and ONNX operators extractors - |-- tf - Front-End TensorFlow-specific transformations, TensorFlow operations extractors, sub-graph replacements configuration files. - |-- middle - Middle-End framework agnostic transformations (layers output shapes are defined). - |-- back - Back-End framework agnostic transformations (preparation for IR generation). - |-- mo - |-- back - Back-End logic: contains IR emitting logic - |-- front - Front-End logic: contains matching between Framework-specific layers and IR specific, calculation of output shapes for each registered layer - |-- graph - Graph utilities to work with internal IR representation - |-- middle - Graph transformations - optimizations of the model - |-- pipeline - Sequence of steps required to create IR for each framework - |-- utils - Utility functions - |-- tf_call_ie_layer - Source code that enables TensorFlow fallback in Inference Engine during model inference - |-- mo.py - Centralized entry point that can be used for any supported framework - |-- mo_caffe.py - Entry point particularly for Caffe - |-- mo_kaldi.py - Entry point particularly for Kaldi - |-- mo_mxnet.py - Entry point particularly for MXNet - |-- mo_onnx.py - Entry point particularly for ONNX - |-- mo_tf.py - Entry point particularly for TensorFlow -``` - -The following sections provide the information about how to use the Model Optimizer, from configuring the tool and generating an IR for a given model to customizing the tool for your needs: - -* [Configuring Model Optimizer](Config_Model_Optimizer.md) -* [Converting a Model to Intermediate Representation](convert_model/Converting_Model.md) -* [Custom Layers in Model Optimizer](customize_model_optimizer/Customize_Model_Optimizer.md) -* [Model Optimization Techniques](Model_Optimization_Techniques.md) -* [Model Optimizer Frequently Asked Questions](Model_Optimizer_FAQ.md) diff --git a/docs/MO_DG/prepare_model/convert_model/Convert_Model_From_MxNet.md b/docs/MO_DG/prepare_model/convert_model/Convert_Model_From_MxNet.md index 4b8c1816e8b318..85218eaf1a0a8c 100644 --- a/docs/MO_DG/prepare_model/convert_model/Convert_Model_From_MxNet.md +++ b/docs/MO_DG/prepare_model/convert_model/Convert_Model_From_MxNet.md @@ -27,14 +27,12 @@ A summary of the steps for optimizing and deploying a model that was trained wit |SSD-ResNet-50| [Repo](https://github.com/zhreshold/mxnet-ssd), [Symbol + Params](https://github.com/zhreshold/mxnet-ssd/releases/download/v0.6/resnet50_ssd_512_voc0712_trainval.zip)| |SSD-VGG-16-300| [Repo](https://github.com/zhreshold/mxnet-ssd), [Symbol + Params](https://github.com/zhreshold/mxnet-ssd/releases/download/v0.5-beta/vgg16_ssd_300_voc0712_trainval.zip)| |SSD-Inception v3| [Repo](https://github.com/zhreshold/mxnet-ssd), [Symbol + Params](https://github.com/zhreshold/mxnet-ssd/releases/download/v0.7-alpha/ssd_inceptionv3_512_voc0712trainval.zip)| -|FCN8 (Semantic Segmentation)| [Repo](https://github.com/apache/incubator-mxnet/tree/master/example/fcn-xs), [Symbol](https://www.dropbox.com/sh/578n5cxej7ofd6m/AAA9SFCBN8R_uL2CnAd3WQ5ia/FCN8s_VGG16-symbol.json?dl=0), [Params](https://www.dropbox.com/sh/578n5cxej7ofd6m/AABHWZHCtA2P6iR6LUflkxb_a/FCN8s_VGG16-0019-cpu.params?dl=0)| |MTCNN part 1 (Face Detection)| [Repo](https://github.com/pangyupo/mxnet_mtcnn_face_detection), [Symbol](https://github.com/pangyupo/mxnet_mtcnn_face_detection/blob/master/model/det1-symbol.json), [Params](https://github.com/pangyupo/mxnet_mtcnn_face_detection/blob/master/model/det1-0001.params)| |MTCNN part 2 (Face Detection)| [Repo](https://github.com/pangyupo/mxnet_mtcnn_face_detection), [Symbol](https://github.com/pangyupo/mxnet_mtcnn_face_detection/blob/master/model/det2-symbol.json), [Params](https://github.com/pangyupo/mxnet_mtcnn_face_detection/blob/master/model/det2-0001.params)| |MTCNN part 3 (Face Detection)| [Repo](https://github.com/pangyupo/mxnet_mtcnn_face_detection), [Symbol](https://github.com/pangyupo/mxnet_mtcnn_face_detection/blob/master/model/det3-symbol.json), [Params](https://github.com/pangyupo/mxnet_mtcnn_face_detection/blob/master/model/det3-0001.params)| |MTCNN part 4 (Face Detection)| [Repo](https://github.com/pangyupo/mxnet_mtcnn_face_detection), [Symbol](https://github.com/pangyupo/mxnet_mtcnn_face_detection/blob/master/model/det4-symbol.json), [Params](https://github.com/pangyupo/mxnet_mtcnn_face_detection/blob/master/model/det4-0001.params)| |Lightened_moon| [Repo](https://github.com/tornadomeet/mxnet-face/tree/master/model/lightened_moon), [Symbol](https://github.com/tornadomeet/mxnet-face/blob/master/model/lightened_moon/lightened_moon_fuse-symbol.json), [Params](https://github.com/tornadomeet/mxnet-face/blob/master/model/lightened_moon/lightened_moon_fuse-0082.params)| |RNN-Transducer| [Repo](https://github.com/HawkAaron/mxnet-transducer) | -|word_lm| [Repo](https://github.com/apache/incubator-mxnet/tree/master/example/rnn/word_lm) | **Other supported topologies** diff --git a/docs/MO_DG/prepare_model/convert_model/Convert_Model_From_TensorFlow.md b/docs/MO_DG/prepare_model/convert_model/Convert_Model_From_TensorFlow.md index 7e29a7668b2f24..17465ef6e62d8a 100644 --- a/docs/MO_DG/prepare_model/convert_model/Convert_Model_From_TensorFlow.md +++ b/docs/MO_DG/prepare_model/convert_model/Convert_Model_From_TensorFlow.md @@ -37,7 +37,7 @@ Detailed information on how to convert models from the TensorFlow 1 Detection Model Zoo is available in the [Converting TensorFlow Object Detection API Models](tf_specific/Convert_Object_Detection_API_Models.md) chapter. The table below contains models from the Object Detection Models zoo that are supported. +Detailed information on how to convert models from the TensorFlow 1 Object Detection Models Zoo and TensorFlow 2 Object Detection Models Zoo is available in the [Converting TensorFlow Object Detection API Models](tf_specific/Convert_Object_Detection_API_Models.md) chapter. The table below contains models from the Object Detection Models Zoo that are supported. | Model Name| TensorFlow 1 Object Detection API Models| | :------------- | -----:| @@ -405,10 +405,8 @@ Refer to [Supported Framework Layers ](../Supported_Frameworks_Layers.md) for th The Model Optimizer provides explanatory messages if it is unable to run to completion due to issues like typographical errors, incorrectly used options, or other issues. The message describes the potential cause of the problem and gives a link to the [Model Optimizer FAQ](../Model_Optimizer_FAQ.md). The FAQ has instructions on how to resolve most issues. The FAQ also includes links to relevant sections in the Model Optimizer Developer Guide to help you understand what went wrong. ## Video: Converting a TensorFlow Model -[![](https://img.youtube.com/vi/QW6532LtiTc/0.jpg)](https://www.youtube.com/watch?v=QW6532LtiTc) -\htmlonly + -\endhtmlonly ## Summary In this document, you learned: diff --git a/docs/MO_DG/prepare_model/convert_model/Converting_Model.md b/docs/MO_DG/prepare_model/convert_model/Converting_Model.md index 26ce1289b8c04e..60ab7e2ac71eaf 100644 --- a/docs/MO_DG/prepare_model/convert_model/Converting_Model.md +++ b/docs/MO_DG/prepare_model/convert_model/Converting_Model.md @@ -1,39 +1,20 @@ # Converting a Model to Intermediate Representation (IR) {#openvino_docs_MO_DG_prepare_model_convert_model_Converting_Model} -Use the mo.py script from the `/deployment_tools/model_optimizer` directory to run the Model Optimizer and convert the model to the Intermediate Representation (IR). -The simplest way to convert a model is to run mo.py with a path to the input model file and an output directory where you have write permissions: +Use the mo.py script from the `/deployment_tools/model_optimizer` directory to run the Model Optimizer and convert the model to the Intermediate Representation (IR): ```sh python3 mo.py --input_model INPUT_MODEL --output_dir ``` +You need to have have write permissions for an output directory. -> **NOTE**: Some models require using additional arguments to specify conversion parameters, such as `--scale`, `--scale_values`, `--mean_values`, `--mean_file`. To learn about when you need to use these parameters, refer to [Converting a Model Using General Conversion Parameters](Converting_Model_General.md). - -The mo.py script is the universal entry point that can deduce the framework that has produced the input model by a standard extension of the model file: - -* `.caffemodel` - Caffe\* models -* `.pb` - TensorFlow\* models -* `.params` - MXNet\* models -* `.onnx` - ONNX\* models -* `.nnet` - Kaldi\* models. - -If the model files do not have standard extensions, you can use the ``--framework {tf,caffe,kaldi,onnx,mxnet,paddle}`` option to specify the framework type explicitly. - -For example, the following commands are equivalent: -```sh -python3 mo.py --input_model /user/models/model.pb -``` -```sh -python3 mo.py --framework tf --input_model /user/models/model.pb -``` +> **NOTE**: Some models require using additional arguments to specify conversion parameters, such as `--input_shape`, `--scale`, `--scale_values`, `--mean_values`, `--mean_file`. To learn about when you need to use these parameters, refer to [Converting a Model Using General Conversion Parameters](Converting_Model_General.md). To adjust the conversion process, you may use general parameters defined in the [Converting a Model Using General Conversion Parameters](Converting_Model_General.md) and Framework-specific parameters for: -* [Caffe](Convert_Model_From_Caffe.md), -* [TensorFlow](Convert_Model_From_TensorFlow.md), -* [MXNet](Convert_Model_From_MxNet.md), -* [ONNX](Convert_Model_From_ONNX.md), -* [Kaldi](Convert_Model_From_Kaldi.md). -* [Paddle](Convert_Model_From_Paddle.md). +* [Caffe](Convert_Model_From_Caffe.md) +* [TensorFlow](Convert_Model_From_TensorFlow.md) +* [MXNet](Convert_Model_From_MxNet.md) +* [ONNX](Convert_Model_From_ONNX.md) +* [Kaldi](Convert_Model_From_Kaldi.md) ## See Also diff --git a/docs/MO_DG/prepare_model/convert_model/Converting_Model_General.md b/docs/MO_DG/prepare_model/convert_model/Converting_Model_General.md index 2d267cda3e7172..913278a8e2ac0e 100644 --- a/docs/MO_DG/prepare_model/convert_model/Converting_Model_General.md +++ b/docs/MO_DG/prepare_model/convert_model/Converting_Model_General.md @@ -212,8 +212,7 @@ Launch the Model Optimizer for the Caffe bvlc_alexnet model with reversed input python3 mo.py --input_model bvlc_alexnet.caffemodel --reverse_input_channels --mean_values [255,255,255] --data_type FP16 --output_dir ``` -Launch the Model Optimizer for the Caffe bvlc_alexnet model with extensions listed in specified directories, specified mean_images binaryproto. - file For more information about extensions, please refer to [this](../customize_model_optimizer/Extending_Model_Optimizer_with_New_Primitives.md) page. +Launch the Model Optimizer for the Caffe bvlc_alexnet model with extensions listed in specified directories, specified mean_images binaryproto file. For more information about extensions, please refer to [this](../customize_model_optimizer/Extending_Model_Optimizer_with_New_Primitives.md) page. ```sh python3 mo.py --input_model bvlc_alexnet.caffemodel --extensions /home/,/some/other/path/ --mean_file /path/to/binaryproto --output_dir ``` diff --git a/docs/MO_DG/prepare_model/convert_model/Cutting_Model.md b/docs/MO_DG/prepare_model/convert_model/Cutting_Model.md index d86368a9f708f5..203fc94862a7fa 100644 --- a/docs/MO_DG/prepare_model/convert_model/Cutting_Model.md +++ b/docs/MO_DG/prepare_model/convert_model/Cutting_Model.md @@ -19,7 +19,7 @@ Model Optimizer provides command line options `--input` and `--output` to specif * `--input` option accepts a comma-separated list of layer names of the input model that should be treated as new entry points to the model. * `--output` option accepts a comma-separated list of layer names of the input model that should be treated as new exit points from the model. -The `--input` option is required for cases unrelated to model cutting. For example, when the model contains several inputs and `--input_shape` or `--mean_values` options are used, you should use the `--input` option to specify the order of input nodes for correct mapping between multiple items provided in `--input_shape` and `--mean_values` and the inputs in the model. This is out of scope. +The `--input` option is required for cases unrelated to model cutting. For example, when the model contains several inputs and `--input_shape` or `--mean_values` options are used, you should use the `--input` option to specify the order of input nodes for correct mapping between multiple items provided in `--input_shape` and `--mean_values` and the inputs in the model. Details on these options are out of scope for this document, which focuses on model cutting. Model cutting is illustrated with Inception V1. This model is in `models/research/slim` repository. [This section](Converting_Model.md) describes pre-work to prepare the model for the Model Optimizer to be ready to proceed with this chapter. diff --git a/docs/MO_DG/prepare_model/convert_model/IR_suitable_for_INT8_inference.md b/docs/MO_DG/prepare_model/convert_model/IR_suitable_for_INT8_inference.md index fa4bdb50554913..4f9baa1386cb7d 100644 --- a/docs/MO_DG/prepare_model/convert_model/IR_suitable_for_INT8_inference.md +++ b/docs/MO_DG/prepare_model/convert_model/IR_suitable_for_INT8_inference.md @@ -9,7 +9,7 @@ Intermediate Representation (IR) should be specifically formed to be suitable fo Such an IR is called a Low Precision IR and you can generate it in two ways: - [Quantize regular IR with the Post-Training Optimization tool](@ref pot_README) - Use the Model Optimizer for a model pretrained for Low Precision inference: TensorFlow\* pre-TFLite models (`.pb` model file with `FakeQuantize*` operations) and ONNX\* quantized models. -Both Tensorflow and ONNX quantized models could be prepared by [Neural Network Compression Framework](https://github.com/openvinotoolkit/nncf/blob/develop/README.md) +Both TensorFlow and ONNX quantized models could be prepared by [Neural Network Compression Framework](https://github.com/openvinotoolkit/nncf/blob/develop/README.md). For an operation to be executed in INT8, it must have `FakeQuantize` operations as inputs. See the [specification of `FakeQuantize` operation](../../../ops/quantization/FakeQuantize_1.md) for details. @@ -17,7 +17,7 @@ See the [specification of `FakeQuantize` operation](../../../ops/quantization/Fa To execute the `Convolution` operation in INT8 on CPU, both data and weight inputs should have `FakeQuantize` as an input operation: ![](../../img/expanded_int8_Convolution_weights.png) -Low pecision IR is also suitable for FP32 and FP16 inference if a chosen plugin supports all operations of the IR, because the only difference between a Low Precision IR and FP16 or FP32 IR is the existence of `FakeQuantize` in the Low Precision IR. +Low precision IR is also suitable for FP32 and FP16 inference if a chosen plugin supports all operations of the IR, because the only difference between a Low Precision IR and FP16 or FP32 IR is the existence of `FakeQuantize` in the Low Precision IR. Plugins with Low Precision Inference support recognize these sub-graphs and quantize them during the inference time. Plugins without Low Precision support execute all operations, including `FakeQuantize`, as is in the FP32 or FP16 precision. diff --git a/docs/MO_DG/prepare_model/convert_model/mxnet_specific/Convert_Style_Transfer_From_MXNet.md b/docs/MO_DG/prepare_model/convert_model/mxnet_specific/Convert_Style_Transfer_From_MXNet.md index f0ec23d5a9f631..eb1a7094673e2f 100644 --- a/docs/MO_DG/prepare_model/convert_model/mxnet_specific/Convert_Style_Transfer_From_MXNet.md +++ b/docs/MO_DG/prepare_model/convert_model/mxnet_specific/Convert_Style_Transfer_From_MXNet.md @@ -90,6 +90,8 @@ Where the `models/13` string is composed of the following substrings: * `models/`: path to the folder that contains .nd files with pre-trained styles weights * `13`: prefix pointing to 13_decoder, which is the default decoder for the repository +>**NOTE**: If you get an error saying "No module named 'cPickle'", try running the script from this step in Python 2. Then return to Python 3 for the remaining steps. + You can choose any style from [collection of pre-trained weights](https://pan.baidu.com/s/1skMHqYp). (On the Chinese-language page, click the down arrow next to a size in megabytes. Then wait for an overlay box to appear, and click the blue button in it to download.) The `generate()` function generates `nst_vgg19-symbol.json` and `vgg19-symbol.json` files for the specified shape. In the code, it is [1024 x 768] for a 4:3 ratio, and you can specify another, for example, [224,224] for a square ratio. #### 6. Run the Model Optimizer to generate an Intermediate Representation (IR): diff --git a/docs/MO_DG/prepare_model/convert_model/pytorch_specific/Convert_F3Net.md b/docs/MO_DG/prepare_model/convert_model/pytorch_specific/Convert_F3Net.md index ffb16eb5f7cc5f..0d130197f74a2c 100644 --- a/docs/MO_DG/prepare_model/convert_model/pytorch_specific/Convert_F3Net.md +++ b/docs/MO_DG/prepare_model/convert_model/pytorch_specific/Convert_F3Net.md @@ -2,15 +2,19 @@ [F3Net](https://github.com/weijun88/F3Net): Fusion, Feedback and Focus for Salient Object Detection +## Clone the F3Net Model Repository + +To clone the repository, run the following command: +```bash +git clone http://github.com/weijun88/F3Net.git +``` + ## Download and Convert the Model to ONNX* To download the pre-trained model or train the model yourself, refer to the -[instruction](https://github.com/weijun88/F3Net/blob/master/README.md) in the F3Net model repository. Firstly, -convert the model to ONNX\* format. Create and run the script with the following content in the `src` -directory of the model repository: +[instruction](https://github.com/weijun88/F3Net/blob/master/README.md) in the F3Net model repository. First, convert the model to ONNX\* format. Create and run the script with the following content in the `src` directory of the model repository: ```python import torch - from dataset import Config from net import F3Net @@ -19,7 +23,7 @@ net = F3Net(cfg) image = torch.zeros([1, 3, 352, 352]) torch.onnx.export(net, image, 'f3net.onnx', export_params=True, do_constant_folding=True, opset_version=11) ``` -The script generates the ONNX\* model file f3net.onnx. The model conversion was tested with the repository hash commit `eecace3adf1e8946b571a4f4397681252f9dc1b8`. +The script generates the ONNX\* model file `f3net.onnx`. This model conversion was tested with the repository hash commit `eecace3adf1e8946b571a4f4397681252f9dc1b8`. ## Convert ONNX* F3Net Model to IR diff --git a/docs/MO_DG/prepare_model/convert_model/pytorch_specific/Convert_RNNT.md b/docs/MO_DG/prepare_model/convert_model/pytorch_specific/Convert_RNNT.md index a58e886d4f4230..31de647f379158 100644 --- a/docs/MO_DG/prepare_model/convert_model/pytorch_specific/Convert_RNNT.md +++ b/docs/MO_DG/prepare_model/convert_model/pytorch_specific/Convert_RNNT.md @@ -20,15 +20,15 @@ mkdir rnnt_for_openvino cd rnnt_for_openvino ``` -**Step 3**. Download pretrained weights for PyTorch implementation from https://zenodo.org/record/3662521#.YG21DugzZaQ. -For UNIX*-like systems you can use wget: +**Step 3**. Download pretrained weights for PyTorch implementation from [https://zenodo.org/record/3662521#.YG21DugzZaQ](https://zenodo.org/record/3662521#.YG21DugzZaQ). +For UNIX*-like systems you can use `wget`: ```bash wget https://zenodo.org/record/3662521/files/DistributedDataParallel_1576581068.9962234-epoch-100.pt ``` The link was taken from `setup.sh` in the `speech_recoginitin/rnnt` subfolder. You will get exactly the same weights as -if you were following the steps from https://github.com/mlcommons/inference/tree/master/speech_recognition/rnnt. +if you were following the steps from [https://github.com/mlcommons/inference/tree/master/speech_recognition/rnnt](https://github.com/mlcommons/inference/tree/master/speech_recognition/rnnt). -**Step 4**. Install required python* packages: +**Step 4**. Install required Python packages: ```bash pip3 install torch toml ``` @@ -37,7 +37,7 @@ pip3 install torch toml `export_rnnt_to_onnx.py` and run it in the current directory `rnnt_for_openvino`: > **NOTE**: If you already have a full clone of MLCommons inference repository, you need to -> specify `mlcommons_inference_path` variable. +> specify the `mlcommons_inference_path` variable. ```python import toml @@ -92,8 +92,7 @@ torch.onnx.export(model.joint, (f, g), "rnnt_joint.onnx", opset_version=12, python3 export_rnnt_to_onnx.py ``` -After completing this step, the files rnnt_encoder.onnx, rnnt_prediction.onnx, and rnnt_joint.onnx will be saved in -the current directory. +After completing this step, the files `rnnt_encoder.onnx`, `rnnt_prediction.onnx`, and `rnnt_joint.onnx` will be saved in the current directory. **Step 6**. Run the conversion command: @@ -102,6 +101,6 @@ python3 {path_to_openvino}/mo.py --input_model rnnt_encoder.onnx --input "input. python3 {path_to_openvino}/mo.py --input_model rnnt_prediction.onnx --input "input.1[1 1],1[2 1 320],2[2 1 320]" python3 {path_to_openvino}/mo.py --input_model rnnt_joint.onnx --input "0[1 1 1024],1[1 1 320]" ``` -Please note that hardcoded value for sequence length = 157 was taken from the MLCommons, but conversion to IR preserves -network [reshapeability](../../../../IE_DG/ShapeInference.md); this means you can change input shapes manually to any value either during conversion or -inference. +Please note that hardcoded value for sequence length = 157 was taken from the MLCommons but conversion to IR preserves +network [reshapeability](../../../../IE_DG/ShapeInference.md), this means you can change input shapes manually to any value either during conversion or +inference. \ No newline at end of file diff --git a/docs/MO_DG/prepare_model/convert_model/pytorch_specific/Convert_YOLACT.md b/docs/MO_DG/prepare_model/convert_model/pytorch_specific/Convert_YOLACT.md index 9fb7e1ca9e9ce3..50272a33f74d4c 100644 --- a/docs/MO_DG/prepare_model/convert_model/pytorch_specific/Convert_YOLACT.md +++ b/docs/MO_DG/prepare_model/convert_model/pytorch_specific/Convert_YOLACT.md @@ -138,7 +138,7 @@ git checkout 57b8f2d95e62e2e649b382f516ab41f949b57239 3. Set up the environment as described in `README.md`. -**Step 2**. Download a pre-trained model from the list attached in the `Evaluation` section of `README.md` document, for example `yolact_base_54_800000.pth`. +**Step 2**. Download a pre-trained model from the list attached in the `Evaluation` section of the [README.md](https://github.com/dbolya/yolact/blob/master/README.md) document, for example `yolact_base_54_800000.pth`. **Step 3**. Export the model to ONNX* format. @@ -187,5 +187,4 @@ python path/to/model_optimizer/mo.py \ --input_model /path/to/yolact.onnx \ --reverse_input_channels \ --scale 255 -``` - +``` \ No newline at end of file diff --git a/docs/MO_DG/prepare_model/convert_model/tf_specific/Convert_XLNet_From_Tensorflow.md b/docs/MO_DG/prepare_model/convert_model/tf_specific/Convert_XLNet_From_Tensorflow.md index cc121ab19e1ad9..ac706c664f2d1e 100644 --- a/docs/MO_DG/prepare_model/convert_model/tf_specific/Convert_XLNet_From_Tensorflow.md +++ b/docs/MO_DG/prepare_model/convert_model/tf_specific/Convert_XLNet_From_Tensorflow.md @@ -24,13 +24,15 @@ To get pb-file from the archive contents, you need to do the following. 1. Run commands ```sh - cd ~ - mkdir XLNet-Base - cd XLNet-Base - git clone https://github.com/zihangdai/xlnet - wget https://storage.googleapis.com/xlnet/released_models/cased_L-12_H-768_A-12.zip - unzip cased_L-12_H-768_A-12.zip - mkdir try_save +cd ~ +mkdir XLNet-Base +cd XLNet-Base +git clone https://github.com/zihangdai/xlnet +wget https://storage.googleapis.com/xlnet/released_models/cased_L-12_H-768_A-12.zip +unzip cased_L-12_H-768_A-12.zip +mkdir try_save +cd xlnet +sed -i "s/tf\.train\.Optimizer/tf\.train.Optimizer if tf.version < '1.15' else tf.compat.v1.train.Optimizer/g" model_utils.py ``` diff --git a/docs/MO_DG/prepare_model/convert_model/tf_specific/Convert_YOLO_From_Tensorflow.md b/docs/MO_DG/prepare_model/convert_model/tf_specific/Convert_YOLO_From_Tensorflow.md index 60674b1c768ad8..ae2de000433b67 100644 --- a/docs/MO_DG/prepare_model/convert_model/tf_specific/Convert_YOLO_From_Tensorflow.md +++ b/docs/MO_DG/prepare_model/convert_model/tf_specific/Convert_YOLO_From_Tensorflow.md @@ -67,7 +67,11 @@ git checkout ed60b90 ``` 3. Download [coco.names](https://raw.githubusercontent.com/pjreddie/darknet/master/data/coco.names) file from the DarkNet website **OR** use labels that fit your task. 4. Download the [yolov3.weights](https://pjreddie.com/media/files/yolov3.weights) (for the YOLOv3 model) or [yolov3-tiny.weights](https://pjreddie.com/media/files/yolov3-tiny.weights) (for the YOLOv3-tiny model) file **OR** use your pre-trained weights with the same structure -5. Run a converter: +5. Install PIL, which is used by the conversion script in the repo: +```sh +pip install PIL +``` +6. Run a converter: - for YOLO-v3: ```sh python3 convert_weights_pb.py --class_names coco.names --data_format NHWC --weights_file yolov3.weights diff --git a/docs/MO_DG/prepare_model/customize_model_optimizer/Customize_Model_Optimizer.md b/docs/MO_DG/prepare_model/customize_model_optimizer/Customize_Model_Optimizer.md index cda8458e4dd72f..567543a01a88dd 100644 --- a/docs/MO_DG/prepare_model/customize_model_optimizer/Customize_Model_Optimizer.md +++ b/docs/MO_DG/prepare_model/customize_model_optimizer/Customize_Model_Optimizer.md @@ -34,7 +34,7 @@ Model Optimizer extensibility mechanism enables support of new operations and custom transformations to generate the optimized intermediate representation (IR) as described in the [Deep Learning Network Intermediate Representation and Operation Sets in OpenVINO™](../../IR_and_opsets.md). This -mechanism is a core part of the Model Optimizer. The Model Optimizer itself uses it under the hood, being a huge set of examples on how to add custom logic to support your model. +mechanism is a core part of the Model Optimizer, which uses it under the hood, so the Model Optimizer itself is a huge set of examples for adding custom logic to support your model. There are several cases when the customization is needed: diff --git a/docs/benchmarks/performance_benchmarks_faq.md b/docs/benchmarks/performance_benchmarks_faq.md index 2ff33612097b38..b833f03c531862 100644 --- a/docs/benchmarks/performance_benchmarks_faq.md +++ b/docs/benchmarks/performance_benchmarks_faq.md @@ -19,31 +19,34 @@ All of the performance benchmarks were generated using the open-sourced tool wit #### 6. What image sizes are used for the classification network models? The image size used in the inference depends on the network being benchmarked. The following table shows the list of input sizes for each network model. -| **Model** | **Public Network** | **Task** | **Input Size** (Height x Width) | -|------------------------------------------------------------------------------------------------------------------------------------|-----------------------------------------|-----------------------------|-----------------------------------| -| [bert-large-uncased-whole-word-masking-squad](https://github.com/openvinotoolkit/open_model_zoo/tree/develop/models/intel/bert-large-uncased-whole-word-masking-squad-int8-0001) | BERT-large |question / answer |384| -| [deeplabv3-TF](https://github.com/openvinotoolkit/open_model_zoo/tree/master/models/public/deeplabv3) | DeepLab v3 Tf |semantic segmentation | 513x513 | -| [densenet-121-TF](https://github.com/openvinotoolkit/open_model_zoo/tree/master/models/public/densenet-121-tf) | Densenet-121 Tf |classification | 224x224 | -| [facenet-20180408-102900-TF](https://github.com/openvinotoolkit/open_model_zoo/tree/master/models/public/facenet-20180408-102900) | FaceNet TF | face recognition | 160x160 | -| [faster_rcnn_resnet50_coco-TF](https://github.com/openvinotoolkit/open_model_zoo/tree/master/models/public/faster_rcnn_resnet50_coco) | Faster RCNN Tf | object detection | 600x1024 | -| [googlenet-v1-TF](https://github.com/openvinotoolkit/open_model_zoo/tree/master/models/public/googlenet-v1-tf) | GoogLeNet_ILSVRC-2012 | classification | 224x224 | -| [inception-v3-TF](https://github.com/openvinotoolkit/open_model_zoo/tree/master/models/public/googlenet-v3) | Inception v3 Tf | classification | 299x299 | -| [mobilenet-ssd-CF](https://github.com/openvinotoolkit/open_model_zoo/tree/master/models/public/mobilenet-ssd) | SSD (MobileNet)_COCO-2017_Caffe | object detection | 300x300 | -| [mobilenet-v1-1.0-224-TF](https://github.com/openvinotoolkit/open_model_zoo/tree/master/models/public/mobilenet-v1-1.0-224-tf) | MobileNet v1 Tf | classification | 224x224 | -| [mobilenet-v2-1.0-224-TF](https://github.com/openvinotoolkit/open_model_zoo/tree/master/models/public/mobilenet-v2-1.0-224) | MobileNet v2 Tf | classification | 224x224 | -| [mobilenet-v2-pytorch](https://github.com/openvinotoolkit/open_model_zoo/tree/master/models/public/mobilenet-v2-pytorch ) | Mobilenet V2 PyTorch | classification | 224x224 | -| [resnet-18-pytorch](https://github.com/openvinotoolkit/open_model_zoo/tree/master/models/public/resnet-18-pytorch) | ResNet-18 PyTorch | classification | 224x224 | -| [resnet-50-pytorch](https://github.com/openvinotoolkit/open_model_zoo/tree/master/models/public/resnet-50-pytorch) | ResNet-50 v1 PyTorch | classification | 224x224 | -| [resnet-50-TF](https://github.com/openvinotoolkit/open_model_zoo/tree/master/models/public/resnet-50-tf) | ResNet-50_v1_ILSVRC-2012 | classification | 224x224 | -| [se-resnext-50-CF](https://github.com/openvinotoolkit/open_model_zoo/tree/master/models/public/se-resnext-50) | Se-ResNext-50_ILSVRC-2012_Caffe | classification | 224x224 | -| [squeezenet1.1-CF](https://github.com/openvinotoolkit/open_model_zoo/tree/master/models/public/squeezenet1.1) | SqueezeNet_v1.1_ILSVRC-2012_Caffe | classification | 227x227 | -| [ssd300-CF](https://github.com/openvinotoolkit/open_model_zoo/tree/master/models/public/ssd300) | SSD (VGG-16)_VOC-2007_Caffe | object detection | 300x300 | -| [yolo_v3-TF](https://github.com/openvinotoolkit/open_model_zoo/tree/master/models/public/yolo-v3-tf) | TF Keras YOLO v3 Modelset | object detection | 300x300 | -| [yolo_v4-TF](https://github.com/openvinotoolkit/open_model_zoo/tree/master/models/public/yolo-v4-tf) | Yolo-V4 TF | object detection | 608x608 | -| [ssd_mobilenet_v1_coco-TF](https://github.com/openvinotoolkit/open_model_zoo/tree/master/models/public/ssd_mobilenet_v1_coco) | ssd_mobilenet_v1_coco | object detection | 300x300 | -| [ssdlite_mobilenet_v2-TF](https://github.com/openvinotoolkit/open_model_zoo/tree/master/models/public/ssdlite_mobilenet_v2) | ssd_mobilenet_v2 | object detection | 300x300 | -| [unet-camvid-onnx-0001](https://github.com/openvinotoolkit/open_model_zoo/blob/master/models/intel/unet-camvid-onnx-0001/description/unet-camvid-onnx-0001.md) | U-Net | semantic segmentation | 368x480 | - +| **Model** | **Public Network** | **Task** | **Input Size** (Height x Width) | +|------------------------------------------------------------------------------------------------------------------------------------|------------------------------------|-----------------------------|-----------------------------------| +| [bert-large-uncased-whole-word-masking-squad](https://github.com/openvinotoolkit/open_model_zoo/tree/develop/models/intel/bert-large-uncased-whole-word-masking-squad-int8-0001) | BERT-large |question / answer |384| +| [brain-tumor-segmentation-0001-MXNET](https://github.com/openvinotoolkit/open_model_zoo/tree/master/models/public/brain-tumor-segmentation-0001) | brain-tumor-segmentation-0001 | semantic segmentation | 128x128x128 | +| [brain-tumor-segmentation-0002-CF2](https://github.com/openvinotoolkit/open_model_zoo/tree/master/models/public/brain-tumor-segmentation-0002) | brain-tumor-segmentation-0002 | semantic segmentation | 128x128x128 | +| [deeplabv3-TF](https://github.com/openvinotoolkit/open_model_zoo/tree/master/models/public/deeplabv3) | DeepLab v3 Tf | semantic segmentation | 513x513 | +| [densenet-121-TF](https://github.com/openvinotoolkit/open_model_zoo/tree/master/models/public/densenet-121-tf) | Densenet-121 Tf | classification | 224x224 | +| [facenet-20180408-102900-TF](https://github.com/openvinotoolkit/open_model_zoo/tree/master/models/public/facenet-20180408-102900) | FaceNet TF | face recognition | 160x160 | +| [faster_rcnn_resnet50_coco-TF](https://github.com/openvinotoolkit/open_model_zoo/tree/master/models/public/faster_rcnn_resnet50_coco) | Faster RCNN Tf | object detection | 600x1024 | +| [inception-v4-TF](https://github.com/openvinotoolkit/open_model_zoo/tree/develop/models/public/googlenet-v4-tf) | Inception v4 Tf (aka GoogleNet-V4) | classification | 299x299 | +| [inception-v3-TF](https://github.com/openvinotoolkit/open_model_zoo/tree/master/models/public/googlenet-v3) | Inception v3 Tf | classification | 299x299 | +| [mobilenet-ssd-CF](https://github.com/openvinotoolkit/open_model_zoo/tree/master/models/public/mobilenet-ssd) | SSD (MobileNet)_COCO-2017_Caffe | object detection | 300x300 | +| [mobilenet-v2-1.0-224-TF](https://github.com/openvinotoolkit/open_model_zoo/tree/master/models/public/mobilenet-v2-1.0-224) | MobileNet v2 Tf | classification | 224x224 | +| [mobilenet-v2-pytorch](https://github.com/openvinotoolkit/open_model_zoo/tree/master/models/public/mobilenet-v2-pytorch ) | Mobilenet V2 PyTorch | classification | 224x224 | +| [resnet-18-pytorch](https://github.com/openvinotoolkit/open_model_zoo/tree/master/models/public/resnet-18-pytorch) | ResNet-18 PyTorch | classification | 224x224 | +| [resnet-50-pytorch](https://github.com/openvinotoolkit/open_model_zoo/tree/master/models/public/resnet-50-pytorch) | ResNet-50 v1 PyTorch | classification | 224x224 | +| [resnet-50-TF](https://github.com/openvinotoolkit/open_model_zoo/tree/master/models/public/resnet-50-tf) | ResNet-50_v1_ILSVRC-2012 | classification | 224x224 | +| [se-resnext-50-CF](https://github.com/openvinotoolkit/open_model_zoo/tree/master/models/public/se-resnext-50) | Se-ResNext-50_ILSVRC-2012_Caffe | classification | 224x224 | +| [squeezenet1.1-CF](https://github.com/openvinotoolkit/open_model_zoo/tree/master/models/public/squeezenet1.1) | SqueezeNet_v1.1_ILSVRC-2012_Caffe | classification | 227x227 | +| [ssd300-CF](https://github.com/openvinotoolkit/open_model_zoo/tree/master/models/public/ssd300) | SSD (VGG-16)_VOC-2007_Caffe | object detection | 300x300 | +| [yolo_v4-TF](https://github.com/openvinotoolkit/open_model_zoo/tree/master/models/public/yolo-v4-tf) | Yolo-V4 TF | object detection | 608x608 | +| [ssd_mobilenet_v1_coco-TF](https://github.com/openvinotoolkit/open_model_zoo/tree/master/models/public/ssd_mobilenet_v1_coco) | ssd_mobilenet_v1_coco | object detection | 300x300 | +| [ssdlite_mobilenet_v2-TF](https://github.com/openvinotoolkit/open_model_zoo/tree/master/models/public/ssdlite_mobilenet_v2) | ssdlite_mobilenet_v2 | object detection | 300x300 | +| [unet-camvid-onnx-0001](https://github.com/openvinotoolkit/open_model_zoo/blob/master/models/intel/unet-camvid-onnx-0001/description/unet-camvid-onnx-0001.md) | U-Net | semantic segmentation | 368x480 | +| [yolo-v3-tiny-tf](https://github.com/openvinotoolkit/open_model_zoo/tree/develop/models/public/yolo-v3-tiny-tf) | YOLO v3 Tiny | object detection | 416x416 | +| [ssd-resnet34-1200-onnx](https://github.com/openvinotoolkit/open_model_zoo/tree/develop/models/public/ssd-resnet34-1200-onnx) | ssd-resnet34 onnx model | object detection | 1200x1200 | +| [vgg19-caffe](https://github.com/openvinotoolkit/open_model_zoo/tree/master/models/public/vgg19-caffe2) | VGG-19 | classification | 224x224| + #### 7. Where can I purchase the specific hardware used in the benchmarking? Intel partners with various vendors all over the world. Visit the [Intel® AI: In Production Partners & Solutions Catalog](https://www.intel.com/content/www/us/en/internet-of-things/ai-in-production/partners-solutions-catalog.html) for a list of Equipment Makers and the [Supported Devices](../IE_DG/supported_plugins/Supported_Devices.md) documentation. You can also remotely test and run models before purchasing any hardware by using [Intel® DevCloud for the Edge](http://devcloud.intel.com/edge/). diff --git a/docs/benchmarks/performance_benchmarks_openvino.md b/docs/benchmarks/performance_benchmarks_openvino.md index 456f593db14461..be7c46410d752f 100644 --- a/docs/benchmarks/performance_benchmarks_openvino.md +++ b/docs/benchmarks/performance_benchmarks_openvino.md @@ -29,81 +29,86 @@ Measuring inference performance involves many variables and is extremely use-cas \htmlonly - + \endhtmlonly \htmlonly - + \endhtmlonly \htmlonly - + \endhtmlonly \htmlonly - + \endhtmlonly \htmlonly - + \endhtmlonly \htmlonly - + \endhtmlonly \htmlonly - + \endhtmlonly \htmlonly - + \endhtmlonly \htmlonly - + \endhtmlonly \htmlonly - + \endhtmlonly \htmlonly - + \endhtmlonly - \htmlonly - + \endhtmlonly + \htmlonly - + \endhtmlonly +\htmlonly + +\endhtmlonly \htmlonly - + \endhtmlonly \htmlonly - + \endhtmlonly \htmlonly - + \endhtmlonly \htmlonly - + \endhtmlonly + + ## Platform Configurations -Intel® Distribution of OpenVINO™ toolkit performance benchmark numbers are based on release 2021.3. +Intel® Distribution of OpenVINO™ toolkit performance benchmark numbers are based on release 2021.4. -Intel technologies’ features and benefits depend on system configuration and may require enabled hardware, software or service activation. Learn more at intel.com, or from the OEM or retailer. Performance results are based on testing as of March 15, 2021 and may not reflect all publicly available updates. See configuration disclosure for details. No product can be absolutely secure. +Intel technologies’ features and benefits depend on system configuration and may require enabled hardware, software or service activation. Learn more at intel.com, or from the OEM or retailer. Performance results are based on testing as of June 18, 2021 and may not reflect all publicly available updates. See configuration disclosure for details. No product can be absolutely secure. Performance varies by use, configuration and other factors. Learn more at [www.intel.com/PerformanceIndex](https://www.intel.com/PerformanceIndex). @@ -127,15 +132,15 @@ Testing by Intel done on: see test date for each HW platform below. | Operating System | Ubuntu* 18.04 LTS | Ubuntu* 18.04 LTS | Ubuntu* 18.04 LTS | | Kernel Version | 5.3.0-24-generic | 5.3.0-24-generic | 5.3.0-24-generic | | BIOS Vendor | American Megatrends Inc.* | American Megatrends Inc. | Intel Corporation | -| BIOS Version | 0904 | 607 | SE5C620.86B.02.01.
0009.092820190230 | -| BIOS Release | April 12, 2019 | May 29, 2020 | September 28, 2019 | +| BIOS Version | 0904 | 607 | SE5C620.86B.02.01.
0013.121520200651 | +| BIOS Release | April 12, 2019 | May 29, 2020 | December 15, 2020 | | BIOS Settings | Select optimized default settings,
save & exit | Select optimized default settings,
save & exit | Select optimized default settings,
change power policy
to "performance",
save & exit | | Batch size | 1 | 1 | 1 | Precision | INT8 | INT8 | INT8 | Number of concurrent inference requests | 4 | 5 | 32 -| Test Date | March 15, 2021 | March 15, 2021 | March 15, 2021 -| Power dissipation, TDP in Watt | [71](https://ark.intel.com/content/www/us/en/ark/products/134854/intel-xeon-e-2124g-processor-8m-cache-up-to-4-50-ghz.html#tab-blade-1-0-1) | [125](https://ark.intel.com/content/www/us/en/ark/products/199336/intel-xeon-w-1290p-processor-20m-cache-3-70-ghz.html) | [125](https://ark.intel.com/content/www/us/en/ark/products/193394/intel-xeon-silver-4216-processor-22m-cache-2-10-ghz.html#tab-blade-1-0-1) | -| CPU Price on Mach 15th, 2021, USD
Prices may vary | [213](https://ark.intel.com/content/www/us/en/ark/products/134854/intel-xeon-e-2124g-processor-8m-cache-up-to-4-50-ghz.html) | [539](https://ark.intel.com/content/www/us/en/ark/products/199336/intel-xeon-w-1290p-processor-20m-cache-3-70-ghz.html) |[1,002](https://ark.intel.com/content/www/us/en/ark/products/193394/intel-xeon-silver-4216-processor-22m-cache-2-10-ghz.html) | +| Test Date | June 18, 2021 | June 18, 2021 | June 18, 2021 +| Rated maximum TDP/socket in Watt | [71](https://ark.intel.com/content/www/us/en/ark/products/134854/intel-xeon-e-2124g-processor-8m-cache-up-to-4-50-ghz.html#tab-blade-1-0-1) | [125](https://ark.intel.com/content/www/us/en/ark/products/199336/intel-xeon-w-1290p-processor-20m-cache-3-70-ghz.html) | [125](https://ark.intel.com/content/www/us/en/ark/products/193394/intel-xeon-silver-4216-processor-22m-cache-2-10-ghz.html#tab-blade-1-0-1) | +| CPU Price/socket on June 21, 2021, USD
Prices may vary | [213](https://ark.intel.com/content/www/us/en/ark/products/134854/intel-xeon-e-2124g-processor-8m-cache-up-to-4-50-ghz.html) | [539](https://ark.intel.com/content/www/us/en/ark/products/199336/intel-xeon-w-1290p-processor-20m-cache-3-70-ghz.html) |[1,002](https://ark.intel.com/content/www/us/en/ark/products/193394/intel-xeon-silver-4216-processor-22m-cache-2-10-ghz.html) | **CPU Inference Engines (continue)** @@ -149,84 +154,104 @@ Testing by Intel done on: see test date for each HW platform below. | Operating System | Ubuntu* 18.04 LTS | Ubuntu* 18.04 LTS | Ubuntu* 18.04 LTS | | Kernel Version | 5.3.0-24-generic | 5.3.0-24-generic | 5.3.0-24-generic | | BIOS Vendor | Intel Corporation | Intel Corporation | Intel Corporation | -| BIOS Version | SE5C620.86B.02.01.
0009.092820190230 | SE5C620.86B.02.01.
0009.092820190230 | WLYDCRB1.SYS.0020.
P86.2103050636 | -| BIOS Release | September 28, 2019 | September 28, 2019 | March 5, 2021 | +| BIOS Version | SE5C620.86B.02.01.
0013.121520200651 | SE5C620.86B.02.01.
0013.121520200651 | WLYDCRB1.SYS.0020.
P86.2103050636 | +| BIOS Release | December 15, 2020 | December 15, 2020 | March 5, 2021 | | BIOS Settings | Select optimized default settings,
change power policy to "performance",
save & exit | Select optimized default settings,
change power policy to "performance",
save & exit | Select optimized default settings,
change power policy to "performance",
save & exit | | Batch size | 1 | 1 | 1 | | Precision | INT8 | INT8 | INT8 | | Number of concurrent inference requests |32 | 52 | 80 | -| Test Date | March 15, 2021 | March 15, 2021 | March 22, 2021 | -| Power dissipation, TDP in Watt | [105](https://ark.intel.com/content/www/us/en/ark/products/193953/intel-xeon-gold-5218t-processor-22m-cache-2-10-ghz.html#tab-blade-1-0-1) | [205](https://ark.intel.com/content/www/us/en/ark/products/192482/intel-xeon-platinum-8270-processor-35-75m-cache-2-70-ghz.html#tab-blade-1-0-1) | [270](https://ark.intel.com/content/www/us/en/ark/products/212287/intel-xeon-platinum-8380-processor-60m-cache-2-30-ghz.html) | -| CPU Price, USD
Prices may vary | [1,349](https://ark.intel.com/content/www/us/en/ark/products/193953/intel-xeon-gold-5218t-processor-22m-cache-2-10-ghz.html) (on Mach 15th, 2021) | [7,405](https://ark.intel.com/content/www/us/en/ark/products/192482/intel-xeon-platinum-8270-processor-35-75m-cache-2-70-ghz.html) (on Mach 15th, 2021) | [8,099](https://ark.intel.com/content/www/us/en/ark/products/212287/intel-xeon-platinum-8380-processor-60m-cache-2-30-ghz.html) (on March 26th, 2021) | +| Test Date | June 18, 2021 | June 18, 2021 | June 18, 2021 | +| Rated maximum TDP/socket in Watt | [105](https://ark.intel.com/content/www/us/en/ark/products/193953/intel-xeon-gold-5218t-processor-22m-cache-2-10-ghz.html#tab-blade-1-0-1) | [205](https://ark.intel.com/content/www/us/en/ark/products/192482/intel-xeon-platinum-8270-processor-35-75m-cache-2-70-ghz.html#tab-blade-1-0-1) | [270](https://ark.intel.com/content/www/us/en/ark/products/212287/intel-xeon-platinum-8380-processor-60m-cache-2-30-ghz.html) | +| CPU Price/socket on June 21, 2021, USD
Prices may vary | [1,349](https://ark.intel.com/content/www/us/en/ark/products/193953/intel-xeon-gold-5218t-processor-22m-cache-2-10-ghz.html) | [7,405](https://ark.intel.com/content/www/us/en/ark/products/192482/intel-xeon-platinum-8270-processor-35-75m-cache-2-70-ghz.html) | [8,099](https://ark.intel.com/content/www/us/en/ark/products/212287/intel-xeon-platinum-8380-processor-60m-cache-2-30-ghz.html) | **CPU Inference Engines (continue)** -| | Intel® Core™ i7-8700T | Intel® Core™ i9-10920X | 11th Gen Intel® Core™ i7-1185G7 | -| -------------------- | ----------------------------------- |--------------------------------------| --------------------------------| -| Motherboard | GIGABYTE* Z370M DS3H-CF | ASUS* PRIME X299-A II | Intel Corporation
Validation Platform | -| CPU | Intel® Core™ i7-8700T CPU @ 2.40GHz | Intel® Core™ i9-10920X CPU @ 3.50GHz | 11th Gen Intel® Core™ i7-1185G7 @ 3.00GHz | -| Hyper Threading | ON | ON | ON | -| Turbo Setting | ON | ON | ON | -| Memory | 4 x 16 GB DDR4 2400MHz | 4 x 16 GB DDR4 2666MHz | 2 x 8 GB DDR4 3200MHz | -| Operating System | Ubuntu* 18.04 LTS | Ubuntu* 18.04 LTS | Ubuntu* 18.04 LTS | -| Kernel Version | 5.3.0-24-generic | 5.3.0-24-generic | 5.8.0-05-generic | -| BIOS Vendor | American Megatrends Inc.* | American Megatrends Inc.* | Intel Corporation | -| BIOS Version | F11 | 505 | TGLSFWI1.R00.3425.
A00.2010162309 | -| BIOS Release | March 13, 2019 | December 17, 2019 | October 16, 2020 | -| BIOS Settings | Select optimized default settings,
set OS type to "other",
save & exit | Default Settings | Default Settings | -| Batch size | 1 | 1 | 1 | -| Precision | INT8 | INT8 | INT8 | -| Number of concurrent inference requests |4 | 24 | 4 | -| Test Date | March 15, 2021 | March 15, 2021 | March 15, 2021 | -| Power dissipation, TDP in Watt | [35](https://ark.intel.com/content/www/us/en/ark/products/129948/intel-core-i7-8700t-processor-12m-cache-up-to-4-00-ghz.html#tab-blade-1-0-1) | [165](https://ark.intel.com/content/www/us/en/ark/products/198012/intel-core-i9-10920x-x-series-processor-19-25m-cache-3-50-ghz.html) | [28](https://ark.intel.com/content/www/us/en/ark/products/208664/intel-core-i7-1185g7-processor-12m-cache-up-to-4-80-ghz-with-ipu.html#tab-blade-1-0-1) | -| CPU Price on Mach 15th, 2021, USD
Prices may vary | [303](https://ark.intel.com/content/www/us/en/ark/products/129948/intel-core-i7-8700t-processor-12m-cache-up-to-4-00-ghz.html) | [700](https://ark.intel.com/content/www/us/en/ark/products/198012/intel-core-i9-10920x-x-series-processor-19-25m-cache-3-50-ghz.html) | [426](https://ark.intel.com/content/www/us/en/ark/products/208664/intel-core-i7-1185g7-processor-12m-cache-up-to-4-80-ghz-with-ipu.html#tab-blade-1-0-0) | +| | Intel® Core™ i7-8700T | Intel® Core™ i9-10920X | +| -------------------- | ----------------------------------- |--------------------------------------| +| Motherboard | GIGABYTE* Z370M DS3H-CF | ASUS* PRIME X299-A II | +| CPU | Intel® Core™ i7-8700T CPU @ 2.40GHz | Intel® Core™ i9-10920X CPU @ 3.50GHz | +| Hyper Threading | ON | ON | +| Turbo Setting | ON | ON | +| Memory | 4 x 16 GB DDR4 2400MHz | 4 x 16 GB DDR4 2666MHz | +| Operating System | Ubuntu* 18.04 LTS | Ubuntu* 18.04 LTS | +| Kernel Version | 5.3.0-24-generic | 5.3.0-24-generic | +| BIOS Vendor | American Megatrends Inc.* | American Megatrends Inc.* | +| BIOS Version | F14c | 1004 | +| BIOS Release | March 23, 2021 | March 19, 2021 | +| BIOS Settings | Select optimized default settings,
set OS type to "other",
save & exit | Default Settings | +| Batch size | 1 | 1 | +| Precision | INT8 | INT8 | +| Number of concurrent inference requests |4 | 24 | +| Test Date | June 18, 2021 | June 18, 2021 | +| Rated maximum TDP/socket in Watt | [35](https://ark.intel.com/content/www/us/en/ark/products/129948/intel-core-i7-8700t-processor-12m-cache-up-to-4-00-ghz.html#tab-blade-1-0-1) | [165](https://ark.intel.com/content/www/us/en/ark/products/198012/intel-core-i9-10920x-x-series-processor-19-25m-cache-3-50-ghz.html) | +| CPU Price/socket on June 21, 2021, USD
Prices may vary | [303](https://ark.intel.com/content/www/us/en/ark/products/129948/intel-core-i7-8700t-processor-12m-cache-up-to-4-00-ghz.html) | [700](https://ark.intel.com/content/www/us/en/ark/products/198012/intel-core-i9-10920x-x-series-processor-19-25m-cache-3-50-ghz.html) | +**CPU Inference Engines (continue)** +| | 11th Gen Intel® Core™ i7-1185G7 | 11th Gen Intel® Core™ i7-11850HE | +| -------------------- | --------------------------------|----------------------------------| +| Motherboard | Intel Corporation
Validation Platform | Intel Corporation
Validation Platform | +| CPU | 11th Gen Intel® Core™ i7-1185G7 @ 3.00GHz | 11th Gen Intel® Core™ i7-11850HE @ 2.60GHz | +| Hyper Threading | ON | ON | +| Turbo Setting | ON | ON | +| Memory | 2 x 8 GB DDR4 3200MHz | 2 x 16 GB DDR4 3200MHz | +| Operating System | Ubuntu* 18.04 LTS | Ubuntu* 18.04.4 LTS | +| Kernel Version | 5.8.0-05-generic | 5.8.0-050800-generic | +| BIOS Vendor | Intel Corporation | Intel Corporation | +| BIOS Version | TGLSFWI1.R00.3425.
A00.2010162309 | TGLIFUI1.R00.4064.
A01.2102200132 | +| BIOS Release | October 16, 2020 | February 20, 2021 | +| BIOS Settings | Default Settings | Default Settings | +| Batch size | 1 | 1 | +| Precision | INT8 | INT8 | +| Number of concurrent inference requests |4 | 4 | +| Test Date | June 18, 2021 | June 18, 2021 | +| Rated maximum TDP/socket in Watt | [28](https://ark.intel.com/content/www/us/en/ark/products/208664/intel-core-i7-1185g7-processor-12m-cache-up-to-4-80-ghz-with-ipu.html) | [45](https://ark.intel.com/content/www/us/en/ark/products/213799/intel-core-i7-11850h-processor-24m-cache-up-to-4-80-ghz.html) | +| CPU Price/socket on June 21, 2021, USD
Prices may vary | [426](https://ark.intel.com/content/www/us/en/ark/products/208664/intel-core-i7-1185g7-processor-12m-cache-up-to-4-80-ghz-with-ipu.html) | [395](https://ark.intel.com/content/www/us/en/ark/products/213799/intel-core-i7-11850h-processor-24m-cache-up-to-4-80-ghz.html) | **CPU Inference Engines (continue)** -| | Intel® Core™ i5-8500 | Intel® Core™ i5-10500TE | -| -------------------- | ---------------------------------- | ----------------------------------- | -| Motherboard | ASUS* PRIME Z370-A | GIGABYTE* Z490 AORUS PRO AX | -| CPU | Intel® Core™ i5-8500 CPU @ 3.00GHz | Intel® Core™ i5-10500TE CPU @ 2.30GHz | -| Hyper Threading | OFF | ON | -| Turbo Setting | ON | ON | -| Memory | 2 x 16 GB DDR4 2666MHz | 2 x 16 GB DDR4 @ 2666MHz | -| Operating System | Ubuntu* 18.04 LTS | Ubuntu* 18.04 LTS | -| Kernel Version | 5.3.0-24-generic | 5.3.0-24-generic | -| BIOS Vendor | American Megatrends Inc.* | American Megatrends Inc.* | -| BIOS Version | 2401 | F3 | -| BIOS Release | July 12, 2019 | March 25, 2020 | -| BIOS Settings | Select optimized default settings,
save & exit | Select optimized default settings,
set OS type to "other",
save & exit | -| Batch size | 1 | 1 | -| Precision | INT8 | INT8 | -| Number of concurrent inference requests | 3 | 4 | -| Test Date | March 15, 2021 | March 15, 2021 | -| Power dissipation, TDP in Watt | [65](https://ark.intel.com/content/www/us/en/ark/products/129939/intel-core-i5-8500-processor-9m-cache-up-to-4-10-ghz.html#tab-blade-1-0-1)| [35](https://ark.intel.com/content/www/us/en/ark/products/203891/intel-core-i5-10500te-processor-12m-cache-up-to-3-70-ghz.html) | -| CPU Price on Mach 15th, 2021, USD
Prices may vary | [192](https://ark.intel.com/content/www/us/en/ark/products/129939/intel-core-i5-8500-processor-9m-cache-up-to-4-10-ghz.html) | [195](https://ark.intel.com/content/www/us/en/ark/products/203891/intel-core-i5-10500te-processor-12m-cache-up-to-3-70-ghz.html) | +| | Intel® Core™ i3-8100 | Intel® Core™ i5-8500 | Intel® Core™ i5-10500TE | +| -------------------- |----------------------------------- | ---------------------------------- | ----------------------------------- | +| Motherboard | GIGABYTE* Z390 UD | ASUS* PRIME Z370-A | GIGABYTE* Z490 AORUS PRO AX | +| CPU | Intel® Core™ i3-8100 CPU @ 3.60GHz | Intel® Core™ i5-8500 CPU @ 3.00GHz | Intel® Core™ i5-10500TE CPU @ 2.30GHz | +| Hyper Threading | OFF | OFF | ON | +| Turbo Setting | OFF | ON | ON | +| Memory | 4 x 8 GB DDR4 2400MHz | 2 x 16 GB DDR4 2666MHz | 2 x 16 GB DDR4 @ 2666MHz | +| Operating System | Ubuntu* 18.04 LTS | Ubuntu* 18.04 LTS | Ubuntu* 18.04 LTS | +| Kernel Version | 5.3.0-24-generic | 5.3.0-24-generic | 5.3.0-24-generic | +| BIOS Vendor | American Megatrends Inc.* | American Megatrends Inc.* | American Megatrends Inc.* | +| BIOS Version | F8 | 2401 | F3 | +| BIOS Release | May 24, 2019 | July 12, 2019 | March 25, 2020 | +| BIOS Settings | Select optimized default settings,
set OS type to "other",
save & exit | Select optimized default settings,
save & exit | Select optimized default settings,
set OS type to "other",
save & exit | +| Batch size | 1 | 1 | 1 | +| Precision | INT8 | INT8 | INT8 | +| Number of concurrent inference requests | 4 | 3 | 4 | +| Test Date | June 18, 2021 | June 18, 2021 | June 18, 2021 | +| Rated maximum TDP/socket in Watt | [65](https://ark.intel.com/content/www/us/en/ark/products/126688/intel-core-i3-8100-processor-6m-cache-3-60-ghz.html#tab-blade-1-0-1)| [65](https://ark.intel.com/content/www/us/en/ark/products/129939/intel-core-i5-8500-processor-9m-cache-up-to-4-10-ghz.html#tab-blade-1-0-1)| [35](https://ark.intel.com/content/www/us/en/ark/products/203891/intel-core-i5-10500te-processor-12m-cache-up-to-3-70-ghz.html) | +| CPU Price/socket on June 21, 2021, USD
Prices may vary | [117](https://ark.intel.com/content/www/us/en/ark/products/126688/intel-core-i3-8100-processor-6m-cache-3-60-ghz.html) | [192](https://ark.intel.com/content/www/us/en/ark/products/129939/intel-core-i5-8500-processor-9m-cache-up-to-4-10-ghz.html) | [195](https://ark.intel.com/content/www/us/en/ark/products/203891/intel-core-i5-10500te-processor-12m-cache-up-to-3-70-ghz.html) | **CPU Inference Engines (continue)** -| | Intel Atom® x5-E3940 | Intel Atom® x6425RE | Intel® Core™ i3-8100 | -| -------------------- | --------------------------------------|------------------------------- |----------------------------------- | -| Motherboard | | Intel Corporation /
ElkhartLake LPDDR4x T3 CRB | GIGABYTE* Z390 UD | -| CPU | Intel Atom® Processor E3940 @ 1.60GHz | Intel Atom® x6425RE
Processor @ 1.90GHz | Intel® Core™ i3-8100 CPU @ 3.60GHz | -| Hyper Threading | OFF | OFF | OFF | -| Turbo Setting | ON | ON | OFF | -| Memory | 1 x 8 GB DDR3 1600MHz | 2 x 4GB DDR4 3200 MHz | 4 x 8 GB DDR4 2400MHz | -| Operating System | Ubuntu* 18.04 LTS | Ubuntu* 18.04 LTS | Ubuntu* 18.04 LTS | -| Kernel Version | 5.3.0-24-generic | 5.8.0-050800-generic | 5.3.0-24-generic | -| BIOS Vendor | American Megatrends Inc.* | Intel Corporation | American Megatrends Inc.* | -| BIOS Version | 5.12 | EHLSFWI1.R00.2463.
A03.2011200425 | F8 | -| BIOS Release | September 6, 2017 | November 22, 2020 | May 24, 2019 | -| BIOS Settings | Default settings | Default settings | Select optimized default settings,
set OS type to "other",
save & exit | -| Batch size | 1 | 1 | 1 | -| Precision | INT8 | INT8 | INT8 | -| Number of concurrent inference requests | 4 | 4 | 4 | -| Test Date | March 15, 2021 | March 15, 2021 | March 15, 2021 | -| Power dissipation, TDP in Watt | [9.5](https://ark.intel.com/content/www/us/en/ark/products/96485/intel-atom-x5-e3940-processor-2m-cache-up-to-1-80-ghz.html) | [12](https://ark.intel.com/content/www/us/en/ark/products/207899/intel-atom-x6425re-processor-1-5m-cache-1-90-ghz.html) | [65](https://ark.intel.com/content/www/us/en/ark/products/126688/intel-core-i3-8100-processor-6m-cache-3-60-ghz.html#tab-blade-1-0-1)| -| CPU Price, USD
Prices may vary | [34](https://ark.intel.com/content/www/us/en/ark/products/96485/intel-atom-x5-e3940-processor-2m-cache-up-to-1-80-ghz.html) (on March 15th, 2021) | [59](https://ark.intel.com/content/www/us/en/ark/products/207899/intel-atom-x6425re-processor-1-5m-cache-1-90-ghz.html) (on March 26th, 2021) | [117](https://ark.intel.com/content/www/us/en/ark/products/126688/intel-core-i3-8100-processor-6m-cache-3-60-ghz.html) (on March 15th, 2021) | +| | Intel Atom® x5-E3940 | Intel Atom® x6425RE | Intel® Celeron® 6305E | +| -------------------- | --------------------------------------|------------------------------- |----------------------------------| +| Motherboard | Intel Corporation
Validation Platform | Intel Corporation
Validation Platform | Intel Corporation
Validation Platform | +| CPU | Intel Atom® Processor E3940 @ 1.60GHz | Intel Atom® x6425RE
Processor @ 1.90GHz | Intel® Celeron®
6305E @ 1.80GHz | +| Hyper Threading | OFF | OFF | OFF | +| Turbo Setting | ON | ON | ON | +| Memory | 1 x 8 GB DDR3 1600MHz | 2 x 4GB DDR4 3200MHz | 2 x 8 GB DDR4 3200MHz | +| Operating System | Ubuntu* 18.04 LTS | Ubuntu* 18.04 LTS | Ubuntu 18.04.5 LTS | +| Kernel Version | 5.3.0-24-generic | 5.8.0-050800-generic | 5.8.0-050800-generic | +| BIOS Vendor | American Megatrends Inc.* | Intel Corporation | Intel Corporation | +| BIOS Version | 5.12 | EHLSFWI1.R00.2463.
A03.2011200425 | TGLIFUI1.R00.4064.A02.2102260133 | +| BIOS Release | September 6, 2017 | November 22, 2020 | February 26, 2021 | +| BIOS Settings | Default settings | Default settings | Default settings | +| Batch size | 1 | 1 | 1 | +| Precision | INT8 | INT8 | INT8 | +| Number of concurrent inference requests | 4 | 4 | 4| +| Test Date | June 18, 2021 | June 18, 2021 | June 18, 2021 | +| Rated maximum TDP/socket in Watt | [9.5](https://ark.intel.com/content/www/us/en/ark/products/96485/intel-atom-x5-e3940-processor-2m-cache-up-to-1-80-ghz.html) | [12](https://ark.intel.com/content/www/us/en/ark/products/207899/intel-atom-x6425re-processor-1-5m-cache-1-90-ghz.html) | [15](https://ark.intel.com/content/www/us/en/ark/products/208072/intel-celeron-6305e-processor-4m-cache-1-80-ghz.html)| +| CPU Price/socket on June 21, 2021, USD
Prices may vary | [34](https://ark.intel.com/content/www/us/en/ark/products/96485/intel-atom-x5-e3940-processor-2m-cache-up-to-1-80-ghz.html) | [59](https://ark.intel.com/content/www/us/en/ark/products/207899/intel-atom-x6425re-processor-1-5m-cache-1-90-ghz.html) |[107](https://ark.intel.com/content/www/us/en/ark/products/208072/intel-celeron-6305e-processor-4m-cache-1-80-ghz.html) | @@ -239,8 +264,8 @@ Testing by Intel done on: see test date for each HW platform below. | Batch size | 1 | 1 | | Precision | FP16 | FP16 | | Number of concurrent inference requests | 4 | 32 | -| Power dissipation, TDP in Watt | 2.5 | [30](https://www.arrow.com/en/products/mustang-v100-mx8-r10/iei-technology?gclid=Cj0KCQiA5bz-BRD-ARIsABjT4ng1v1apmxz3BVCPA-tdIsOwbEjTtqnmp_rQJGMfJ6Q2xTq6ADtf9OYaAhMUEALw_wcB) | -| CPU Price, USD
Prices may vary | [69](https://ark.intel.com/content/www/us/en/ark/products/140109/intel-neural-compute-stick-2.html) (from March 15, 2021) | [1180](https://www.arrow.com/en/products/mustang-v100-mx8-r10/iei-technology?gclid=Cj0KCQiA5bz-BRD-ARIsABjT4ng1v1apmxz3BVCPA-tdIsOwbEjTtqnmp_rQJGMfJ6Q2xTq6ADtf9OYaAhMUEALw_wcB) (from March 15, 2021) | +| Rated maximum TDP/socket in Watt | 2.5 | [30](https://www.arrow.com/en/products/mustang-v100-mx8-r10/iei-technology?gclid=Cj0KCQiA5bz-BRD-ARIsABjT4ng1v1apmxz3BVCPA-tdIsOwbEjTtqnmp_rQJGMfJ6Q2xTq6ADtf9OYaAhMUEALw_wcB) | +| CPU Price/socket on June 21, 2021, USD
Prices may vary | [69](https://ark.intel.com/content/www/us/en/ark/products/140109/intel-neural-compute-stick-2.html) | [425](https://www.arrow.com/en/products/mustang-v100-mx8-r10/iei-technology?gclid=Cj0KCQiA5bz-BRD-ARIsABjT4ng1v1apmxz3BVCPA-tdIsOwbEjTtqnmp_rQJGMfJ6Q2xTq6ADtf9OYaAhMUEALw_wcB) | | Host Computer | Intel® Core™ i7 | Intel® Core™ i5 | | Motherboard | ASUS* Z370-A II | Uzelinfo* / US-E1300 | | CPU | Intel® Core™ i7-8700 CPU @ 3.20GHz | Intel® Core™ i5-6600 CPU @ 3.30GHz | @@ -252,9 +277,9 @@ Testing by Intel done on: see test date for each HW platform below. | BIOS Vendor | American Megatrends Inc.* | American Megatrends Inc.* | | BIOS Version | 411 | 5.12 | | BIOS Release | September 21, 2018 | September 21, 2018 | -| Test Date | March 15, 2021 | March 15, 2021 | +| Test Date | June 18, 2021 | June 18, 2021 | -Please follow this link for more detailed configuration descriptions: [Configuration Details](https://docs.openvinotoolkit.org/resources/benchmark_files/system_configurations_2021.3.html) +Please follow this link for more detailed configuration descriptions: [Configuration Details](https://docs.openvinotoolkit.org/resources/benchmark_files/system_configurations_2021.4.html) \htmlonly