Skip to content

Commit

Permalink
Feature/azaytsev/cherry picks from 2021 4 (openvinotoolkit#7389)
Browse files Browse the repository at this point in the history
* Added info on DockerHub CI Framework

* Feature/azaytsev/change layout (openvinotoolkit#3295)

* Changes according to feedback comments

* Replaced @ref's with html links

* Fixed links, added a title page for installing from repos and images, fixed formatting issues

* Added links

* minor fix

* Added DL Streamer to the list of components installed by default

* Link fixes

* Link fixes

* ovms doc fix (openvinotoolkit#2988)

* added OpenVINO Model Server

* ovms doc fixes

Co-authored-by: Trawinski, Dariusz <[email protected]>

* Updated openvino_docs.xml

* Updated the link to software license agreements

* Revert "Updated the link to software license agreements"

This reverts commit 706dac5.

* Updated legal info (openvinotoolkit#6409)

# Conflicts:
#	thirdparty/ade

* Cherry-pick 4833c8d

[DOCS]Changed DL WB related docs and tips (openvinotoolkit#6318)

* changed DL WB related docs and tips

* added two tips to benchmark and changed layout

* changed layout

* changed links

* page title added

* changed tips

* ie layout fixed

* updated diagram and hints

* changed tooltip and ref link

* changet tooltip link

* changed DL WB description

* typo fix
# Conflicts:
#	docs/doxygen/ie_docs.xml
#	thirdparty/ade

* Cherry-pick 6405

Feature/azaytsev/mo devguide changes (openvinotoolkit#6405)

* MO devguide edits

* MO devguide edits

* MO devguide edits

* MO devguide edits

* MO devguide edits

* Experimenting with videos

* Experimenting with videos

* Experimenting with videos

* Experimenting with videos

* Experimenting with videos

* Experimenting with videos

* Experimenting with videos

* Experimenting with videos

* Experimenting with videos

* Additional edits

* Additional edits

* Updated the workflow diagram

* Minor fix

* Experimenting with videos

* Updated the workflow diagram

* Removed  Prepare_Trained_Model, changed the title for Config_Model_Optimizer

* Rolled back

* Revert "Rolled back"

This reverts commit 6a4a3e1.

* Revert "Removed  Prepare_Trained_Model, changed the title for Config_Model_Optimizer"

This reverts commit 0810bd5.

* Fixed ie_docs.xml, Removed  Prepare_Trained_Model, changed the title for Config_Model_Optimizer

* Fixed ie_docs.xml

* Minor fix

* <details> tag issue

* <details> tag issue

* Fix <details> tag issue

* Fix <details> tag issue

* Fix <details> tag issue
# Conflicts:
#	thirdparty/ade

* Cherry-pick openvinotoolkit#6419

* [Runtime] INT8 inference documentation update

* [Runtime] INT8 inference documentation: typo was fixed

* Update docs/IE_DG/Int8Inference.md

Co-authored-by: Anastasiya Ageeva <[email protected]>

* Update docs/IE_DG/Int8Inference.md

Co-authored-by: Anastasiya Ageeva <[email protected]>

* Update docs/IE_DG/Int8Inference.md

Co-authored-by: Anastasiya Ageeva <[email protected]>

* Update docs/IE_DG/Int8Inference.md

Co-authored-by: Anastasiya Ageeva <[email protected]>

* Update docs/IE_DG/Int8Inference.md

Co-authored-by: Anastasiya Ageeva <[email protected]>

* Table of Contents was removed

Co-authored-by: Anastasiya Ageeva <[email protected]>
# Conflicts:
#	docs/IE_DG/Int8Inference.md
#	thirdparty/ade

* Cherry pick (openvinotoolkit#6437)

* Q2 changes

* Changed Convert_RNNT.md

Co-authored-by: baychub <[email protected]>
# Conflicts:
#	docs/IE_DG/Int8Inference.md
#	docs/install_guides/installing-openvino-conda.md
#	docs/install_guides/pypi-openvino-dev.md
#	thirdparty/ade

* Cherry-pick (openvinotoolkit#6447)

* Added benchmark page changes

* Make the picture smaller

* Added Intel® Iris® Xe MAX Graphics

* Changed the TIP about DL WB

* Added Note on the driver for Intel® Iris® Xe MAX Graphics

* Fixed formatting

* Added the link to Intel® software for general purpose GPU capabilities

* OVSA ovsa_get_started updates

* Fixed link
# Conflicts:
#	thirdparty/ade

* Cherry-pick openvinotoolkit#6450

* fix layout

* 4
# Conflicts:
#	thirdparty/ade

* Cherry-pick openvinotoolkit#6466

* Cherry-pick openvinotoolkit#6548

* install docs fixes

* changed video width

* CMake reference added

* fixed table

* added backtics and table formating

* new table changes

* GPU table changes

* added more backtics and changed table format

* gpu table changes

* Update get_started_dl_workbench.md

Co-authored-by: Andrey Zaytsev <[email protected]>
# Conflicts:
#	thirdparty/ade

* [Runtime] INT8 inference documentation update (openvinotoolkit#6419)

* [Runtime] INT8 inference documentation update

* [Runtime] INT8 inference documentation: typo was fixed

* Update docs/IE_DG/Int8Inference.md

Co-authored-by: Anastasiya Ageeva <[email protected]>

* Update docs/IE_DG/Int8Inference.md

Co-authored-by: Anastasiya Ageeva <[email protected]>

* Update docs/IE_DG/Int8Inference.md

Co-authored-by: Anastasiya Ageeva <[email protected]>

* Update docs/IE_DG/Int8Inference.md

Co-authored-by: Anastasiya Ageeva <[email protected]>

* Update docs/IE_DG/Int8Inference.md

Co-authored-by: Anastasiya Ageeva <[email protected]>

* Table of Contents was removed

Co-authored-by: Anastasiya Ageeva <[email protected]>
# Conflicts:
#	docs/IE_DG/Int8Inference.md
#	thirdparty/ade

* Cherry-pick openvinotoolkit#6651

* Edits to MO

Per findings spreadsheet

* macOS changes

per issue spreadsheet

* Fixes from review spreadsheet

Mostly IE_DG fixes

* Consistency changes

* Make doc fixes from last round of review

* Add GSG build-all details

* Fix links to samples and demos pages

* Make MO_DG v2 changes

* Add image view step to classify demo

* Put MO dependency with others

* Edit docs per issues spreadsheet

* Add file to pytorch_specific

* More fixes per spreadsheet

* Prototype sample page

* Add build section

* Update README.md

* Batch download/convert by default

* Add detail to How It Works

* Minor change

* Temporary restored topics

* corrected layout

* Resized

* Added white background into the picture

* fixed link to omz_tools_downloader

* fixed title in the layout

Co-authored-by: baychub <[email protected]>
Co-authored-by: baychub <[email protected]>
# Conflicts:
#	docs/doxygen/ie_docs.xml

* Cherry-pick  (openvinotoolkit#6789) [59449][DOCS] GPU table layout change

* changed argument display

* added br tag to more arguments

* changed argument display in GPU table

* changed more arguments

* changed Quantized_ models display
# Conflicts:
#	thirdparty/ade

* Sync doxygen-ignore

* Removed ref to FPGA.md

* Fixed link to ONNX format doc

Co-authored-by: Trawinski, Dariusz <[email protected]>
Co-authored-by: Tatiana Savina <[email protected]>
Co-authored-by: Edward Shogulin <[email protected]>
Co-authored-by: Nikolay Tyukaev <[email protected]>
  • Loading branch information
5 people authored and akuporos committed Sep 29, 2021
1 parent 4767931 commit d05fc79
Show file tree
Hide file tree
Showing 85 changed files with 898 additions and 1,009 deletions.
6 changes: 1 addition & 5 deletions docs/IE_DG/Deep_Learning_Inference_Engine_DevGuide.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,5 @@
# Inference Engine Developer Guide {#openvino_docs_IE_DG_Deep_Learning_Inference_Engine_DevGuide}

> **NOTE:** [Intel® System Studio](https://software.intel.com/content/www/us/en/develop/tools/oneapi/commercial-base-iot.html) (click "Intel® System Studio Users" tab) is an all-in-one, cross-platform tool suite, purpose-built to simplify system bring-up and improve system and IoT device application performance on Intel® platforms. If you are using the Intel® Distribution of OpenVINO™ with Intel® System Studio, go to [Get Started with Intel® System Studio](https://software.intel.com/en-us/articles/get-started-with-openvino-and-intel-system-studio-2019).
This Guide provides an overview of the Inference Engine describing the typical workflow for performing inference of a pre-trained and optimized deep learning model and a set of sample applications.

> **NOTE:** Before you perform inference with the Inference Engine, your models should be converted to the Inference Engine format using the Model Optimizer or built directly in runtime using nGraph API. To learn about how to use Model Optimizer, refer to the [Model Optimizer Developer Guide](../MO_DG/Deep_Learning_Model_Optimizer_DevGuide.md). To learn about the pre-trained and optimized models delivered with the OpenVINO™ toolkit, refer to [Pre-Trained Models](@ref omz_models_group_intel).
Expand Down Expand Up @@ -111,10 +109,8 @@ The common workflow contains the following steps:
8. **Get the output** - After inference is completed, get the output memory or read the memory you provided earlier. Do this with the `InferenceEngine::IInferRequest::GetBlob()` method.

## Video: Inference Engine Concept
[![](https://img.youtube.com/vi/e6R13V8nbak/0.jpg)](https://www.youtube.com/watch?v=e6R13V8nbak)
\htmlonly

<iframe width="560" height="315" src="https://www.youtube.com/embed/e6R13V8nbak" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture" allowfullscreen></iframe>
\endhtmlonly

## Further Reading

Expand Down
2 changes: 1 addition & 1 deletion docs/IE_DG/Extensibility_DG/AddingNGraphOps.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# Custom nGraph Operation {#openvino_docs_IE_DG_Extensibility_DG_AddingNGraphOps}

Inference Engine Extension API allows you to register operation sets (opsets) with custom nGraph operations to support models with operations which OpenVINO™ does not support out-of-the-box.
The Inference Engine Extension API allows you to register operation sets (opsets) with custom nGraph operations to support models with operations that OpenVINO™ does not support out-of-the-box.

## Operation Class

Expand Down
5 changes: 3 additions & 2 deletions docs/IE_DG/Extensibility_DG/Extension.md
Original file line number Diff line number Diff line change
Expand Up @@ -25,5 +25,6 @@ Also, an `Extension` object should implement the following methods:
Implement the InferenceEngine::IExtension::getOpSets method if the extension contains custom layers.
Read [Custom nGraph Operation](AddingNGraphOps.md) for more information.

To integrate execution kernels to the extension library, read [How to Implement Custom CPU Operations](CPU_Kernel.md).
To register a custom ONNX\* operator to the extension library, read [Custom ONNX Operators](Custom_ONNX_Ops.md).
To understand how to integrate execution kernels to the extension library, read the [documentation about development of custom CPU kernels](CPU_Kernel.md).

To understand how to register custom ONNX operator to the extension library, read the [documentation about custom ONNX operators](Custom_ONNX_Ops.md).
28 changes: 11 additions & 17 deletions docs/IE_DG/Int8Inference.md
Original file line number Diff line number Diff line change
@@ -1,12 +1,5 @@
# Low-Precision 8-bit Integer Inference {#openvino_docs_IE_DG_Int8Inference}

## Table of Contents
1. [Supported devices](#supported-devices)
2. [Low-Precision 8-bit Integer Inference Workflow](#low-precision-8-bit-integer-inference-workflow)
3. [Prerequisites](#prerequisites)
4. [Inference](#inference)
5. [Results analysis](#results-analysis)

## Supported devices

Low-precision 8-bit inference is optimized for:
Expand All @@ -24,34 +17,35 @@ Low-precision 8-bit inference is optimized for:

## Low-Precision 8-bit Integer Inference Workflow

8-bit computations (referred to as `int8`) offer better performance compared to the results of inference in higher precision (for example, `fp32`), because they allow loading more data into a single processor instruction. Usually the cost for significant boost is a reduced accuracy. However, it is proved that an accuracy drop can be negligible and depends on task requirements, so that the application engineer can set up the maximum accuracy drop that is acceptable.
8-bit computations (referred to as `int8`) offer better performance compared to the results of inference in higher precision (for example, `fp32`), because they allow loading more data into a single processor instruction. Usually the cost for significant boost is reduced accuracy. However, it is proved that an accuracy drop can be negligible and depends on task requirements, so that the application engineer can set up the maximum accuracy drop that is acceptable.

For 8-bit integer computations, a model must be quantized. Quantized models can be downloaded from [Overview of OpenVINO™ Toolkit Intel's Pre-Trained Models](@ref omz_models_group_intel). If the model is not quantized, you can use the [Post-Training Optimization Tool](@ref pot_README) to quantize the model. The quantization process adds [FakeQuantize](../ops/quantization/FakeQuantize_1.md) layers on activations and weights for most layers. Read more about mathematical computations in the [Uniform Quantization with Fine-Tuning](https://github.com/openvinotoolkit/nncf/blob/develop/docs/compression_algorithms/Quantization.md).

When you pass the quantized IR to the OpenVINO™ plugin, the plugin automatically recognizes it as a quantized model and performs 8-bit inference. Note, if you pass a quantized model to another plugin that does not support 8-bit inference but supports all operations from the model, the model is inferred in precision that this plugin supports.

In *Runtime stage* stage, the quantized model is loaded to the plugin. The plugin uses `Low Precision Transformation` component to update the model to infer it in low precision:
- Update `FakeQuantize` layers to have quantized output tensors in low precision range and add dequantization layers to compensate the update. Dequantization layers are pushed through as many layers as possible to have more layers in low precision. After that, most layers have quantized input tensors in low precision range and can be inferred in low precision. Ideally, dequantization layers should be fused in the next `FakeQuantize` layer.
- Weights are quantized and stored in `Constant` layers.
In *Runtime stage*, the quantized model is loaded to the plugin. The plugin uses the `Low Precision Transformation` component to update the model to infer it in low precision:
- Update `FakeQuantize` layers to have quantized output tensors in a low precision range and add dequantization layers to compensate the update. Dequantization layers are pushed through as many layers as possible to have more layers in low precision. After that, most layers quantized input tensors in the low precision range and can be inferred in low precision. Ideally, dequantization layers should be fused in the next `FakeQuantize` layer.
- Quantize weights and store them in `Constant` layers.

## Prerequisites

Let's explore quantized [TensorFlow* implementation of ResNet-50](https://github.com/openvinotoolkit/open_model_zoo/tree/master/models/public/resnet-50-tf) model. Use [Model Downloader](@ref omz_tools_downloader) tool to download the `fp16` model from [OpenVINO™ Toolkit - Open Model Zoo repository](https://github.com/openvinotoolkit/open_model_zoo):
Let's explore the quantized [TensorFlow* implementation of ResNet-50](https://github.com/openvinotoolkit/open_model_zoo/tree/master/models/public/resnet-50-tf) model. Use the [Model Downloader](@ref omz_tools_downloader) tool to download the `fp16` model from [OpenVINO™ Toolkit - Open Model Zoo repository](https://github.com/openvinotoolkit/open_model_zoo):
```sh
./downloader.py --name resnet-50-tf --precisions FP16-INT8
cd $INTEL_OPENVINO_DIR/deployment_tools/tools/model_downloader
./downloader.py --name resnet-50-tf --precisions FP16-INT8 --output_dir <your_model_directory>
```
After that you should quantize model by the [Model Quantizer](@ref omz_tools_downloader) tool.
After that, you should quantize the model by the [Model Quantizer](@ref omz_tools_downloader) tool. For the dataset, you can choose to download the ImageNet dataset from [here](https://www.image-net.org/download.php).
```sh
./quantizer.py --model_dir public/resnet-50-tf --dataset_dir <DATASET_DIR> --precisions=FP16-INT8
./quantizer.py --model_dir --name public/resnet-50-tf --dataset_dir <DATASET_DIR> --precisions=FP16-INT8
```

## Inference

The simplest way to infer the model and collect performance counters is [C++ Benchmark Application](../../inference-engine/samples/benchmark_app/README.md).
The simplest way to infer the model and collect performance counters is the [C++ Benchmark Application](../../inference-engine/samples/benchmark_app/README.md).
```sh
./benchmark_app -m resnet-50-tf.xml -d CPU -niter 1 -api sync -report_type average_counters -report_folder pc_report_dir
```
If you infer the model with the OpenVINO™ CPU plugin and collect performance counters, all operations (except last not quantized SoftMax) are executed in INT8 precision.
If you infer the model with the Inference Engine CPU plugin and collect performance counters, all operations (except the last non-quantized SoftMax) are executed in INT8 precision.

## Results analysis

Expand Down
12 changes: 0 additions & 12 deletions docs/IE_DG/Legal_Information.md

This file was deleted.

2 changes: 1 addition & 1 deletion docs/IE_DG/Samples_Overview.md
Original file line number Diff line number Diff line change
Expand Up @@ -109,7 +109,7 @@ for the debug configuration — in `<path_to_build_directory>/intel64/Debug/`.

The recommended Windows* build environment is the following:
* Microsoft Windows* 10
* Microsoft Visual Studio* 2017, or 2019
* Microsoft Visual Studio* 2017, or 2019. Make sure that C++ CMake tools for Windows is [enabled](https://docs.microsoft.com/en-us/cpp/build/cmake-projects-in-visual-studio?view=msvc-160#:~:text=The%20Visual%20C%2B%2B%20Tools%20for,Visual%20Studio%20generators%20are%20supported).
* CMake* version 3.10 or higher

> **NOTE**: If you want to use Microsoft Visual Studio 2019, you are required to install CMake 3.14.
Expand Down
2 changes: 1 addition & 1 deletion docs/IE_DG/ShapeInference.md
Original file line number Diff line number Diff line change
Expand Up @@ -33,7 +33,7 @@ If a model has a hard-coded batch dimension, use `InferenceEngine::CNNNetwork::s

Inference Engine takes three kinds of a model description as an input, which are converted into an `InferenceEngine::CNNNetwork` object:
1. [Intermediate Representation (IR)](../MO_DG/IR_and_opsets.md) through `InferenceEngine::Core::ReadNetwork`
2. [ONNX model](../IE_DG/OnnxImporterTutorial.md) through `InferenceEngine::Core::ReadNetwork`
2. [ONNX model](../IE_DG/ONNX_Support.md) through `InferenceEngine::Core::ReadNetwork`
3. [nGraph function](../nGraph_DG/nGraph_dg.md) through the constructor of `InferenceEngine::CNNNetwork`

`InferenceEngine::CNNNetwork` keeps an `ngraph::Function` object with the model description internally.
Expand Down
Loading

0 comments on commit d05fc79

Please sign in to comment.