diff --git a/docs/doxygen/ie_docs.xml b/docs/doxygen/ie_docs.xml index 9ef9073e409a1e..133562d135b24d 100644 --- a/docs/doxygen/ie_docs.xml +++ b/docs/doxygen/ie_docs.xml @@ -135,6 +135,11 @@ limitations under the License. + + + + + diff --git a/docs/ops/detection/ExperimentalDetectronDetectionOutput_6.md b/docs/ops/detection/ExperimentalDetectronDetectionOutput_6.md new file mode 100644 index 00000000000000..cc0a026734feee --- /dev/null +++ b/docs/ops/detection/ExperimentalDetectronDetectionOutput_6.md @@ -0,0 +1,193 @@ +## ExperimentalDetectronDetectionOutput {#openvino_docs_ops_detection_ExperimentalDetectronDetectionOutput_6} + +**Versioned name**: *ExperimentalDetectronDetectionOutput-6* + +**Category**: Object detection + +**Short description**: The *ExperimentalDetectronDetectionOutput* operation performs non-maximum suppression to generate +the detection output using information on location and score predictions. + +**Detailed description**: The operation performs the following steps: + +1. Applies deltas to boxes sizes [x1, y1, x2, y2] and takes coordinates of +refined boxes according to the formulas: + +`x1_new = ctr_x + (dx - 0.5 * exp(min(d_log_w, max_delta_log_wh))) * box_w` + +`y0_new = ctr_y + (dy - 0.5 * exp(min(d_log_h, max_delta_log_wh))) * box_h` + +`x1_new = ctr_x + (dx + 0.5 * exp(min(d_log_w, max_delta_log_wh))) * box_w - 1.0` + +`y1_new = ctr_y + (dy + 0.5 * exp(min(d_log_h, max_delta_log_wh))) * box_h - 1.0` + +* `box_w` and `box_h` are width and height of box, respectively: + +`box_w = x1 - x0 + 1.0` + +`box_h = y1 - y0 + 1.0` + +* `ctr_x` and `ctr_y` are center location of a box: + +`ctr_x = x0 + 0.5f * box_w` + +`ctr_y = y0 + 0.5f * box_h` + +* `dx`, `dy`, `d_log_w` and `d_log_h` are deltas calculated according to the formulas below, and `deltas_tensor` is a +second input: + +`dx = deltas_tensor[roi_idx, 4 * class_idx + 0] / deltas_weights[0]` + +`dy = deltas_tensor[roi_idx, 4 * class_idx + 1] / deltas_weights[1]` + +`d_log_w = deltas_tensor[roi_idx, 4 * class_idx + 2] / deltas_weights[2]` + +`d_log_h = deltas_tensor[roi_idx, 4 * class_idx + 3] / deltas_weights[3]` + +2. If *class_agnostic_box_regression* is `true` removes predictions for background classes. +3. Clips boxes to the image. +4. Applies *score_threshold* on detection scores. +5. Applies non-maximum suppression class-wise with *nms_threshold* and returns *post_nms_count* or less detections per +class. +6. Returns *max_detections_per_image* detections if total number of detections is more than *max_detections_per_image*; +otherwise, returns total number of detections and the output tensor is filled with undefined values for rest output +tensor elements. + +**Attributes**: + +* *score_threshold* + + * **Description**: The *score_threshold* attribute specifies a threshold to consider only detections whose score are + larger than the threshold. + * **Range of values**: non-negative floating point number + * **Type**: float + * **Default value**: None + * **Required**: *yes* + +* *nms_threshold* + + * **Description**: The *nms_threshold* attribute specifies a threshold to be used in the NMS stage. + * **Range of values**: non-negative floating point number + * **Type**: float + * **Default value**: None + * **Required**: *yes* + +* *num_classes* + + * **Description**: The *num_classes* attribute specifies the number of detected classes. + * **Range of values**: non-negative integer number + * **Type**: int + * **Default value**: None + * **Required**: *yes* + +* *post_nms_count* + + * **Description**: The *post_nms_count* attribute specifies the maximal number of detections per class. + * **Range of values**: non-negative integer number + * **Type**: int + * **Default value**: None + * **Required**: *yes* + +* *max_detections_per_image* + + * **Description**: The *max_detections_per_image* attribute specifies maximal number of detections per image. + * **Range of values**: non-negative integer number + * **Type**: int + * **Default value**: None + * **Required**: *yes* + +* *class_agnostic_box_regression* + + * **Description**: *class_agnostic_box_regression* attribute ia a flag specifies whether to delete background + classes or not. + * **Range of values**: + * `true` means background classes should be deleted + * `false` means background classes should not be deleted + * **Type**: boolean + * **Default value**: false + * **Required**: *no* + +* *max_delta_log_wh* + + * **Description**: The *max_delta_log_wh* attribute specifies maximal delta of logarithms for width and height. + * **Range of values**: floating point number + * **Type**: float + * **Default value**: None + * **Required**: *yes* + +* *deltas_weights* + + * **Description**: The *deltas_weights* attribute specifies weights for bounding boxes sizes deltas. + * **Range of values**: a list of non-negative floating point numbers + * **Type**: float[] + * **Default value**: None + * **Required**: *yes* + +**Inputs** + +* **1**: A 2D tensor of type *T* with input ROIs, with shape `[number_of_ROIs, 4]` providing the ROIs as 4-tuples: +[x1, y1, x2, y2]. The batch dimension of first, second, and third inputs +should be the same. **Required.** + +* **2**: A 2D tensor of type *T* with shape `[number_of_ROIs, num_classes * 4]` providing deltas for input boxes. + **Required.** + +* **3**: A 2D tensor of type *T* with shape `[number_of_ROIs, num_classes]` providing detections scores. **Required.** + +* **4**: A 2D tensor of type *T* with shape `[1, 3]` contains three elements + `[image_height, image_width, scale_height_and_width]` providing input image size info. **Required.** + +**Outputs** + +* **1**: A 2D tensor of type *T* with shape `[max_detections_per_image, 4]` providing boxes indices. + +* **2**: A 1D tensor of type *T_IND* with shape `[max_detections_per_image]` providing classes indices. + +* **3**: A 1D tensor of type *T* with shape `[max_detections_per_image]` providing scores indices. + +**Types** + +* *T*: any supported floating point type. + +* *T_IND*: `int64` or `int32`. + + +**Example** + +```xml + + + + + 1000 + 4 + + + 1000 + 324 + + + 1000 + 81 + + + 1 + 3 + + + + + 100 + 4 + + + 100 + + + 100 + + + 100 + + + +``` diff --git a/docs/ops/detection/ExperimentalDetectronGenerateProposalsSingleImage_6.md b/docs/ops/detection/ExperimentalDetectronGenerateProposalsSingleImage_6.md new file mode 100644 index 00000000000000..7cfacbeed58ff0 --- /dev/null +++ b/docs/ops/detection/ExperimentalDetectronGenerateProposalsSingleImage_6.md @@ -0,0 +1,112 @@ +## ExperimentalDetectronGenerateProposalsSingleImage {#openvino_docs_ops_detection_ExperimentalDetectronGenerateProposalsSingleImage_6} + +**Versioned name**: *ExperimentalDetectronGenerateProposalsSingleImage-6* + +**Category**: Object detection + +**Short description**: The *ExperimentalDetectronGenerateProposalsSingleImage* operation computes ROIs and their scores +based on input data. + +**Detailed description**: The operation performs the following steps: + +1. Transposes and reshapes predicted bounding boxes deltas and scores to get them into the same order as the anchors. +2. Transforms anchors into proposals using deltas and clips proposals to an image. +3. Removes predicted boxes with either height or width < *min_size*. +4. Sorts all `(proposal, score)` pairs by score from highest to lowest; order of pairs with equal scores is undefined. +5. Takes top *pre_nms_count* proposals, if total number of proposals is less than *pre_nms_count* takes all proposals. +6. Applies non-maximum suppression with *nms_threshold*. +7. Takes top *post_nms_count* proposals and returns these top proposals and their scores. If total number of proposals +is less than *post_nms_count* returns output tensors filled with zeroes. + +**Attributes**: + +* *min_size* + + * **Description**: The *min_size* attribute specifies minimum box width and height. + * **Range of values**: non-negative floating point number + * **Type**: float + * **Default value**: None + * **Required**: *yes* + +* *nms_threshold* + + * **Description**: The *nms_threshold* attribute specifies threshold to be used in the NMS stage. + * **Range of values**: non-negative floating point number + * **Type**: float + * **Default value**: None + * **Required**: *yes* + +* *pre_nms_count* + + * **Description**: The *pre_nms_count* attribute specifies number of top-n proposals before NMS. + * **Range of values**: non-negative integer number + * **Type**: int + * **Default value**: None + * **Required**: *yes* + +* *post_nms_count* + + * **Description**: The *post_nms_count* attribute specifies number of top-n proposals after NMS. + * **Range of values**: non-negative integer number + * **Type**: int + * **Default value**: None + * **Required**: *yes* + +**Inputs** + +* **1**: A 1D tensor of type *T* with 3 elements `[image_height, image_width, scale_height_and_width]` providing input +image size info. **Required.** + +* **2**: A 2D tensor of type *T* with shape `[height * width * number_of_channels, 4]` providing anchors. **Required.** + +* **3**: A 3D tensor of type *T* with shape `[number_of_channels * 4, height, width]` providing deltas for anchors. +Height and width for third and fourth inputs should be equal. **Required.** + +* **4**: A 3D tensor of type *T* with shape `[number_of_channels, height, width]` providing proposals scores. +**Required.** + +**Outputs** + +* **1**: A 2D tensor of type *T* with shape `[post_nms_count, 4]` providing ROIs. + +* **2**: A 1D tensor of type *T* with shape `[post_nms_count]` providing ROIs scores. + +**Types** + +* *T*: any supported floating point type. + +**Example** + +```xml + + + + + 3 + + + 12600 + 4 + + + 12 + 50 + 84 + + + 3 + 50 + 84 + + + + + 1000 + 4 + + + 1000 + + + +``` diff --git a/docs/ops/detection/ExperimentalDetectronPriorGridGenerator_6.md b/docs/ops/detection/ExperimentalDetectronPriorGridGenerator_6.md new file mode 100644 index 00000000000000..54a684d98bb7af --- /dev/null +++ b/docs/ops/detection/ExperimentalDetectronPriorGridGenerator_6.md @@ -0,0 +1,116 @@ +## ExperimentalDetectronPriorGridGenerator {#openvino_docs_ops_detection_ExperimentalDetectronPriorGridGenerator_6} + +**Versioned name**: *ExperimentalDetectronPriorGridGenerator-6* + +**Category**: Object detection + +**Short description**: The *ExperimentalDetectronPriorGridGenerator* operation generates prior grids of specified sizes. + +**Detailed description**: The operation takes coordinates of centres of boxes and adds strides with offset `0.5` to them to +calculate coordinates of prior grids. + +Numbers of generated cells is `featmap_height` and `featmap_width` if *h* and *w* are zeroes; otherwise, *h* and *w*, +respectively. Steps of generated grid are `image_height` / `layer_height` and `image_width` / `layer_width` if +*stride_h* and *stride_w* are zeroes; otherwise, *stride_h* and *stride_w*, respectively. + +`featmap_height`, `featmap_width`, `image_height` and `image_width` are spatial dimensions values from second and third +inputs, respectively. + +**Attributes**: + +* *flatten* + + * **Description**: The *flatten* attribute specifies whether the output tensor should be 2D or 4D. + * **Range of values**: + * `true` - the output tensor should be a 2D tensor + * `false` - the output tensor should be a 4D tensor + * **Type**: boolean + * **Default value**: true + * **Required**: *no* + +* *h* + + * **Description**: The *h* attribute specifies number of cells of the generated grid with respect to height. + * **Range of values**: non-negative integer number less or equal than `featmap_height` + * **Type**: int + * **Default value**: 0 + * **Required**: *no* + +* *w* + + * **Description**: The *w* attribute specifies number of cells of the generated grid with respect to width. + * **Range of values**: non-negative integer number less or equal than `featmap_width` + * **Type**: int + * **Default value**: 0 + * **Required**: *no* + +* *stride_x* + + * **Description**: The *stride_x* attribute specifies the step of generated grid with respect to x coordinate. + * **Range of values**: non-negative float number + * **Type**: float + * **Default value**: 0.0 + * **Required**: *no* + +* *stride_y* + + * **Description**: The *stride_y* attribute specifies the step of generated grid with respect to y coordinate. + * **Range of values**: non-negative float number + * **Type**: float + * **Default value**: 0.0 + * **Required**: *no* + +**Inputs** + +* **1**: A 2D tensor of type *T* with shape `[number_of_priors, 4]` contains priors. **Required.** + +* **2**: A 4D tensor of type *T* with input feature map `[1, number_of_channels, featmap_height, featmap_width]`. This +operation uses only sizes of this input tensor, not its data.**Required.** + +* **3**: A 4D tensor of type *T* with input image `[1, number_of_channels, image_height, image_width]`. The number of +channels of both feature map and input image tensors must match. This operation uses only sizes of this input tensor, +not its data. **Required.** + +**Outputs** + +* **1**: A tensor of type *T* with priors grid with shape `[featmap_height * featmap_width * number_of_priors, 4]` +if flatten is `true` or `[featmap_height, featmap_width, number_of_priors, 4]`, otherwise. +If 0 < *h* < `featmap_height` and/or 0 < *w* < `featmap_width` the output data size is less than +`featmap_height` * `featmap_width` * `number_of_priors` * 4 and the output tensor is filled with undefined values for +rest output tensor elements. + +**Types** + +* *T*: any supported floating point type. + +**Example** + +```xml + + + + + 3 + 4 + + + 1 + 256 + 25 + 42 + + + 1 + 3 + 800 + 1344 + + + + + 3150 + 4 + + + +``` diff --git a/docs/ops/detection/ExperimentalDetectronROIFeatureExtractor_6.md b/docs/ops/detection/ExperimentalDetectronROIFeatureExtractor_6.md new file mode 100644 index 00000000000000..a44411bc9c4cbb --- /dev/null +++ b/docs/ops/detection/ExperimentalDetectronROIFeatureExtractor_6.md @@ -0,0 +1,139 @@ +## ExperimentalDetectronROIFeatureExtractor {#openvino_docs_ops_detection_ExperimentalDetectronROIFeatureExtractor_6} + +**Versioned name**: *ExperimentalDetectronROIFeatureExtractor-6* + +**Category**: Object detection + +**Short description**: *ExperimentalDetectronROIFeatureExtractor* is the [ROIAlign](ROIAlign_3.md) operation applied +over a feature pyramid. + +**Detailed description**: *ExperimentalDetectronROIFeatureExtractor* maps input ROIs to the levels of the pyramid +depending on the sizes of ROIs and parameters of the operation, and then extracts features via ROIAlign from +corresponding pyramid levels. + +Operation applies the *ROIAlign* algorithm to the pyramid layers: + +`output[i, :, :, :] = ROIAlign(inputPyramid[j], rois[i])` + +`j = PyramidLevelMapper(rois[i])` + +PyramidLevelMapper maps the ROI to the pyramid level using the following formula: + +`j = floor(2 + log2(sqrt(w * h) / 224)` + +Here 224 is the canonical ImageNet pre-training size, 2 is the pyramid starting level, and `w`, `h` are the ROI width and height. + +For more details please see the following source: +[Feature Pyramid Networks for Object Detection](https://arxiv.org/pdf/1612.03144.pdf). + +**Attributes**: + +* *output_size* + + * **Description**: The *output_size* attribute specifies the width and height of the output tensor. + * **Range of values**: a positive integer number + * **Type**: int + * **Default value**: None + * **Required**: *yes* + +* *sampling_ratio* + + * **Description**: The *sampling_ratio* attribute specifies the number of sampling points per the output value. If 0, + then use adaptive number computed as `ceil(roi_width / output_width)`, and likewise for height. + * **Range of values**: a non-negative integer number + * **Type**: int + * **Default value**: None + * **Required**: *yes* + +* *pyramid_scales* + + * **Description**: The *pyramid_scales* enlists `image_size / layer_size[l]` ratios for pyramid layers `l=1,...,L`, + where `L` is the number of pyramid layers, and `image_size` refers to network's input image. Note that pyramid's + largest layer may have smaller size than input image, e.g. `image_size` is `800 x 1344` in the XML example below. + * **Range of values**: a list of positive integer numbers + * **Type**: int[] + * **Default value**: None + * **Required**: *yes* + +* *aligned* + + * **Description**: The *aligned* attribute specifies add offset (`-0.5`) to ROIs sizes or not. + * **Range of values**: + * `true` - add offset to ROIs sizes + * `false` - do not add offset to ROIs sizes + * **Type**: boolean + * **Default value**: false + * **Required**: *no* + +**Inputs**: + +* **1**: 2D input tensor of type *T* with shape `[number_of_ROIs, 4]` providing the ROIs as 4-tuples: +[x1, y1, x2, y2]. Coordinates *x* and *y* are refer to the network's input +*image_size*. **Required**. + +* **2**, ..., **L**: Pyramid of 4D input tensors with feature maps. Shape must be +`[1, number_of_channels, layer_size[l], layer_size[l]]`. The number of channels must be the same for all layers of the +pyramid. The layer width and height must equal to the `layer_size[l] = image_size / pyramid_scales[l]`. **Required**. + +**Outputs**: + +* **1**: 4D output tensor of type *T* with ROIs features. Shape must be +`[number_of_ROIs, number_of_channels, output_size, output_size]`. Channels number is the same as for all images in the +input pyramid. + +* **2**: 2D output tensor of type *T* with reordered ROIs according to their mapping to the pyramid levels. Shape +must be the same as for 1 input: `[number_of_ROIs, 4]`. + +**Types** + +* *T*: any supported floating point type. + +**Example** + +```xml + + + + + 1000 + 4 + + + 1 + 256 + 200 + 336 + + + 1 + 256 + 100 + 168 + + + 1 + 256 + 50 + 84 + + + 1 + 256 + 25 + 42 + + + + + 1000 + 256 + 7 + 7 + + + 1000 + 4 + + + +``` diff --git a/docs/ops/opset6.md b/docs/ops/opset6.md index bf25a29a4d4291..dbe17d468611d2 100644 --- a/docs/ops/opset6.md +++ b/docs/ops/opset6.md @@ -50,6 +50,11 @@ declared in `namespace opset6`. * [Equal](comparison/Equal_1.md) * [Erf](arithmetic/Erf_1.md) * [Exp](activation/Exp_1.md) +* [ExperimentalDetectronDetectionOutput_6](detection/ExperimentalDetectronDetectionOutput_6.md) +* [ExperimentalDetectronGenerateProposalsSingleImage_6](detection/ExperimentalDetectronGenerateProposalsSingleImage_6.md) +* [ExperimentalDetectronPriorGridGenerator_6](detection/ExperimentalDetectronPriorGridGenerator_6.md) +* [ExperimentalDetectronROIFeatureExtractor_6](detection/ExperimentalDetectronROIFeatureExtractor_6.md) +* [ExperimentalDetectronTopKROIs_6](sort/ExperimentalDetectronTopKROIs_6.md) * [ExtractImagePatches](movement/ExtractImagePatches_3.md) * [FakeQuantize](quantization/FakeQuantize_1.md) * [Floor](arithmetic/Floor_1.md) diff --git a/docs/ops/sort/ExperimentalDetectronTopKROIs_6.md b/docs/ops/sort/ExperimentalDetectronTopKROIs_6.md new file mode 100644 index 00000000000000..107f6311c53c3e --- /dev/null +++ b/docs/ops/sort/ExperimentalDetectronTopKROIs_6.md @@ -0,0 +1,61 @@ +## ExperimentalDetectronTopKROIs {#openvino_docs_ops_sort_ExperimentalDetectronTopKROIs_6} + +**Versioned name**: *ExperimentalDetectronTopKROIs-6* + +**Category**: Sort + +**Short description**: The *ExperimentalDetectronTopKROIs* operation is TopK operation applied to probabilities of input +ROIs. + +**Detailed description**: The operation performs probabilities descending sorting for input ROIs and returns *max_rois* +number of ROIs. Order of sorted ROIs with equal probabilities is undefined. If the number of ROIs is less than *max_rois* +then operation returns all ROIs descended sorted and the output tensor is filled with undefined values for the rest of +output tensor elements. + +**Attributes**: + +* *max_rois* + + * **Description**: The *max_rois* attribute specifies maximal numbers of output ROIs. + * **Range of values**: non-negative integer number + * **Type**: int + * **Default value**: 0 + * **Required**: *no* + +**Inputs** + +* **1**: A 2D tensor of type *T* with shape `[number_of_ROIs, 4]` describing the ROIs as 4-tuples: +[x1, y1, x2, y2]. **Required.** + +* **2**: A 1D tensor of type *T* with shape `[number_of_input_ROIs]` contains probabilities for input ROIs. **Required.** + +**Outputs** + +* **1**: A 2D tensor of type *T* with shape `[max_rois, 4]` describing *max_rois* ROIs with highest probabilities. + +**Types** + +* *T*: any supported floating point type. + +**Example** + +```xml + + + + + 5000 + 4 + + + 5000 + + + + + 1000 + 4 + + + +```