Add specification for ExperimentalDetectron* oprations (#5128)

openvinotoolkit · Apr 7, 2021 · 1c84064 · 1c84064
1 parent c370284
commit 1c84064
Show file tree

Hide file tree

Showing 7 changed files with 631 additions and 0 deletions.
diff --git a/docs/doxygen/ie_docs.xml b/docs/doxygen/ie_docs.xml
@@ -135,6 +135,11 @@ limitations under the License.
                 <tab type="user" title="Equal-1" url="@ref openvino_docs_ops_comparison_Equal_1"/>
                 <tab type="user" title="Erf-1" url="@ref openvino_docs_ops_arithmetic_Erf_1"/>
                 <tab type="user" title="Exp-1" url="@ref openvino_docs_ops_activation_Exp_1"/>
+                <tab type="user" title="ExperimentalDetectronDetectionOutput-6" url="@ref openvino_docs_ops_detection_ExperimentalDetectronDetectionOutput_6"/>
+                <tab type="user" title="ExperimentalDetectronGenerateProposalsSingleImage-6" url="@ref openvino_docs_ops_detection_ExperimentalDetectronGenerateProposalsSingleImage_6"/>
+                <tab type="user" title="ExperimentalDetectronPriorGridGenerator-6" url="@ref openvino_docs_ops_detection_ExperimentalDetectronPriorGridGenerator_6"/>
+                <tab type="user" title="ExperimentalDetectronROIFeatureExtractor-6" url="@ref openvino_docs_ops_detection_ExperimentalDetectronROIFeatureExtractor_6"/>
+                <tab type="user" title="ExperimentalDetectronTopKROIs-6" url="@ref openvino_docs_ops_sort_ExperimentalDetectronTopKROIs_6"/>
                 <tab type="user" title="ExtractImagePatches-3" url="@ref openvino_docs_ops_movement_ExtractImagePatches_3"/>
                 <tab type="user" title="FakeQuantize-1" url="@ref openvino_docs_ops_quantization_FakeQuantize_1"/>
                 <tab type="user" title="FloorMod-1" url="@ref openvino_docs_ops_arithmetic_FloorMod_1"/>

diff --git a/docs/ops/detection/ExperimentalDetectronDetectionOutput_6.md b/docs/ops/detection/ExperimentalDetectronDetectionOutput_6.md
@@ -0,0 +1,193 @@
+## ExperimentalDetectronDetectionOutput <a name="ExperimentalDetectronDetectionOutput"></a> {#openvino_docs_ops_detection_ExperimentalDetectronDetectionOutput_6}
+
+**Versioned name**: *ExperimentalDetectronDetectionOutput-6*
+
+**Category**: Object detection
+
+**Short description**: The *ExperimentalDetectronDetectionOutput* operation performs non-maximum suppression to generate
+the detection output using information on location and score predictions.
+
+**Detailed description**: The operation performs the following steps:
+
+1.  Applies deltas to boxes sizes [x<sub>1</sub>, y<sub>1</sub>, x<sub>2</sub>, y<sub>2</sub>] and takes coordinates of
+refined boxes according to the formulas:
+
+`x1_new = ctr_x + (dx - 0.5 * exp(min(d_log_w, max_delta_log_wh))) * box_w`
+
+`y0_new = ctr_y + (dy - 0.5 * exp(min(d_log_h, max_delta_log_wh))) * box_h`
+
+`x1_new = ctr_x + (dx + 0.5 * exp(min(d_log_w, max_delta_log_wh))) * box_w - 1.0`
+
+`y1_new = ctr_y + (dy + 0.5 * exp(min(d_log_h, max_delta_log_wh))) * box_h - 1.0`
+
+* `box_w` and `box_h` are width and height of box, respectively:
+
+`box_w = x1 - x0 + 1.0`
+
+`box_h = y1 - y0 + 1.0`
+
+* `ctr_x` and `ctr_y` are center location of a box:
+
+`ctr_x = x0 + 0.5f * box_w`
+
+`ctr_y = y0 + 0.5f * box_h`
+
+* `dx`, `dy`, `d_log_w` and `d_log_h` are deltas calculated according to the formulas below, and `deltas_tensor` is a
+second input:
+
+`dx = deltas_tensor[roi_idx, 4 * class_idx + 0] / deltas_weights[0]`
+
+`dy = deltas_tensor[roi_idx, 4 * class_idx + 1] / deltas_weights[1]`
+
+`d_log_w = deltas_tensor[roi_idx, 4 * class_idx + 2] / deltas_weights[2]`
+
+`d_log_h = deltas_tensor[roi_idx, 4 * class_idx + 3] / deltas_weights[3]`
+
+2.  If *class_agnostic_box_regression* is `true` removes predictions for background classes.
+3.  Clips boxes to the image.
+4.  Applies *score_threshold* on detection scores.
+5.  Applies non-maximum suppression class-wise with *nms_threshold* and returns *post_nms_count* or less detections per
+class.
+6.  Returns *max_detections_per_image* detections if total number of detections is more than *max_detections_per_image*;
+otherwise, returns total number of detections and the output tensor is filled with undefined values for rest output
+tensor elements.
+
+**Attributes**:
+
+* *score_threshold*
+
+    * **Description**: The *score_threshold* attribute specifies a threshold to consider only detections whose score are
+    larger than the threshold.
+    * **Range of values**: non-negative floating point number
+    * **Type**: float
+    * **Default value**: None
+    * **Required**: *yes*
+
+* *nms_threshold*
+
+    * **Description**: The *nms_threshold* attribute specifies a threshold to be used in the NMS stage.
+    * **Range of values**: non-negative floating point number
+    * **Type**: float
+    * **Default value**: None
+    * **Required**: *yes*
+
+* *num_classes*
+
+    * **Description**: The *num_classes* attribute specifies the number of detected classes.
+    * **Range of values**: non-negative integer number
+    * **Type**: int
+    * **Default value**: None
+    * **Required**: *yes*
+
+* *post_nms_count*
+
+    * **Description**: The *post_nms_count* attribute specifies the maximal number of detections per class.
+    * **Range of values**: non-negative integer number
+    * **Type**: int
+    * **Default value**: None
+    * **Required**: *yes*
+
+* *max_detections_per_image*
+
+    * **Description**: The *max_detections_per_image* attribute specifies maximal number of detections per image.
+    * **Range of values**: non-negative integer number
+    * **Type**: int
+    * **Default value**: None
+    * **Required**: *yes*
+
+* *class_agnostic_box_regression*
+
+    * **Description**: *class_agnostic_box_regression* attribute ia a flag specifies whether to delete background
+    classes or not.
+    * **Range of values**:
+      * `true` means background classes should be deleted
+      * `false` means background classes should not be deleted
+    * **Type**: boolean
+    * **Default value**: false
+    * **Required**: *no*
+
+* *max_delta_log_wh*
+
+    * **Description**: The *max_delta_log_wh* attribute specifies maximal delta of logarithms for width and height.
+    * **Range of values**: floating point number
+    * **Type**: float
+    * **Default value**: None
+    * **Required**: *yes*
+
+* *deltas_weights*
+
+    * **Description**: The *deltas_weights* attribute specifies weights for bounding boxes sizes deltas.
+    * **Range of values**: a list of non-negative floating point numbers
+    * **Type**: float[]
+    * **Default value**: None
+    * **Required**: *yes*
+
+**Inputs**
+
+* **1**: A 2D tensor of type *T* with input ROIs, with shape `[number_of_ROIs, 4]` providing the ROIs as 4-tuples:
+[x<sub>1</sub>, y<sub>1</sub>, x<sub>2</sub>, y<sub>2</sub>]. The batch dimension of first, second, and third inputs
+should be the same. **Required.**
+
+* **2**: A 2D tensor of type *T* with shape `[number_of_ROIs, num_classes * 4]` providing deltas for input boxes.
+ **Required.**
+
+* **3**: A 2D tensor of type *T* with shape `[number_of_ROIs, num_classes]` providing detections scores. **Required.**
+
+* **4**: A 2D tensor of type *T* with shape `[1, 3]` contains three elements
+ `[image_height, image_width, scale_height_and_width]` providing input image size info. **Required.**
+
+**Outputs**
+
+* **1**: A 2D tensor of type *T* with shape `[max_detections_per_image, 4]` providing boxes indices.
+
+* **2**: A 1D tensor of type *T_IND* with shape `[max_detections_per_image]` providing classes indices.
+
+* **3**: A 1D tensor of type *T* with shape `[max_detections_per_image]` providing scores indices.
+
+**Types**
+
+* *T*: any supported floating point type.
+
+* *T_IND*: `int64` or `int32`.
+
+
+**Example**
+
+```xml
+<layer ... type="ExperimentalDetectronDetectionOutput" version="opset6">
+    <data class_agnostic_box_regression="false" deltas_weights="10.0,10.0,5.0,5.0" max_delta_log_wh="4.135166645050049" max_detections_per_image="100" nms_threshold="0.5" num_classes="81" post_nms_count="2000" score_threshold="0.05000000074505806"/>
+    <input>
+        <port id="0">
+            <dim>1000</dim>
+            <dim>4</dim>
+        </port>
+        <port id="1">
+            <dim>1000</dim>
+            <dim>324</dim>
+        </port>
+        <port id="2">
+            <dim>1000</dim>
+            <dim>81</dim>
+        </port>
+        <port id="3">
+            <dim>1</dim>
+            <dim>3</dim>
+        </port>
+    </input>
+    <output>
+        <port id="4" precision="FP32">
+            <dim>100</dim>
+            <dim>4</dim>
+        </port>
+        <port id="5" precision="I32">
+            <dim>100</dim>
+        </port>
+        <port id="6" precision="FP32">
+            <dim>100</dim>
+        </port>
+        <port id="7" precision="I32">
+            <dim>100</dim>
+        </port>
+    </output>
+</layer>
+```
diff --git a/docs/ops/detection/ExperimentalDetectronGenerateProposalsSingleImage_6.md b/docs/ops/detection/ExperimentalDetectronGenerateProposalsSingleImage_6.md
@@ -0,0 +1,112 @@
+## ExperimentalDetectronGenerateProposalsSingleImage <a name="ExperimentalDetectronGenerateProposalsSingleImage"></a> {#openvino_docs_ops_detection_ExperimentalDetectronGenerateProposalsSingleImage_6}
+
+**Versioned name**: *ExperimentalDetectronGenerateProposalsSingleImage-6*
+
+**Category**: Object detection
+
+**Short description**: The *ExperimentalDetectronGenerateProposalsSingleImage* operation computes ROIs and their scores
+based on input data.
+
+**Detailed description**: The operation performs the following steps:
+
+1.  Transposes and reshapes predicted bounding boxes deltas and scores to get them into the same order as the anchors.
+2.  Transforms anchors into proposals using deltas and clips proposals to an image.
+3.  Removes predicted boxes with either height or width < *min_size*.
+4.  Sorts all `(proposal, score)` pairs by score from highest to lowest; order of pairs with equal scores is undefined.
+5.  Takes top *pre_nms_count* proposals, if total number of proposals is less than *pre_nms_count* takes all proposals.
+6.  Applies non-maximum suppression with *nms_threshold*.
+7.  Takes top *post_nms_count* proposals and returns these top proposals and their scores. If total number of proposals
+is less than *post_nms_count* returns output tensors filled with zeroes.
+
+**Attributes**:
+
+* *min_size*
+
+    * **Description**: The *min_size* attribute specifies minimum box width and height.
+    * **Range of values**: non-negative floating point number
+    * **Type**: float
+    * **Default value**: None
+    * **Required**: *yes*
+
+* *nms_threshold*
+
+    * **Description**: The *nms_threshold* attribute specifies threshold to be used in the NMS stage.
+    * **Range of values**: non-negative floating point number
+    * **Type**: float
+    * **Default value**: None
+    * **Required**: *yes*
+
+* *pre_nms_count*
+
+    * **Description**: The *pre_nms_count* attribute specifies number of top-n proposals before NMS.
+    * **Range of values**: non-negative integer number
+    * **Type**: int
+    * **Default value**: None
+    * **Required**: *yes*
+
+* *post_nms_count*
+
+    * **Description**: The *post_nms_count* attribute specifies number of top-n proposals after NMS.
+    * **Range of values**: non-negative integer number
+    * **Type**: int
+    * **Default value**: None
+    * **Required**: *yes*
+
+**Inputs**
+
+* **1**: A 1D tensor of type *T* with 3 elements `[image_height, image_width, scale_height_and_width]` providing input
+image size info. **Required.**
+
+* **2**: A 2D tensor of type *T* with shape `[height * width * number_of_channels, 4]` providing anchors. **Required.**
+
+* **3**: A 3D tensor of type *T* with shape `[number_of_channels * 4, height, width]` providing deltas for anchors.
+Height and width for third and fourth inputs should be equal. **Required.**
+
+* **4**: A 3D tensor of type *T* with shape `[number_of_channels, height, width]` providing proposals scores.
+**Required.**
+
+**Outputs**
+
+* **1**: A 2D tensor of type *T* with shape `[post_nms_count, 4]` providing ROIs.
+
+* **2**: A 1D tensor of type *T* with shape `[post_nms_count]` providing ROIs scores.
+
+**Types**
+
+* *T*: any supported floating point type.
+
+**Example**
+
+```xml
+<layer ... type="ExperimentalDetectronGenerateProposalsSingleImage" version="opset6">
+    <data min_size="0.0" nms_threshold="0.699999988079071" post_nms_count="1000" pre_nms_count="1000"/>
+    <input>
+        <port id="0">
+            <dim>3</dim>
+        </port>
+        <port id="1">
+            <dim>12600</dim>
+            <dim>4</dim>
+        </port>
+        <port id="2">
+            <dim>12</dim>
+            <dim>50</dim>
+            <dim>84</dim>
+        </port>
+        <port id="3">
+            <dim>3</dim>
+            <dim>50</dim>
+            <dim>84</dim>
+        </port>
+    </input>
+    <output>
+        <port id="4" precision="FP32">
+            <dim>1000</dim>
+            <dim>4</dim>
+        </port>
+        <port id="5" precision="FP32">
+            <dim>1000</dim>
+        </port>
+    </output>
+</layer>
+```