Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add specifications for ExperimentalDetectron* oprations #5128

Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions docs/doxygen/ie_docs.xml
Original file line number Diff line number Diff line change
Expand Up @@ -135,6 +135,11 @@ limitations under the License.
<tab type="user" title="Equal-1" url="@ref openvino_docs_ops_comparison_Equal_1"/>
<tab type="user" title="Erf-1" url="@ref openvino_docs_ops_arithmetic_Erf_1"/>
<tab type="user" title="Exp-1" url="@ref openvino_docs_ops_activation_Exp_1"/>
<tab type="user" title="ExperimentalDetectronDetectionOutput-6" url="@ref openvino_docs_ops_detection_ExperimentalDetectronDetectionOutput_6"/>
<tab type="user" title="ExperimentalDetectronGenerateProposalsSingleImage-6" url="@ref openvino_docs_ops_detection_ExperimentalDetectronGenerateProposalsSingleImage_6"/>
<tab type="user" title="ExperimentalDetectronPriorGridGenerator-6" url="@ref openvino_docs_ops_detection_ExperimentalDetectronPriorGridGenerator_6"/>
<tab type="user" title="ExperimentalDetectronROIFeatureExtractor-6" url="@ref openvino_docs_ops_detection_ExperimentalDetectronROIFeatureExtractor_6"/>
<tab type="user" title="ExperimentalDetectronTopKROIs-6" url="@ref openvino_docs_ops_sort_ExperimentalDetectronTopKROIs_6"/>
<tab type="user" title="ExtractImagePatches-3" url="@ref openvino_docs_ops_movement_ExtractImagePatches_3"/>
<tab type="user" title="FakeQuantize-1" url="@ref openvino_docs_ops_quantization_FakeQuantize_1"/>
<tab type="user" title="FloorMod-1" url="@ref openvino_docs_ops_arithmetic_FloorMod_1"/>
Expand Down
193 changes: 193 additions & 0 deletions docs/ops/detection/ExperimentalDetectronDetectionOutput_6.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,193 @@
## ExperimentalDetectronDetectionOutput <a name="ExperimentalDetectronDetectionOutput"></a> {#openvino_docs_ops_detection_ExperimentalDetectronDetectionOutput_6}

**Versioned name**: *ExperimentalDetectronDetectionOutput-6*

**Category**: Object detection

**Short description**: The *ExperimentalDetectronDetectionOutput* operation performs non-maximum suppression to generate
the detection output using information on location and score predictions.

**Detailed description**: The operation performs the following steps:

1. Applies deltas to boxes sizes [x<sub>1</sub>, y<sub>1</sub>, x<sub>2</sub>, y<sub>2</sub>] and takes coordinates of
refined boxes according to the formulas:

`x1_new = ctr_x + (dx - 0.5 * exp(min(d_log_w, max_delta_log_wh))) * box_w`

`y0_new = ctr_y + (dy - 0.5 * exp(min(d_log_h, max_delta_log_wh))) * box_h`

`x1_new = ctr_x + (dx + 0.5 * exp(min(d_log_w, max_delta_log_wh))) * box_w - 1.0`

`y1_new = ctr_y + (dy + 0.5 * exp(min(d_log_h, max_delta_log_wh))) * box_h - 1.0`

* `box_w` and `box_h` are width and height of box, respectively:

`box_w = x1 - x0 + 1.0`

`box_h = y1 - y0 + 1.0`

* `ctr_x` and `ctr_y` are center location of a box:

`ctr_x = x0 + 0.5f * box_w`

`ctr_y = y0 + 0.5f * box_h`

* `dx`, `dy`, `d_log_w` and `d_log_h` are deltas calculated according to the formulas below, and `deltas_tensor` is a
second input:

`dx = deltas_tensor[roi_idx, 4 * class_idx + 0] / deltas_weights[0]`

`dy = deltas_tensor[roi_idx, 4 * class_idx + 1] / deltas_weights[1]`

`d_log_w = deltas_tensor[roi_idx, 4 * class_idx + 2] / deltas_weights[2]`

`d_log_h = deltas_tensor[roi_idx, 4 * class_idx + 3] / deltas_weights[3]`

2. If *class_agnostic_box_regression* is `true` removes predictions for background classes.
3. Clips boxes to the image.
4. Applies *score_threshold* on detection scores.
5. Applies non-maximum suppression class-wise with *nms_threshold* and returns *post_nms_count* or less detections per
class.
6. Returns *max_detections_per_image* detections if total number of detections is more than *max_detections_per_image*;
otherwise, returns total number of detections and the output tensor is filled with undefined values for rest output
tensor elements.

**Attributes**:

* *score_threshold*

* **Description**: The *score_threshold* attribute specifies a threshold to consider only detections whose score are
larger than the threshold.
* **Range of values**: non-negative floating point number
* **Type**: float
* **Default value**: None
* **Required**: *yes*

* *nms_threshold*

* **Description**: The *nms_threshold* attribute specifies a threshold to be used in the NMS stage.
* **Range of values**: non-negative floating point number
* **Type**: float
* **Default value**: None
* **Required**: *yes*

* *num_classes*

* **Description**: The *num_classes* attribute specifies the number of detected classes.
* **Range of values**: non-negative integer number
* **Type**: int
* **Default value**: None
* **Required**: *yes*

* *post_nms_count*

* **Description**: The *post_nms_count* attribute specifies the maximal number of detections per class.
* **Range of values**: non-negative integer number
* **Type**: int
* **Default value**: None
* **Required**: *yes*

* *max_detections_per_image*

* **Description**: The *max_detections_per_image* attribute specifies maximal number of detections per image.
* **Range of values**: non-negative integer number
* **Type**: int
* **Default value**: None
* **Required**: *yes*

* *class_agnostic_box_regression*

* **Description**: *class_agnostic_box_regression* attribute ia a flag specifies whether to delete background
classes or not.
* **Range of values**:
* `true` means background classes should be deleted
* `false` means background classes should not be deleted
* **Type**: boolean
* **Default value**: false
* **Required**: *no*

* *max_delta_log_wh*

* **Description**: The *max_delta_log_wh* attribute specifies maximal delta of logarithms for width and height.
* **Range of values**: floating point number
* **Type**: float
* **Default value**: None
* **Required**: *yes*

* *deltas_weights*

* **Description**: The *deltas_weights* attribute specifies weights for bounding boxes sizes deltas.
* **Range of values**: a list of non-negative floating point numbers
* **Type**: float[]
* **Default value**: None
* **Required**: *yes*

**Inputs**

* **1**: A 2D tensor of type *T* with input ROIs, with shape `[number_of_ROIs, 4]` providing the ROIs as 4-tuples:
[x<sub>1</sub>, y<sub>1</sub>, x<sub>2</sub>, y<sub>2</sub>]. The batch dimension of first, second, and third inputs
should be the same. **Required.**

* **2**: A 2D tensor of type *T* with shape `[number_of_ROIs, num_classes * 4]` providing deltas for input boxes.
**Required.**

* **3**: A 2D tensor of type *T* with shape `[number_of_ROIs, num_classes]` providing detections scores. **Required.**

* **4**: A 2D tensor of type *T* with shape `[1, 3]` contains three elements
`[image_height, image_width, scale_height_and_width]` providing input image size info. **Required.**

**Outputs**

* **1**: A 2D tensor of type *T* with shape `[max_detections_per_image, 4]` providing boxes indices.

* **2**: A 1D tensor of type *T_IND* with shape `[max_detections_per_image]` providing classes indices.

* **3**: A 1D tensor of type *T* with shape `[max_detections_per_image]` providing scores indices.

**Types**

* *T*: any supported floating point type.

* *T_IND*: `int64` or `int32`.


**Example**

```xml
<layer ... type="ExperimentalDetectronDetectionOutput" version="opset6">
<data class_agnostic_box_regression="false" deltas_weights="10.0,10.0,5.0,5.0" max_delta_log_wh="4.135166645050049" max_detections_per_image="100" nms_threshold="0.5" num_classes="81" post_nms_count="2000" score_threshold="0.05000000074505806"/>
<input>
<port id="0">
<dim>1000</dim>
<dim>4</dim>
</port>
<port id="1">
<dim>1000</dim>
<dim>324</dim>
</port>
<port id="2">
<dim>1000</dim>
<dim>81</dim>
</port>
<port id="3">
<dim>1</dim>
<dim>3</dim>
</port>
</input>
<output>
<port id="4" precision="FP32">
<dim>100</dim>
<dim>4</dim>
</port>
<port id="5" precision="I32">
<dim>100</dim>
</port>
<port id="6" precision="FP32">
<dim>100</dim>
</port>
<port id="7" precision="I32">
<dim>100</dim>
</port>
</output>
</layer>
```
Original file line number Diff line number Diff line change
@@ -0,0 +1,112 @@
## ExperimentalDetectronGenerateProposalsSingleImage <a name="ExperimentalDetectronGenerateProposalsSingleImage"></a> {#openvino_docs_ops_detection_ExperimentalDetectronGenerateProposalsSingleImage_6}

**Versioned name**: *ExperimentalDetectronGenerateProposalsSingleImage-6*

**Category**: Object detection

**Short description**: The *ExperimentalDetectronGenerateProposalsSingleImage* operation computes ROIs and their scores
based on input data.

**Detailed description**: The operation performs the following steps:

1. Transposes and reshapes predicted bounding boxes deltas and scores to get them into the same order as the anchors.
2. Transforms anchors into proposals using deltas and clips proposals to an image.
3. Removes predicted boxes with either height or width < *min_size*.
4. Sorts all `(proposal, score)` pairs by score from highest to lowest; order of pairs with equal scores is undefined.
5. Takes top *pre_nms_count* proposals, if total number of proposals is less than *pre_nms_count* takes all proposals.
6. Applies non-maximum suppression with *nms_threshold*.
7. Takes top *post_nms_count* proposals and returns these top proposals and their scores. If total number of proposals
is less than *post_nms_count* returns output tensors filled with zeroes.

**Attributes**:

* *min_size*

* **Description**: The *min_size* attribute specifies minimum box width and height.
* **Range of values**: non-negative floating point number
* **Type**: float
* **Default value**: None
* **Required**: *yes*

* *nms_threshold*

* **Description**: The *nms_threshold* attribute specifies threshold to be used in the NMS stage.
* **Range of values**: non-negative floating point number
* **Type**: float
* **Default value**: None
* **Required**: *yes*

* *pre_nms_count*

* **Description**: The *pre_nms_count* attribute specifies number of top-n proposals before NMS.
* **Range of values**: non-negative integer number
* **Type**: int
* **Default value**: None
* **Required**: *yes*

* *post_nms_count*

* **Description**: The *post_nms_count* attribute specifies number of top-n proposals after NMS.
* **Range of values**: non-negative integer number
* **Type**: int
* **Default value**: None
* **Required**: *yes*

**Inputs**

* **1**: A 1D tensor of type *T* with 3 elements `[image_height, image_width, scale_height_and_width]` providing input
image size info. **Required.**

* **2**: A 2D tensor of type *T* with shape `[height * width * number_of_channels, 4]` providing anchors. **Required.**

* **3**: A 3D tensor of type *T* with shape `[number_of_channels * 4, height, width]` providing deltas for anchors.
Height and width for third and fourth inputs should be equal. **Required.**

* **4**: A 3D tensor of type *T* with shape `[number_of_channels, height, width]` providing proposals scores.
**Required.**

**Outputs**

* **1**: A 2D tensor of type *T* with shape `[post_nms_count, 4]` providing ROIs.

* **2**: A 1D tensor of type *T* with shape `[post_nms_count]` providing ROIs scores.

**Types**

* *T*: any supported floating point type.

**Example**

```xml
<layer ... type="ExperimentalDetectronGenerateProposalsSingleImage" version="opset6">
<data min_size="0.0" nms_threshold="0.699999988079071" post_nms_count="1000" pre_nms_count="1000"/>
<input>
<port id="0">
<dim>3</dim>
</port>
<port id="1">
<dim>12600</dim>
<dim>4</dim>
</port>
<port id="2">
<dim>12</dim>
<dim>50</dim>
<dim>84</dim>
</port>
<port id="3">
<dim>3</dim>
<dim>50</dim>
<dim>84</dim>
</port>
</input>
<output>
<port id="4" precision="FP32">
<dim>1000</dim>
<dim>4</dim>
</port>
<port id="5" precision="FP32">
<dim>1000</dim>
</port>
</output>
</layer>
```
Loading