-
Notifications
You must be signed in to change notification settings - Fork 248
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[RetinaNet] Image Converter and ObjectDetector (#1906)
* Rebased phase 1 changes * Rebased phase 1 changes * nit * Retina Phase 2 * nit * Expose Anchor Generator as layer, docstring correction and test correction * nit * Add missing args for prediction heads * - Use FeaturePyramidBackbone cls for RetinaNet backbone. - Correct test cases. * fix decoding error * - Add ground truth arg for RetinaNet model and remove source and target format from preprocessor * nit * Subclass Imageconverter and overload call method for object detection method * Revert "Subclass Imageconverter and overload call method for object detection method" This reverts commit 3b26d3a. * add names to layers * correct fpn coarser level as per torch retinanet model * nit * Polish Prediction head and fpn layers to include flags and norm layers * nit * nit * add prior probability flag for prediction head to use it for classification head and user friendly * compute_shape seems redudant here and correct layers for channels_first * keep compute_output_shape for fpn * nit * Change AnchorGen Implementation as per torch * correct the source format of anchors format * use plain rescaling and normalization no resizing for od models as it can effect the bounding boxes and the ops i backend framework dependent * use single bbox format for model * - Add arg for encoding format - Add required docstrings - Use `center_xywh` encoding for retinanet as per torch weights * make anchor generator optional * init as layers for anchor generator and label encoder and as one more arg for prediction head configuration * nit * - only consider levels from min level to backbone maxlevel fro feature extraction from image encoder * nit * nit * update resizing as per new keras3 resizing layer for bboxes * Revert "update resizing as per new keras3 resizing layer for bboxes" This reverts commit eb555ca. * Add TODO's for keras bounding box ops * Use keras layers to rescale and normalize * check with plain values * use convert_preprocessing_inputs function for basic operations as backend cause some gpu misplacement * use keras for init variables * modify task test for cases when test runs on gpu * modify the order of steps * fix tensor device placement error for torch backend * this should fix error while image size is give and not given cases * use numpy arrays * make `yxyx` as default bbox format and some nit * use image_size argument so that we dont break presets * Add retinanet_resnet50_fpn_coco preset * register retinanet presets
- Loading branch information
Showing
24 changed files
with
1,504 additions
and
230 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,2 @@ | ||
# TODO: Once all bounding boxes are moved to keras repostory remove the | ||
# bounding box folder. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,87 @@ | ||
import keras | ||
|
||
from keras_hub.src.api_export import keras_hub_export | ||
from keras_hub.src.models.task import Task | ||
|
||
|
||
@keras_hub_export("keras_hub.models.ImageObjectDetector") | ||
class ImageObjectDetector(Task): | ||
"""Base class for all image object detection tasks. | ||
The `ImageObjectDetector` tasks wrap a `keras_hub.models.Backbone` and | ||
a `keras_hub.models.Preprocessor` to create a model that can be used for | ||
object detection. `ImageObjectDetector` tasks take an additional | ||
`num_classes` argument, controlling the number of predicted output classes. | ||
To fine-tune with `fit()`, pass a dataset containing tuples of `(x, y)` | ||
labels where `x` is a string and `y` is dictionary with `boxes` and | ||
`classes`. | ||
All `ImageObjectDetector` tasks include a `from_preset()` constructor which | ||
can be used to load a pre-trained config and weights. | ||
""" | ||
|
||
def compile( | ||
self, | ||
optimizer="auto", | ||
box_loss="auto", | ||
classification_loss="auto", | ||
metrics=None, | ||
**kwargs, | ||
): | ||
"""Configures the `ImageObjectDetector` task for training. | ||
The `ImageObjectDetector` task extends the default compilation signature of | ||
`keras.Model.compile` with defaults for `optimizer`, `loss`, and | ||
`metrics`. To override these defaults, pass any value | ||
to these arguments during compilation. | ||
Args: | ||
optimizer: `"auto"`, an optimizer name, or a `keras.Optimizer` | ||
instance. Defaults to `"auto"`, which uses the default optimizer | ||
for the given model and task. See `keras.Model.compile` and | ||
`keras.optimizers` for more info on possible `optimizer` values. | ||
box_loss: `"auto"`, a loss name, or a `keras.losses.Loss` instance. | ||
Defaults to `"auto"`, where a | ||
`keras.losses.Huber` loss will be | ||
applied for the object detector task. See | ||
`keras.Model.compile` and `keras.losses` for more info on | ||
possible `loss` values. | ||
classification_loss: `"auto"`, a loss name, or a `keras.losses.Loss` | ||
instance. Defaults to `"auto"`, where a | ||
`keras.losses.BinaryFocalCrossentropy` loss will be | ||
applied for the object detector task. See | ||
`keras.Model.compile` and `keras.losses` for more info on | ||
possible `loss` values. | ||
metrics: `a list of metrics to be evaluated by | ||
the model during training and testing. Defaults to `None`. | ||
See `keras.Model.compile` and `keras.metrics` for | ||
more info on possible `metrics` values. | ||
**kwargs: See `keras.Model.compile` for a full list of arguments | ||
supported by the compile method. | ||
""" | ||
if optimizer == "auto": | ||
optimizer = keras.optimizers.Adam(5e-5) | ||
if box_loss == "auto": | ||
box_loss = keras.losses.Huber(reduction="sum") | ||
if classification_loss == "auto": | ||
activation = getattr(self, "activation", None) | ||
activation = keras.activations.get(activation) | ||
from_logits = activation != keras.activations.sigmoid | ||
classification_loss = keras.losses.BinaryFocalCrossentropy( | ||
from_logits=from_logits, reduction="sum" | ||
) | ||
if metrics is not None: | ||
raise ValueError("User metrics not yet supported") | ||
|
||
losses = { | ||
"bbox_regression": box_loss, | ||
"cls_logits": classification_loss, | ||
} | ||
|
||
super().compile( | ||
optimizer=optimizer, | ||
loss=losses, | ||
metrics=metrics, | ||
**kwargs, | ||
) |
57 changes: 57 additions & 0 deletions
57
keras_hub/src/models/image_object_detector_preprocessor.py
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,57 @@ | ||
import keras | ||
|
||
from keras_hub.src.api_export import keras_hub_export | ||
from keras_hub.src.models.preprocessor import Preprocessor | ||
from keras_hub.src.utils.tensor_utils import preprocessing_function | ||
|
||
|
||
@keras_hub_export("keras_hub.models.ImageObjectDetectorPreprocessor") | ||
class ImageObjectDetectorPreprocessor(Preprocessor): | ||
"""Base class for object detector preprocessing layers. | ||
`ImageObjectDetectorPreprocessor` tasks wraps a | ||
`keras_hub.layers.Preprocessor` to create a preprocessing layer for | ||
object detection tasks. It is intended to be paired with a | ||
`keras_hub.models.ImageObjectDetector` task. | ||
All `ImageObjectDetectorPreprocessor` take three inputs, `x`, `y`, and | ||
`sample_weight`. `x`, the first input, should always be included. It can | ||
be a image or batch of images. See examples below. `y` and `sample_weight` | ||
are optional inputs that will be passed through unaltered. Usually, `y` will | ||
be the a dict of `{"boxes": Tensor(batch_size, num_boxes, 4), | ||
"classes": (batch_size, num_boxes)}. | ||
The layer will returns either `x`, an `(x, y)` tuple if labels were provided, | ||
or an `(x, y, sample_weight)` tuple if labels and sample weight were | ||
provided. `x` will be the input images after all model preprocessing has | ||
been applied. | ||
All `ImageObjectDetectorPreprocessor` tasks include a `from_preset()` | ||
constructor which can be used to load a pre-trained config and vocabularies. | ||
You can call the `from_preset()` constructor directly on this base class, in | ||
which case the correct class for your model will be automatically | ||
instantiated. | ||
Args: | ||
image_converter: Preprocessing pipeline for images. | ||
Examples. | ||
```python | ||
preprocessor = keras_hub.models.ImageObjectDetectorPreprocessor.from_preset( | ||
"retinanet_resnet50", | ||
) | ||
""" | ||
|
||
def __init__( | ||
self, | ||
image_converter=None, | ||
**kwargs, | ||
): | ||
super().__init__(**kwargs) | ||
self.image_converter = image_converter | ||
|
||
@preprocessing_function | ||
def call(self, x, y=None, sample_weight=None): | ||
if self.image_converter: | ||
x = self.image_converter(x) | ||
return keras.utils.pack_x_y_sample_weight(x, y, sample_weight) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,5 @@ | ||
from keras_hub.src.models.retinanet.retinanet_backbone import RetinaNetBackbone | ||
from keras_hub.src.models.retinanet.retinanet_presets import backbone_presets | ||
from keras_hub.src.utils.preset_utils import register_presets | ||
|
||
register_presets(backbone_presets, RetinaNetBackbone) |
Oops, something went wrong.