Skip to content

mobilenet_v3 added in keras-nlp #1782

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 29 commits into from
Aug 28, 2024
Merged

Conversation

ushareng
Copy link
Collaborator

No description provided.

Copy link
Member

@mattdangerw mattdangerw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

Copy link
Collaborator

@divyashreepathihalli divyashreepathihalli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR Usha! I left a few comments.
One question I have is, can we consolidate MobileNet architectures into a generic backbone like we did for Resnet here - https://github.com/keras-team/keras-nlp/blob/keras-hub/keras_nlp/src/models/resnet/resnet_backbone.py
I found this blog here on the mobilenet version - https://medium.com/@pandrii000/mobilenet-architectures-17fe7406d794
The pattern we are following for KerasHub is - if the model architectures are slightly different and can be tweaked with a few args, we add them that way.

@ushareng ushareng self-assigned this Aug 19, 2024
@ushareng
Copy link
Collaborator Author

Thanks for the PR Usha! I left a few comments. One question I have is, can we consolidate MobileNet architectures into a generic backbone like we did for Resnet here - https://github.com/keras-team/keras-nlp/blob/keras-hub/keras_nlp/src/models/resnet/resnet_backbone.py I found this blog here on the mobilenet version - https://medium.com/@pandrii000/mobilenet-architectures-17fe7406d794 The pattern we are following for KerasHub is - if the model architectures are slightly different and can be tweaked with a few args, we add them that way.

Hi @divyashreepathihalli , I went through the MobileNetV2 architecture here https://github.com/keras-team/keras/blob/v3.5.0/keras/src/applications/mobilenet_v2.py.
It can be integrated with V3. I would like to clarify whether, in V2, the number of layers and corresponding parameters, which are currently hardcoded, should remain unchanged, or if we should generalize the approach to allow for these values to be derived from parameters like in v3

@mattdangerw
Copy link
Member

@ushareng we should allow for the parameters to be constructor arguments reflected in the backbone's config.

Stuff like these lines should be an anti-pattern, at least for this port. One way to think about it is to imaging browsing the config.json online. https://www.kaggle.com/models/keras/bert?select=config.json It should tell you a lot about the structure of the model itself. Number of layers, number filters at different place in the model, etc.

Copy link
Member

@mattdangerw mattdangerw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! We need to decompose v2 vs v3 into a series of simple arguments to the model. left some suggestions on how that might work.

Copy link
Member

@mattdangerw mattdangerw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Some more comments

@mattdangerw mattdangerw added the kokoro:force-run Runs Tests on GPU label Aug 23, 2024
@kokoro-team kokoro-team removed the kokoro:force-run Runs Tests on GPU label Aug 23, 2024
Copy link
Member

@mattdangerw mattdangerw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looking much better! just some minor comments

output_filter: specifies whether to add conv and batch_norm in the end,
if set to None, it will not add these layers in the end.
'None' for MobileNetV1
activation: activation function to be used
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe let's do

input_filters
input_activation
output_filters
output_activation

Otherwise what this activation refers to is very unclear.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are using same activation function in the input and output

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think the duplication would hurt anything. We want these to be understandable to end users. If I see...

  • input_filters
  • stackwise_activation
  • activation
  • output_filters

I wouldn't be able to tell you what activation does. Nor does the docstring clear it up.

If I see...

  • input_filters
  • input_activation
  • stackwise_activation
  • stackwise_...
  • output_filters
  • output_activation

I would have a good guess at what is going on, even before reading the docstring.

@mattdangerw mattdangerw added the kokoro:force-run Runs Tests on GPU label Aug 26, 2024
@kokoro-team kokoro-team removed the kokoro:force-run Runs Tests on GPU label Aug 26, 2024
Copy link
Member

@mattdangerw mattdangerw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Just some final small renaming comments.

output_filter: specifies whether to add conv and batch_norm in the end,
if set to None, it will not add these layers in the end.
'None' for MobileNetV1
activation: activation function to be used
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think the duplication would hurt anything. We want these to be understandable to end users. If I see...

  • input_filters
  • stackwise_activation
  • activation
  • output_filters

I wouldn't be able to tell you what activation does. Nor does the docstring clear it up.

If I see...

  • input_filters
  • input_activation
  • stackwise_activation
  • stackwise_...
  • output_filters
  • output_activation

I would have a good guess at what is going on, even before reading the docstring.

Copy link
Collaborator

@divyashreepathihalli divyashreepathihalli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@divyashreepathihalli divyashreepathihalli added the kokoro:force-run Runs Tests on GPU label Aug 27, 2024
@kokoro-team kokoro-team removed the kokoro:force-run Runs Tests on GPU label Aug 27, 2024
Copy link
Member

@mattdangerw mattdangerw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm! Thanks for all the work on this.

@mattdangerw mattdangerw merged commit 9289ab7 into keras-team:keras-hub Aug 28, 2024
10 checks passed
@sachinprasadhs sachinprasadhs linked an issue Sep 3, 2024 that may be closed by this pull request
@divyashreepathihalli
Copy link
Collaborator

@ushareng Thanks for contributing this model to KerasHub. Can you please also provide weights conversion script for this model? We will not be able to publish the model without the weights and presets added on Kaggle.

mattdangerw pushed a commit to mattdangerw/keras-hub that referenced this pull request Sep 10, 2024
* mobilenet_v3 added in keras-nlp

* minor bug fixed in mobilenet_v3_backbone

* formatting corrected

* refactoring backbone

* correct_pad_downsample method added

* refactoring backbone

* parameters updated

* Testcaseupdated, expected output shape corrected

* code formatted with black

* testcase updated

* refactoring and description added

* comments updated

* added mobilenet v1 and v2

* merge conflict resolved

* version arg removed, and config options added

* input_shape changed to image_shape in arg

* config updated

* input shape corrected

* comments resolved

* activation function format changed

* minor bug fixed

* minor bug fixed

* added vision_backbone_test

* channel_first bug resolved

* channel_first cases working

* comments  resolved

* formatting fixed

* refactoring

---------

Co-authored-by: ushareng <[email protected]>
mattdangerw pushed a commit that referenced this pull request Sep 11, 2024
* mobilenet_v3 added in keras-nlp

* minor bug fixed in mobilenet_v3_backbone

* formatting corrected

* refactoring backbone

* correct_pad_downsample method added

* refactoring backbone

* parameters updated

* Testcaseupdated, expected output shape corrected

* code formatted with black

* testcase updated

* refactoring and description added

* comments updated

* added mobilenet v1 and v2

* merge conflict resolved

* version arg removed, and config options added

* input_shape changed to image_shape in arg

* config updated

* input shape corrected

* comments resolved

* activation function format changed

* minor bug fixed

* minor bug fixed

* added vision_backbone_test

* channel_first bug resolved

* channel_first cases working

* comments  resolved

* formatting fixed

* refactoring

---------

Co-authored-by: ushareng <[email protected]>
mattdangerw pushed a commit that referenced this pull request Sep 13, 2024
* mobilenet_v3 added in keras-nlp

* minor bug fixed in mobilenet_v3_backbone

* formatting corrected

* refactoring backbone

* correct_pad_downsample method added

* refactoring backbone

* parameters updated

* Testcaseupdated, expected output shape corrected

* code formatted with black

* testcase updated

* refactoring and description added

* comments updated

* added mobilenet v1 and v2

* merge conflict resolved

* version arg removed, and config options added

* input_shape changed to image_shape in arg

* config updated

* input shape corrected

* comments resolved

* activation function format changed

* minor bug fixed

* minor bug fixed

* added vision_backbone_test

* channel_first bug resolved

* channel_first cases working

* comments  resolved

* formatting fixed

* refactoring

---------

Co-authored-by: ushareng <[email protected]>
mattdangerw pushed a commit that referenced this pull request Sep 17, 2024
* mobilenet_v3 added in keras-nlp

* minor bug fixed in mobilenet_v3_backbone

* formatting corrected

* refactoring backbone

* correct_pad_downsample method added

* refactoring backbone

* parameters updated

* Testcaseupdated, expected output shape corrected

* code formatted with black

* testcase updated

* refactoring and description added

* comments updated

* added mobilenet v1 and v2

* merge conflict resolved

* version arg removed, and config options added

* input_shape changed to image_shape in arg

* config updated

* input shape corrected

* comments resolved

* activation function format changed

* minor bug fixed

* minor bug fixed

* added vision_backbone_test

* channel_first bug resolved

* channel_first cases working

* comments  resolved

* formatting fixed

* refactoring

---------

Co-authored-by: ushareng <[email protected]>
divyashreepathihalli added a commit that referenced this pull request Sep 25, 2024
* Add VGG16 backbone (#1737)

* Agg Vgg16 backbone

* update names

* update tests

* update test

* add image classifier

* incorporate review comments

* Update test case

* update backbone test

* add image classifier

* classifier cleanup

* code reformat

* add vgg16 image classifier

* make vgg generic

* update doc string

* update docstring

* add classifier test

* update tests

* update docstring

* address review comments

* code reformat

* update the configs

* address review comments

* fix task saved model test

* update init

* code reformatted

* Add `ResNetBackbone` and `ResNetImageClassifier` (#1765)

* Add ResNetV1 and ResNetV2

* Address comments

* Add CSP DarkNet backbone and classifier (#1774)

* Add CSP DarkNet

* Add CSP DarkNet

* snake_case function names

* change use_depthwise to block_type

* Add `FeaturePyramidBackbone` and port weights from `timm` for `ResNetBackbone` (#1769)

* Add FeaturePyramidBackbone and update ResNetBackbone

* Simplify the implementation

* Fix CI

* Make ResNetBackbone compatible with timm and add FeaturePyramidBackbone

* Add conversion implementation

* Update docstrings

* Address comments

* Add DenseNet (#1775)

* Add DenseNet

* fix testcase

* address comments

* nit

* fix lint errors

* move description

* Add ViTDetBackbone (#1776)

* add vit det vit_det_backbone

* update docstring

* code reformat

* fix tests

* address review comments

* bump year on all files

* address review comments

* rename backbone

* fix tests

* change back to ViT

* address review comments

* update image shape

* Add Mix transformer (#1780)

* Add MixTransformer

* fix testcase

* test changes and comments

* lint fix

* update config list

* modify testcase for 2 layers

* update input_image_shape -> image_shape (#1785)

* update input_image_shape -> image_shape

* update docstring example

* code reformat

* update tests

* Create __init__.py (#1788)

add missing __init__ file to vit_det

* Hack package build script to rename to keras-hub (#1793)

This is a temporary way to test out the keras-hub branch.
- Does a global rename of all symbols during package build.
- Registers the "old" name on symbol export for saving compat.
- Adds a github action to publish every commit to keras-hub as
  a new package.
- Removes our descriptions on PyPI temporarily, until we want
  to message this more broadly.

* Add CLIP and T5XXL for StableDiffusionV3 (#1790)

* Add `CLIPTokenizer`, `T5XXLTokenizer`, `CLIPTextEncoder` and `T5XXLTextEncoder`.

* Make CLIPTextEncoder as Backbone

* Add `T5XXLPreprocessor` and remove `T5XXLTokenizer`

Add `CLIPPreprocessor`

* Use `tf = None` at the top

* Replace manual implementation of `CLIPAttention` with `MultiHeadAttention`

* Add Bounding Box Utils (#1791)

* Bounding box utils

* - Correct test cases

* - Remove hard tensorflow dtype

* - fix api gen

* - Fix import for test cases
- Use setup for converters test case

* - fix api_gen issue

* - FIx api gen

* - Fix api gen error

* - Correct test cases as per new api changes

* mobilenet_v3 added in keras-nlp (#1782)

* mobilenet_v3 added in keras-nlp

* minor bug fixed in mobilenet_v3_backbone

* formatting corrected

* refactoring backbone

* correct_pad_downsample method added

* refactoring backbone

* parameters updated

* Testcaseupdated, expected output shape corrected

* code formatted with black

* testcase updated

* refactoring and description added

* comments updated

* added mobilenet v1 and v2

* merge conflict resolved

* version arg removed, and config options added

* input_shape changed to image_shape in arg

* config updated

* input shape corrected

* comments resolved

* activation function format changed

* minor bug fixed

* minor bug fixed

* added vision_backbone_test

* channel_first bug resolved

* channel_first cases working

* comments  resolved

* formatting fixed

* refactoring

---------

Co-authored-by: ushareng <[email protected]>

* Pkgoogle/efficient net migration (#1778)

* migrating efficientnet models to keras-hub

* merging changes from other sources

* autoformatting pass

* initial consolidation of efficientnet_backbone

* most updates and removing separate implementation

* cleanup, autoformatting, keras generalization

* removed layer examples outside of effiicient net

* many, mainly documentation changes, small test fixes

* Add the ResNet_vd backbone (#1766)

* Add ResNet_vd to ResNet backbone

* Addressed requested parameter changes

* Fixed tests and updated comments

* Added new parameters to docstring

* Add `VAEImageDecoder` for StableDiffusionV3 (#1796)

* Add `VAEImageDecoder` for StableDiffusionV3

* Use `keras.Model` for `VAEImageDecoder` and follows the coding style in `VAEAttention`

* Replace `Backbone` with `keras.Model` in `CLIPTextEncoder` and `T5XXLTextEncoder` (#1802)

* Add pyramid output for densenet, cspDarknet (#1801)

* add pyramid outputs

* fix testcase

* format fix

* make common testcase for pyramid outputs

* change default shape

* simplify testcase

* test case change and add channel axis

* Add `MMDiT` for StableDiffusionV3 (#1806)

* Add `MMDiT`

* Update

* Update

* Update implementation

* Add remaining bbox utils (#1804)

* - Add formats, iou, utils for bounding box

* - Add `AnchorGenerator`, `BoxMatcher` and `NonMaxSupression` layers

* - Remove scope_name  not required.

* use default keras name scope

* - Correct format error

* - Remove layers as of now and keep them at model level till keras core supports them

* - Correct api_gen

* Fix timm conversion for rersnet (#1814)

* Add `StableDiffusion3`

* Fix `_normalize_inputs`

* Separate CLIP encoders from SD3 backbone.

* Simplify `text_to_image` function.

* Address comments

* Minor update and add docstrings.

* Add VGG16 backbone (#1737)

* Agg Vgg16 backbone

* update names

* update tests

* update test

* add image classifier

* incorporate review comments

* Update test case

* update backbone test

* add image classifier

* classifier cleanup

* code reformat

* add vgg16 image classifier

* make vgg generic

* update doc string

* update docstring

* add classifier test

* update tests

* update docstring

* address review comments

* code reformat

* update the configs

* address review comments

* fix task saved model test

* update init

* code reformatted

* Add `ResNetBackbone` and `ResNetImageClassifier` (#1765)

* Add ResNetV1 and ResNetV2

* Address comments

* Add CSP DarkNet backbone and classifier (#1774)

* Add CSP DarkNet

* Add CSP DarkNet

* snake_case function names

* change use_depthwise to block_type

* Add `FeaturePyramidBackbone` and port weights from `timm` for `ResNetBackbone` (#1769)

* Add FeaturePyramidBackbone and update ResNetBackbone

* Simplify the implementation

* Fix CI

* Make ResNetBackbone compatible with timm and add FeaturePyramidBackbone

* Add conversion implementation

* Update docstrings

* Address comments

* Add DenseNet (#1775)

* Add DenseNet

* fix testcase

* address comments

* nit

* fix lint errors

* move description

* Add ViTDetBackbone (#1776)

* add vit det vit_det_backbone

* update docstring

* code reformat

* fix tests

* address review comments

* bump year on all files

* address review comments

* rename backbone

* fix tests

* change back to ViT

* address review comments

* update image shape

* Add Mix transformer (#1780)

* Add MixTransformer

* fix testcase

* test changes and comments

* lint fix

* update config list

* modify testcase for 2 layers

* update input_image_shape -> image_shape (#1785)

* update input_image_shape -> image_shape

* update docstring example

* code reformat

* update tests

* Create __init__.py (#1788)

add missing __init__ file to vit_det

* Hack package build script to rename to keras-hub (#1793)

This is a temporary way to test out the keras-hub branch.
- Does a global rename of all symbols during package build.
- Registers the "old" name on symbol export for saving compat.
- Adds a github action to publish every commit to keras-hub as
  a new package.
- Removes our descriptions on PyPI temporarily, until we want
  to message this more broadly.

* Add CLIP and T5XXL for StableDiffusionV3 (#1790)

* Add `CLIPTokenizer`, `T5XXLTokenizer`, `CLIPTextEncoder` and `T5XXLTextEncoder`.

* Make CLIPTextEncoder as Backbone

* Add `T5XXLPreprocessor` and remove `T5XXLTokenizer`

Add `CLIPPreprocessor`

* Use `tf = None` at the top

* Replace manual implementation of `CLIPAttention` with `MultiHeadAttention`

* Add Bounding Box Utils (#1791)

* Bounding box utils

* - Correct test cases

* - Remove hard tensorflow dtype

* - fix api gen

* - Fix import for test cases
- Use setup for converters test case

* - fix api_gen issue

* - FIx api gen

* - Fix api gen error

* - Correct test cases as per new api changes

* mobilenet_v3 added in keras-nlp (#1782)

* mobilenet_v3 added in keras-nlp

* minor bug fixed in mobilenet_v3_backbone

* formatting corrected

* refactoring backbone

* correct_pad_downsample method added

* refactoring backbone

* parameters updated

* Testcaseupdated, expected output shape corrected

* code formatted with black

* testcase updated

* refactoring and description added

* comments updated

* added mobilenet v1 and v2

* merge conflict resolved

* version arg removed, and config options added

* input_shape changed to image_shape in arg

* config updated

* input shape corrected

* comments resolved

* activation function format changed

* minor bug fixed

* minor bug fixed

* added vision_backbone_test

* channel_first bug resolved

* channel_first cases working

* comments  resolved

* formatting fixed

* refactoring

---------

Co-authored-by: ushareng <[email protected]>

* Pkgoogle/efficient net migration (#1778)

* migrating efficientnet models to keras-hub

* merging changes from other sources

* autoformatting pass

* initial consolidation of efficientnet_backbone

* most updates and removing separate implementation

* cleanup, autoformatting, keras generalization

* removed layer examples outside of effiicient net

* many, mainly documentation changes, small test fixes

* Add the ResNet_vd backbone (#1766)

* Add ResNet_vd to ResNet backbone

* Addressed requested parameter changes

* Fixed tests and updated comments

* Added new parameters to docstring

* Add `VAEImageDecoder` for StableDiffusionV3 (#1796)

* Add `VAEImageDecoder` for StableDiffusionV3

* Use `keras.Model` for `VAEImageDecoder` and follows the coding style in `VAEAttention`

* Replace `Backbone` with `keras.Model` in `CLIPTextEncoder` and `T5XXLTextEncoder` (#1802)

* Add pyramid output for densenet, cspDarknet (#1801)

* add pyramid outputs

* fix testcase

* format fix

* make common testcase for pyramid outputs

* change default shape

* simplify testcase

* test case change and add channel axis

* Add `MMDiT` for StableDiffusionV3 (#1806)

* Add `MMDiT`

* Update

* Update

* Update implementation

* Add remaining bbox utils (#1804)

* - Add formats, iou, utils for bounding box

* - Add `AnchorGenerator`, `BoxMatcher` and `NonMaxSupression` layers

* - Remove scope_name  not required.

* use default keras name scope

* - Correct format error

* - Remove layers as of now and keep them at model level till keras core supports them

* - Correct api_gen

* Fix timm conversion for rersnet (#1814)

* Fix

* Update

* Rename to diffuser and decoder

* Define functional model

* Merge from upstream/master

* Delete old SD3

* Fix copyright

* Rename to keras_hub

* Address comments

* Update

* Fix CI

* Fix bugs occurred in keras3.1

---------

Co-authored-by: Divyashree Sreepathihalli <[email protected]>
Co-authored-by: Sachin Prasad <[email protected]>
Co-authored-by: Matt Watson <[email protected]>
Co-authored-by: Siva Sravana Kumar Neeli <[email protected]>
Co-authored-by: Usha Rengaraju <[email protected]>
Co-authored-by: ushareng <[email protected]>
Co-authored-by: pkgoogle <[email protected]>
Co-authored-by: gowthamkpr <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add MobileNetV3
4 participants