-
Notifications
You must be signed in to change notification settings - Fork 613
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implementation of Normalizations #14
Conversation
…rm and LayerNorm and first testcase
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Smokrow Tyvm for the PR. In general looks good, few changes requested.
Regarding test cases, as you mentioned there are some more things we'll want to check, including some of the tests from: https://github.com/tensorflow/tensorflow/blob/v1.12.0/tensorflow/contrib/layers/python/layers/normalization_test.py
Looks like a layer normalization layer has been added to |
@karmel as discussed in our monthly meeting... LayerNorm was added to keras experimental which kind of stomps on the implementation we were going to add. Could you please update us on the keras experimental roadmap? Additionally, addons was going to add GroupNorm as a generalized normalization case (See below image). Any thoughts on how we should proceed? |
Notes from discussions with the Keras team-- In general, this makes more sense in Addons than in experimental. It was added to experimental as a stopgap for some migration work, but we all agree this is a better fit for Addons than experimental, as the scope of use-cases is fairly narrow. In the future, we will make sure to de-dupe with Addons before adding to experimental, and to prefer Addons unless we are actually adding a Layer/etc. that will end up in core, but has some API kinks to work out. So, for this PR, if you could go ahead and dedupe with the experimental implementation and push here, we will remove the experimental implementation and use this one instead. Thanks, all, for working on this. |
@Smokrow Would you be able to modify the test coverage to be as extensive as the reference in tf.keras experimental. Then we can merge and request the removal of LayerNormalization from core |
On it. Thx for clearifying 👍 |
Hi, could you resolve conflicts? |
This is still not finished. |
@seanpmorgan I am having some trouble with running the tests. I am getting the following while running the
Looks like it is deep down in the Eager execution but I am quite confused whats going on down there 😄 |
I check out the feature branch, and fail to run bazel tests. @Smokrow Hi, Moritz. Could you fix bazel configuration and make sure we can reproduce your problem? It would be greatly helpful for debug :-) |
@seanpmorgan ready for review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I leave some comments, and most of them are trivial code style problem. Thanks for the PR :-)
group_axes.insert(1, self.groups) | ||
|
||
# reshape inputs to new group shape | ||
group_shape = [group_axes[0], self.groups] + group_axes[2:] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can it handle both channel-first and channel-last format?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes. at this point the ordering would be [batch, group, channels, steps]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@facaiy since you are setting up an axis to work on I am not quite sure if you can handle "channel first" and "channel last". When somebody sets his axis exactly on the channel axis I think he should be allowed to do that.
groups: Integer, the number of groups for Group Normalization. | ||
Can be in the range [1, N] where N is the input dimension. | ||
The input dimension must be divisible by the number of groups. | ||
axis: Integer, the axis that should be normalized |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the case of 4D input tensor, the axis that's normalized is either C/G, H, W
, or Batch, G
, depend on your definition of "be normalized". But it is in no way "C".
BatchNorm layer takes axis=C, therefore if you want to make an analogy here, it would be axis=[Batch, G]. This analogy is a bit ugly, so I think a better way is to still define axis to be the channel dimension, and write a clearer documentation about what this layer actually does for 2D and 4D tensors, respectively.
Same comment applies to other norms.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it is better to remove this explanation completly since it does not really belong into the code docs. Currently there are colab notebooks planned for better documentation and explanation. I am currently writing one for layer/group/instance normalization layers. If you want I can reference you in the PR when it is finished.
removed wrong documentation
Removed explanation of layers. Will be added to colab
@seanpmorgan @facaiy ready for review 👍 |
I don't have other comments. The description for the "axis" argument is still not very accurate but it seems there are other plans to address it. |
* Remove tf.logging as part of TF2 * Add normaliztion layers to init * Update READMEs
So there's good news and bad news. 👍 The good news is that everyone that needs to sign a CLA (the pull request submitter and all commit authors) have done so. Everything is all good there. 😕 The bad news is that it appears that one or more commits were authored or co-authored by someone other than the pull request submitter. We need to confirm that all authors are ok with their commits being contributed to this project. Please have them confirm that here in the pull request. Note to project maintainer: This is a terminal state, meaning the ℹ️ Googlers: Go here for more info. |
Thanks so much for the contribution Moritz. Made some minor formatting changes & integrations with the project if you want to check them out. Looking forward to the example/demo notebook. Thanks for the review @ppwwyyxx we'll be sure to address the |
* Implemented GroupNorm,InstanceNorm and LayerNorm
First draft of a file for Normalizations implementing #6 .
I have Implemented GroupNorm, InstanceNorm and LayerNorm and a first testcase for GroupNorm( I will add a few more and will also implement some for Layer/Instance norm).
Could you give me a quick feedback on the implementation or if something is missing?
Thx in advance