-
Notifications
You must be signed in to change notification settings - Fork 613
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add rrelu kernel #573
add rrelu kernel #573
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@fsx950223 Thanks for the contribution! Generally LGTM. I know this is WIP but want to address some points first. Thanks again for bringing rrelu up!
tensorflow_addons/custom_ops/activations/cc/kernels/rrelu_op.cc
Outdated
Show resolved
Hide resolved
tensorflow_addons/custom_ops/activations/cc/kernels/rrelu_op_gpu.cu.cc
Outdated
Show resolved
Hide resolved
It seems the formal(5) shows in https://arxiv.org/pdf/1505.00853.pdf is wrong.It should be x*(lower+upper)/2 |
I could confirm that pytorch computes like |
x = tf.constant([-2.0, -1.0, 0.0, 1.0, 2.0], dtype=dtype) | ||
lower = 0.2 | ||
upper = 0.2 | ||
result, alpha = rrelu(x, lower, upper, training=True) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you have any idea about random test? @WindQAQ
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we do not have to return alpha
for rrelu OP. In your experiment, setting seed in TF is not enough, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Argument with_alpha can control the return behavior and I don't need seed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@fsx950223 I do not think it's a good practice to have an argument to return such things for public API. Is alpha deterministic while setting the same TF seed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, it isn't and I don't know how to let alpha deterministic.Is there a better way to set seed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Umm, I think we could just test values with training=False
. As for gradient testing, we should check both of them. How do feel about this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
changed
9a8e542
to
bd20659
Compare
Hi @fsx950223 when time allows, mind refactoring to use the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Generally LGTM. Some API designs are needed to discuss. Thanks!
if with_alpha: | ||
return result, alpha | ||
else: | ||
return result |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have no idea if we should expose an argument to return alpha
or not. My survey is that chainer does but pytorch does not. What do you think @fsx950223 ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I return alpha just for test. It could be removed if the test case is fixed without alpha.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The problem seems has been solved by using philox random.
grad = t.gradient(result, x) | ||
expect_grad = _ref_rrelu_grad(x, alpha, dtype) | ||
self.assertAllCloseAccordingToType( | ||
grad, expect_grad, atol=1e-4) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we use tf.test.compute_gradient
to check gradients? Just like what you do in hardtanh
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The test failed when the training
argument is True and test failed when training
argument is False and data is 0.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@fsx950223 Sorry for the late reply. Could you push the codes so that I could test it locally? Thanks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@fsx950223 Sorry for the late reply. Could you push the codes so that I could test it locally? Thanks.
How do you debug the code. I can't step into source code anymore since tf2 released. @WindQAQ
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Edit: I successfully re-configured the environment with ./configure.sh
on the docker early today.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe _compute_numeric_jacobian is not suitable for this test case.Because the activcation has different gradients around 0. Please reference Relu test case
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe _compute_numeric_jacobian is not suitable for this test case.Because the activcation has different gradients around 0. Please reference Relu test case
I think it's ok to modify the input value to avoid non-smooth part. Please see relu's gradient check.
https://github.com/tensorflow/tensorflow/blob/66ea3ed9b8cbbbf01b0eabb14e436883895e4bde/tensorflow/python/kernel_tests/relu_op_test.py#L123
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The problem is that I can't go to code definition.
I tried it in Vscode and Pycharm, but neither of them worked.
Ohoh. Got it. But could the code execute successfully if ide can not find the definition of functions? Sorry that I usually use vim so that I cannot understand what you talk about for a while... Thanks!
Good
Tzu-Wei Sung <[email protected]<mailto:[email protected]>> 于 2019年10月20日周日 下午10:43写道:
@WindQAQ commented on this pull request.
________________________________
In tensorflow_addons/activations/rrelu_test.py<#573 (comment)>:
+ ("float64", np.float64))
+ @tf.function
+ def test_theoretical_gradients(self, dtype):
+ x = tf.constant([-2.0, -1.0, 0.0, 1.0, 2.0], dtype=dtype)
+ lower = 0.1
+ upper = 0.2
+ for training in [True, False]:
+ with self.subTest(training=training):
+ with tf.GradientTape() as t:
+ t.watch(x)
+ result, alpha = rrelu(
+ x, lower, upper, training=training, with_alpha=True)
+ grad = t.gradient(result, x)
+ expect_grad = _ref_rrelu_grad(x, alpha, dtype)
+ self.assertAllCloseAccordingToType(
+ grad, expect_grad, atol=1e-4)
I believe _compute_numeric_jacobian is not suitable for this test case.Because the activcation has different gradients around 0. Please reference Relu test case <https://github.com/tensorflow/tensorflow/blob/66ea3ed9b8cbbbf01b0eabb14e436883895e4bde/tensorflow/python/kernel_tests/relu_op_test.py#L61>
I think it's ok to modify the input value to avoid non-smooth part. Please see relu's gradient check.
https://github.com/tensorflow/tensorflow/blob/66ea3ed9b8cbbbf01b0eabb14e436883895e4bde/tensorflow/python/kernel_tests/relu_op_test.py#L123
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub<#573?email_source=notifications&email_token=AEGHB4YP7H2WHX7KATPATDLQPRU75A5CNFSM4I53DJ72YY3PNVWWK3TUL52HS4DFWFIHK3DMKJSXC5LFON2FEZLWNFSXPKTDN5WW2ZLOORPWSZGOCIRKWHA#discussion_r336781718>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AEGHB44BSBZ77YZDPTGRMU3QPRU75ANCNFSM4I53DJ7Q>.
|
Amazing. There is something wrong debug by vscode.But it works well with pycharm.
Tzu-Wei Sung <[email protected]<mailto:[email protected]>> 于 2019年10月20日周日 下午10:47写道:
@WindQAQ commented on this pull request.
________________________________
In tensorflow_addons/activations/rrelu_test.py<#573 (comment)>:
+ ("float64", np.float64))
+ @tf.function
+ def test_theoretical_gradients(self, dtype):
+ x = tf.constant([-2.0, -1.0, 0.0, 1.0, 2.0], dtype=dtype)
+ lower = 0.1
+ upper = 0.2
+ for training in [True, False]:
+ with self.subTest(training=training):
+ with tf.GradientTape() as t:
+ t.watch(x)
+ result, alpha = rrelu(
+ x, lower, upper, training=training, with_alpha=True)
+ grad = t.gradient(result, x)
+ expect_grad = _ref_rrelu_grad(x, alpha, dtype)
+ self.assertAllCloseAccordingToType(
+ grad, expect_grad, atol=1e-4)
The problem is that I can't go to code definition.
[Screenshot from 2019-10-20 17-49-01]<https://user-images.githubusercontent.com/17592563/67157689-1a97a080-f362-11e9-8cd5-4a67fcbb47a4.png>
I tried it in Vscode and Pycharm, but neither of them worked.
Ohoh. Got it. But could the code execute successfully if ide can not find the definition of functions? Sorry that I usually use vim so that I cannot understand what you talk about for a while... Thanks!
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub<#573?email_source=notifications&email_token=AEGHB46G442FC6CHWUBNDP3QPRVQVA5CNFSM4I53DJ72YY3PNVWWK3TUL52HS4DFWFIHK3DMKJSXC5LFON2FEZLWNFSXPKTDN5WW2ZLOORPWSZGOCIRKX6Y#discussion_r336781937>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AEGHB443GY4T3AZUNYQBRV3QPRVQVANCNFSM4I53DJ7Q>.
|
The CPU kernel and the GPU kernel have different behaviors? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the nice and long work!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@fsx950223 Nice PR. Thanks for the contribution!
kernel implemention of rrelu. @WindQAQ