New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Feat/make transformer decoder callable without causal mask #1083

Merged

mattdangerw merged 7 commits into keras-team:master from ferraric:feat/make-TransformerDecoder-callable-without-causal-mask

Jun 21, 2023

Contributor

ferraric commented Jun 17, 2023 •

edited

Loading

Fixes #1062

Claudio Ferrari added 2 commits

June 17, 2023 12:39


          feat: make using causal mask a parameter

99ecfc6


          refactor: do not overwrite variables

58eb7da

google-cla bot commented Jun 17, 2023

Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

View this failed invocation of the CLA check for more information.

For the most up to date status, view the checks section at the bottom of the pull request.

ferraric commented

View reviewed changes

Contributor Author

ferraric left a comment

Did not add a test specifically for that since the current testing strategy does not test any of the other "_mask" arguments to the call() function. Since this argument is at the same abstraction level I decided to follow the convention

keras_nlp/layers/transformer_decoder.py Outdated

Comment on lines 255 to 256

		use_causal_mask: bool, defaults to True. If true, causal mask
		(masking out future input) is applied on the decoder sequence.

Contributor Author

ferraric Jun 17, 2023

naming and documentation from previous version where this was in: https://github.com/keras-team/keras-nlp/blob/cb0fa028971475879911ddf042a1473037775ee6/keras_nlp/layers/transformer_decoder.py#L191-L192

Member

mattdangerw Jun 20, 2023

this lgtm, and matches the multi-head attention layer in core Keras

keras_nlp/layers/transformer_decoder.py Show resolved Hide resolved

ferraric marked this pull request as ready for review

June 17, 2023 10:51

ferraric mentioned this pull request

Make causal mask in TransformerDecoder optional #1062

Closed

jbischof requested a review from mattdangerw

June 17, 2023 14:31

jbischof approved these changes

View reviewed changes

Contributor

jbischof left a comment

Looks great thanks for the contribution!

jbischof reviewed

View reviewed changes

keras_nlp/layers/transformer_decoder.py Show resolved Hide resolved


          docs: add causal mask behavior to class level docstring again

03de072

mattdangerw approved these changes

View reviewed changes

Member

mattdangerw left a comment

Looks great! Thank you. Few minor comments.

keras_nlp/layers/transformer_decoder.py Show resolved Hide resolved

keras_nlp/layers/transformer_decoder.py Outdated Show resolved Hide resolved

keras_nlp/layers/transformer_decoder.py Outdated

Comment on lines 255 to 256

		use_causal_mask: bool, defaults to True. If true, causal mask
		(masking out future input) is applied on the decoder sequence.

Member

mattdangerw Jun 20, 2023

this lgtm, and matches the multi-head attention layer in core Keras

keras_nlp/layers/transformer_decoder.py Outdated

-                      if decoder_mask is not None:
-                          self_attention_mask = tf.minimum(decoder_mask, self_attention_mask)
+                      if use_causal_mask:
+                          batch_size = tf.shape(decoder_sequence)[0]

Member

mattdangerw Jun 20, 2023

I think this branching is getting complex enough that we should split a private method for this

self_attention_mask = self._compute_self_attention_mask(
    decoder_sequence,
    self_attention_cache,
    self_attention_cache_update_index,
    use_causal_mask,
)

Contributor Author

ferraric Jun 21, 2023

good point, done

keras_nlp/layers/transformer_decoder.py Show resolved Hide resolved

Claudio Ferrari added 4 commits

June 21, 2023 13:12


          docs: improve documentation of argument

cea5cd7


          refactor: move to method

fa6e6a9


          refactor: remove redundant comment

68f0856


          refactor: directly return

1d9113b

Member

mattdangerw commented Jun 21, 2023

/gcbrun

mattdangerw merged commit 71a6a2a into keras-team:master

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet