Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix Fenced Doc-String #782 #785

Merged
merged 13 commits into from
Mar 3, 2023
Merged

Conversation

atharvapurdue
Copy link
Contributor

Following changes were made in Doc-String Examples -

  • Add models to the list in docstring_test.py
  • Import from the root of the library
  • Fix typo in f_net_classifier.py
  • add [MASK] in Vocaburay
  • remove num_heads parameter

The number of failed examples has gone down from 5 to 1, however I am getting 1 error in File "/content/keras-nlp/keras-nlp/keras-nlp/keras_nlp/models/f_net/f_net_masked_lm.py", line 73, in keras_nlp.models.f_net.f_net_masked_lm.FNetMaskedLM-

Node: 'f_net_masked_lm_1/f_net_backbone_3/segment_embedding/embedding_lookup' indices[0,3] = 4 is not in [0, 4) [[{{node f_net_masked_lm_1/f_net_backbone_3/segment_embedding/embedding_lookup}}]] [Op:__inference_train_function_458156]
Might need help from @abheesht17 , or any other developer who can assist.
Thanks!
Fixes keras-team/keras-nlp #782

@abheesht17
Copy link
Collaborator

Thanks, @atharvapurdue! The reason why it's throwing an error is because the default value of num_segments is 4 for FNet. Hence, this throws an embedding lookup error: https://github.com/keras-team/keras-nlp/blob/master/keras_nlp/models/f_net/f_net_backbone.py#L126-L131. The best way to resolve it is to just change

        "segment_ids": tf.constant(
            [[1, 0, 0, 4, 0, 6, 7, 8]] * 2, shape=(2, 8)
        ),

to values less than 4. Just change it to something like [[0, 0, 0, 1, 1, 1, 0, 0]] * 2.

@atharvapurdue
Copy link
Contributor Author

atharvapurdue commented Feb 27, 2023

Yes! @abheesht17. I thought so, but wasn't sure. I will make the necessary changes. Thank you!

@atharvapurdue atharvapurdue changed the title Fix Fenced Doc-String (Partially) #782 Fix Fenced Doc-String #782 Feb 27, 2023
@atharvapurdue
Copy link
Contributor Author

Done ! @abheesht17 , Thanks for your help!
You can see the doc_string_test passing all tests here - https://colab.research.google.com/drive/1xVlUGJIDSH44bF851HZ01yD24ocJmul5#scrollTo=8DErjytmUTv_

Copy link
Member

@mattdangerw mattdangerw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, a few minor comments.

keras_nlp/models/distil_bert/distil_bert_preprocessor.py Outdated Show resolved Hide resolved
@@ -114,6 +114,9 @@ def test_fenced_docstrings():
"keras_nlp.models.xlm_roberta.xlm_roberta_preprocessor",
"keras_nlp.models.f_net.f_net_preprocessor",
"keras_nlp.models.f_net.f_net_tokenizer",
"keras_nlp.tokenizers.byte_pair_tokenizer",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think these three additions should be in the "base classes" block a few lines up.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Made the required changes @mattdangerw. Please have a look

@mattdangerw
Copy link
Member

Thank you! Works great!

@mattdangerw mattdangerw merged commit 791428e into keras-team:master Mar 3, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants