Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Fenced Docstring Testing #640

Merged
merged 22 commits into from
Jan 17, 2023

Conversation

abheesht17
Copy link
Collaborator

@abheesht17 abheesht17 commented Jan 6, 2023

Resolves #420

Copied over from https://github.com/tensorflow/tensorflow/blob/master/tensorflow/tools/docs/fenced_doctest_lib.py.

Two issues that have to be solved:

  • All three failing fenced docstring tests are because we use an imaginary file for SPM vocabulary.
  • Currently, the tests operate on files (and not modules), i.e., we obtain a list of files and apply RegEx matching on them. Can we do it using modules and not files? The reason for that being, we will be able to handle the "f-string" kind of doc-strings.

CCing @mattdangerw for suggestions/help!

@abheesht17 abheesht17 changed the title Add Fence Docstring Testing Add Fenced Docstring Testing Jan 7, 2023
Copy link
Member

@mattdangerw mattdangerw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome! This is super super useful for the library. Left some questions.

@@ -77,7 +77,7 @@ class BertBackbone(Backbone):
}

# Pretrained BERT encoder
model = keras_nlp.models.BertBackbone.from_preset("base_base_en_uncased")
model = keras_nlp.models.BertBackbone.from_preset("bert_base_en_uncased")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lol, good we are running this :)

MESSAGE = textwrap.dedent(
"""\n
##############################################################
# Check the documentation (go/g3doctest) on how to write
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's not include this link, it's not public

)
def test_fenced_docstrings():

regex_pattern = re.compile(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this a pattern to skip? If so we should probably name/comment this to make it clearer.

I wonder if there is an annotation we could add that would make this more flexible (and work for both types of docstrings). @do_not_test_docstring or something like that.

@@ -77,3 +94,59 @@ def test_docstrings():
if not result.wasSuccessful():
print(result)
assert result.wasSuccessful()


@pytest.mark.skipif(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can probably remove this case everywhere, we aren't supporting win32 natively anymore (only through WSL).

@pytest.mark.skipif(
sys.platform == "win32", reason="Numpy prints differently on windows"
)
def test_fenced_docstrings():
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should probably mark this a large test, this will involve a lot of file downloads for the preset right?


suite.addTest(
doctest.DocFileSuite(
*keras_nlp_files,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you said you were going to look into doing this via modules and not files, is that still possible? What was the outcome of checking this out?

from keras_nlp.tests.doc_tests.docstring_lib import DoctestOutputChecker


class FencedCellOutputChecker(DoctestOutputChecker):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should attribute our source here

textwrap.dedent(
"""
*********************************************************************
* Caution: `fenced_doctest` patches `doctest.compile` don't use this
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this a valid warning? If so, I'm pretty sure we will run both docstest in this same binary.

Maybe we are ok because this gets run second?

So to let doctest act like a notebook:

1. We need `mode="exec"` (easy)
2. We need the last expression to be printed (harder).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we want this... Nothing relies on output after a fenced code block today, but we could add this in.

Will discuss with folks.

@mattdangerw
Copy link
Member

@abheesht17 also it looks like things are failing on ci because of the lack of astor. We should add that (and any other new pip dependencies) to requirements-common.txt.

@abheesht17
Copy link
Collaborator Author

abheesht17 commented Jan 13, 2023

@chenmoneygithub, could you please generate a docker image for this for accelerator testing? I've added astor to requirements_common.txt. Thanks!

@@ -49,12 +49,15 @@ def from_preset(
Defaults to `True`.

Examples:
doctest.skip
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's remember to remove this.

@abheesht17 abheesht17 mentioned this pull request Jan 16, 2023
)

ASTOR_WARNING = (
"`test_fenced_docstrings` requires `astor`. Please run `pip install astor`."
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't usually keep warning string in constant like this, I'm just going to inline this.

Copy link
Member

@mattdangerw mattdangerw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! LGTM!

@mattdangerw mattdangerw merged commit 544aca0 into keras-team:master Jan 17, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Automatically test fenced code blocks in docstrings
2 participants