doctest filters have to match both output and expected output #2277

fingolfin · 2023-09-21T10:24:04Z

Here is an MWE:

using Documenter

module MWE
"""

```jldoctest; filter = r"#"
julia> 2
2    # 1 + 1
```
"""
function dummy end
end

doctest(nothing, [MWE])

I would expect this to pass, but it doesn't. Doing ENV["JULIA_DEBUG"] = "Documenter" reveals that the filter is not (?) applied:

│ ```jldoctest; filter = r"#"
│ julia> 2
│ 2    # 1 + 1
│ ```
│
│ Subexpression:
│
│ 2
│
│ Evaluated output:
│
│ 2
│
│ Expected output:
│
│ 2    # 1 + 1
│
│ 1 doctest filter was applied:
│
│   r"#"
│
│ Evaluated output (filtered):
│
│ 2    # 1 + 1
│
│ Expected output (filtered):
│
│ 2    # 1 + 1

(note the confusing "Evaluated output (filtered)", fixed by PR #2276)

The text was updated successfully, but these errors were encountered:

fingolfin · 2023-09-21T10:25:52Z

For context, this comes from JuliaMath/Primes.jl#140

fingolfin · 2023-09-21T10:29:11Z

OK maybe this is a hint: if I use r"2" then this "works":

┌ Debug: Verifying doctest at REPL[17]
│
│ ```jldoctest; filter = r"2"
│ julia> 2
│ 2    # 1 + 1
│ ```
│
│ Subexpression:
│
│ 2
│
│ Evaluated output:
│
│ 2
│
│ Expected output:
│
│ 2    # 1 + 1
│
│ 1 doctest filter was applied:
│
│   r"2"
│
│ Evaluated output (filtered):
│
│     # 1 + 1
│
│ Expected output (filtered):
│
│     # 1 + 1
└ @ Documenter ~/.julia/packages/Documenter/Meee1/src/doctests.jl:329

But if I use r"1" then it also doesn't work as I'd expect it:

┌ Debug: Verifying doctest at REPL[20]
│
│ ```jldoctest; filter = r"1"
│ julia> 2
│ 2    # 1 + 1
│ ```
│
│ Subexpression:
│
│ 2
│
│ Evaluated output:
│
│ 2
│
│ Expected output:
│
│ 2    # 1 + 1
│
│ 1 doctest filter was applied:
│
│   r"1"
│
│ Evaluated output (filtered):
│
│ 2    # 1 + 1
│
│ Expected output (filtered):
│
│ 2    # 1 + 1
└ @ Documenter ~/.julia/packages/Documenter/Meee1/src/doctests.jl:329

mortenpi · 2023-09-21T11:03:57Z

So, this is non-obvious, but caused by the all here:

Documenter.jl/src/doctests.jl

Line 283 in e50da5d

if all(occursin.((r,), strings))

Both output and expected output have to match before filtering is applied. I am unsure of the usecase, but it was intentionally added as a bugfix: #588

Minimally, this should be documented though.

mortenpi · 2023-09-21T23:22:08Z

So I was thinking about this a bit. We have 4 cases that can happen when we apply a filter to a doctest:

Both evaluated output and expected output match.
Neither the evaluated output nor expected output match.
Evaluated output matches, but expected output doesn't.
Expected output matches, but evaluated output doesn't.

For (1) we apply the filtering, and everyone is happy. For (2), the filtering would have no effect anyway, so it doesn't really matter if we apply it or not. The question is whether we should also apply the filter in cases (3) and (4). Currently we don't.

Let's suppose we have output like FOO:1234, with the number being random/changing, so we want to filter it with r"[0-9]+".

In case (3), where we would filter when the evaluated output matches the filter, but reference doesn't, you would run into the case where following doctest would pass:

# Expected output:
FOO:
# Evaluated output:
FOO:1234

This is not what the user would expect -- they clearly expect the part after FOO: to be empty. So we should not filter here.

One argument here could be then that, since the expected output doesn't have the pattern anyway, then why are you even filtering this? But this case can happen when you use global doctest filters (e.g. DocTestFilter in at-meta) -- you will be applying filters to doctests where you don't expect them to actually filter anything (because it doesn't exist in the reference). I feel that this is the test case in #588.

Case (4), where expected output matches, but evaluated doesn't, would lead to the opposite doctest passing:

# Expected output:
FOO:1234
# Evaluated output:
FOO:

Here I think the user definitely wants the doctest to fail if the number after FOO: has gone missing, since they clearly expect a number to be there. So we should not filter here either. This is the comments case however

Some closing thoughts:

I think using optional regex matches (i.e. r"([0-9]+)?" should work), although this may indeed have (performance) problems, since you're sorta matching empty strings.
If the optional regex doesn't work, maybe we can mark a filter in a way that it would still apply in case (4)? Opt-in for this seems fine.
Maybe we want some sort of "doctest preprocessing" functionality, where you can run filters on the expected output, before it goes into matching? This seems a bit complex, so I am not sure it's worth it though.

timholy · 2025-01-08T12:38:36Z

Presumably this is the same bug, but this surprised me: in the doc markdown files, I have

```jldoctest; filter=[r"1\.9999\d+" => "2.0", r"0\.49999\d+" => "0.5", r"[ -]\d\.\d+e-\d\d" => "0.0", r"0\.9999\d+" => "1.0", r"1\.0000\d+" => "1.0"]
julia> from_points = [[0, 0], [1, 0], [0, 1]];

julia> to_points   = [[1, 1], [3, 1], [1.5, 3]];

julia> AffineMap(from_points => to_points)
AffineMap([2.0 0.5; 0.0 2.0], [1.0, 1.0])
```

and it seems to work:

julia> replace("AffineMap([1.9999999999999996 0.4999999999999999; -5.551115123125783e-16 2.0], [0.9999999999999999, 1.0000000000000002])", [r"1\.9999\d+" => "2.0", r"0\.49999\d+" => "0.5", r"[ -]\d\.\d+e-\d\d" => "0.0", r"0\.9999\d+" => "1.0", r"1\.0000\d+" => "1.0"]...)
"AffineMap([2.0 0.5; 0.0 2.0], [1.0, 1.0])"

and yet

~~The insidious thing about that example is that it borks any ability to handle differences among CPUs wrt rounding.~~ I was able to fix it by writing the filters as

filter=[r"(2\.0|1\.9999\d+)" => "2.0", r"(0\.5|0\.49999\d+)" => "0.5", r"(0\.0|[ -]\d\.\d+e-\d\d)" => "0.0", r"(1\.0(?!0)|1\.0000\d+|0\.9999\d+)" => "1.0"]

but that's pretty non-obvious.

timholy · 2025-01-08T13:04:47Z

Can't your

FOO:1234

case be handled simply by writing the filter more carefully, e.g., r"FOO:\d+" => "FOO:"?

xref JuliaDocs/Documenter.jl#2277 (comment)

fingolfin · 2025-01-08T13:36:19Z

I don't find the reasoning

Here I think the user definitely wants the doctest to fail if the number after FOO: has gone missing, since they clearly expect a number to be there.

for the example

# Expected output:
FOO:1234
# Evaluated output:
FOO:

very compelling. I could just as well argue "clearly the user expected 1234 to occur there, or "the user expected a number starting with 1 there". And what is with this example, which I expect will pass since the regex applies to both outputs?

# Expected output:
FOO:1234:1234
# Evaluated output:
FOO::1234

Overall for global regex filters I am sure it will be frequent that they get applied to things which the author did not have in mind -- I actually think that's a (difficult) design issue with the entire regex filter system. Which is not meant as a complaint, this is after all a difficult problem and I certainly don't have a great alternative in mind. I merely want to point out that this is IMHO a systematic issue.

Anyway, I don't expect to be able to change minds, and it may be too late anyway (perhaps even breaking people's existing tests).

So I hope it can at least be documented at https://documenter.juliadocs.org/stable/man/doctests/#Filtering-Doctests (and perhaps illustrated with an example). (Perhaps I can find some time to write something but I don't want to make promises I can't keep, sorry)

timholy · 2025-01-08T13:51:43Z

Personally I'd say this is worth fixing in the long run (Documenter 2.0), but certainly documenting this limitation in the short run makes sense.

mortenpi added the Type: Documentation label Sep 21, 2023

fredrikekre changed the title ~~jldoctest filter with # does not work~~ doctest filters have to match both output and expected output Sep 21, 2023

timholy added a commit to JuliaGeometry/CoordinateTransformations.jl that referenced this issue Jan 8, 2025

Fix a broken doctest

10b8f34

xref JuliaDocs/Documenter.jl#2277 (comment)

timholy mentioned this issue Jan 8, 2025

Remove JET from Project.toml, get tests passing, bump to Julia 1.10 JuliaGeometry/CoordinateTransformations.jl#96

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

doctest filters have to match both output and expected output #2277

doctest filters have to match both output and expected output #2277

fingolfin commented Sep 21, 2023 •

edited

Loading

fingolfin commented Sep 21, 2023

fingolfin commented Sep 21, 2023

mortenpi commented Sep 21, 2023 •

edited

Loading

mortenpi commented Sep 21, 2023

timholy commented Jan 8, 2025 •

edited

Loading

timholy commented Jan 8, 2025

fingolfin commented Jan 8, 2025

timholy commented Jan 8, 2025

doctest filters have to match both output and expected output #2277

doctest filters have to match both output and expected output #2277

Comments

fingolfin commented Sep 21, 2023 • edited Loading

fingolfin commented Sep 21, 2023

fingolfin commented Sep 21, 2023

mortenpi commented Sep 21, 2023 • edited Loading

mortenpi commented Sep 21, 2023

timholy commented Jan 8, 2025 • edited Loading

timholy commented Jan 8, 2025

fingolfin commented Jan 8, 2025

timholy commented Jan 8, 2025

fingolfin commented Sep 21, 2023 •

edited

Loading

mortenpi commented Sep 21, 2023 •

edited

Loading

timholy commented Jan 8, 2025 •

edited

Loading