Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

InterSphinx + DirHTML results in 'index' ref in objects.inv not loading correctly #7095

Closed
pradyunsg opened this issue Feb 4, 2020 · 5 comments

Comments

@pradyunsg
Copy link
Contributor

pradyunsg commented Feb 4, 2020

Describe the bug
When loading from an objects.inv from a project built with the dirhtml builder, the index ref has an incorrect URL associated with it.

To Reproduce

$ git clone https://github.com/pypa/pip
$ cd pip
$ pip install -r tools/requirements/docs.txt
$ sphinx-build -W -d /tmp/doctrees/html -b dirhtml docs/html docs/build/html
$ python -m sphinx.ext.intersphinx ./docs/build/html/objects.inv | grep "\tindex"
        index                                    - Python Package Installer              : pip
        index-overview                           Overview                                : development/architecture/package-finding/#index-overview

Notice how the first line doesn't have the correct title pip - Python Package Installer or URI ``.

Expected behavior
The correct URI is associated with the index page (eg. # or / or ``)

Your project
pip

Screenshots
N/A

Environment info

(does not matter, see below)

  • OS: MacOS
  • Python version: 3.8.0
  • Sphinx version: 2.3.1
  • Sphinx extensions: sphinx.ext.extlinks, pip_sphinxext (repo local file), sphinx.ext.intersphinx
  • Extra tools: not needed. :)

Additional context

My investigation into this bug: pypa/pip#7130

This issue occurs due to:

  • the dirhtml builder providing uri = '' for the index page
  • the use of greedy \s+ in the regex used during loading of lines in the intersphinx objects.inv format.

During writing, InventoryFile.dump writes a line containing <priority><space><space><heading> -- notice the two spaces back to back, one on either side of the empty ("") uri. This extra whitespace is greedily consumed by the regex, since it uses \s+ for matching the whitespace and it does not allow for an empty uri (it uses \S+ for matching uri).


AFAICT, there are two changes that can be made to fix this:

  1. Stop returning "" as the uri for index in dirhtml's builder
    • is / or # a good option?
  2. Update regex to:
    • use \s+? instead of \s+ for matching whitespace non-greedily
    • allow for the uri part to be empty (\S* instead of \S+).
    (?x)(.+?)\s+?(\S*:\S*)\s+?(-?\d+)\s+?(\S*)\s+?(.*)
    

1 provides the ability for older-sphinx to read objects.inv from newer-sphinx. 2 provides the ability for newer-sphinx to read objects.inv from older-sphinx. Implementing both would likely be best for interoperability between sphinx versions.

builder ↓ \ loader old new
old forever broken 2
new 1 1 or 2
@tk0miya
Copy link
Member

tk0miya commented Feb 5, 2020

Thank you for reporting and good investigation.
I think returning # is better. I'll fix this tomorrow.

@tk0miya
Copy link
Member

tk0miya commented Feb 6, 2020

Sadly, # does not work well for other kind of cross references. So I choose its filename; index.html for an alternative finally.

@pradyunsg
Copy link
Contributor Author

But is that incompatible with dirhtml's stated objective?

@tk0miya
Copy link
Member

tk0miya commented Feb 6, 2020

hmm... you're right. Then, only we can do is 2nd idea: Update regex to.
https://www.sphinx-doc.org/en/master/usage/builders/index.html#sphinx.builders.dirhtml.DirectoryHTMLBuilder

tk0miya added a commit to tk0miya/sphinx that referenced this issue Feb 6, 2020
tk0miya added a commit to tk0miya/sphinx that referenced this issue Feb 7, 2020
tk0miya added a commit that referenced this issue Feb 9, 2020
Fix #7095: dirhtml: Cross references are broken via intersphinx and :doc:
@tk0miya
Copy link
Member

tk0miya commented Feb 9, 2020

Fixed by #7102
Thank you for reporting!

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

2 participants