Fix internal links in docs (#1031)

* update sphinx referencing * update storage spec ref * add test for docstring generation * update changelog * add zarr and mpi mappings * update tutorials * fix italics in tutorials to code blocks * fix broken links and references * set nitpicky link checking for sphinx * update conf file * fix italics to code blocks in tutorials * revert RegionReference to type * Update src/hdmf/spec/spec.py Co-authored-by: Ryan Ly <[email protected]> * add remaining warnings to nitpick_ignore * raise sphinx warnings as errors * rename workflow to reflect linkcheck updates * Update CHANGELOG.md * Update check_sphinx_links.yml job name --------- Co-authored-by: Ryan Ly <[email protected]>
hdmf-dev · Jan 19, 2024 · 3a3dd59 · 3a3dd59
1 parent b4bdcef
commit 3a3dd59
Show file tree

Hide file tree

Showing 20 changed files with 82 additions and 59 deletions.
diff --git a/.github/workflows/check_external_links.yml → .github/workflows/check_sphinx_links.yml b/.github/workflows/check_external_links.yml → .github/workflows/check_sphinx_links.yml
@@ -1,12 +1,12 @@
-name: Check Sphinx external links
+name: Check Sphinx links
 on:
   pull_request:
   schedule:
     - cron: '0 5 * * *'  # once per day at midnight ET
   workflow_dispatch:
 
 jobs:
-  check-external-links:
+  check-sphinx-links:
     runs-on: ubuntu-latest
     concurrency:
       group: ${{ github.workflow }}-${{ github.ref }}
@@ -29,5 +29,5 @@ jobs:
           python -m pip install -r requirements-doc.txt -r requirements-opt.txt
           python -m pip install .
 
-      - name: Check Sphinx external links
-        run: sphinx-build -b linkcheck ./docs/source ./test_build
+      - name: Check Sphinx internal and external links
+        run: sphinx-build -W -b linkcheck ./docs/source ./test_build
diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -7,6 +7,7 @@
 - Added `add_ref_termset`, updated helper methods for `HERD`, revised `add_ref` to support validations prior to populating the tables
   and added `add_ref_container`.  @mavaylon1 [#968](https://github.com/hdmf-dev/hdmf/pull/968)
 - Use `stacklevel` in most warnings. @rly [#1027](https://github.com/hdmf-dev/hdmf/pull/1027)
+- Fixed broken links in documentation and added internal link checking to workflows. @stephprince [#1031](https://github.com/hdmf-dev/hdmf/pull/1031)
 
 ### Minor Improvements
 - Updated `__gather_columns` to ignore the order of bases when generating columns from the super class. @mavaylon1 [#991](https://github.com/hdmf-dev/hdmf/pull/991)

diff --git a/docs/Makefile b/docs/Makefile
@@ -149,7 +149,7 @@ changes:
 	@echo "The overview file is in $(BUILDDIR)/changes."
 
 linkcheck:
-	$(SPHINXBUILD) -b linkcheck $(ALLSPHINXOPTS) $(BUILDDIR)/linkcheck
+	$(SPHINXBUILD) -W -b linkcheck $(ALLSPHINXOPTS) $(BUILDDIR)/linkcheck
 	@echo
 	@echo "Link check complete; look for any errors in the above output " \
 	      "or in $(BUILDDIR)/linkcheck/output.txt."

diff --git a/docs/gallery/plot_external_resources.py b/docs/gallery/plot_external_resources.py
@@ -153,8 +153,8 @@ def __init__(self, **kwargs):
 # ------------------------------------------------------
 # It is important to keep in mind that when adding and :py:class:`~hdmf.common.resources.Object` to
 # the :py:class:~hdmf.common.resources.ObjectTable, the parent object identified by
-# :py:class:`~hdmf.common.resources.Object.object_id` must be the closest parent to the target object
-# (i.e., :py:class:`~hdmf.common.resources.Object.relative_path` must be the shortest possible path and
+# ``Object.object_id`` must be the closest parent to the target object
+# (i.e., ``Object.relative_path`` must be the shortest possible path and
 # as such cannot contain any objects with a ``data_type`` and associated ``object_id``).
 #
 # A common example would be with the :py:class:`~hdmf.common.table.DynamicTable` class, which holds

diff --git a/docs/gallery/plot_generic_data_chunk_tutorial.py b/docs/gallery/plot_generic_data_chunk_tutorial.py
@@ -119,10 +119,10 @@ def _get_dtype(self):
 #   optimal performance  (typically 1 MB or less). In contrast, a :py:class:`~hdmf.data_utils.DataChunk` in
 #   HDMF acts as a block of data for writing data to dataset, and spans multiple HDF5 chunks to improve performance.
 #   This is achieved by avoiding repeat
-#   updates to the same `Chunk` in the HDF5 file, :py:class:`~hdmf.data_utils.DataChunk` objects for write
-#   should align with `Chunks` in the HDF5 file, i.e., the :py:class:`~hdmf.data_utils.DataChunk.selection`
-#   should fully cover one or more `Chunks`  in the HDF5 file to avoid repeat updates to the same
-#   `Chunks` in the HDF5 file. This is what the `buffer` of the :py:class`~hdmf.data_utils.GenericDataChunkIterator`
+#   updates to the same ``Chunk`` in the HDF5 file, :py:class:`~hdmf.data_utils.DataChunk` objects for write
+#   should align with ``Chunks`` in the HDF5 file, i.e., the ``DataChunk.selection``
+#   should fully cover one or more ``Chunks`` in the HDF5 file to avoid repeat updates to the same
+#   ``Chunks`` in the HDF5 file. This is what the `buffer` of the :py:class`~hdmf.data_utils.GenericDataChunkIterator`
 #   does, which upon each iteration returns a single
 #   :py:class:`~hdmf.data_utils.DataChunk` object (by default > 1 GB) that perfectly spans many HDF5 chunks
 #   (by default < 1 MB) to help reduce the number of small I/O operations

diff --git a/docs/gallery/plot_term_set.py b/docs/gallery/plot_term_set.py
@@ -107,7 +107,7 @@
 ######################################################
 # Viewing TermSet values
 # ----------------------------------------------------
-# :py:class:`~hdmf.term_set.TermSet` has methods to retrieve terms. The :py:func:`~hdmf.term_set.TermSet:view_set`
+# :py:class:`~hdmf.term_set.TermSet` has methods to retrieve terms. The :py:func:`~hdmf.term_set.TermSet.view_set`
 # method will return a dictionary of all the terms and the corresponding information for each term.
 # Users can index specific terms from the :py:class:`~hdmf.term_set.TermSet`. LinkML runtime will need to be installed.
 # You can do so by first running ``pip install linkml-runtime``.

diff --git a/docs/make.bat b/docs/make.bat
@@ -183,7 +183,7 @@ if "%1" == "changes" (
 )
 
 if "%1" == "linkcheck" (
-	%SPHINXBUILD% -b linkcheck %ALLSPHINXOPTS% %BUILDDIR%/linkcheck
+	%SPHINXBUILD% -W -b linkcheck %ALLSPHINXOPTS% %BUILDDIR%/linkcheck
 	if errorlevel 1 exit /b 1
 	echo.
 	echo.Link check complete; look for any errors in the above output ^

diff --git a/docs/source/conf.py b/docs/source/conf.py
@@ -76,6 +76,7 @@
     "matplotlib": ("https://matplotlib.org/stable/", None),
     "h5py": ("https://docs.h5py.org/en/latest/", None),
     "pandas": ("https://pandas.pydata.org/pandas-docs/stable/", None),
+    "zarr": ("https://zarr.readthedocs.io/en/stable/", None),
 }
 
 # these links cannot be checked in github actions
@@ -84,6 +85,14 @@
     "https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/proposing-changes-to-your-work-with-pull-requests/creating-a-pull-request",
 ]
 
+nitpicky = True
+nitpick_ignore = [('py:class', 'Intracomm'),
+                  ('py:class', 'h5py.RegionReference'),
+                  ('py:class', 'h5py._hl.dataset.Dataset'),
+                  ('py:class', 'function'),
+                  ('py:class', 'unittest.case.TestCase'),
+                  ]
+
 # Add any paths that contain templates here, relative to this directory.
 templates_path = ["_templates"]
 

diff --git a/docs/source/overview_software_architecture.rst b/docs/source/overview_software_architecture.rst
@@ -81,19 +81,19 @@ Spec
 * Interface for writing extensions or custom specification
 * There are several main specification classes:
 
-   * :py:class:`~hdmf.spec.AttributeSpec` - specification for metadata
-   * :py:class:`~hdmf.spec.GroupSpec` - specification for a collection of
+   * :py:class:`~hdmf.spec.spec.AttributeSpec` - specification for metadata
+   * :py:class:`~hdmf.spec.spec.GroupSpec` - specification for a collection of
      objects (i.e. subgroups, datasets, link)
-   * :py:class:`~hdmf.spec.DatasetSpec` - specification for dataset (like
+   * :py:class:`~hdmf.spec.spec.DatasetSpec` - specification for dataset (like
      and n-dimensional array). Specifies data type, dimensions, etc.
-   * :py:class:`~hdmf.spec.LinkSpec` - specification for link (like a POSIX
+   * :py:class:`~hdmf.spec.spec.LinkSpec` - specification for link (like a POSIX
      soft link)
    * :py:class:`~hdmf.spec.spec.RefSpec` - specification for references
      (References are like links, but stored as data)
-   * :py:class:`~hdmf.spec.DtypeSpec` - specification for compound data
+   * :py:class:`~hdmf.spec.spec.DtypeSpec` - specification for compound data
      types. Used to build complex data type specification, e.g., to define
      tables (used only in :py:class:`~hdmf.spec.spec.DatasetSpec` and
-     correspondingly :py:class:`~hdmf.spec.DatasetSpec`)
+     correspondingly :py:class:`~hdmf.spec.spec.DatasetSpec`)
 
 * **Main Modules:** :py:class:`hdmf.spec`
 

diff --git a/docs/source/validation.rst b/docs/source/validation.rst
@@ -3,7 +3,7 @@
 Validating HDMF Data
 ====================
 
-Validation of NWB files is available through :py:mod:`~pynwb`. See the `PyNWB documentation
+Validation of NWB files is available through ``pynwb``. See the `PyNWB documentation
 <https://pynwb.readthedocs.io/en/stable/validation.html>`_ for more information.
 
 --------

diff --git a/src/hdmf/backends/hdf5/h5_utils.py b/src/hdmf/backends/hdf5/h5_utils.py
@@ -77,7 +77,7 @@ def append(self, dataset, data):
         Append a value to the queue
 
         :param dataset: The dataset where the DataChunkIterator is written to
-        :type dataset: Dataset
+        :type dataset: :py:class:`~h5py.Dataset`
         :param data: DataChunkIterator with the data to be written
         :type data: AbstractDataChunkIterator
         """
@@ -604,7 +604,7 @@ def filter_available(filter, allow_plugin_filters):
 
         :param filter: String with the name of the filter, e.g., gzip, szip etc.
                        int with the registered filter ID, e.g. 307
-        :type filter: String, int
+        :type filter: str, int
         :param allow_plugin_filters: bool indicating whether the given filter can be dynamically loaded
         :return: bool indicating whether the given filter is available
         """

diff --git a/src/hdmf/backends/hdf5/h5tools.py b/src/hdmf/backends/hdf5/h5tools.py
@@ -484,7 +484,7 @@ def read(self, **kwargs):
                 raise UnsupportedOperation("Cannot read data from file %s in mode '%s'. There are no values."
                                            % (self.source, self.__mode))
 
-    @docval(returns='a GroupBuilder representing the data object', rtype='GroupBuilder')
+    @docval(returns='a GroupBuilder representing the data object', rtype=GroupBuilder)
     def read_builder(self):
         """
         Read data and return the GroupBuilder representing it.
@@ -978,7 +978,7 @@ def _filler():
              'default': True},
             {'name': 'export_source', 'type': str,
              'doc': 'The source of the builders when exporting', 'default': None},
-            returns='the Group that was created', rtype='Group')
+            returns='the Group that was created', rtype=Group)
     def write_group(self, **kwargs):
         parent, builder = popargs('parent', 'builder', kwargs)
         self.logger.debug("Writing GroupBuilder '%s' to parent group '%s'" % (builder.name, parent.name))
@@ -1033,7 +1033,7 @@ def __get_path(self, builder):
             {'name': 'builder', 'type': LinkBuilder, 'doc': 'the LinkBuilder to write'},
             {'name': 'export_source', 'type': str,
              'doc': 'The source of the builders when exporting', 'default': None},
-            returns='the Link that was created', rtype='Link')
+            returns='the Link that was created', rtype=(SoftLink, ExternalLink))
     def write_link(self, **kwargs):
         parent, builder, export_source = getargs('parent', 'builder', 'export_source', kwargs)
         self.logger.debug("Writing LinkBuilder '%s' to parent group '%s'" % (builder.name, parent.name))

diff --git a/src/hdmf/common/alignedtable.py b/src/hdmf/common/alignedtable.py
@@ -29,7 +29,7 @@ class AlignedDynamicTable(DynamicTable):
 
     @docval(*get_docval(DynamicTable.__init__),
             {'name': 'category_tables', 'type': list,
-             'doc': 'List of DynamicTables to be added to the container. NOTE: Only regular '
+             'doc': 'List of DynamicTables to be added to the container. NOTE - Only regular '
                     'DynamicTables are allowed. Using AlignedDynamicTable as a category for '
                     'AlignedDynamicTable is currently not supported.', 'default': None},
             {'name': 'categories', 'type': 'array_data',

diff --git a/src/hdmf/common/resources.py b/src/hdmf/common/resources.py
@@ -897,7 +897,7 @@ def get_object_entities(self, **kwargs):
 
     @docval({'name': 'use_categories', 'type': bool, 'default': False,
              'doc': 'Use a multi-index on the columns to indicate which category each column belongs to.'},
-            rtype=pd.DataFrame, returns='A DataFrame with all data merged into a flat, denormalized table.')
+            rtype='pandas.DataFrame', returns='A DataFrame with all data merged into a flat, denormalized table.')
     def to_dataframe(self, **kwargs):
         """
         Convert the data from the keys, resources, entities, objects, and object_keys tables

diff --git a/src/hdmf/data_utils.py b/src/hdmf/data_utils.py
@@ -36,7 +36,7 @@ def extend_data(data, arg):
     """Add all the elements of the iterable arg to the end of data.
 
     :param data: The array to extend
-    :type data: list, DataIO, np.ndarray, h5py.Dataset
+    :type data: list, DataIO, numpy.ndarray, h5py.Dataset
     """
     if isinstance(data, (list, DataIO)):
         data.extend(arg)
@@ -383,15 +383,12 @@ def _get_data(self, selection: Tuple[slice]) -> np.ndarray:
         The developer of a new implementation of the GenericDataChunkIterator must ensure the data is actually
         loaded into memory, and not simply mapped.
 
-        :param selection: Tuple of slices, each indicating the selection indexed with respect to maxshape for that axis
-        :type selection: tuple of slices
+        :param selection: tuple of slices, each indicating the selection indexed with respect to maxshape for that axis.
+            Each axis of tuple is a slice of the full shape from which to pull data into the buffer.
+        :type selection: Tuple[slice]
 
         :returns: Array of data specified by selection
-        :rtype: np.ndarray
-        Parameters
-        ----------
-        selection : tuple of slices
-            Each axis of tuple is a slice of the full shape from which to pull data into the buffer.
+        :rtype: numpy.ndarray
         """
         raise NotImplementedError("The data fetching method has not been built for this DataChunkIterator!")
 
@@ -615,7 +612,7 @@ def __next__(self):
 
         .. tip::
 
-            :py:attr:`numpy.s_` provides a convenient way to generate index tuples using standard array slicing. This
+            :py:obj:`numpy.s_` provides a convenient way to generate index tuples using standard array slicing. This
             is often useful to define the DataChunk.selection of the current chunk
 
         :returns: DataChunk object with the data and selection of the current chunk
@@ -800,17 +797,17 @@ def assertEqualShape(data1,
     Ensure that the shape of data1 and data2 match along the given dimensions
 
     :param data1: The first input array
-    :type data1: List, Tuple, np.ndarray, DataChunkIterator etc.
+    :type data1: List, Tuple, numpy.ndarray, DataChunkIterator
     :param data2: The second input array
-    :type data2: List, Tuple, np.ndarray, DataChunkIterator etc.
+    :type data2: List, Tuple, numpy.ndarray, DataChunkIterator
     :param name1: Optional string with the name of data1
     :param name2: Optional string with the name of data2
     :param axes1: The dimensions of data1 that should be matched to the dimensions of data2. Set to None to
                   compare all axes in order.
-    :type axes1: int, Tuple of ints, List of ints, or None
+    :type axes1: int, Tuple(int), List(int), None
     :param axes2: The dimensions of data2 that should be matched to the dimensions of data1. Must have
                   the same length as axes1. Set to None to compare all axes in order.
-    :type axes1: int, Tuple of ints, List of ints, or None
+    :type axes1: int, Tuple(int), List(int), None
     :param ignore_undetermined: Boolean indicating whether non-matching unlimited dimensions should be ignored,
                i.e., if two dimension don't match because we can't determine the shape of either one, then
                should we ignore that case or treat it as no match

diff --git a/src/hdmf/spec/write.py b/src/hdmf/spec/write.py
@@ -240,9 +240,9 @@ def export_spec(ns_builder, new_data_types, output_dir):
     the given data type specs.
 
     Args:
-        ns_builder - NamespaceBuilder instance used to build the
+        ns_builder: NamespaceBuilder instance used to build the
                      namespace and extension
-        new_data_types - Iterable of specs that represent new data types
+        new_data_types: Iterable of specs that represent new data types
                          to be added
     """
 

diff --git a/src/hdmf/testing/testcase.py b/src/hdmf/testing/testcase.py
@@ -239,8 +239,8 @@ def assertBuilderEqual(self,
         :type check_path: bool
         :param check_source: Check that the builder.source values are equal
         :type check_source: bool
-        :param message: Custom message to add when any asserts as part of this assert are failing
-        :type message: str or None (default=None)
+        :param message: Custom message to add when any asserts as part of this assert are failing (default=None)
+        :type message: str or None
         """
         self.assertTrue(isinstance(builder1, Builder), message)
         self.assertTrue(isinstance(builder2, Builder), message)