Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor docs to use inclusive language. #342

Merged
merged 9 commits into from
Jul 9, 2020
Merged
Show file tree
Hide file tree
Changes from 6 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 3 additions & 2 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@ The API of each package as part of the framework is documented in the form of do
A more general introduction in the form of tutorials, guides, and recipes is published as part of the framework documentation: https://docs.signac.io.

Anyone is invited to add to or edit any part of the documentation.
To fix a spelling mistake or to make a minor edits, just click on the *Edit on GitHub** button in the top-right corner.
To fix a spelling mistake or to make a minor edits, just click on the **Edit on GitHub** button in the top-right corner.
We recommend to clone the signac-docs repository for more substantial edits on your local machine.

## Triaging Issues
Expand All @@ -35,6 +35,7 @@ All contributors must agree to the Contributor Agreement ([ContributorAgreement.
* Preserve backwards-compatibility whenever possible, and make clear if something must change.
* Document any portions of the code that might be less clear to others, especially to new developers.
* Write API documentation in this package, and put usage information, guides, and concept overviews in the [framework documentation](https://docs.signac.io/) ([source](https://github.com/glotzerlab/signac-docs/)).
* Use inclusive language in all documentation and code. The [Google developer documentation style guide](https://developers.google.com/style/inclusive-documentation) is a helpful reference.

Please see the [Support](https://docs.signac.io/projects/signac-core/en/latest/support.html) section as part of the documentation for detailed development guidelines.

Expand All @@ -54,7 +55,7 @@ Pull requests should generally be approved by two reviewers prior to merge.

The following items represent a general guideline for points that should be considered during the review process:

* Breaking changes to the API should be avoided whenever possible and require approval by a lead maintainer.
* Breaking changes to the API should be avoided whenever possible and require approval by a project maintainer.
* Significant performance degradations must be avoided unless the regression is necessary to fix a bug.
* Updates for non-trivial bug fixes should be accompanied by a unit test that catches the related issue to avoid future regression.
* The code is easy to follow and sufficiently documented to be understandable even to developers who are not highly familiar with the code.
Expand Down
13 changes: 7 additions & 6 deletions changelog.txt
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,7 @@ Deprecated
++++++++++

- Deprecate the ``create_access_modules`` method in ``Project``, to be removed in 2.0 (#303, #308).
- The ``MainCrawler`` class has replaced the ``MasterCrawler`` class. Both classes are deprecated.


[1.4.0] -- 2020-02-28
Expand Down Expand Up @@ -385,7 +386,7 @@ Added

- Introduction of the ``Collection`` class for the management of document collections, such as indexes in memory and on disk.
- Generation of file indexes directly via the ``signac.index_files()`` function.
- Generation of master indexes directly via the ``signac.index()`` function and the ``$ signac index`` command.
- Generation of main indexes directly via the ``signac.index()`` function and the ``$ signac index`` command.
- The API of ``signac_access.py`` files has been simplified, including the possibility to use a blank file for a minimal configuration.
- Use the ``$ signac project --access`` command to create a minimal access module in addition to ``Project.create_access_module()``.
- The update of existing index collections has been simplified by using the ``export()`` function with the `update=True` argument, which means that stale documents (the associated file or state point no longer exists) are automatically identified and removed.
Expand All @@ -395,16 +396,16 @@ Added
Changed (breaking API)
++++++++++++++++++++++

- The ``$ signac index`` command generates a master index instead of a project index. To generate a project index from the command line use ``$ signac project --index`` instead.
- The ``$ signac index`` command generates a main index instead of a project index. To generate a project index from the command line use ``$ signac project --index`` instead.
- The ``SignacProjectCrawler`` class expects the project's root directory as first argument, not the workspace directory.
- The ``get_crawlers()`` function defined within a ``signac_access.py`` access module is expected to yield crawler instances directly, not a mapping of crawler ids and instances.
- The simplification of the ``signac_access.py`` module API is reflected in a reduction of arguments to the ``Project.create_access_module()`` method.

Changed (non-breaking)
++++++++++++++++++++++

- The ``RegexFileCrawler``, ``SignacProjectCrawler`` and ``MasterCrawler`` classes were moved into the root namespace.
- If a ``MasterCrawler`` object is instantiated with the ``raise_on_error`` argument set to True, any errors encountered during crawling are raised instead of ignored and skipped; this simplifies the debugging of erroneous access modules.
- The ``RegexFileCrawler``, ``SignacProjectCrawler`` and ``MainCrawler`` classes were moved into the root namespace.
- If a ``MainCrawler`` object is instantiated with the ``raise_on_error`` argument set to True, any errors encountered during crawling are raised instead of ignored and skipped; this simplifies the debugging of erroneous access modules.
- Improved error message for invalid configuration files.
- Better error messages for invalid ``$ signac find`` queries.
- Check a host configuration on the command line via ``$ signac host --test``.
Expand Down Expand Up @@ -546,7 +547,7 @@ Added
Changed
+++++++

- The MasterCrawler logic has been simplified; their primary function is the compilation of index documents from slave crawlers, all export logic, including data mirroring is now provided by the ``signac.export()`` function.
- The MainCrawler logic has been simplified; their primary function is the compilation of index documents from subcrawlers, all export logic, including data mirroring is now provided by the ``signac.export()`` function.
- Each index document is now uniquely coupled with only one file or data object, which is why ``signac.fetch()`` replaces ``signac.fetch_one()`` and the latter one has been deprecated and is currently an alias of the former one.
- The ``signac.fetch()`` function always returns a file-like object, regardless of format definition.
- The format argument in the crawler ``define()`` function is now optional and has now very well defined behavior for str types. It is encouraged to define a format with a str constant rather than a file-like object type.
Expand All @@ -557,7 +558,7 @@ Changed
- The ``contrib.crawler`` module has been renamed to contrib.indexing to better reflect the semantic context.
- The ``signac.export()`` function now implements the logic for data linking and mirroring.
- Provide default argument for '--indent' option for ``$ signac statepoint`` command.
- Log, but do not reraise exceptions during ``MasterCrawler`` execution, making the compilation of master indexes more robust against errors.
- Log, but do not reraise exceptions during ``MainCrawler`` execution, making the compilation of main indexes more robust against errors.
- The object representation of ``Job`` and ``Project`` instances is simplified.
- The warning verbosity has been reduced when importing modules with optional dependencies.

Expand Down
2 changes: 1 addition & 1 deletion doc/api.rst
Original file line number Diff line number Diff line change
Expand Up @@ -159,7 +159,7 @@ Top-level functions
.. automodule:: signac
:members:
:show-inheritance:
:exclude-members: Project,Collection,RegexFileCrawler,MasterCrawler,SignacProjectCrawler,JSONDict,H5Store,H5StoreManager
:exclude-members: Project,Collection,RegexFileCrawler,MainCrawler,MasterCrawler,SignacProjectCrawler,JSONDict,H5Store,H5StoreManager


Submodules
Expand Down
3 changes: 2 additions & 1 deletion signac/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,7 @@
from .contrib import index_files
from .contrib import index
from .contrib import RegexFileCrawler
from .contrib import MainCrawler
from .contrib import MasterCrawler
from .contrib import SignacProjectCrawler
from .diff import diff_jobs
Expand All @@ -57,7 +58,7 @@
'export_pymongo', 'fs',
'index_files', 'index',
'RegexFileCrawler',
'MasterCrawler',
'MainCrawler', 'MasterCrawler',
'SignacProjectCrawler',
'buffered', 'is_buffered', 'flush', 'get_buffer_size', 'get_buffer_load',
'JSONDict',
Expand Down
6 changes: 3 additions & 3 deletions signac/__main__.py
Original file line number Diff line number Diff line change
Expand Up @@ -335,7 +335,7 @@ def main_clone(args):


def main_index(args):
_print_err("Compiling master index for path '{}'...".format(
_print_err("Compiling main index for path '{}'...".format(
os.path.realpath(args.root)))
if args.tags:
args.tags = set(args.tags)
Expand Down Expand Up @@ -1303,11 +1303,11 @@ def main():
'root',
nargs='?',
default='.',
help="Specify the root path from where the master index is to be compiled.")
help="Specify the root path from where the main index is to be compiled.")
parser_index.add_argument(
'-t', '--tags',
nargs='+',
help="Specify tags for this master index compilation.")
help="Specify tags for this main index compilation.")
parser_index.set_defaults(func=main_index)

parser_find = subparsers.add_parser(
Expand Down
3 changes: 2 additions & 1 deletion signac/contrib/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@
from .indexing import RegexFileCrawler
from .indexing import JSONCrawler
from .indexing import SignacProjectCrawler
from .indexing import MainCrawler
from .indexing import MasterCrawler
from .indexing import fetch
from .indexing import fetched
Expand All @@ -29,7 +30,7 @@
'indexing',
'Project', 'TemporaryProject', 'get_project', 'init_project', 'get_job',
'BaseCrawler', 'RegexFileCrawler', 'JSONCrawler', 'SignacProjectCrawler',
'MasterCrawler', 'fetch', 'fetched',
'MainCrawler', 'MasterCrawler', 'fetch', 'fetched',
'export_one', 'export', 'export_to_mirror', 'export_pymongo',
'index_files', 'index',
'Collection',
Expand Down
38 changes: 24 additions & 14 deletions signac/contrib/indexing.py
Original file line number Diff line number Diff line change
Expand Up @@ -431,8 +431,8 @@ def crawl(self, depth=0):


# this class is deprecated
class MasterCrawler(BaseCrawler):
r"""Compiles a master index from indexes defined in access modules.
class MainCrawler(BaseCrawler):
r"""Compiles a main index from indexes defined in access modules.

An instance of this crawler will search the data space for access
modules, which by default are named ``signac_access.py``. Once such
Expand All @@ -456,7 +456,7 @@ def get_indexes(root):
def get_crawlers(root):
yield MyCrawler(root)

In case that the master crawler has tags, the ``get_indexes()`` function
In case that the main crawler has tags, the ``get_indexes()`` function
will always be ignored while crawlers yielded from the ``get_crawlers()``
function will only be executed in case that they match at least one
of the tags.
Expand Down Expand Up @@ -496,7 +496,7 @@ def get_indexes(root):
details="The indexing module is deprecated.")
def __init__(self, root, raise_on_error=False):
self.raise_on_error = raise_on_error
super(MasterCrawler, self).__init__(root=root)
super(MainCrawler, self).__init__(root=root)

def _docs_from_module(self, dirpath, fn):
name = os.path.join(dirpath, fn)
Expand Down Expand Up @@ -536,15 +536,15 @@ def _check_tags(tags):

if hasattr(module, 'get_crawlers'):
for crawler in module.get_crawlers(dirpath):
logger.info("Executing slave crawler:\n {}".format(crawler))
logger.info("Executing subcrawler:\n {}".format(crawler))
if _check_tags(getattr(crawler, 'tags', None)):
for doc in crawler.crawl():
doc.setdefault(
KEY_PROJECT, os.path.relpath(dirpath, self.root))
yield doc

def docs_from_file(self, dirpath, fn):
"""Compile master index from file in case it is an access module.
"""Compile main index from file in case it is an access module.

:param dirpath: The path of the file relative to root.
:param fn: The filename of the file.
Expand All @@ -563,6 +563,16 @@ def docs_from_file(self, dirpath, fn):
logger.debug("Completed indexing from '{}'.".format(os.path.join(dirpath, fn)))


# Deprecated API
class MasterCrawler(MainCrawler):
def __init__(self, *args, **kwargs):
warnings.warn(
"The MasterCrawler class has been replaced by the MainCrawler class. "
"Both classes are deprecated and will be removed in a future release.",
DeprecationWarning)
super(MasterCrawler, self).__init__(*args, **kwargs)


@deprecated(deprecated_in="1.3", removed_in="2.0", current_version=__version__,
details="The indexing module is deprecated.")
def fetch(doc_or_id, mode='r', mirrors=None, num_tries=3, timeout=60, ignore_local=False):
Expand Down Expand Up @@ -931,9 +941,9 @@ class Crawler(RegexFileCrawler):
@deprecated(deprecated_in="1.3", removed_in="2.0", current_version=__version__,
details="The indexing module is deprecated.")
def index(root='.', tags=None, depth=0, **kwargs):
r"""Generate a master index.
r"""Generate a main index.

A master index is compiled from other indexes by searching
A main index is compiled from other indexes by searching
for modules named ``signac_access.py`` and compiling all
indexes which are yielded from a function ``get_indexes(root)``
defined within that module as well as the indexes generated by
Expand All @@ -950,21 +960,21 @@ def get_indexes(root):
yield signac.index_files(root, r'.*\.txt')

Internally, this function constructs an instance of
:py:class:`.MasterCrawler` and all extra key-word arguments
will be forwarded to the constructor of said master crawler.
:py:class:`.MainCrawler` and all extra key-word arguments
will be forwarded to the constructor of said main crawler.

:param root: Look for access modules under this directory path.
:type root: str
:param tags: If tags are provided, do not execute slave crawlers
:param tags: If tags are provided, do not execute subcrawlers
that don't match the same tags.
:param depth: Limit the search to the specified directory depth.
:param kwargs: These keyword-arguments are forwarded to the
internal MasterCrawler instance.
internal MainCrawler instance.
:type depth: int
:yields: The master index documents as instances of dict.
:yields: The main index documents as instances of dict.
"""

class Crawler(MasterCrawler):
class Crawler(MainCrawler):
pass

if tags is not None:
Expand Down
27 changes: 16 additions & 11 deletions signac/contrib/project.py
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,7 @@
from .job import Job
from .hashing import calc_id
from .indexing import SignacProjectCrawler
from .indexing import MasterCrawler
from .indexing import MainCrawler
from .utility import _mkdir_p, split_and_print_progress, _nested_dicts_to_dotted_keys
from .schema import ProjectSchema
from .errors import WorkspaceError
Expand All @@ -47,7 +47,7 @@ def get_indexes(root):
yield signac.get_project(root).index()
"""

ACCESS_MODULE_MASTER = """#!/usr/bin/env python
ACCESS_MODULE_MAIN = """#!/usr/bin/env python
# -*- coding: utf-8 -*-
import signac

Expand Down Expand Up @@ -1450,32 +1450,37 @@ class Crawler(SignacProjectCrawler):

@deprecated(deprecated_in="1.5", removed_in="2.0", current_version=__version__,
details="Access modules are deprecated.")
def create_access_module(self, filename=None, master=True):
def create_access_module(self, filename=None, main=True, master=None):
"""Create the access module for indexing.

This method generates the access module required to make
this project's index part of a master index.
this project's index part of a main index.

:param filename: The name of the access module file.
Defaults to the standard name and should usually
not be changed.
:type filename: str
:param master: If True, add directives for the compilation
of a master index when executing the module.
:type master: bool
:param main: If True, add directives for the compilation
of a main index when executing the module.
:type main: bool
:param master: Deprecated parameter. Replaced by main.
:returns: The name of the created access module.
:rtype: str
"""
if master is not None:
warnings.warn("The parameter master has been renamed to main.", DeprecationWarning)
main = master

if filename is None:
filename = os.path.join(
self.root_directory(),
MasterCrawler.FN_ACCESS_MODULE)
MainCrawler.FN_ACCESS_MODULE)
with open(filename, 'x') as file:
if master:
file.write(ACCESS_MODULE_MASTER)
if main:
file.write(ACCESS_MODULE_MAIN)
else:
file.write(ACCESS_MODULE_MINIMAL)
if master:
if main:
mode = os.stat(filename).st_mode | stat.S_IEXEC
os.chmod(filename, mode)
logger.info("Created access module file '{}'.".format(filename))
Expand Down
16 changes: 8 additions & 8 deletions tests/test_indexing.py
Original file line number Diff line number Diff line change
Expand Up @@ -283,9 +283,9 @@ def test_json_crawler(self):
ids = set(doc['_id'] for doc in docs)
assert len(ids) == len(docs)

def test_master_crawler(self):
def test_main_crawler(self):
self.setup_project()
crawler = indexing.MasterCrawler(root=self._tmp_dir.name)
crawler = indexing.MainCrawler(root=self._tmp_dir.name)
crawler.tags = {'test1'}
no_find = True
with pytest.deprecated_call():
Expand Down Expand Up @@ -326,7 +326,7 @@ def test_fetch(self):
with pytest.raises(errors.FetchError):
signac.fetch(dict())
self.setup_project()
crawler = indexing.MasterCrawler(root=self._tmp_dir.name)
crawler = indexing.MainCrawler(root=self._tmp_dir.name)
crawler.tags = {'test1'}
with pytest.deprecated_call():
docs = list(crawler.crawl())
Expand All @@ -343,7 +343,7 @@ def test_fetch(self):

def test_export_one(self):
self.setup_project()
crawler = indexing.MasterCrawler(root=self._tmp_dir.name)
crawler = indexing.MainCrawler(root=self._tmp_dir.name)
crawler.tags = {'test1'}
index = self.get_index_collection()
with pytest.deprecated_call():
Expand All @@ -355,7 +355,7 @@ def test_export_one(self):

def test_export(self):
self.setup_project()
crawler = indexing.MasterCrawler(root=self._tmp_dir.name)
crawler = indexing.MainCrawler(root=self._tmp_dir.name)
crawler.tags = {'test1'}
index = self.get_index_collection()
with pytest.deprecated_call():
Expand Down Expand Up @@ -403,7 +403,7 @@ def test_export_with_update(self):

def test_export_to_mirror(self):
self.setup_project()
crawler = indexing.MasterCrawler(root=self._tmp_dir.name)
crawler = indexing.MainCrawler(root=self._tmp_dir.name)
crawler.tags = {'test1'}
index = self.get_index_collection()
mirror = _TestFS()
Expand All @@ -425,9 +425,9 @@ def test_export_to_mirror(self):
with mirror.get(doc['file_id']):
pass

def test_master_crawler_tags(self):
def test_main_crawler_tags(self):
self.setup_project()
crawler = indexing.MasterCrawler(root=self._tmp_dir.name)
crawler = indexing.MainCrawler(root=self._tmp_dir.name)
with pytest.deprecated_call():
assert 0 == len(list(crawler.crawl()))
crawler.tags = None
Expand Down