Skip to content

Commit

Permalink
traverse_project: Better handle overlap between pyenv and code/deps dir
Browse files Browse the repository at this point in the history
Issue #465 highlights a peculiar case where the top-level project
directory is also itself the root of a virtualenv. Our traversal code
will skip any (given or detected) Python environment, under the
assumption that we never want to analyze code/deps from a package
installed within. However, we here have a special case where -- from the
traversal code's POV -- we have found a pyenv rooted at a directory that
also appears in the given settings.code and/or settings.deps (by default
settings.code, settings.deps and settings.pyenvs are all ".").

Work around this special case as follows: When a Python environment is
found rooted at a path that is _also_ directly given in either
settings.code or settings.deps, then we _don't_ want to skip traversal
of the entire directory, rather we only skip traversal of the specific
subdirectories that form the actual contents of the environment:

 - Any package dirs found by validate_pyenv_source() (i.e. any
   "site-packages" dirs).
 - The "bin/" (or "Scripts\" on Windows) subdir where we expect to find
   any tools installed into the environment.

We could probably extend this handling to _any_ Python environment
(including those found after we start the traversal proper), but I'm not
100% sure that this won't trigger false positives where some weird
Python environment (e.g. a Conda or poetry2nix environment) happens to
contain extra "stuff" missing from the above list that would adversely
affect our analysis.
  • Loading branch information
jherland committed Jan 28, 2025
1 parent 729ed39 commit 339aaf7
Showing 1 changed file with 13 additions and 1 deletion.
14 changes: 13 additions & 1 deletion fawltydeps/traverse_project.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
"""Traverse a project to identify appropriate inputs to FawltyDeps."""

import logging
import sys
from pathlib import Path
from typing import AbstractSet, Iterator, Optional, Set, Tuple, Type, Union

Expand Down Expand Up @@ -129,7 +130,18 @@ def find_sources( # noqa: C901, PLR0912, PLR0915
if package_dirs is not None: # Python environment dir given directly
logger.debug(f"find_sources() Found {package_dirs}")
yield from package_dirs
traversal.skip_dir(path) # disable traversal of path below
if path in (settings.code | settings.deps):
# We are also searching this dir for code/deps, hence we should
# not skip it altogether, but rather only skip the parts of it
# that likely contain installed 3rd-party packages
for pyenv_src in package_dirs:
traversal.skip_dir(pyenv_src.path)
if sys.platform.startswith("win"): # Windows
traversal.skip_dir(path / "Scripts")
else: # Assume POSIX
traversal.skip_dir(path / "bin")
else:
traversal.skip_dir(path) # disable traversal of entire Python env
else: # must traverse directory to find Python environments
traversal.add(path, PyEnvSource)

Expand Down

0 comments on commit 339aaf7

Please sign in to comment.