Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Address packaging warnings related to data files included inside the flintrock package directory #368

Open
nchammas opened this issue Nov 26, 2023 · 0 comments

Comments

@nchammas
Copy link
Owner

nchammas commented Nov 26, 2023

Installing Flintrock with pip install -vvv -e . shows a few warnings like this:

  /private/var/folders/_0/vqvhqvnj5wq9s4s5drcb30pc0000gn/T/pip-build-env-lm653aie/overlay/lib/python3.8/site-packages/setuptools/command/build_py.py:207: _Warning: Package 'flintrock.scripts' is absent from the `packages` configuration.
  !!

          ********************************************************************************
          ############################
          # Package would be ignored #
          ############################
          Python recognizes 'flintrock.scripts' as an importable package[^1],
          but it is absent from setuptools' `packages` configuration.

          This leads to an ambiguous overall configuration. If you want to distribute this
          package, please make sure that 'flintrock.scripts' is explicitly added
          to the `packages` configuration field.

          Alternatively, you can also rely on setuptools' discovery methods
          (for example by using `find_namespace_packages(...)`/`find_namespace:`
          instead of `find_packages(...)`/`find:`).

          You can read more about "package discovery" on setuptools documentation page:

          - https://setuptools.pypa.io/en/latest/userguide/package_discovery.html

          If you don't want 'flintrock.scripts' to be distributed and are
          already explicitly excluding 'flintrock.scripts' via
          `find_namespace_packages(...)/find_namespace` or `find_packages(...)/find`,
          you can try to use `exclude_package_data`, or `include-package-data=False` in
          combination with a more fine grained `package-data` configuration.

          You can read more about "package data files" on setuptools documentation page:

          - https://setuptools.pypa.io/en/latest/userguide/datafiles.html


          [^1]: For Python, any directory (with suitable naming) can be imported,
                even if it does not contain any `.py` files.
                On the other hand, currently there is no concept of package data
                directory, all directories are treated like packages.
          ********************************************************************************

  !!

These warnings are triggered for all the data folders we package for use during cluster launch and configuration:

  • flintrock/scripts
  • flintrock/templates/hadoop/conf
  • flintrock/templates/spark/conf

This is because an importable Python package can just be a subdirectory, so a subdirectory of data doubles ambiguously as a Python package. This is discussed extensively here: pypa/setuptools#3340

And just for the record, we also package the following non-directories as data (which don't trigger any warnings):

  • config.yaml.template

It's not a big deal, and I don't think it will impact users for now. Setuptools may in the future be more forceful about discouraging this kind of mixing of Python packages and data.

I don't know how to fix this, so I'm just noting it here for the future. Perhaps we need to reorganize the code from a flat-layout to a src-layout and have the data files live outside the flintrock/ package directory. I'm not sure.

Currently using:

  • pip 23.3.1
  • setuptools 56.0.0
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant