Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gh-121607: Edited source file import recipe to make it more clear #121519

Merged
merged 11 commits into from
Sep 13, 2024

Conversation

ChrisBarker-NOAA
Copy link
Contributor

@ChrisBarker-NOAA ChrisBarker-NOAA commented Jul 8, 2024

gh-87005: Edited source file import recipe to make it more clear

Suggest change to the recipe in the importlib docs for: "Importing a source file directly"

Changes:

Added a note that SourceFileLoader.load_module() is deprecated.

Put the actual recipe code in a function to:

  • make it a bit more clear where the boundaries are
  • have something ready for folks to copy&paste into their code.

Changed the tokenize module to the json module to make it more clear that it's an arbitrary example. At least to me, tokenize has enough to do with importing modules that it took me a bit to realize that there was no tokenizing going on in this case at all and it was just an arbitrary example.

Added a touch more explanation.


📚 Documentation preview 📚: https://cpython-previews--121519.org.readthedocs.build/

Copy link
Member

@ZeroIntensity ZeroIntensity left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code block should follow PEP 8. Move the import json to the top, and put an extra newline between the import_from_path definition from and the comment. Should look like this:

import importlib.util
import json
import sys


def import_from_path(module_name, file_path):
    spec = importlib.util.spec_from_file_location(module_name, file_path)
    module = importlib.util.module_from_spec(spec)
    sys.modules[module_name] = module
    spec.loader.exec_module(module)
    return module


# For illustrative purposes a name and file path of an
# existing module is needed -- use the json module
# as an example
file_path = json.__file__
module_name = json.__name__

# equivalent of ``import json``
json = import_from_path(module_name, file_path)

@@ -1584,20 +1584,30 @@ Note that if ``name`` is a submodule (contains a dot),
Importing a source file directly
''''''''''''''''''''''''''''''''

.. note:: ``SourceFileLoader.load_module()`` has been deprecated -- this recipe should be used instead.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why the extra note? The method isn't used here and it's documented as deprecated, so why call it out here and add a visible callout in the rendered docs that some might find distracting?

Copy link
Contributor Author

@ChrisBarker-NOAA ChrisBarker-NOAA Jul 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did that because the advice to use SourceFileLoader.load_module() is found all over the internet (e.g. https://krbnite.github.io/How-to-Import-a-Python-Module-from-an-Arbitrary-Path/)
-- and it seems a lot simpler than this recipe, and it doesn't raise a deprecation error. so I thought it would be helpful to make it clear why the more complex recipe is preferred.

I removed the note directive to make it less distracting.

That being said, while it might be helpful now, it'll be clutter in the future -- so your call if you want to simply remove it.

Copy link
Contributor

@ncoghlan ncoghlan Jul 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think having a lead-in paragraph makes sense, but I think it's pointing at the wrong alternative: the reason importing an arbitrary path is a bad idea is because it's a good way to confuse the import system, and that's why the stdlib's only shorthand API for doing that is deprecated without a straightforward replacement.

The example given uses a top level module/package, so it's potentially OK (although I'm not 100% certain without checking the code for module_from_spec - if that bypasses sys.modules, you might end up with two different versions of JSONDecodeError kicking around, which can easily land you in exception handling hell), but if the file is part of an already importable package than you're far more likely to get some weird state duplication going on (it's a good way to get yourself caught in the "double import trap").

The much safer alternative for executing a Python file and getting access to its top-level module state is to use runpy.run_path (which returns the global namespace that results from executing the file rather than creating a module object).

In my experience, folks wanting to import from arbitrary paths is usually a classic case of the "XY problem": "X" is "I want to run a Python file from a Python program and get access to the resulting global variables", which becomes the "Y" question of "How do I import a Python module given only its path name?". With import being a statement, and the execfile builtin ancient history, it's an understandable leap for people to make, but that doesn't mean it's correct when there's no actual need to create a module object at all (let alone register it with the import system).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it doesn't raise a deprecation error

... yet. 😉 #121604

I did that because the advice to use SourceFileLoader.load_module() is found all over the internet

So is probably outdated Python 2 advice, but that doesn't mean we need to call it out. 😉

I think having a lead-in paragraph makes sense, but I think it's pointing at the wrong alternative: the reason importing an arbitrary path is a bad idea is because it's a good way to confuse the import system

I agree. I think it would be better to have a sentence or paragraph pointing out that the recipe is an approximation of an import statement where you specify the file path. It should also point out that modifying sys.path may be better than circumventing the import system or using runpy.run_path() if you just need the global namespace and not a module object.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW, this is already in the docs! All this PR does is clean up the snippet a little bit.

Doc/library/importlib.rst Outdated Show resolved Hide resolved
Doc/library/importlib.rst Outdated Show resolved Hide resolved
@bedevere-app
Copy link

bedevere-app bot commented Jul 9, 2024

A Python core developer has requested some changes be made to your pull request before we can consider merging it. If you could please address their requests along with any other requests in other reviews from core developers that would be appreciated.

Once you have made the requested changes, please leave a comment on this pull request containing the phrase I have made the requested changes; please review again. I will then notify any core developers who have left a review that you're ready for them to take another look at this pull request.

ChrisBarker-NOAA and others added 2 commits July 9, 2024 22:17
@ChrisBarker-NOAA
Copy link
Contributor Author

ChrisBarker-NOAA commented Jul 10, 2024

@ZeroIntensity: the json import serves only as a way to get a module file path for the example use case -- it's not part of the recipe code, so I think it's better to keep it where it was.

@ChrisBarker-NOAA
Copy link
Contributor Author

Thanks for the review.

I have made the requested changes; please review again

@bedevere-app
Copy link

bedevere-app bot commented Jul 10, 2024

Thanks for making the requested changes!

@brettcannon: please review the changes made to this pull request.

@bedevere-app bedevere-app bot requested a review from brettcannon July 10, 2024 05:29
Doc/library/importlib.rst Outdated Show resolved Hide resolved
Co-authored-by: Peter Bierma <[email protected]>
Copy link
Member

@ZeroIntensity ZeroIntensity left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! LGTM.

@@ -1584,20 +1584,29 @@ Note that if ``name`` is a submodule (contains a dot),
Importing a source file directly
''''''''''''''''''''''''''''''''

``SourceFileLoader.load_module()`` has been deprecated -- this recipe should be used instead.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be changed to be a warning about the recipe and to consider alternatives before using it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really don't feel qualified to write that warning -- I suppose I could take some of @ncoghlan's note above and try, but maybe someone else could propose text.

FWIW, I guess ncoghlan's note is why this is a recipe in the docs, and not a function in importlib, but I DO think this is a useful thing to do -- yes, i suppose a bit of a foot gun, but still useful.

In regards to the potential double import problem, in this case of the json module - yes, using the recipe for something on the import path is a bad idea, for sure -- that's only there to provide a example that can be run -- I borrowed that from the existing docs, but maybe it would be better to not provide a runnable example, and rather something like:

my_mod = import_from_path("my_mod", "path/to/a_python_file.py")

For the use-case that got me here, I'm using Python as a configuration definition -- yes, very dangerous if you are getting the config from an untrusted source, but that's not the case here (internal data analysis tool) -- and I very much do want a module, not a dict of a namespace.

I think this would be very useful for, e.g. a scriptable application as well, so users could provide external python scripts that could be run.

But, as @ZeroIntensity said, this is already in the docs -- I'm just trying to make it a bit more clear.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this would be very useful for, e.g. a scriptable application as well, so users could provide external python scripts that could be run.

In most cases, runpy can do that, but that's beside the point anyway.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can take what I outlined as what should be in the paragraph and just write better prose:

I think it would be better to have a sentence or paragraph pointing out that the recipe is an approximation of an import statement where you specify the file path. It should also point out that modifying sys.path may be better than circumventing the import system or using runpy.run_path() if you just need the global namespace and not a module object.

Comment on lines +1605 to +1610
import json
file_path = json.__file__
module_name = json.__name__

# Similar outcome as `import json`.
json = import_from_path(module_name, file_path)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I included this example because there had been a similar one in the previous version.

However, it is (always?) a bad idea to use this recipe to import a module on sys.path, particularly an stdlib one -- so maybe better to simply not provide a runnable example?

I expect the code itself is clear enough on how it can be used.

@ChrisBarker-NOAA
Copy link
Contributor Author

I have made the requested changes; please review again

@bedevere-app
Copy link

bedevere-app bot commented Jul 15, 2024

Thanks for making the requested changes!

@brettcannon: please review the changes made to this pull request.

@bedevere-app bedevere-app bot requested a review from brettcannon July 15, 2024 22:14
Doc/library/importlib.rst Outdated Show resolved Hide resolved
Doc/library/importlib.rst Outdated Show resolved Hide resolved
ChrisBarker-NOAA and others added 2 commits July 15, 2024 15:36
@brettcannon brettcannon added needs backport to 3.12 bug and security fixes needs backport to 3.13 bugs and security fixes labels Sep 13, 2024
@brettcannon brettcannon merged commit 3880917 into python:main Sep 13, 2024
24 of 27 checks passed
@miss-islington-app
Copy link

Thanks @ChrisBarker-NOAA for the PR, and @brettcannon for merging it 🌮🎉.. I'm working now to backport this PR to: 3.12, 3.13.
🐍🍒⛏🤖

miss-islington pushed a commit to miss-islington/cpython that referenced this pull request Sep 13, 2024
…ar (pythonGH-121519)

(cherry picked from commit 3880917)

Co-authored-by: Chris Barker <[email protected]>
Co-authored-by: Brett Cannon <[email protected]>
Co-authored-by: Peter Bierma <[email protected]>
@bedevere-app
Copy link

bedevere-app bot commented Sep 13, 2024

GH-124080 is a backport of this pull request to the 3.13 branch.

miss-islington pushed a commit to miss-islington/cpython that referenced this pull request Sep 13, 2024
…ar (pythonGH-121519)

(cherry picked from commit 3880917)

Co-authored-by: Chris Barker <[email protected]>
Co-authored-by: Brett Cannon <[email protected]>
Co-authored-by: Peter Bierma <[email protected]>
@bedevere-app bedevere-app bot removed the needs backport to 3.13 bugs and security fixes label Sep 13, 2024
@brettcannon
Copy link
Member

Thanks, @ChrisBarker-NOAA !

@bedevere-app
Copy link

bedevere-app bot commented Sep 13, 2024

GH-124081 is a backport of this pull request to the 3.12 branch.

@bedevere-app bedevere-app bot removed the needs backport to 3.12 bug and security fixes label Sep 13, 2024
@ChrisBarker-NOAA
Copy link
Contributor Author

Thanks all -- good to see something get finished!

Yhg1s pushed a commit that referenced this pull request Sep 24, 2024
…ear (GH-121519) (#124080)

gh-121607: Edited source file import recipe to make it more clear (GH-121519)
(cherry picked from commit 3880917)

Co-authored-by: Chris Barker <[email protected]>
Co-authored-by: Brett Cannon <[email protected]>
Co-authored-by: Peter Bierma <[email protected]>
brettcannon added a commit that referenced this pull request Oct 10, 2024
…ear (GH-121519) (GH-124081)

gh-121607: Edited source file import recipe to make it more clear (GH-121519)
(cherry picked from commit 3880917)

Co-authored-by: Chris Barker <[email protected]>
Co-authored-by: Brett Cannon <[email protected]>
Co-authored-by: Peter Bierma <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
docs Documentation in the Doc dir skip news
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants