Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for editable wheels #47

Closed
FFY00 opened this issue Apr 18, 2022 · 26 comments
Closed

Add support for editable wheels #47

FFY00 opened this issue Apr 18, 2022 · 26 comments
Labels
enhancement New feature or request
Milestone

Comments

@FFY00
Copy link
Member

FFY00 commented Apr 18, 2022

The last thing preventing this from being adopted in the wild is probably the missing support for editable wheels.

For the implementation, the idea is simple, create an editable wheel with the files from the build directory. The editable install would always need to be tied down to the build directory, which is the most annoying thing with this approach, but doing it other way would be more difficult, and very likely require some more work from the build system users.

The main worries with the implementation is that we might need to fix up some RPATHs, but I am hoping we won't, as that would bring in more complexity.
We will also likely need to do some code refactoring to re-use the logic from the normal wheel building, but I suspect that might end being too complex and not worth it. In which case, we might be better off with a little bit of code duplication. The jury's still out on that, we need to investigate and then make a call.

@rgommers does this sound reasonable to you?

@FFY00 FFY00 added the enhancement New feature or request label Apr 18, 2022
@rgommers
Copy link
Contributor

The editable install would always need to be tied down to the build directory

I'm not sure what this means exactly, can you elaborate?

The files needed for a working package are split between in-tree and build-dir. My understanding was that you planned to write out an install directory with symlinks for all files, with those symlinks pointing back to both directories.

@FFY00
Copy link
Member Author

FFY00 commented Apr 18, 2022

Not really, if I have an install directory, we'd have to run meson install to use it. That can be problematic as it is very tricky for us to inject code to do so on import. The proper distributable wheel is derived from a meson install installation. What I am planning to do here is to use the build directory itself instead.

Does that make sense to you? Is there anything you want me to clarify?

@rgommers
Copy link
Contributor

Ah, do you mean put symlinks into the build dir, pointing back at .py files (and data files etc.) that are under version control in-tree? If so, that makes sense, yes.

@FFY00
Copy link
Member Author

FFY00 commented Apr 18, 2022

No, the editable wheel would include symlinks to the build directory or source tree, depending if the file is generated or not.

The current approach for normal wheels is to run meson install and get the files from there, here we would be creating a wheel with symlinks to the build directory or source tree directly, intead of going through meson install.

The issue is that we would depend on the build directory existing, but I don't think it is that big of a deal since we already depend on the source tree anyway.

An alternative would be to maintain our own build directory, but that has a couple issues which I can go into more detail if you want.

@rgommers
Copy link
Contributor

No, the editable wheel would include symlinks to the build directory or source tree, depending if the file is generated or not.

Ah, now I get it. I thought there was an issue with this, that's why I didn't make the connection that the symlinks would be in the wheel. Found the discussion now: cupy/cupy#6261 (comment). It says " when generating a wheel all symlinks are "materialized"". Not sure if that applies here or not - I guess not, because the root cause is the wheel package not providing support for symlinks (the CuPy issue links to pypa/wheel#203), and we're not using wheel here.

What I had expected was writing directly a separate directory of symlinks, and then use a .pth file or another mechanism to point to that directory.

Okay, sounds like this should work and if it does, seems like a good plan to me.

The issue is that we would depend on the build directory existing, but I don't think it is that big of a deal since we already depend on the source tree anyway.

Agreed, that's perfectly fine I'd say.

@FFY00
Copy link
Member Author

FFY00 commented Apr 18, 2022

What I had expected was writing directly a separate directory of symlinks, and then use a .pth file or another mechanism to point to that directory.

That would be a bit tricky, and is not really needed since we can include symlinks in the wheel.

I think long term my plan would probably be to add an import hook, which would do the same job as a .pth file but make the implementation easier and more efficient. Unfortunately, there isn't a mechanism to inject the hook yet, and that is something I want to work on. I think there is an agreement already on how this should work (I couldn't find the links from a quick search), so we mostly just need to drive the PEP.

@rgommers
Copy link
Contributor

I think long term my plan would probably be to add an import hook

Now I'm a little confused again - if the wheel contains all the symlinks we need, why do we also need an import hook? Or are these two unrelated points?

@FFY00
Copy link
Member Author

FFY00 commented Apr 18, 2022

That is just a future plan, which would be similar to what you described initially with the .pth file. Right now the simplest approach is just to put symlinks in the wheel, later we might want to implement some checks and other stuff to make the user experience better, but that requires running code on import, which will require an approach like that 😅

@leofang
Copy link

leofang commented Apr 22, 2022

Right now the simplest approach is just to put symlinks in the wheel

I am confused. The madness about wheel support is you can't do this.

@FFY00
Copy link
Member Author

FFY00 commented Apr 22, 2022

Yeah, I was looking into that. The issue is that pypa/wheel does not support extracting symlinks, making this approach unusable, as most projects use it to extract wheels.

We will have to go with the more complex approach of using a .pth file.

@FFY00
Copy link
Member Author

FFY00 commented Dec 28, 2022

Implemented in #87, we will go through a pre-release before a proper release with the new feature.

@rgommers rgommers modified the milestone: v0.12.0 Dec 28, 2022
@rgommers
Copy link
Contributor

🎉. Let's close this then

@rgommers rgommers added this to the v0.13.0 milestone Dec 28, 2022
@lagru
Copy link

lagru commented Dec 29, 2022

(If you prefer a different place to discuss this, I'm happy to post there!)

Hey, thanks everyone for working on this in #87! I've just tested this with the current main (8a13f8d) and can confirm that this does seem to use the new meson-based system when creating an editable install of scikit-image (both with and without --no-build-isolation). 🎉 I think the synchronizing behavior on import is useful if a bit opaque without MESONPY_EDITABLE_VERBOSE=1 set. I'd argue that setting should be enabled by default.

However, I also noticed that this is not (yet) a drop-in replacement for the behavior of old-style editable installs: as far as I understand it the approach implemented in #87 doesn't actually use the source files of a project but maintains a copy in .mesonpy/editable/install/usr/lib/python3.10/site-packages/. This copy is updated each time the editable install is imported. Naive question: what is blocking us from pointing scikit_image-editable-hook.pth at the source directory and syncing .mesonpy/editable/build/ with the source directory? I'd be very happy to try and help with this or alternative solutions.

Unfortunately, the current behavior does not provide the main feature of an editable install that I had hoped for: that executed .py files are identical with the source files in a project. I would rather call the current implementation an "syncing or updating install" rather than an editable one. IDEs / debuggers (e.g. PyCharm) execute the code in .mesonpy. This get's awkward really quickly, e.g. when debugging. I have to navigate to the appropriate file in .mesonpy by hand, trigger a sync/rebuild, place a breakpoint and then start the debugger. If I want to iterate on the original source code (the one stuff in .mesonpy is synced with) I have to repeat this process several times. To me, that makes this workflow really painful.

I also noticed that the current trigger mechanism also doesn't play well with ipython's autoreload magic

Sorry, for not providing this feedback earlier, I did not really understand the approach that was suggested here. I also realize that this is a "but it breaks my workflow". 😉 However, I imagine that many other developers will face similar problems. So I thought I'd bring this up in the hope that there's a solution.

At this point, I'm also wondering how you guys solve this use case for yourselves. Are you using tools that make this issue go away or do you just rarely use a debugger?

@rgommers
Copy link
Contributor

@lagru thanks for your feedback! Before diving into possible design improvements, let's first detail out the usage issues a bit more. Using the debugger seems like the key issue (also brought up for SciPy).

Unfortunately, the current behavior does not provide the main feature of an editable install that I had hoped for: that executed .py files are identical with the source files in a project.

This is only an issue when there's an observable difference. If there's an auto-sync on every relevant user action, it should not matter.

I have to navigate to the appropriate file in .mesonpy by hand, trigger a sync/rebuild, place a breakpoint and then start the debugger.
...
Are you using tools that make this issue go away or do you just rarely use a debugger?

I have to say that I'm not using a debugger on a daily basis, but when I do use one then here are the ways I use it:

  1. For compiled code, by running gdb from the command line (see http://scipy.github.io/devdocs/dev/contributor/compiled_code.html)
  2. For Python code, figure out where I want to stop and then insert import ipdb; ipdb.set_trace() there

That should work fine (auto-syncing will copy that set_trace statement to where the code encounters it. So question to you: can you explain in more detail how you use a debugger and what exactly is not working?

@tupui
Copy link

tupui commented Dec 30, 2022

Moving part of the discussion we had on SciPy's issue tracker.

If I use Pytest from the CLI and ask it to discover SciPy's test or start an interpreter and import SciPy, it's working. The issue is that it uses files from .mesonpy/editable instead of the source themselves. See bellow.

❯ python -m pytest --pyargs scipy.stats.tests.test_qmc
========================================================================================================= test session starts =========================================================================================================
platform darwin -- Python 3.10.6, pytest-7.1.2, pluggy-1.0.0
rootdir: /Users/tupui
plugins: anyio-3.6.1, xdist-2.5.0, forked-1.4.0, timeout-2.1.0, cov-3.0.0
collected 135 items                                                                                                                                                                                                                   

Documents/Code/scipy/.mesonpy/editable/install/opt/homebrew/Caskroom/mambaforge/base/envs/scipy-dev/lib/python3.10/site-packages/scipy/stats/tests/test_qmc.py .......................................s.s.s.s.s.s...sss........ [ 47%]
............................s.s.s.s.s...sss............................                                                                                                                                                         [100%]

=================================================================================================== 118 passed, 17 skipped in 1.90s ===================================================================================================

This is a problem as with the IDE's if you put a breakpoint or ask to run a specific test, it's doing this in the original file. It has no knowledge of .mesonpy/editable.

And when trying to use the IDE to launch a test (you have the original file open, and click on one test to make it run), it complains about an import mismatch:

ImportError while loading conftest '/Users/tupui/Documents/Code/scipy/scipy/conftest.py'.
_pytest.pathlib.ImportPathMismatchError: ('scipy.conftest', '/Users/tupui/Documents/Code/scipy/.mesonpy/editable/install/opt/homebrew/Caskroom/mambaforge/base/envs/scipy-dev/lib/python3.10/site-packages/scipy/conftest.py', PosixPath('/Users/tupui/Documents/Code/scipy/scipy/conftest.py'))

The current workaround is to open in the IDE files in .mesonpy/editable and put the break points there or start tests from there. Hence not "in-place".

@FFY00
Copy link
Member Author

FFY00 commented Dec 30, 2022

I wonder if a custom loader built on top of the Traversable protocol will behave correctly here. If it doesn't, we may be able to fix it in pdb, but it needs to be explored.

@lagru
Copy link

lagru commented Dec 30, 2022

[...] let's first detail out the usage issues a bit more. [...] can you explain in more detail how you use a debugger and what exactly is not working?

Good call and gladly! A very common use case is the following: I want to test the behavior of a Python snippet, either to reproduce a bug report or test some other functionality. I'll paste that into PyCharm's IPython console, enable the debugger, navigate to the appropriate source file(s) in the IDE to set a breakpoint and then execute the prompt. For the navigation part I can use all the bell's an whistles the IDE provides, usually I just search for the function I need or click on a file path in a provided traceback (the last bit works with #87 too).

This approach allows for fast iteration, as I can easily repeat the prompt with different inputs, modify breakpoints and try different patches in the same place. With #87, the code I step through during debugging and the code I try patches on is in two different places. #87 helps in synchronizing in one direction but I doesn't help with having to open the duplicated file in the IDE twice and keeping in mind which is which. And it doesn't help that the executed part is nested several directories deep. I hope this explanation makes it a lot clearer for you.

Thanks for pointing out ipdb, I'll give it a whirl! However, ideally I would hope that meson-python would play nice out of the box with as much developer tooling out there as possible. Which brings me to the next part.

[...] that executed .py files are identical with the source files in a project.

This is only an issue when there's an observable difference. If there's an auto-sync on every relevant user action, it should not matter.

There is an observable difference: the location in the file tree is different. This results in a lot of issues because it's an assumption made by most dev tools in Python land: e.g. pytest as pointed out by @tupui, debuggers, autoreload, ... and many we are not aware of (yet). The synchronization solves only one part of the problem, and does so under the big assumption that a fresh import has been made beforehand (e.g. not true for workflows staying / iterating in an active shell).

I wonder if a custom loader built on top of the Traversable protocol will behave correctly here. If it doesn't, we may be able to fix it in pdb, but it needs to be explored.

This would only solve the problem for one tool out there, wouldn't it? Patching or accommodating dev tools out there (or expecting them to do it) sounds like a significant additional maintenance burden to me.

Earlier I mentioned pointing scikit_image-editable-hook.pth at the source directory and moving build binaries from .mesonpy/editable/build/ to the correct place in the source directory? The idea is to leave the meson build process untouched and isolated but afterwards move the created artifacts back to the source tree, which seems like an "easy" problem to me. Would this or something like it be feasible? If time and effort are an issue here, I say the additional work is well worth it. And I'm happy to back that up with help if that's wanted and useful. 🙂

@rgommers
Copy link
Contributor

rgommers commented Jan 3, 2023

It seems to me that symlinks will still be the right way to solve this. That way the IDE code navigation will end up opening the right files for sure; pdb should also work then I'd hope (but not 100% sure of that).

Once wheels support symlinks, that should be the way to go imho. Moving that symlink support along is good for other reasons as well.

Earlier I mentioned pointing scikit_image-editable-hook.pth at the source directory and moving build binaries from .mesonpy/editable/build/ to the correct place in the source directory? The idea is to leave the meson build process untouched and isolated but afterwards move the created artifacts back to the source tree, which seems like an "easy" problem to me

That's not great I think. Having the source tree be completely unchanged is a natural and nice property of out of tree builds. For example, it makes it unnecessary to maintain elaborate .gitignore files, it ensures that install_subdir(...) will only pick up the intended files, it allows having multiple builds in parallel in different build dirs, etc.

@lagru
Copy link

lagru commented Jan 3, 2023

Would .mesonpy contain symlinks to the source tree? That could indeed make synchronization unnecessary. I'm not sure if this is what you mean and how it will effect the debugging issue though. That might depend on how a debugger is finding the source file of an imported module, maybe through the module's __file__ or __path__ attributes? I think these will still point at .mesonpy as long as editable-hook.pth is pointing there as well. Symlinks seem not to get resolved during the import (at least that's what happened in a quick local test). But I think I get now what @FFY00 might be hinting at with the "custom loader built on top of the Traversable protocol"; maybe that can resolve symlinks during import... 👍

Having the source tree be completely unchanged is a natural and nice property of out of tree builds. For example, it makes it unnecessary to maintain elaborate .gitignore files, it ensures that install_subdir(...) will only pick up the intended files, it allows having multiple builds in parallel in different build dirs, etc.

That is indeed a really nice behavior to have in a build system and I completely agree with this as a worthwhile goal.

@rgommers
Copy link
Contributor

rgommers commented Jan 4, 2023

Would .mesonpy contain symlinks to the source tree?

Yes indeed. That was our original idea, and it seems like the most obvious one. The blocker was/is that the wheel format does not support symlinks.

And when trying to use the IDE to launch a test (you have the original file open, and click on one test to make it run), it complains about an import mismatch:

This conftest.py mismatch thing may not be too difficult to solve. @tupui I think the only problem here is that conftest.py is found when walking up the directory tree starting from the test you're trying to run via the IDE interface. If you'd rename conftest.py to _conftest and then installed it using [fs.copyfile`](https://mesonbuild.com/Fs-module.html#copyfile):

fs.copyfile('_conftest.py', 'conftest.py', install: true, install_dir: py3.get_install_dir() / 'scipy')

then I suspect that things will work fine. Same as things being already fine if you have your tests/ dir next to your package source tree, rather than all the tests inside the package.

@tupui
Copy link

tupui commented Jan 5, 2023

@rgommers I needed to do install: false. I am not sure I did all correctly in the build file though. Still, in the install directory I have conftest.py whereas in the source directory I have _conftest.py.

Now it's going a bit further but fails with another mismatch (this is what I launch a test from the UI, clicking a green arrow):

/opt/homebrew/Caskroom/mambaforge/base/envs/scipy-dev/bin/python /Applications/PyCharm.app/Contents/plugins/python/helpers/pycharm/_jb_pytest_runner.py --target test_qmc.py::TestUtils 
Testing started at 16:49 ...
Launching pytest with arguments test_qmc.py::TestUtils --no-header --no-summary -q in /Users/tupui/Documents/Code/scipy/scipy/stats/tests

============================= test session starts ==============================
collecting ... ERROR: not found: /Users/tupui/Documents/Code/scipy/scipy/stats/tests/test_qmc.py::TestUtils
(no name '/Users/tupui/Documents/Code/scipy/scipy/stats/tests/test_qmc.py::TestUtils' in any of [<Module test_qmc.py>])


scipy/stats/tests/test_qmc.py:None (scipy/stats/tests/test_qmc.py)
import file mismatch:
imported module 'scipy.stats.tests.test_qmc' has this __file__ attribute:
  /Users/tupui/Documents/Code/scipy/.mesonpy/editable/install/opt/homebrew/Caskroom/mambaforge/base/envs/scipy-dev/lib/python3.10/site-packages/scipy/stats/tests/test_qmc.py
which is not the same as the test file we want to collect:
  /Users/tupui/Documents/Code/scipy/scipy/stats/tests/test_qmc.py

@rgommers
Copy link
Contributor

rgommers commented Jan 6, 2023

Ah, that's a problem. I'm not sure there's a way around that import file mismatch thing. It seems necessary to either teach the IDE to run the correct file (e.g., by using pytest --pyargs ... always) or to teach pytest about this somehow.

@FFY00
Copy link
Member Author

FFY00 commented Jan 6, 2023

The only reasonable solution I can think of right now is to skip the meson install step, and do the install ourselves, using symlinks, as @rgommers proposed above.

@dnicolodi
Copy link
Member

dnicolodi commented Jan 10, 2023

Would implementing a custom importlib importer looking up modules and extensions in the source directory and in the build directory via the meson install path data be a viable alternative solution?

I haven't played with this aspect of Python before, but if the import mechanism can be made to load code from a zip file, it can be probably be taught to load code from arbitrary paths. What would need to be done is to use the install path data provided by meson to construct a "virtual" module hierarchy of the package content and traverse it to find the file to load for a given module name. I'll try to have a look to whether this is a viable solution later today.

Update: re-reading the comments in #87, I think the approach taken by @FFY00 initially was similar to what is proposed here. IIUC implementing support for package resources (in the importlib.resources sense) made the implementation complex. However, I think the complexity of implementing this is likely lower than for the other solutions proposed to have the code effectively loaded from the source files in the source directory. I'm still motivated to give this (another) try.

@FFY00
Copy link
Member Author

FFY00 commented Jan 10, 2023

Yes, I believe that it would, but we would only know for sure until we try it. But for now, what I mentioned above should be enough, if having a symlink is enough to trick IDEs into working properly.

The plan was always to eventually shift into a custom import loader approach.

@lagru
Copy link

lagru commented Feb 21, 2023

Just a note: on v0.13.0rc0 I replaced the site-packages folder in build-install with a symlink to my repository root which contains the skimage package. This workaround effectively reproduces the old behavior of editable installs perfectly (for now). I'm using meson-python through devpy build (installed commit).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

6 participants