-
Notifications
You must be signed in to change notification settings - Fork 811
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make torchtext available on conda-forge #2023
Comments
I fully agree! If any help is needed I'm here. Just a reminder that you only need to publish once, because the updates are automatically done. It's worth entering the conda-forge ecosystem and it has more than 20k packages |
I'm trying to get a recipe using https://www.marcelotrevisani.com/grayskull which I use to get every recipe and I don't see why it doesn't work for this package. |
@Nayef211 @parmeet @mthrok @osalpekar The Conda-forge community is trying to publish this package, could any of you help us figure this out? |
@ezyang @malfet @zou3519 @jerryzh168 @peterbell10 Sorry for mentioning you guys but you seem to be the main maintainers of PyTorch package, maybe any of you could give us some light on how to publish this on conda-forge. We are trying to publish We are trying to publish a package called |
@giswqs comment from autogluon/autogluon#612 (comment)
|
[edit] I see @giswqs already opened a PR that adds it to the forge |
That would be great! This is the main documentation about adding packages, but this is not very specific. https://conda-forge.org/docs/maintainer/adding_pkgs.html I bet @giswqs could enlighten us here, but if he can't, I will talk to one of the maintainers from conda-forge who are more familiar with the details of this kind of transition to make |
That pr is about trying to merge to conda forge, but still a long way to go. In the documentation I sent there is a section about feedstock structure and another about publishing, which should give you an idea. We need to build a recipe using requirements but at the moment we can't do that due to the current structure, which makes it harder. We are still trying figuring it out, but thanks for reaching out |
Don't we have to first change the build process so that it supports building on top of third party libraries from separate package? cc #1615 |
Yes, that's where we need help. Any guidance for changing the build process to support third party libraries?
|
Can someone share the recipe that was used to build torchtext on the pytorch channel on Anaconda? I guess it would be a similar recipe for building it on conda-forge. |
It's in packaging directory. https://github.com/pytorch/text/tree/main/packaging/torchtext |
Third party libraries are all configured here. I think introducing the switch to alternate the behavior between build-from-source and search-existing-installation, then actually implementing the search mechanism is one way to achieve that. |
That's the main issue for not being able to publish |
I am supportive of this direction but I don't really know about how conda recipe works, so I don't think I am good fit for recipe maintainer.
Those are files to setup environment in PyTorch CI, including downloading and installing Conda. And these scripts call the final conda build process. I imagine conda feed-stack CI infra already has equivalent setup mechanism. So that should not be a big issue. (but I can be wrong here)
Looking at PyTorch CMakeLists.txt, it can be as simple as defining the option https://github.com/pytorch/pytorch/blob/master/cmake/Dependencies.cmake#L136-L141 Couple of questions
|
Once the package is published, most of the updates are done automatically. Except when you add new packages or remove them, however, this can easily be done by creating a PR that changes the recipe. It literally takes 2 to 3 minutes and once published, others can (and will) help to fix it if it has any issues... The conda-forge community is very supportive and that's why it has more than 20.000 packages. It's easy to manage the updates because most of them are automatically |
Many times the structure of the package already published in pypi already allows conda-forge to be published in conda-forge channel by just using greyskull (package to create conda forge recipes). It seems that we have to manually do that with torchtext, but I'm not sure we even can... that's why we are asking for help here |
I don't have experience in this as I have not created a recipe with third party libraries before. Here is the conda-forge documentation regarding third party libraries.
Conda-forge can either pull the source from PyPI or GitHub. See the recipe here. For torchtext, I pulled the source code from the latest release |
I briefly looked into this, the general direction I think that can work is
For testing, one can simply install the dependencies from 1 via I do not have bandwidth to own this now. I think the ideal situation is that the logic for |
@h-vetinari do you think that would be helpful for conda-forge/staged-recipes#17129? mthrok is one of the maintainers of pytorch that is helping us to deal with this |
I already mentioned this in conda-forge/staged-recipes#21740:
So yes, the @mthrok's proposal would be very helpful, but currently I'm focusing on debugging the whole build end-to-end, and then we can think about upstreaming patches in a way that's both acceptable for the project here and helpful for our packaging. I'm pretty getting pretty far in conda-forge/staged-recipes#17129 now, currently running into a problem I commented about in #1886 (interestingly, GH does not create cross-references for links in a review, it seems...) |
|
If anyone is interested, I pushed a branch with the changes that were necessary to get this to work for conda-forge. There's several things around (not) building submodules, which are obviously not meant as patches that should be upstreamed directly (this is the part that @mthrok's proposal would cover much better by having some CMake switches like There's other adaptations to our infrastructure that are not necessarily suited to this repository (yet), but perhaps someone is interested in taking those and banging them into shape. Not sure I have much time to do it myself, but anyone's free to take my patches (and I've signed the contributor license agreement already for other meta projects, so that should be fine). |
Also worth noting, we don't have support on windows yet, because windows support for pytorch in conda-forge is still missing, primarily due to a combination of CI constraints & consequent maintenance constraints. |
@h-vetinari thanks for calling this out. Would you mind creating an issue to help us keep track of this? Also giving a very brief summary of the the things that were done differently in your branch (and their benefits) which could be adopted in the torchtext repo would be immensely helpful when we look into this in the near future! :) |
Note that we can upstream the C++ 17 patch immediately and we should be doing that before 2.0 release branch cut as PyTorch raised the requirement to C++17. See also: pytorch/audio#2973 |
Done #2031 |
🚀 Feature
Motivation
Although torchtext is available on the Anaconda pytorch channel, it is not yet available on the conda-forge channel. Downstream packages of pytorch can't be made available on conda-forge until torchtext has been published on conda-forge.
Pitch
Making torchtext available on conda-forge will benefit downstream packages and numerous conda-forge users.
Alternatives
I am not aware of any alternatives.
Additional context
Relevant issue: autogluon/autogluon#612
The text was updated successfully, but these errors were encountered: