Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a cibuildwheel workflow #6188

Merged
merged 1 commit into from
Nov 28, 2021

Conversation

AllSeeingEyeTolledEweSew
Copy link
Contributor

@AllSeeingEyeTolledEweSew AllSeeingEyeTolledEweSew commented May 7, 2021

This adds a cibuildwheel workflow.

Features

  • Builds and tests wheels in 28 flavors
  • Can be run manually, with an option to upload to pypi
  • Will run automatically when a release is published, unconditionally uploads to pypi
  • If a PR touches cibuildwheel infrastructure (cibuildwheel.yml or tools/cibuildwheel/**), we will test by building a representative sample of wheels (the full suite of wheels takes hours)
  • Caches build artifacts, so re-runs are fast

Changes

  • The python project name is changed from python-libtorrent to just libtorrent. Closer to common practice in python, plus the name was available on pypi
  • Other changes to setup.py, following the principle that --config-mode=distutils should create a wheel-suitable build by default

Tests

I patched this (and other dependent build system changes) on the v1.2.14 and v2.0.4 tags. I uploaded the wheels to test-pypi:

Release procedures and notes

  • @arvidn should create a pypi API token restricted to the "libtorrent" project, and add it as a github actions secret (PYPI_API_TOKEN)
  • EDIT: also create an API token on test-pypi restricted to the separate "libtorrent" project there, and add it as a github actions secret TEST_PYPI_API_TOKEN
  • I already registered the project on pypi (EDIT: and test-pypi), and invited @arvidn as an owner

Proposed release procedure:

  • before you create a release, manually run the cibuildwheel workflow without publishing to pypi, to make sure everything builds
  • If everything builds, run the workflow again and publish to pypi. Or just create the release, and the workflow will publish automatically.

Release procedure caveats

NOTE: pypi uploads are immutable, and filenames cannot be reused even after files are deleted. Publish with care!!

  • I already uploaded some files for libtorrent 2.0.4, then deleted them, not realizing those filenames can never be used again
    • If desired, this could be remedied by publishing a 2.0.4.1 release
  • It's good practice to upload to https://test.pypi.org/ before uploading the "production" index at https://pypi.org/
  • But "libtorrent" at test-pypi is held by someone else, I believe by github user @mayli
  • I have contacted @mayli by email to transfer ownership of the test-pypi project EDIT: mayli transferred ownership of the project, thanks!

Build Flavors

  • cp37-cp37m-macosx_10_9_x86_64
  • cp37-cp37m-manylinux_2_12_x86_64.manylinux2010_x86_64
  • cp37-cp37m-manylinux_2_17_aarch64.manylinux2014_aarch64
  • cp37-cp37m-musllinux_1_1_aarch64
  • cp37-cp37m-musllinux_1_1_x86_64
  • cp37-cp37m-win32
  • cp37-cp37m-win_amd64
  • cp38-cp38-macosx_10_9_x86_64
  • cp38-cp38-manylinux_2_12_x86_64.manylinux2010_x86_64
  • cp38-cp38-manylinux_2_17_aarch64.manylinux2014_aarch64
  • cp38-cp38-musllinux_1_1_aarch64
  • cp38-cp38-musllinux_1_1_x86_64
  • cp38-cp38-win32
  • cp38-cp38-win_amd64
  • cp39-cp39-macosx_10_9_x86_64
  • cp39-cp39-manylinux_2_12_x86_64.manylinux2010_x86_64
  • cp39-cp39-manylinux_2_17_aarch64.manylinux2014_aarch64
  • cp39-cp39-musllinux_1_1_aarch64
  • cp39-cp39-musllinux_1_1_x86_64
  • cp39-cp39-win32
  • cp39-cp39-win_amd64
  • cp310-cp310-macosx_10_9_x86_64
  • cp310-cp310-manylinux_2_12_x86_64.manylinux2010_x86_64
  • cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64
  • cp310-cp310-musllinux_1_1_aarch64
  • cp310-cp310-musllinux_1_1_x86_64
  • cp310-cp310-win32
  • cp310-cp310-win_amd64

Missing flavors

Flavors we'd like to build but can't

  • macos arm64
    • the x86 macos runners can cross-compile, but I can't seem to make it work. We are invoking b2 architecture=arm address-model=64 which should be right, but there are still link errors about mixing x86_64 objects.
    • macos on x86_64 cannot run or emulate arm64 code, so these builds can't be tested until there are m1 mac runners
  • macos universal2
    • b2 architecture=combined address-model=64 does not build this (should pick -arch x86_64 -arch arm64), even in boost 1.77.0. I tried cxxflags="-arch x86_64 -arch arm64", but the compiler says these flags are unused.
    • as above, these can't be tested until there are m1 mac runners. apparently mac on m1 can emulate x86_64, so universal binaries can be tested on a single runner, just not the runners we have
  • musllinux
    • this is new, and cibuildwheel doesn't support it yet EDIT: the latest cibuildwheel supports this!
  • pypy
    • the python bindings don't build with this. I'll investigate

Other flavors

The build times already take hours, and the artifacts are large due to static linking, and pypi has finite hosting space. We should add flavors as requested, rather than building everything.

Python 3.6 will reach end of life in 2021-12, so I've preemptively removed it from the list to shorten build times. numpy has already done this in their most recent release

Future optimizations

The builds currently take hours on aarch64 and other emulated architectures. It would be nice to reduce this.

abi3

The biggest win would be to use abi3 which would let us produce one build per platform rather than building for every python version. However

Prebuild dependencies

  • openssl and boost-python (in all python versions) are built from source every time. We could cache them somewhere. On linux, using actions/cache is difficult because the builds run in docker. The right approach is probably to build a custom docker image with prebuilt dependencies. This would especially cut down build times for aarch64 and other emulated architectures.

macos ccache

We build artifacts for a given platform sequentially in a single job, which lets us use ccache. On the Linux builds this works well today.

On mac, ccache reports 0% hit rates between builds for some reason. The only thing that changes is the python version. We should investigate this.

windows ccache equivalent

On Windows, there is no build caching. It would be nice to have. However the Windows builds already complete in ~40 minutes, so this is a smaller gain compared to others.

@AllSeeingEyeTolledEweSew AllSeeingEyeTolledEweSew force-pushed the asetes-cibw branch 24 times, most recently from 387dd0f to 09d7eb9 Compare May 8, 2021 08:35
@AllSeeingEyeTolledEweSew
Copy link
Contributor Author

@arvidn could you take a look at the linux and mac workflows? They have compile errors but I don't understand why.

@arvidn
Copy link
Owner

arvidn commented May 8, 2021

    ../../src/disk_io_job.cpp:111:8: error: 'visit<libtorrent::aux::(anonymous namespace)::caller_visitor, std::__1::variant<...> &>' is unavailable: introduced in macOS 10.14

Apparently std::variant isn't part of the runtime on MacOS until 10.14. So there needs to be a requirement for at least that version. It's something I need to fix in the Jamfile (and cmake file) to specify the oldest version to support.

I think the Jamfile and CMakeList.txt both need to specify MACOSX_DEPLOYMENT_TARGET=10.14 as a macro.

@arvidn
Copy link
Owner

arvidn commented May 8, 2021

I'm not sure why this shows up in this PR though.

@AllSeeingEyeTolledEweSew
Copy link
Contributor Author

I'm not sure why this shows up in this PR though.

I think I needed to configure cibuildwheel for this, but didn't. https://cibuildwheel.readthedocs.io/en/stable/cpp_standards/#macos-and-deployment-target-versions

Is something similar happening with the Linux error? I'll try to configure a newer manylinux image and see if it gets fixed.

@AllSeeingEyeTolledEweSew AllSeeingEyeTolledEweSew force-pushed the asetes-cibw branch 2 times, most recently from ad39f21 to c7b8d23 Compare May 9, 2021 19:58
@arvidn
Copy link
Owner

arvidn commented Nov 23, 2021

That's orders of magnitude longer than it should be. What's ./test/rootfs? Is your checkout clean?

No, I have lots of stuff in my repo, and I would like to keep it. I take it this will pull in everything in the libtorrent directory then, whether it's needed or not.

@AllSeeingEyeTolledEweSew
Copy link
Contributor Author

No, I have lots of stuff in my repo, and I would like to keep it. I take it this will pull in everything in the libtorrent directory then, whether it's needed or not.

Yes. On Linux it creates a docker container, copies the whole directory, and builds in isolation. I don't think cibuildwheel has options to limit what gets copied.

@cas--
Copy link
Contributor

cas-- commented Nov 23, 2021

With Docker, normally would use a .dockerignore file, wonder if that would work here?

@AllSeeingEyeTolledEweSew
Copy link
Contributor Author

With Docker, normally would use a .dockerignore file, wonder if that would work here?

I believe .dockerignore is only used for a COPY in a Dockerfile, but this copy is done by cibuildwheel to copy stuff into a running container.

The copy takes less than a second on a fresh checkout (see the PR check), and we don't expect cibuildwheel to be run locally except when debugging Linux builds, so I don't think we need to optimize copying, or do work to defend against interference from local build artifacts.

@AllSeeingEyeTolledEweSew
Copy link
Contributor Author

I changed the implementation from the run_cibuildwheel.py wrapper script to just using declarative config in pyproject.toml. Their config system is now robust enough to support everything we were doing before, and I think it's slightly easier to read.

Full workflow (all build flavors) here

paths:
- .github/workflows/cibuildwheel.yml
- tools/cibuildwheel/**
- pyproject.toml
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is assuming that anything that would break cibuildwheel would also break the existing builds of the python bindings, right?
Isn't this the only place where the binding is built in an anylinux environment though?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes to the referenced files could break cibuildwheel without breaking builds of the python bindings.

cibuildwheel is the easiest (or only) way to use manylinux/musllinux environments, yes.

Not sure if this answers what you're asking, but I expect cibuildwheel to be redundant with the other python build workflows, except when we want the specialized environments for building release wheels.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In theory we could remove the existing python bindings workflows and just use cibuildwheel with limited flavor selection instead for all PRs, but I think there are a few disadvantages:

  • The cibuildwheel workflow builds openssl from source on Linux, which takes longer
  • The cibuildwheel workflow wouldn't test against distro versions of openssl, which can be nice

@arvidn
Copy link
Owner

arvidn commented Nov 24, 2021

I checked out your branch in a clean repo, but without run_cibuildwheel.py, how do I build this locally now? It seems the commands are now hidden inside github actions

}


function do_standard_install {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see this function being used anywhere

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These scripts are vendored from https://github.com/pypa/manylinux/tree/main/docker/build_scripts. I took them as a semi-official reference for how to build up-to-date openssl. (manylinux's build scripts build openssl as a dependency for other build tools, then delete openssl, because they don't want to be on the hook to deploy updates for heartbleed 2.0).

I thought it would be nice to add a cron workflow to automatically pull these scripts from their repo, but I wanted to keep this PR simple

@AllSeeingEyeTolledEweSew
Copy link
Contributor Author

I checked out your branch in a clean repo, but without run_cibuildwheel.py, how do I build this locally now? It seems the commands are now hidden inside github actions

Now it's just python -m cibuildwheel.

I used the pypa/cibuildwheel action in the workflow definition because

@arvidn
Copy link
Owner

arvidn commented Nov 26, 2021

I get this build failure. I'll investigate:

  src/module.cpp: In function ‘void init_module_libtorrent()’:
  src/module.cpp:34:5: error: ‘Py_Initialize’ was not declared in this scope
       Py_Initialize();
       ^~~~~~~~~~~~~
  src/module.cpp:34:5: note: suggested alternative: ‘PyPy_IsInitialized’
       Py_Initialize();
       ^~~~~~~~~~~~~
       PyPy_IsInitialized

      "g++"   -fPIC -O3 -finline-functions -Wno-inline -Wall -Wno-deprecated-declarations -fvisibility-inlines-hidden -fPIC -fvisibility=hidden -DBOOST_ALL_NO_LIB -DBOOST_ASIO_ENABLE_CANCELIO -DBOOST_ASIO_HAS_STD_CHRONO -DBOOST_MULTI_INDEX_DISABLE_SERIALIZATION -DBOOST_PYTHON_STATIC_LIB -DBOOST_SYSTEM_NO_DEPRECATED -DBOOST_SYSTEM_STATIC_LINK=1 -DNDEBUG -DOPENSSL_NO_SSL2 -DTORRENT_USE_I2P=1 -DTORRENT_USE_LIBCRYPTO -DTORRENT_USE_OPENSSL -D_FILE_OFFSET_BITS=64 -D_WIN32_WINNT=0x0600  -I"../../include" -I"../../include/libtorrent" -I"/opt/python/pp37-pypy37_pp73/include" -I"/tmp/boost" -I"/usr/sfw/include" -I"src"  -c -o "bin/gcc-8/release/address-model-64/crypto-openssl/libtorrent-python-pic-on/lt-visibility-hidden/python-3.7/src/module.o" "src/module.cpp"

@arvidn
Copy link
Owner

arvidn commented Nov 26, 2021

interestingly, with this patch it builds:

diff --git a/bindings/python/src/boost_python.hpp b/bindings/python/src/boost_python.hpp
index ed636e4b2..2abda563f 100644
--- a/bindings/python/src/boost_python.hpp
+++ b/bindings/python/src/boost_python.hpp
@@ -7,6 +7,7 @@
 
 #include <cstdio>
 #include <libtorrent/aux_/disable_warnings_push.hpp>
+#include <Python.h>
 #include <boost/python.hpp>
 
 #include <boost/bind/placeholders.hpp>
diff --git a/bindings/python/src/module.cpp b/bindings/python/src/module.cpp
index 305f4872f..89837aaa3 100644
--- a/bindings/python/src/module.cpp
+++ b/bindings/python/src/module.cpp
@@ -7,6 +7,7 @@
 #endif
 
 #include <boost/python/module.hpp>
+#include "boost_python.hpp"
 #include "libtorrent/config.hpp"
 
 void bind_utility();

but cibuildwheel still fails with:

Successfully installed libtorrent-1.2.15
    + sh -c 'cd /project/bindings/python && python test.py'
Traceback (most recent call last):
  File "test.py", line 5, in <module>
    import libtorrent as lt
ImportError: libtorrent-rasterbar.so.10.0.0: cannot open shared object file: No such file or directory

                                                              ✕ 5.83s
Error: Command ['sh', '-c', 'cd /project/bindings/python && python test.py'] failed with code 1. 

@AllSeeingEyeTolledEweSew
Copy link
Contributor Author

That PyPy_IsInitialized indicates it's trying to build wheels for pypy. I had disabled that in the run_cibuildwheel.py wrapper but apparently forgot to port that config to pyproject.toml. I've updated that now.

The workflows specifically select cpython currently, so it didn't get caught there.

It might be easy to get pypy to build and work, but I didn't want to do it in this PR.

@arvidn
Copy link
Owner

arvidn commented Nov 26, 2021

ok, that seems to fix the build failure. I'm still seeing this though:

Processing /tmp/cibuildwheel/repaired_wheel/libtorrent-1.2.15-cp37-cp37m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Installing collected packages: libtorrent
Successfully installed libtorrent-1.2.15
    + sh -c 'cd /project/bindings/python && python test.py'
Traceback (most recent call last):
  File "test.py", line 5, in <module>
    import libtorrent as lt
ImportError: libtorrent-rasterbar.so.10.0.0: cannot open shared object file: No such file or directory

                                                              ✕ 5.89s
Error: Command ['sh', '-c', 'cd /project/bindings/python && python test.py'] failed with code 1. 

@arvidn
Copy link
Owner

arvidn commented Nov 27, 2021

I found that you can see the build log with:

CIBW_BUILD_VERBOSITY="3" python -m cibuildwheel --platform linux

Which reveals that the main libtorrent library is not built, just the python bindings. So it's really no surprise that running the tests fail saying the library isn't found.

I suppose it's reasonable to just build the binding, but I don't understand why the pyptoject.toml then is also trying to run the tests. That won't ever work.

Am I misunderstanding something? Is it supposed to also build the main library?

@arvidn
Copy link
Owner

arvidn commented Nov 27, 2021

I see, it's linking the python binding statically against both libtorrent and boost:

b2 boost-link=static libtorrent-link=static crypto=openssl deprecated-functions=on variant=release address-model=64 python=3.7

clearly the python module that's actually loaded by the test wasn't linked statically

@arvidn
Copy link
Owner

arvidn commented Nov 27, 2021

is the virtual machine torn down and deleted every time? I'm interested in being able to look around where files end up and analyze the build output. Is there a way to do that?

@arvidn arvidn added this to the 1.2.15 milestone Nov 27, 2021
@cas--
Copy link
Contributor

cas-- commented Nov 27, 2021

@arvidn You can use upload artifact action to store output from a job and download it

https://docs.github.com/en/actions/advanced-guides/storing-workflow-data-as-artifacts#downloading-or-deleting-artifacts

@arvidn
Copy link
Owner

arvidn commented Nov 27, 2021

@cas-- as far as I understand, it's working in CI. I'm interested in being able to reproduce the build locally.

@AllSeeingEyeTolledEweSew
Copy link
Contributor Author

I can't think of why the linking wouldn't work. I suspect the test is picking up the wrong build artifact again, somehow.

You could use CIBW_TEST_SKIP="*", which will mean the (broken) wheel will get copied out of the container, and can be inspected. If that looks intact (no missing link to libtorrent-rasterbar.so), I'd be more confident a build artifact is getting picked up.

Unfortunately it looks like cibuildwheel unconditionally removes the container, so you can't inspect intermediary files used in building.

You could start the docker container manually, and run each step manually in the container.

I'm not sure why I haven't been able to reproduce this failure locally. It should be exactly the same.

@arvidn arvidn merged commit 26f398c into arvidn:RC_1_2 Nov 28, 2021
@AllSeeingEyeTolledEweSew
Copy link
Contributor Author

🎉

@AllSeeingEyeTolledEweSew
Copy link
Contributor Author

FYI: I believe manually-triggered workflows must exist in the default branch (RC_2_0) before they can be run against any branch, so you may not see an option to run this until it's merged through

@AllSeeingEyeTolledEweSew AllSeeingEyeTolledEweSew deleted the asetes-cibw branch December 7, 2021 17:24
@jkterry1
Copy link

jkterry1 commented Feb 5, 2022

@AllSeeingEyeTolledEweSew I'm finally trying to fully put this into production for the project I was telling you about and me and my collaborators had a few questions I wanted to ask you. Could you please email me at [email protected]?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants