Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GH-43951: [CI][Python] Use GitHub Packages for vcpkg cache #44644

Merged
merged 33 commits into from
Nov 15, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
43 changes: 27 additions & 16 deletions ci/docker/python-wheel-manylinux.dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -69,36 +69,47 @@ RUN /arrow/ci/scripts/install_ccache.sh ${ccache} /usr/local
ARG vcpkg
COPY ci/vcpkg/*.patch \
ci/vcpkg/*linux*.cmake \
ci/vcpkg/vcpkg.json \
arrow/ci/vcpkg/
COPY ci/scripts/install_vcpkg.sh \
arrow/ci/scripts/
ENV VCPKG_ROOT=/opt/vcpkg
ARG build_type=release
ENV CMAKE_BUILD_TYPE=${build_type} \
VCPKG_FORCE_SYSTEM_BINARIES=1 \
VCPKG_OVERLAY_TRIPLETS=/arrow/ci/vcpkg \
PATH="${PATH}:${VCPKG_ROOT}" \
VCPKG_DEFAULT_TRIPLET=${arch_short}-linux-static-${build_type} \
VCPKG_FEATURE_FLAGS="manifests"

RUN arrow/ci/scripts/install_vcpkg.sh ${VCPKG_ROOT} ${vcpkg}
ENV PATH="${PATH}:${VCPKG_ROOT}"

COPY ci/vcpkg/vcpkg.json arrow/ci/vcpkg/
# cannot use the S3 feature here because while aws-sdk-cpp=1.9.160 contains
# ssl related fixes as well as we can patch the vcpkg portfile to support
# arm machines it hits ARROW-15141 where we would need to fall back to 1.8.186
# but we cannot patch those portfiles since vcpkg-tool handles the checkout of
# previous versions => use bundled S3 build
RUN vcpkg install \
VCPKG_FEATURE_FLAGS="manifests" \
VCPKG_FORCE_SYSTEM_BINARIES=1 \
VCPKG_OVERLAY_TRIPLETS=/arrow/ci/vcpkg
# For --mount=type=secret: The GITHUB_TOKEN is the only real secret but we use
# --mount=type=secret for GITHUB_REPOSITORY_OWNER and
# VCPKG_BINARY_SOURCES too because we don't want to store them
# into the built image in order to easily reuse the built image cache.
#
# For vcpkg install: cannot use the S3 feature here because while
# aws-sdk-cpp=1.9.160 contains ssl related fixes as well as we can
# patch the vcpkg portfile to support arm machines it hits ARROW-15141
# where we would need to fall back to 1.8.186 but we cannot patch
# those portfiles since vcpkg-tool handles the checkout of previous
# versions => use bundled S3 build
RUN --mount=type=secret,id=github_repository_owner \
--mount=type=secret,id=github_token \
--mount=type=secret,id=vcpkg_binary_sources \
export GITHUB_REPOSITORY_OWNER=$(cat /run/secrets/github_repository_owner); \
export GITHUB_TOKEN=$(cat /run/secrets/github_token); \
export VCPKG_BINARY_SOURCES=$(cat /run/secrets/vcpkg_binary_sources); \
arrow/ci/scripts/install_vcpkg.sh ${VCPKG_ROOT} ${vcpkg} && \
vcpkg install \
--clean-after-build \
--x-install-root=${VCPKG_ROOT}/installed \
--x-manifest-root=/arrow/ci/vcpkg \
--x-feature=azure \
--x-feature=azure \
--x-feature=flight \
--x-feature=gcs \
--x-feature=json \
--x-feature=parquet \
--x-feature=s3
--x-feature=s3 && \
rm -rf ~/.config/NuGet/

# Make sure auditwheel is up-to-date
RUN pipx upgrade auditwheel
Expand Down
33 changes: 31 additions & 2 deletions ci/scripts/install_vcpkg.sh
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@
# specific language governing permissions and limitations
# under the License.

set -e
set -eu

if [ "$#" -lt 1 ]; then
echo "Usage: $0 ``<target-directory> [<vcpkg-version> [<vcpkg-ports-patch>]]"
Expand All @@ -42,7 +42,7 @@ pushd ${vcpkg_destination}

git checkout "${vcpkg_version}"

if [[ "$OSTYPE" == "msys" ]]; then
if [[ "${OSTYPE:-}" == "msys" ]]; then
./bootstrap-vcpkg.bat -disableMetrics
else
./bootstrap-vcpkg.sh -disableMetrics
Expand All @@ -53,4 +53,33 @@ if [ -f "${vcpkg_ports_patch}" ]; then
echo "Patch successfully applied to the VCPKG port files!"
fi

if [ -n "${GITHUB_TOKEN:-}" ] && \
[ -n "${GITHUB_REPOSITORY_OWNER:-}" ] && \
[ "${VCPKG_BINARY_SOURCES:-}" = "clear;nuget,GitHub,readwrite" ] ; then
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in which case would VCPKG_BINARY_SOURCES be different to clear;nuget,GitHub,readwrite? on manylinux2014_aarch64 this wouldn't be present but not different, right?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If VCPKG_BINARY_SOURCES is not clear;nuget,GitHub,readwrite (including not present case on manylinux2014_aarch64), we don't configure NuGet automatically.

Ignoring not clear;nuget,GitHub,readwrite VCPKG_BINARY_SOURCES is for using other cache. For example, we may want to use local file cache for local archery docker run. But it's out-of-scope of this PR and it's not tested. If we need it, we can work on it as a separated task.

if type dnf 2>/dev/null; then
dnf install -y epel-release
dnf install -y mono-complete
curl \
--location \
--output "${vcpkg_destination}/nuget" \
https://dist.nuget.org/win-x86-commandline/latest/nuget.exe
fi
PATH="${vcpkg_destination}:${PATH}"
nuget_url="https://nuget.pkg.github.com/${GITHUB_REPOSITORY_OWNER}/index.json"
nuget="$(vcpkg fetch nuget | tail -n 1)"
if type mono 2>/dev/null; then
nuget="mono ${nuget}"
fi
${nuget} \
sources add \
-source "${nuget_url}" \
-storepasswordincleartext \
-name "GitHub" \
-username "${GITHUB_REPOSITORY_OWNER}" \
-password "${GITHUB_TOKEN}"
${nuget} \
setapikey "${GITHUB_TOKEN}" \
-source "${nuget_url}"
fi

popd
10 changes: 10 additions & 0 deletions dev/tasks/java-jars/github.yml
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,9 @@

{{ macros.github_header() }}

permissions:
packages: write

jobs:

build-cpp-ubuntu:
Expand Down Expand Up @@ -51,7 +54,14 @@ jobs:
- name: Build C++ libraries
env:
{{ macros.github_set_sccache_envvars()|indent(8) }}
GITHUB_TOKEN: {{ '${{ secrets.GITHUB_TOKEN }}' }}
run: |
if [ "${ARCH}" = "arm64v8" ]; then
# We can't use NuGet on manylinux2014_aarch64 because Mono is old.
:
else
export VCPKG_BINARY_SOURCES="clear;nuget,GitHub,readwrite"
fi
archery docker run \
-e ARROW_JAVA_BUILD=OFF \
-e ARROW_JAVA_TEST=OFF \
Expand Down
16 changes: 15 additions & 1 deletion dev/tasks/python-wheels/github.linux.yml
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,9 @@

{{ macros.github_header() }}

permissions:
packages: write

jobs:
build:
name: "Build wheel for manylinux {{ manylinux_version }}"
Expand Down Expand Up @@ -49,7 +52,18 @@ jobs:

- name: Build wheel
shell: bash
run: archery docker run -e SETUPTOOLS_SCM_PRETEND_VERSION={{ arrow.no_rc_version }} python-wheel-manylinux-{{ manylinux_version }}
env:
GITHUB_TOKEN: {{ '${{ secrets.GITHUB_TOKEN }}' }}
run: |
if [ "{{ manylinux_version }}" = "2014" ] && [ "{{ arch }}" = "arm64" ]; then
# We can't use NuGet on manylinux2014_aarch64 because Mono is old.
:
else
export VCPKG_BINARY_SOURCES="clear;nuget,GitHub,readwrite"
fi
archery docker run \
-e SETUPTOOLS_SCM_PRETEND_VERSION={{ arrow.no_rc_version }} \
python-wheel-manylinux-{{ manylinux_version }}

- uses: actions/upload-artifact@v4
with:
Expand Down
21 changes: 5 additions & 16 deletions dev/tasks/python-wheels/github.osx.yml
Original file line number Diff line number Diff line change
Expand Up @@ -37,6 +37,9 @@
VCPKG_OVERLAY_TRIPLETS: {{ "${{ github.workspace }}/arrow/ci/vcpkg" }}
VCPKG_ROOT: {{ "${{ github.workspace }}/vcpkg" }}

permissions:
packages: write

jobs:
build:
name: Build wheel for Python {{ python_version }} on macOS
Expand Down Expand Up @@ -69,27 +72,13 @@ jobs:
echo "VCPKG_VERSION=$vcpkg_version" >> $GITHUB_ENV

- name: Install Vcpkg
env:
GITHUB_TOKEN: {{ '${{ secrets.GITHUB_TOKEN }}' }}
run: arrow/ci/scripts/install_vcpkg.sh $VCPKG_ROOT $VCPKG_VERSION

- name: Add Vcpkg to PATH
run: echo ${VCPKG_ROOT} >> $GITHUB_PATH

- name: Setup NuGet Credentials
env:
GITHUB_TOKEN: {{ '${{ secrets.GITHUB_TOKEN }}' }}
run: |
mono $(vcpkg fetch nuget | tail -n 1) \
sources add \
-source "https://nuget.pkg.github.com/$GITHUB_REPOSITORY_OWNER/index.json" \
-storepasswordincleartext \
-name "GitHub" \
-username "$GITHUB_REPOSITORY_OWNER" \
-password "$GITHUB_TOKEN" \

mono $(vcpkg fetch nuget | tail -n 1) \
setapikey "$GITHUB_TOKEN" \
-source "https://nuget.pkg.github.com/$GITHUB_REPOSITORY_OWNER/index.json"

- name: Install Packages
run: |
vcpkg install \
Expand Down
4 changes: 3 additions & 1 deletion dev/tasks/python-wheels/github.windows.yml
Original file line number Diff line number Diff line change
Expand Up @@ -33,7 +33,9 @@ jobs:
# note that we don't run docker build since there wouldn't be a cache hit
# and rebuilding the dependencies takes a fair amount of time
REPO: ghcr.io/ursacomputing/arrow
# BuildKit isn't really supported on Windows for now
# BuildKit isn't really supported on Windows for now.
# NuGet + GitHub Packages based vcpkg cache is also disabled for now.
# Because secret mount requires BuildKit.
DOCKER_BUILDKIT: 0

steps:
Expand Down
37 changes: 26 additions & 11 deletions docker-compose.yml
Original file line number Diff line number Diff line change
Expand Up @@ -53,26 +53,31 @@
#
# See more in cpp/build-support/run-test.sh::print_coredumps

x-common: &common
GITHUB_ACTIONS:

x-ccache: &ccache
CCACHE_COMPILERCHECK: content
CCACHE_COMPRESS: 1
CCACHE_COMPRESSLEVEL: 6
CCACHE_MAXSIZE: 1G
CCACHE_DIR: /ccache

x-common: &common
GITHUB_ACTIONS:

x-cpp: &cpp
ARROW_RUNTIME_SIMD_LEVEL:
ARROW_SIMD_LEVEL:

x-sccache: &sccache
AWS_ACCESS_KEY_ID:
AWS_SECRET_ACCESS_KEY:
SCCACHE_BUCKET:
SCCACHE_REGION:
SCCACHE_S3_KEY_PREFIX: ${SCCACHE_S3_KEY_PREFIX:-sccache}

x-cpp: &cpp
ARROW_RUNTIME_SIMD_LEVEL:
ARROW_SIMD_LEVEL:
x-vcpkg-build-secrets: &vcpkg-build-secrets
- github_repository_owner
- github_token
- vcpkg_binary_sources

# CPU/memory limit presets to pass to Docker.
#
Expand Down Expand Up @@ -1123,14 +1128,15 @@ services:
arch: ${ARCH}
arch_short: ${ARCH_SHORT}
base: quay.io/pypa/manylinux2014_${ARCH_ALIAS}:2024-08-03-32dfa47
vcpkg: ${VCPKG}
manylinux: 2014
python: ${PYTHON}
python_abi_tag: ${PYTHON_ABI_TAG}
manylinux: 2014
vcpkg: ${VCPKG}
context: .
dockerfile: ci/docker/python-wheel-manylinux.dockerfile
cache_from:
- ${REPO}:${ARCH}-python-${PYTHON}-wheel-manylinux-2014-vcpkg-${VCPKG}
secrets: *vcpkg-build-secrets
environment:
<<: [*common, *ccache]
volumes:
Expand All @@ -1147,14 +1153,15 @@ services:
arch: ${ARCH}
arch_short: ${ARCH_SHORT}
base: quay.io/pypa/manylinux_2_28_${ARCH_ALIAS}:2024-08-03-32dfa47
vcpkg: ${VCPKG}
manylinux: 2_28
python: ${PYTHON}
python_abi_tag: ${PYTHON_ABI_TAG}
manylinux: 2_28
vcpkg: ${VCPKG}
context: .
dockerfile: ci/docker/python-wheel-manylinux.dockerfile
cache_from:
- ${REPO}:${ARCH}-python-${PYTHON}-wheel-manylinux-2-28-vcpkg-${VCPKG}
secrets: *vcpkg-build-secrets
environment:
<<: [*common, *ccache]
volumes:
Expand Down Expand Up @@ -1239,8 +1246,8 @@ services:
image: ${REPO}:python-${PYTHON}-wheel-windows-vs2019-vcpkg-${VCPKG}-${PYTHON_WHEEL_WINDOWS_IMAGE_REVISION}
build:
args:
vcpkg: ${VCPKG}
python: ${PYTHON}
vcpkg: ${VCPKG}
context: .
dockerfile: ci/docker/python-wheel-windows-vs2019.dockerfile
# This should make the pushed images reusable, but the image gets rebuilt.
Expand Down Expand Up @@ -2119,3 +2126,11 @@ services:
/bin/bash -c "
git config --global --add safe.directory /arrow &&
/arrow/dev/release/verify-release-candidate.sh $${VERIFY_VERSION} $${VERIFY_RC}"

secrets:
github_repository_owner:
environment: GITHUB_REPOSITORY_OWNER
github_token:
environment: GITHUB_TOKEN
vcpkg_binary_sources:
environment: VCPKG_BINARY_SOURCES
Loading