Skip to content
This repository has been archived by the owner on Nov 17, 2023. It is now read-only.

[v1.8.x][BACKPORT]Stablizing CI and making binaries apache compliant #20015

Merged
merged 7 commits into from
Mar 14, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion 3rdparty/mshadow/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -42,7 +42,7 @@ else()
target_compile_definitions(mshadow INTERFACE MSHADOW_USE_SSE=0)
endif()
if(USE_CUDNN)
target_compile_definitions(mshadow INTERFACE MSHADOW_USE_CUDNN)
target_compile_definitions(mshadow INTERFACE MSHADOW_USE_CUDNN=1)
endif()
if(MSHADOW_IN_CXX11)
target_compile_definitions(mshadow INTERFACE MSHADOW_IN_CXX11)
Expand Down
1 change: 1 addition & 0 deletions CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -142,6 +142,7 @@ if(CMAKE_BUILD_TYPE STREQUAL "Distribution" AND UNIX AND NOT APPLE)
# Enforce DT_PATH instead of DT_RUNPATH
set(CMAKE_SHARED_LINKER_FLAGS "-Wl,--disable-new-dtags")
set(CMAKE_EXE_LINKER_FLAGS "-Wl,--disable-new-dtags")
set(Protobuf_USE_STATIC_LIBS ON)
endif()

set(CMAKE_MODULE_PATH "${CMAKE_CURRENT_SOURCE_DIR}/cmake/upstream;${CMAKE_CURRENT_SOURCE_DIR}/cmake/Modules;${CMAKE_MODULE_PATH}")
Expand Down
2 changes: 1 addition & 1 deletion cd/Jenkinsfile_cd_pipeline
Original file line number Diff line number Diff line change
Expand Up @@ -36,7 +36,7 @@ pipeline {

parameters {
// Release parameters
string(defaultValue: "cpu,native,cu92,cu100,cu101,cu102,cu110", description: "Comma separated list of variants", name: "MXNET_VARIANTS")
string(defaultValue: "cpu,native,cu100,cu101,cu102,cu110,cu112", description: "Comma separated list of variants", name: "MXNET_VARIANTS")
booleanParam(defaultValue: false, description: 'Whether this is a release build or not', name: "RELEASE_BUILD")
}

Expand Down
2 changes: 1 addition & 1 deletion cd/Jenkinsfile_release_job
Original file line number Diff line number Diff line change
Expand Up @@ -43,7 +43,7 @@ pipeline {
// any disruption caused by different COMMIT_ID values chaning the job parameter configuration on
// Jenkins.
string(defaultValue: "mxnet_lib", description: "Pipeline to build", name: "RELEASE_JOB_TYPE")
string(defaultValue: "cpu,native,cu100,cu101,cu102,cu110", description: "Comma separated list of variants", name: "MXNET_VARIANTS")
string(defaultValue: "cpu,native,cu100,cu101,cu102,cu110,cu112", description: "Comma separated list of variants", name: "MXNET_VARIANTS")
booleanParam(defaultValue: false, description: 'Whether this is a release build or not', name: "RELEASE_BUILD")
string(defaultValue: "nightly_v1.x", description: "String used for naming docker images", name: "VERSION")
}
Expand Down
5 changes: 3 additions & 2 deletions cd/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,7 @@ MXNet aims to support a variety of frontends, e.g. Python, Java, Perl, R, etc. a

The CD process is driven by the [CD pipeline job](Jenkinsfile_cd_pipeline), which orchestrates the order in which the artifacts are delivered. For instance, first publish the libmxnet library before publishing the pip package. It does this by triggering the [release job](Jenkinsfile_release_job) with a specific set of parameters for each delivery channel. The release job executes the specific release pipeline for a delivery channel across all MXNet *variants*.

A variant is a specific environment or features for which MXNet is compiled. For instance CPU, GPU with CUDA v10.0, CUDA v9.0 with MKL-DNN support, etc.
A variant is a specific environment or features for which MXNet is compiled. For instance CPU, GPU with CUDA v10.0, CUDA v11.0 with MKL-DNN support, etc.

Currently, below variants are supported. All of these variants except native have MKL-DNN backend enabled.

Expand All @@ -36,6 +36,7 @@ Currently, below variants are supported. All of these variants except native hav
* *cu101*: CUDA 10.1
* *cu102*: CUDA 10.2
* *cu110*: CUDA 11.0
* *cu112*: CUDA 11.2

*For more on variants, see [here](https://github.com/apache/incubator-mxnet/issues/8671)*

Expand Down Expand Up @@ -121,7 +122,7 @@ The "first mile" of the CD process is posting the mxnet binaries to the [artifac

##### Timeout

We shouldn't set global timeouts for the pipelines. Rather, the `step` being executed should be rapped with a `timeout` function (as in the pipeline example above). The `max_time` is a global variable set at the [release job](Jenkinsfile_release_job) level.
We shouldn't set global timeouts for the pipelines. Rather, the `step` being executed should be rapped with a `timeout` function (as in the pipeline example above). The `max_time` is a global variable set at the [release job](Jenkinsfile_release_job) level.

##### Node of execution

Expand Down
2 changes: 1 addition & 1 deletion cd/python/pypi/pypi_package.sh
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@

set -ex

# variant = cpu, native, cu80, cu100, etc.
# variant = cpu, native, cu100, cu101, cu102, cu110, cu112 etc.
export mxnet_variant=${1:?"Please specify the mxnet variant"}

# Due to this PR: https://github.com/apache/incubator-mxnet/pull/14899
Expand Down
8 changes: 4 additions & 4 deletions cd/utils/artifact_repository.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@

# Artifact Repository - Pushing and Pulling libmxnet

The artifact repository is an S3 bucket accessible only to restricted Jenkins nodes. It is used to store compiled MXNet artifacts that can be used by downstream CD pipelines to package the compiled libraries for different delivery channels (e.g. DockerHub, PyPI, Maven, etc.). The S3 object keys for the files being posted will be prefixed with the following distinguishing characteristics of the binary: branch, commit id, operating system, variant and dependency linking strategy (static or dynamic). For instance, s3://bucket/73b29fa90d3eac0b1fae403b7583fdd1529942dc/ubuntu16.04/cu92mkl/static/libmxnet.so
The artifact repository is an S3 bucket accessible only to restricted Jenkins nodes. It is used to store compiled MXNet artifacts that can be used by downstream CD pipelines to package the compiled libraries for different delivery channels (e.g. DockerHub, PyPI, Maven, etc.). The S3 object keys for the files being posted will be prefixed with the following distinguishing characteristics of the binary: branch, commit id, operating system, variant and dependency linking strategy (static or dynamic). For instance, s3://bucket/73b29fa90d3eac0b1fae403b7583fdd1529942dc/ubuntu18.04/cu100/static/libmxnet.so

An MXNet artifact is defined as the following set of files:

Expand Down Expand Up @@ -53,13 +53,13 @@ If not set, derived through the value of sys.platform (https://docs.python.org/3

**Variant**

Manually configured through the --variant argument. The current variants are: cpu, native, cu92, cu100, cu101, cu102 and cu110.
Manually configured through the --variant argument. The current variants are: cpu, native, cu100, cu101, cu102, cu110 and cu112.

As long as the tool is being run from the MXNet code base, the runtime feature detection tool (https://github.com/larroy/mxnet/blob/dd432b7f241c9da2c96bcb877c2dc84e6a1f74d4/docs/api/python/libinfo/libinfo.md) can be used to detect whether the library has been compiled with MKL (library has MKL-DNN feature enabled) and/or CUDA support (compiled with CUDA feature enabled).

If it has been compiled with CUDA support, the output of /usr/local/cuda/bin/nvcc --version can be mined for the exact CUDA version (eg. 8.0, 9.0, etc.).
If it has been compiled with CUDA support, the output of /usr/local/cuda/bin/nvcc --version can be mined for the exact CUDA version (eg. 10.0, 11.0, etc.).

By knowing which features are enabled on the binary, and if necessary, which CUDA version is installed on the machine, the value for the variant argument can be calculated. Eg. if CUDA features are enabled, and nvcc reports cuda version 10, then the variant would be cu100. If neither MKL-DNN nor CUDA features are enabled, the variant would be native.
By knowing which features are enabled on the binary, and if necessary, which CUDA version is installed on the machine, the value for the variant argument can be calculated. Eg. if CUDA features are enabled, and nvcc reports cuda version 10.0, then the variant would be cu100. If neither MKL-DNN nor CUDA features are enabled, the variant would be native.

**Dependency Linking**

Expand Down
3 changes: 3 additions & 0 deletions cd/utils/mxnet_base_image.sh
Original file line number Diff line number Diff line change
Expand Up @@ -33,6 +33,9 @@ case ${mxnet_variant} in
cu110*)
echo "nvidia/cuda:11.0-cudnn8-runtime-ubuntu18.04"
;;
cu112*)
echo "nvidia/cuda:11.2.1-cudnn8-runtime-ubuntu18.04"
;;
cpu)
echo "ubuntu:18.04"
;;
Expand Down
4 changes: 2 additions & 2 deletions cd/utils/test_artifact_repository.py
Original file line number Diff line number Diff line change
Expand Up @@ -144,9 +144,9 @@ def test_get_cuda_version(self, mock):
cuda_version = get_cuda_version()
self.assertEqual(cuda_version, '100')

mock.return_value = b'Cuda compilation tools, release 9.2, V9.2.148'
mock.return_value = b'Cuda compilation tools, release 11.0, V11.0.148'
cuda_version = get_cuda_version()
self.assertEqual(cuda_version, '92')
self.assertEqual(cuda_version, '110')

@patch('artifact_repository.check_output')
def test_get_cuda_version_not_found(self, mock):
Expand Down
4 changes: 2 additions & 2 deletions ci/docker/Dockerfile.build.ubuntu_build_cuda
Original file line number Diff line number Diff line change
Expand Up @@ -16,12 +16,12 @@
# specific language governing permissions and limitations
# under the License.
#
# Dockerfile to build MXNet on Ubuntu 16.04 for GPU but on
# Dockerfile to build MXNet on Ubuntu 18.04 for GPU but on
# a CPU-only instance. This restriction is caused by the CPP-
# package generation, requiring the actual CUDA library to be
# present

FROM nvidia/cuda:10.1-devel-ubuntu16.04
FROM nvidia/cuda:10.1-devel-ubuntu18.04

WORKDIR /work/deps

Expand Down
4 changes: 2 additions & 2 deletions ci/docker/Dockerfile.build.ubuntu_cpu
Original file line number Diff line number Diff line change
Expand Up @@ -16,9 +16,9 @@
# specific language governing permissions and limitations
# under the License.
#
# Dockerfile to build and run MXNet on Ubuntu 16.04 for CPU
# Dockerfile to build and run MXNet on Ubuntu 18.04 for CPU

FROM ubuntu:16.04
FROM ubuntu:18.04

WORKDIR /work/deps

Expand Down
4 changes: 2 additions & 2 deletions ci/docker/Dockerfile.build.ubuntu_cpu_c
Original file line number Diff line number Diff line change
Expand Up @@ -16,9 +16,9 @@
# specific language governing permissions and limitations
# under the License.
#
# Dockerfile to build and run MXNet on Ubuntu 16.04 for CPU
# Dockerfile to build and run MXNet on Ubuntu 18.04 for CPU

FROM ubuntu:16.04
FROM ubuntu:18.04

WORKDIR /work/deps

Expand Down
4 changes: 2 additions & 2 deletions ci/docker/Dockerfile.build.ubuntu_cpu_jekyll
Original file line number Diff line number Diff line change
Expand Up @@ -16,9 +16,9 @@
# specific language governing permissions and limitations
# under the License.
#
# Dockerfile to build and run MXNet on Ubuntu 16.04 for CPU
# Dockerfile to build and run MXNet on Ubuntu 18.04 for CPU

FROM ubuntu:16.04
FROM ubuntu:18.04

WORKDIR /work/deps

Expand Down
4 changes: 2 additions & 2 deletions ci/docker/Dockerfile.build.ubuntu_cpu_julia
Original file line number Diff line number Diff line change
Expand Up @@ -16,9 +16,9 @@
# specific language governing permissions and limitations
# under the License.
#
# Dockerfile to build and run MXNet on Ubuntu 16.04 for CPU
# Dockerfile to build and run MXNet on Ubuntu 18.04 for CPU

FROM ubuntu:16.04
FROM ubuntu:18.04

WORKDIR /work/deps

Expand Down
4 changes: 2 additions & 2 deletions ci/docker/Dockerfile.build.ubuntu_cpu_lite
Original file line number Diff line number Diff line change
Expand Up @@ -16,9 +16,9 @@
# specific language governing permissions and limitations
# under the License.
#
# Dockerfile to build and run MXNet on Ubuntu 16.04 for CPU
# Dockerfile to build and run MXNet on Ubuntu 18.04 for CPU

FROM ubuntu:16.04
FROM ubuntu:18.04

WORKDIR /work/deps

Expand Down
4 changes: 2 additions & 2 deletions ci/docker/Dockerfile.build.ubuntu_cpu_python
Original file line number Diff line number Diff line change
Expand Up @@ -16,9 +16,9 @@
# specific language governing permissions and limitations
# under the License.
#
# Dockerfile to build and run MXNet on Ubuntu 16.04 for CPU
# Dockerfile to build and run MXNet on Ubuntu 18.04 for CPU

FROM ubuntu:16.04
FROM ubuntu:18.04

WORKDIR /work/deps

Expand Down
4 changes: 2 additions & 2 deletions ci/docker/Dockerfile.build.ubuntu_cpu_r
Original file line number Diff line number Diff line change
Expand Up @@ -16,9 +16,9 @@
# specific language governing permissions and limitations
# under the License.
#
# Dockerfile to build and run MXNet on Ubuntu 16.04 for CPU
# Dockerfile to build and run MXNet on Ubuntu 18.04 for CPU

FROM ubuntu:16.04
FROM ubuntu:18.04

WORKDIR /work/deps

Expand Down
4 changes: 2 additions & 2 deletions ci/docker/Dockerfile.build.ubuntu_cpu_scala
Original file line number Diff line number Diff line change
Expand Up @@ -16,9 +16,9 @@
# specific language governing permissions and limitations
# under the License.
#
# Dockerfile to build and run MXNet on Ubuntu 16.04 for CPU
# Dockerfile to build and run MXNet on Ubuntu 18.04 for CPU

FROM ubuntu:16.04
FROM ubuntu:18.04

WORKDIR /work/deps

Expand Down
4 changes: 2 additions & 2 deletions ci/docker/Dockerfile.build.ubuntu_gpu_cu100
Original file line number Diff line number Diff line change
Expand Up @@ -16,9 +16,9 @@
# specific language governing permissions and limitations
# under the License.
#
# Dockerfile to run MXNet on Ubuntu 16.04 for GPU
# Dockerfile to run MXNet on Ubuntu 18.04 for GPU

FROM nvidia/cuda:10.0-devel-ubuntu16.04
FROM nvidia/cuda:10.0-devel-ubuntu18.04

WORKDIR /work/deps

Expand Down
4 changes: 2 additions & 2 deletions ci/docker/Dockerfile.build.ubuntu_gpu_cu101
Original file line number Diff line number Diff line change
Expand Up @@ -16,9 +16,9 @@
# specific language governing permissions and limitations
# under the License.
#
# Dockerfile to run MXNet on Ubuntu 16.04 for GPU
# Dockerfile to run MXNet on Ubuntu 18.04 for GPU

FROM nvidia/cuda:10.1-devel-ubuntu16.04
FROM nvidia/cuda:10.1-devel-ubuntu18.04

WORKDIR /work/deps

Expand Down
6 changes: 3 additions & 3 deletions ci/docker/Dockerfile.build.ubuntu_gpu_cu102
Original file line number Diff line number Diff line change
Expand Up @@ -16,9 +16,9 @@
# specific language governing permissions and limitations
# under the License.
#
# Dockerfile to run MXNet on Ubuntu 16.04 for GPU
# Dockerfile to run MXNet on Ubuntu 18.04 for GPU

FROM nvidia/cuda:10.2-devel-ubuntu16.04
FROM nvidia/cuda:10.2-devel-ubuntu18.04

WORKDIR /work/deps

Expand Down Expand Up @@ -65,7 +65,7 @@ COPY install/ubuntu_tutorials.sh /work/
RUN /work/ubuntu_tutorials.sh

ENV CUDA_VERSION=10.2.89
ENV CUDNN_VERSION=7.6.5.32
ENV CUDNN_VERSION=8.0.4.30
COPY install/ubuntu_cudnn.sh /work/
RUN /work/ubuntu_cudnn.sh

Expand Down
4 changes: 2 additions & 2 deletions ci/docker/Dockerfile.build.ubuntu_gpu_cu110
Original file line number Diff line number Diff line change
Expand Up @@ -16,9 +16,9 @@
# specific language governing permissions and limitations
# under the License.
#
# Dockerfile to run MXNet on Ubuntu 16.04 for GPU
# Dockerfile to run MXNet on Ubuntu 18.04 for GPU

FROM nvidia/cuda:11.0-cudnn8-devel-ubuntu16.04
FROM nvidia/cuda:11.0-cudnn8-devel-ubuntu18.04

WORKDIR /work/deps

Expand Down
48 changes: 48 additions & 0 deletions ci/docker/Dockerfile.build.ubuntu_gpu_cu112
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
# -*- mode: dockerfile -*-
# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements. See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership. The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing,
# software distributed under the License is distributed on an
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
# KIND, either express or implied. See the License for the
# specific language governing permissions and limitations
# under the License.
#
# Dockerfile to run MXNet on Ubuntu 18.04 for GPU

FROM nvidia/cuda:11.2.1-cudnn8-devel-ubuntu18.04

WORKDIR /work/deps

COPY install/requirements /work/

COPY install/ubuntu_core.sh /work/
RUN /work/ubuntu_core.sh

COPY install/deb_ubuntu_ccache.sh /work/
RUN /work/deb_ubuntu_ccache.sh

COPY install/ubuntu_python.sh /work/
RUN /work/ubuntu_python.sh

COPY install/ubuntu_docs.sh /work/
RUN /work/ubuntu_docs.sh

# Always last
ARG USER_ID=0
ARG GROUP_ID=0
COPY install/ubuntu_adduser.sh /work/
RUN /work/ubuntu_adduser.sh

COPY runtime_functions.sh /work/

WORKDIR /work/mxnet
ENV LD_LIBRARY_PATH=${LD_LIBRARY_PATH}:/usr/local/cuda/compat
3 changes: 2 additions & 1 deletion ci/docker/install/requirements
Original file line number Diff line number Diff line change
Expand Up @@ -31,4 +31,5 @@ pylint==2.3.1 # pylint and astroid need to be aligned
astroid==2.3.3 # pylint and astroid need to be aligned
requests<2.19.0,>=2.18.4
scipy==1.2.1
setuptools<50
setuptools
coverage
1 change: 1 addition & 0 deletions ci/docker/install/ubuntu_caffe.sh
Original file line number Diff line number Diff line change
Expand Up @@ -40,6 +40,7 @@ git clone http://github.com/BVLC/caffe.git

cd caffe
cp Makefile.config.example Makefile.config
echo "OPENCV_VERSION := 3" >> Makefile.config

echo "CPU_ONLY := 1" >> Makefile.config

Expand Down
6 changes: 2 additions & 4 deletions ci/docker/install/ubuntu_clang.sh
Original file line number Diff line number Diff line change
Expand Up @@ -25,11 +25,9 @@ set -ex
apt-get update || true
# Install clang 3.9 (the same version as in XCode 8.*) and 6.0 (latest major release)
wget -qO - http://apt.llvm.org/llvm-snapshot.gpg.key | apt-key add - && \
apt-add-repository "deb http://apt.llvm.org/xenial/ llvm-toolchain-xenial-3.9 main" && \
apt-add-repository "deb http://apt.llvm.org/xenial/ llvm-toolchain-xenial-6.0 main" && \
apt-add-repository "deb http://apt.llvm.org/bionic/ llvm-toolchain-bionic-6.0 main" && \
apt-get update && \
apt-get install -y clang-3.9 clang-6.0 clang-tidy-6.0 && \
clang-3.9 --version && \
apt-get install -y clang-6.0 clang-tidy-6.0 && \
clang-6.0 --version

# Use llvm's master version of run-clang-tidy.py. This version has mostly minor updates, but
Expand Down
9 changes: 7 additions & 2 deletions ci/docker/install/ubuntu_core.sh
Original file line number Diff line number Diff line change
Expand Up @@ -38,10 +38,14 @@ apt-get install -y \
libcurl4-openssl-dev \
libjemalloc-dev \
libhdf5-dev \
libomp5 \
libomp-dev \
liblapack-dev \
libopenblas-dev \
libopencv-dev \
libturbojpeg \
libjpeg-turbo8-dev \
libjpeg8-dev \
libturbojpeg0-dev \
libzmq3-dev \
libtinfo-dev \
zlib1g-dev \
Expand All @@ -52,11 +56,12 @@ apt-get install -y \
sudo \
unzip \
vim-nox \
openjdk-8-jdk \
openjdk-8-jre \
wget

# Use libturbojpeg package as it is correctly compiled with -fPIC flag
# https://github.com/HaxeFoundation/hashlink/issues/147
ln -s /usr/lib/x86_64-linux-gnu/libturbojpeg.so.0.1.0 /usr/lib/x86_64-linux-gnu/libturbojpeg.so


# CMake 3.13.2+ is required
Expand Down
Loading