Skip to content

Commit

Permalink
Dockerise CI pipeline (apache#3393)
Browse files Browse the repository at this point in the history
  • Loading branch information
gerardo authored and Chris Fei committed Jan 23, 2019
1 parent 4db4442 commit 1bfe1f5
Show file tree
Hide file tree
Showing 36 changed files with 485 additions and 580 deletions.
89 changes: 17 additions & 72 deletions .travis.yml
Original file line number Diff line number Diff line change
Expand Up @@ -19,95 +19,40 @@
sudo: true
dist: trusty
language: python
jdk:
- oraclejdk8
services:
- cassandra
- mongodb
- mysql
- postgresql
- rabbitmq
addons:
apt:
packages:
- slapd
- ldap-utils
- openssh-server
- mysql-server-5.6
- mysql-client-core-5.6
- mysql-client-5.6
- krb5-user
- krb5-kdc
- krb5-admin-server
- oracle-java8-installer
- python-selinux
postgresql: "9.2"
python:
- "2.7"
- "3.5"
env:
global:
- DOCKER_COMPOSE_VERSION=1.20.0
- SLUGIFY_USES_TEXT_UNIDECODE=yes
- TRAVIS_CACHE=$HOME/.travis_cache/
- KRB5_CONFIG=/etc/krb5.conf
- KRB5_KTNAME=/etc/airflow.keytab
# Travis on google cloud engine has a global /etc/boto.cfg that
# does not work with python 3
- BOTO_CONFIG=/tmp/bogusvalue
matrix:
- TOX_ENV=flake8
- TOX_ENV=py27-backend_mysql
- TOX_ENV=py27-backend_sqlite
- TOX_ENV=py27-backend_postgres
- TOX_ENV=py35-backend_mysql
- TOX_ENV=py35-backend_sqlite
- TOX_ENV=py35-backend_postgres
- TOX_ENV=flake8
- TOX_ENV=py35-backend_mysql PYTHON_VERSION=3
- TOX_ENV=py35-backend_sqlite PYTHON_VERSION=3
- TOX_ENV=py35-backend_postgres PYTHON_VERSION=3
- TOX_ENV=py27-backend_postgres KUBERNETES_VERSION=v1.9.0
- TOX_ENV=py35-backend_postgres KUBERNETES_VERSION=v1.10.0
matrix:
exclude:
- python: "3.5"
env: TOX_ENV=py27-backend_mysql
- python: "3.5"
env: TOX_ENV=py27-backend_sqlite
- python: "3.5"
env: TOX_ENV=py27-backend_postgres
- python: "2.7"
env: TOX_ENV=py35-backend_mysql
- python: "2.7"
env: TOX_ENV=py35-backend_sqlite
- python: "2.7"
env: TOX_ENV=py35-backend_postgres
- python: "2.7"
env: TOX_ENV=flake8
- python: "3.5"
env: TOX_ENV=py27-backend_postgres KUBERNETES_VERSION=v1.9.0
- python: "2.7"
env: TOX_ENV=py35-backend_postgres KUBERNETES_VERSION=v1.10.0
- TOX_ENV=py35-backend_postgres KUBERNETES_VERSION=v1.10.0 PYTHON_VERSION=3
cache:
directories:
- $HOME/.wheelhouse/
- $HOME/.cache/pip
- $HOME/.travis_cache/
before_install:
- yes | ssh-keygen -t rsa -C [email protected] -P '' -f ~/.ssh/id_rsa
- cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
- ln -s ~/.ssh/authorized_keys ~/.ssh/authorized_keys2
- chmod 600 ~/.ssh/*
- jdk_switcher use oraclejdk8
- sudo ls -lh $HOME/.cache/pip/
- sudo rm -rf $HOME/.cache/pip/* $HOME/.wheelhouse/*
- sudo chown -R travis.travis $HOME/.cache/pip
install:
# Use recent docker-compose version
- sudo rm /usr/local/bin/docker-compose
- curl -L https://github.com/docker/compose/releases/download/${DOCKER_COMPOSE_VERSION}/docker-compose-`uname -s`-`uname -m` > docker-compose
- chmod +x docker-compose
- sudo mv docker-compose /usr/local/bin
- pip install --upgrade pip
- pip install tox
- pip install codecov
before_script:
- cat "$TRAVIS_BUILD_DIR/scripts/ci/my.cnf" | sudo tee -a /etc/mysql/my.cnf
- mysql -e 'drop database if exists airflow; create database airflow' -u root
- sudo service mysql restart
- psql -c 'create database airflow;' -U postgres
- export PATH=${PATH}:/tmp/hive/bin
# Required for K8s v1.10.x. See
# https://github.com/kubernetes/kubernetes/issues/61058#issuecomment-372764783
- sudo mount --make-shared / && sudo service docker restart
script:
- ./scripts/ci/travis_script.sh
- docker-compose --log-level ERROR -f scripts/ci/docker-compose.yml run airflow-testing /app/scripts/ci/run-ci.sh
after_success:
- sudo chown -R travis.travis .
- codecov
180 changes: 90 additions & 90 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,6 @@ under the License.
Contributions are welcome and are greatly appreciated! Every
little bit helps, and credit will always be given.


# Table of Contents
* [TOC](#table-of-contents)
* [Types of Contributions](#types-of-contributions)
Expand All @@ -34,11 +33,10 @@ little bit helps, and credit will always be given.
* [Documentation](#documentation)
* [Development and Testing](#development-and-testing)
- [Setting up a development environment](#setting-up-a-development-environment)
- [Pull requests guidelines](#pull-request-guidelines)
- [Testing Locally](#testing-locally)
- [Running unit tests](#running-unit-tests)
* [Pull requests guidelines](#pull-request-guidelines)
* [Changing the Metadata Database](#changing-the-metadata-database)


## Types of Contributions

### Report Bugs
Expand Down Expand Up @@ -98,55 +96,110 @@ extras to build the full API reference.

## Development and Testing

### Set up a development env using Docker
### Set up a development environment

Go to your Airflow directory and start a new docker container. You can choose between Python 2 or 3, whatever you prefer.
There are three ways to setup an Apache Airflow development environment.

```
# Start docker in your Airflow directory
docker run -t -i -v `pwd`:/airflow/ -w /airflow/ -e SLUGIFY_USES_TEXT_UNIDECODE=yes python:2 bash
1. Using tools and libraries installed directly on your system.

Install Python (2.7.x or 3.4.x), MySQL, and libxml by using system-level package
managers like yum, apt-get for Linux, or Homebrew for Mac OS at first. Refer to the [base CI Dockerfile](https://github.com/apache/incubator-airflow-ci/blob/master/Dockerfile.base) for
a comprehensive list of required packages.

Then install python development requirements. It is usually best to work in a virtualenv:

```bash
cd $AIRFLOW_HOME
virtualenv env
source env/bin/activate
pip install -e .[devel]
```

2. Using a Docker container

Go to your Airflow directory and start a new docker container. You can choose between Python 2 or 3, whatever you prefer.

```
# Start docker in your Airflow directory
docker run -t -i -v `pwd`:/airflow/ -w /airflow/ -e SLUGIFY_USES_TEXT_UNIDECODE=yes python:2 bash
# Go to the Airflow directory
cd /airflow/
# Install Airflow with all the required dependencies,
# including the devel which will provide the development tools
pip install -e ".[hdfs,hive,druid,devel]"
# Init the database
airflow initdb
nosetests -v tests/hooks/test_druid_hook.py
test_get_first_record (tests.hooks.test_druid_hook.TestDruidDbApiHook) ... ok
test_get_records (tests.hooks.test_druid_hook.TestDruidDbApiHook) ... ok
test_get_uri (tests.hooks.test_druid_hook.TestDruidDbApiHook) ... ok
test_get_conn_url (tests.hooks.test_druid_hook.TestDruidHook) ... ok
test_submit_gone_wrong (tests.hooks.test_druid_hook.TestDruidHook) ... ok
test_submit_ok (tests.hooks.test_druid_hook.TestDruidHook) ... ok
test_submit_timeout (tests.hooks.test_druid_hook.TestDruidHook) ... ok
test_submit_unknown_response (tests.hooks.test_druid_hook.TestDruidHook) ... ok
----------------------------------------------------------------------
Ran 8 tests in 3.036s
OK
```

# Install Airflow with all the required dependencies,
# including the devel which will provide the development tools
pip install -e .[devel,druid,hdfs,hive]
The Airflow code is mounted inside of the Docker container, so if you change something using your favorite IDE, you can directly test is in the container.

# Init the database
airflow initdb
3. Using [Docker Compose](https://docs.docker.com/compose/) and Airflow's CI scripts.

nosetests -v tests/hooks/test_druid_hook.py
Start a docker container through Compose for development to avoid installing the packages directly on your system. The following will give you a shell inside a container, run all required service containers (MySQL, PostgresSQL, krb5 and so on) and install all the dependencies:

test_get_first_record (tests.hooks.test_druid_hook.TestDruidDbApiHook) ... ok
test_get_records (tests.hooks.test_druid_hook.TestDruidDbApiHook) ... ok
test_get_uri (tests.hooks.test_druid_hook.TestDruidDbApiHook) ... ok
test_get_conn_url (tests.hooks.test_druid_hook.TestDruidHook) ... ok
test_submit_gone_wrong (tests.hooks.test_druid_hook.TestDruidHook) ... ok
test_submit_ok (tests.hooks.test_druid_hook.TestDruidHook) ... ok
test_submit_timeout (tests.hooks.test_druid_hook.TestDruidHook) ... ok
test_submit_unknown_response (tests.hooks.test_druid_hook.TestDruidHook) ... ok
```bash
docker-compose -f scripts/ci/docker-compose.yml run airflow-testing bash
# From the container
pip install -e .[devel]
# Run all the tests with python and mysql through tox
tox -e py35-backend_mysql
```

----------------------------------------------------------------------
Ran 8 tests in 3.036s
### Running unit tests

To run tests locally, once your unit test environment is setup (directly on your
system or through our Docker setup) you should be able to simply run
``./run_unit_tests.sh`` at will.

For example, in order to just execute the "core" unit tests, run the following:

OK
```
./run_unit_tests.sh tests.core:CoreTest -s --logging-level=DEBUG
```

or a single test method:

The Airflow code is mounted inside of the Docker container, so if you change something using your favorite IDE, you can directly test is in the container.
```
./run_unit_tests.sh tests.core:CoreTest.test_check_operators -s --logging-level=DEBUG
```

To run the whole test suite with Docker Compose, do:

```
# Install Docker Compose first, then this will run the tests
docker-compose -f scripts/ci/docker-compose.yml run airflow-testing /app/scripts/ci/run-ci.sh
```

### Set up a development env using Virtualenv
Alternatively can also set up [Travis CI](https://travis-ci.org/) on your repo to automate this.
It is free for open source projects.

Please install python(2.7.x or 3.4.x), mysql, and libxml by using system-level package
managers like yum, apt-get for Linux, or homebrew for Mac OS at first.
It is usually best to work in a virtualenv and tox. Install development requirements:
For more information on how to run a subset of the tests, take a look at the
nosetests docs.

cd $AIRFLOW_HOME
virtualenv env
source env/bin/activate
pip install -e .[devel]
tox
See also the list of test classes and methods in `tests/core.py`.

Feel free to customize based on the extras available in [setup.py](./setup.py)

### Pull Request Guidelines
## Pull Request Guidelines

Before you submit a pull request from your forked repo, check that it
meets these guidelines:
Expand Down Expand Up @@ -187,59 +240,6 @@ using `flake8 airflow tests`. `git diff upstream/master -u -- "*.py" | flake8 --
commit messages and adhere to them. It makes the lives of those who
come after you a lot easier.

### Testing locally

#### TL;DR
Tests can then be run with (see also the [Running unit tests](#running-unit-tests) section below):

./run_unit_tests.sh

Individual test files can be run with:

nosetests [path to file]

#### Running unit tests

We *highly* recommend setting up [Travis CI](https://travis-ci.org/) on
your repo to automate this. It is free for open source projects. If for
some reason you cannot, you can use the steps below to run tests.

Here are loose guidelines on how to get your environment to run the unit tests.
We do understand that no one out there can run the full test suite since
Airflow is meant to connect to virtually any external system and that you most
likely have only a subset of these in your environment. You should run the
CoreTests and tests related to things you touched in your PR.

To set up a unit test environment, first take a look at `run_unit_tests.sh` and
understand that your ``AIRFLOW_CONFIG`` points to an alternate config file
while running the tests. You shouldn't have to alter this config file but
you may if need be.

From that point, you can actually export these same environment variables in
your shell, start an Airflow webserver ``airflow webserver -d`` and go and
configure your connection. Default connections that are used in the tests
should already have been created, you just need to point them to the systems
where you want your tests to run.

Once your unit test environment is setup, you should be able to simply run
``./run_unit_tests.sh`` at will.

For example, in order to just execute the "core" unit tests, run the following:

```
./run_unit_tests.sh tests.core:CoreTest -s --logging-level=DEBUG
```

or a single test method:

```
./run_unit_tests.sh tests.core:CoreTest.test_check_operators -s --logging-level=DEBUG
```

For more information on how to run a subset of the tests, take a look at the
nosetests docs.

See also the list of test classes and methods in `tests/core.py`.

### Changing the Metadata Database

Expand Down
3 changes: 2 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -89,7 +89,8 @@ unit of work and continuity.

Want to help build Apache Airflow? Check out our [contributing documentation](https://github.com/apache/airflow/blob/master/CONTRIBUTING.md).

## Who uses Apache Airflow?

## Who uses Airflow?

As the Apache Airflow community grows, we'd like to keep track of who is using
the platform. Please send a PR with your company name and @githubhandle
Expand Down
2 changes: 0 additions & 2 deletions airflow/operators/python_operator.py
Original file line number Diff line number Diff line change
Expand Up @@ -176,10 +176,8 @@ class PythonVirtualenvOperator(PythonOperator):
variable named virtualenv_string_args will be available (populated by
string_args). In addition, one can pass stuff through op_args and op_kwargs, and one
can use a return value.
Note that if your virtualenv runs in a different Python major version than Airflow,
you cannot use return values, op_args, or op_kwargs. You can use string_args though.
:param python_callable: A python function with no references to outside variables,
defined with def, which will be run in a virtualenv
:type python_callable: function
Expand Down
Loading

0 comments on commit 1bfe1f5

Please sign in to comment.