Merge branch 'main' into votable_issues

* main: (69 commits) Revert plotting-vs-y (SciTools#4601) Bump peter-evans/create-pull-request from 3.13.0 to 3.14.0 (SciTools#4608) Support false-easting and false-northing when loading Mercator-projected data (SciTools#4524) Bump peter-evans/create-pull-request from 3.12.1 to 3.13.0 (SciTools#4607) Stop using nc_time_axis.CalendarDateTime (SciTools#4584) Utility class in netcdf loader should not be public. (SciTools#4592) Overnight benchmarks (SciTools#4583) Yaml fixes + clarifications. (SciTools#4594) Bump actions/script from 5.1.0 to 6 (SciTools#4586) Add missing commit to v3.2.x and update version number (SciTools#4593) docs linkcheck skip (SciTools#4590) Finalise whatsnew and version string update (SciTools#4588) [pre-commit.ci] pre-commit autoupdate (SciTools#4587) fix test (SciTools#4585) Loading Benchmarks (SciTools#4477) Fix load_http bug, extend testing, and note to docs (SciTools#4580) New tool-agnostic ASV environment management (SciTools#4571) Fix refresh lockfile worrkflow pull request title (SciTools#4579) gha: lockfiles labels and auto-pr details (SciTools#4578) Bump actions/script from 4 to 5.1.0 (SciTools#4576) ...
tkknight · Mar 2, 2022 · ea2064a · ea2064a
2 parents c88a082 + 0adcbfa
commit ea2064a
Show file tree

Hide file tree

Showing 170 changed files with 9,639 additions and 6,019 deletions.
diff --git a/.cirrus.yml b/.cirrus.yml
@@ -38,7 +38,7 @@ env:
   # Conda packages to be installed.
   CONDA_CACHE_PACKAGES: "nox pip"
   # Git commit hash for iris test data.
-  IRIS_TEST_DATA_VERSION: "2.5"
+  IRIS_TEST_DATA_VERSION: "2.7"
   # Base directory for the iris-test-data.
   IRIS_TEST_DATA_DIR: ${HOME}/iris-test-data
 
@@ -60,7 +60,6 @@ linux_task_template: &LINUX_TASK_TEMPLATE
       - echo "$(date +%Y).$(expr $(date +%U) / ${CACHE_PERIOD}):${CONDA_CACHE_BUILD}"
       - uname -r
     populate_script:
-      - export CONDA_OVERRIDE_LINUX="$(uname -r | cut -d'+' -f1)"
       - bash miniconda.sh -b -p ${HOME}/miniconda
       - conda config --set always_yes yes --set changeps1 no
       - conda config --set show_channel_urls True
@@ -141,8 +140,6 @@ task:
   only_if: ${SKIP_TEST_TASK} == ""
   << : *CREDITS_TEMPLATE
   matrix:
-    env:
-      PY_VER: 3.7
     env:
       PY_VER: 3.8
   name: "${CIRRUS_OS}: py${PY_VER} tests"
@@ -153,7 +150,6 @@ task:
   << : *IRIS_TEST_DATA_TEMPLATE
   << : *LINUX_TASK_TEMPLATE
   tests_script:
-    - export CONDA_OVERRIDE_LINUX="$(uname -r | cut -d'+' -f1)"
     - echo "[Resources]" > ${SITE_CFG}
     - echo "test_data_dir = ${IRIS_TEST_DATA_DIR}/test_data" >> ${SITE_CFG}
     - echo "doc_dir = ${CIRRUS_WORKING_DIR}/docs" >> ${SITE_CFG}
@@ -174,7 +170,6 @@ task:
   << : *IRIS_TEST_DATA_TEMPLATE
   << : *LINUX_TASK_TEMPLATE
   tests_script:
-    - export CONDA_OVERRIDE_LINUX="$(uname -r | cut -d'+' -f1)"
     - echo "[Resources]" > ${SITE_CFG}
     - echo "test_data_dir = ${IRIS_TEST_DATA_DIR}/test_data" >> ${SITE_CFG}
     - echo "doc_dir = ${CIRRUS_WORKING_DIR}/docs" >> ${SITE_CFG}
@@ -197,7 +192,6 @@ task:
   name: "${CIRRUS_OS}: py${PY_VER} link check"
   << : *LINUX_TASK_TEMPLATE
   tests_script:
-    - export CONDA_OVERRIDE_LINUX="$(uname -r | cut -d'+' -f1)"
     - mkdir -p ${MPL_RC_DIR}
     - echo "backend : agg" > ${MPL_RC_FILE}
     - echo "image.cmap : viridis" >> ${MPL_RC_FILE}

diff --git a/.github/dependabot.yml b/.github/dependabot.yml
@@ -0,0 +1,15 @@
+# Reference:
+# - https://docs.github.com/en/code-security/supply-chain-security/keeping-your-dependencies-updated-automatically/keeping-your-actions-up-to-date-with-dependabot
+# - https://docs.github.com/en/code-security/supply-chain-security/keeping-your-dependencies-updated-automatically/configuration-options-for-dependency-updates
+
+version: 2
+updates:
+
+  - package-ecosystem: "github-actions"
+    directory: "/"
+    schedule:
+      # Check for updates to GitHub Actions every weekday
+      interval: "daily"
+    labels:
+      - "New: Pull Request"
+      - "Bot"
diff --git a/.github/workflows/benchmark.yml b/.github/workflows/benchmark.yml
@@ -1,10 +1,11 @@
-# This is a basic workflow to help you get started with Actions
+# Use ASV to check for performance regressions in the last 24 hours' commits.
 
 name: benchmark-check
 
 on:
-  # Triggers the workflow on push or pull request events but only for the master branch
-  pull_request:
+  schedule:
+    # Runs every day at 23:00.
+    - cron: "0 23 * * *"
 
 jobs:
   benchmark:
@@ -16,35 +17,29 @@ jobs:
       IRIS_TEST_DATA_PATH: benchmarks/iris-test-data
       IRIS_TEST_DATA_VERSION: "2.5"
       # Lets us manually bump the cache to rebuild
+      ENV_CACHE_BUILD: "0"
       TEST_DATA_CACHE_BUILD: "2"
+      PY_VER: 3.8
 
     steps:
       # Checks-out your repository under $GITHUB_WORKSPACE, so your job can access it
       - uses: actions/checkout@v2
-
-      - name: Fetch the PR base branch too
-        run: |
-          git fetch --depth=1 origin ${{ github.event.pull_request.base.ref }}
-          git branch _base FETCH_HEAD
-          echo PR_BASE_SHA=$(git rev-parse _base) >> $GITHUB_ENV
+        with:
+          fetch-depth: 0
 
       - name: Install Nox
         run: |
           pip install nox
 
-      - name: Cache .nox and .asv/env directories
+      - name: Cache environment directories
         id: cache-env-dir
         uses: actions/cache@v2
         with:
           path: |
             .nox
             benchmarks/.asv/env
-          # Make sure GHA never gets an exact cache match by using the unique
-          #  github.sha. This means it will always store this run as a new
-          #  cache (Nox may have made relevant changes during run). Cache
-          #  restoration still succeeds via the partial restore-key match.
-          key: ${{ runner.os }}-${{ github.sha }}
-          restore-keys: ${{ runner.os }}
+            $CONDA/pkgs
+          key: ${{ runner.os }}-${{ hashFiles('requirements/') }}-${{ env.ENV_CACHE_BUILD }}
 
       - name: Cache test data directory
         id: cache-test-data
@@ -62,16 +57,51 @@ jobs:
           unzip -q iris-test-data.zip
           mkdir --parents ${GITHUB_WORKSPACE}/${IRIS_TEST_DATA_LOC_PATH}
           mv iris-test-data-${IRIS_TEST_DATA_VERSION} ${GITHUB_WORKSPACE}/${IRIS_TEST_DATA_PATH}
-          
+
       - name: Set test data var
         run: |
           echo "OVERRIDE_TEST_DATA_REPOSITORY=${GITHUB_WORKSPACE}/${IRIS_TEST_DATA_PATH}/test_data" >> $GITHUB_ENV
 
-      - name: Run CI benchmarks
+      - name: Run overnight benchmarks
+        run: |
+          first_commit=$(git log --after="$(date -d "1 day ago" +"%Y-%m-%d") 23:00:00" --pretty=format:"%h" | tail -n 1)
+          if [ "$first_commit" != "" ]
+          then
+            nox --session="benchmarks(overnight)" -- $first_commit
+          fi
+
+      - name: Create issues for performance shifts
+        env:
+          GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
         run: |
-          mkdir --parents benchmarks/.asv
-          set -o pipefail
-          nox --session="benchmarks(ci compare)" | tee benchmarks/.asv/ci_compare.txt
+          if [ -d benchmarks/.asv/performance-shifts ]
+          then
+            cd benchmarks/.asv/performance-shifts
+            for commit_file in *
+            do
+              pr_number=$(git log "$commit_file"^! --oneline | grep -o "#[0-9]*" | tail -1 | cut -c 2-)
+              assignee=$(gh pr view $pr_number --json author -q '.["author"]["login"]' --repo $GITHUB_REPOSITORY)
+              title="Performance Shift(s): \`$commit_file\`"
+              body="
+          Benchmark comparison has identified performance shifts at commit \
+          $commit_file (#$pr_number). Please review the report below and \
+          take corrective/congratulatory action as appropriate \
+          :slightly_smiling_face:
+
+          <details>
+          <summary>Performance shift report</summary>
+
+          \`\`\`
+          $(cat $commit_file)
+          \`\`\`
+
+          </details>
+
+          Generated by GHA run [\`${{github.run_id}}\`](https://github.com/${{github.repository}}/actions/runs/${{github.run_id}})
+              "
+              gh issue create --title "$title" --body "$body" --assignee $assignee --label "Bot" --label "Type: Performance" --repo $GITHUB_REPOSITORY
+            done
+          fi
 
       - name: Archive asv results
         if: ${{ always() }}
@@ -80,4 +110,3 @@ jobs:
           name: asv-report
           path: |
             benchmarks/.asv/results
-            benchmarks/.asv/ci_compare.txt
diff --git a/.github/workflows/refresh-lockfiles.yml b/.github/workflows/refresh-lockfiles.yml
@@ -22,7 +22,9 @@ on:
         default: "no"
   schedule:
     # Run once a week on a Saturday night 
-    - cron: 1 0 * * 6
+    # N.B. "should" be quoted, according to
+    # https://docs.github.com/en/actions/using-workflows/workflow-syntax-for-github-actions#onschedule
+    - cron: "1 0 * * 6"
 
 
 jobs:
@@ -35,7 +37,7 @@ jobs:
       # the lockfile bot has made the head commit, abort the workflow.
       # This job can be manually overridden by running directly from the github actions panel
       # (known as a "workflow_dispatch") and setting the `clobber` input to "yes".
-      - uses: actions/script@v4
+      - uses: actions/script@v6
         with:
           github-token: ${{ secrets.GITHUB_TOKEN }}
           script: |
@@ -71,7 +73,7 @@ jobs:
 
     strategy:
       matrix:
-        python: ['37', '38']
+        python: ['38']
 
     steps:
       - uses: actions/checkout@v2
@@ -108,13 +110,25 @@ jobs:
           rm -r artifacts
         
       - name: Create Pull Request
-        uses: peter-evans/create-pull-request@052fc72b4198ba9fbc81b818c6e1859f747d49a8
+        id: cpr
+        uses: peter-evans/create-pull-request@18f7dc018cc2cd597073088f7c7591b9d1c02672
         with:
           commit-message: Updated environment lockfiles
           committer: "Lockfile bot <[email protected]>"
           author: "Lockfile bot <[email protected]>"
           delete-branch: true
           branch: auto-update-lockfiles
-          title: Update CI environment lockfiles
+          title: "[iris.ci] environment lockfiles auto-update"
           body: |
             Lockfiles updated to the latest resolvable environment.
+          labels: |
+            New: Pull Request
+            Bot
+
+      - name: Check Pull Request
+        if: steps.cpr.outputs.pull-request-number != ''
+        run: |
+          echo "pull-request #${{ steps.cpr.outputs.pull-request-number }}"
+          echo "pull-request URL ${{ steps.cpr.outputs.pull-request-url }}"
+          echo "pull-request operation [${{ steps.cpr.outputs.pull-request-operation }}]"
+          echo "pull-request head SHA ${{ steps.cpr.outputs.pull-request-head-sha }}"
diff --git a/.github/workflows/stale.yml b/.github/workflows/stale.yml
@@ -1,16 +1,20 @@
 # See https://github.com/actions/stale
 
 name: Stale issues and pull-requests
+
 on:
   schedule:
-    - cron: 0 0 * * *
+    # Run once a day
+    # N.B. "should" be quoted, according to
+    # https://docs.github.com/en/actions/using-workflows/workflow-syntax-for-github-actions#onschedule
+    - cron: "0 0 * * *"
 
 jobs:
   stale:
     if: "github.repository == 'SciTools/iris'"
     runs-on: ubuntu-latest
     steps:
-      - uses: actions/stale@v4.0.0
+      - uses: actions/stale@v4.1.0
         with:
           repo-token: ${{ secrets.GITHUB_TOKEN }}
 
@@ -59,11 +63,11 @@ jobs:
           stale-pr-label: Stale
 
           # Labels on issues exempted from stale.
-          exempt-issue-labels: |
+          exempt-issue-labels:
             "Status: Blocked,Status: Decision Required,Peloton 🚴‍♂️,Good First Issue"
 
           # Labels on prs exempted from stale.
-          exempt-pr-labels: |
+          exempt-pr-labels:
             "Status: Blocked,Status: Decision Required,Peloton 🚴‍♂️,Good First Issue"
 
           # Max number of operations per run.

diff --git a/.pre-commit-config.yaml b/.pre-commit-config.yaml
@@ -29,7 +29,7 @@ repos:
     -   id: no-commit-to-branch
 
 -   repo: https://github.com/psf/black
-    rev: 21.12b0
+    rev: 22.1.0
     hooks:
     -   id: black
         pass_filenames: false
@@ -50,14 +50,14 @@ repos:
         args: [--filter-files]
 
 -   repo: https://github.com/asottile/blacken-docs
-    rev: v1.12.0
+    rev: v1.12.1
     hooks:
     -   id: blacken-docs
         types: [file, rst]
         additional_dependencies: [black==21.6b0]
 
 -   repo: https://github.com/aio-libs/sort-all
-    rev: v1.1.0
+    rev: v1.2.0
     hooks:
     -   id: sort-all
         types: [file, python]
diff --git a/benchmarks/README.md b/benchmarks/README.md
@@ -0,0 +1,80 @@
+# Iris Performance Benchmarking
+
+Iris uses an [Airspeed Velocity](https://github.com/airspeed-velocity/asv)
+(ASV) setup to benchmark performance. This is primarily designed to check for
+performance shifts between commits using statistical analysis, but can also
+be easily repurposed for manual comparative and scalability analyses.
+
+The benchmarks are automatically run overnight
+[by a GitHub Action](../.github/workflows/benchmark.yml), with any notable
+shifts in performance being flagged in a new GitHub issue.
+
+## Running benchmarks
+
+`asv ...` commands must be run from this directory. You will need to have ASV
+installed, as well as Nox (see
+[Benchmark environments](#benchmark-environments)).
+
+[Iris' noxfile](../noxfile.py) includes a `benchmarks` session that provides
+conveniences for setting up before benchmarking, and can also replicate the
+automated overnight run locally. See the session docstring for detail.
+
+### Environment variables
+
+* ``DATA_GEN_PYTHON`` - required - path to a Python executable that can be
+used to generate benchmark test objects/files; see
+[Data generation](#data-generation). The Nox session sets this automatically,
+but will defer to any value already set in the shell.
+* ``BENCHMARK_DATA`` - optional - path to a directory for benchmark synthetic
+test data, which the benchmark scripts will create if it doesn't already
+exist. Defaults to ``<root>/benchmarks/.data/`` if not set.
+
+## Writing benchmarks
+
+[See the ASV docs](https://asv.readthedocs.io/) for full detail.
+
+### Data generation
+**Important:** be sure not to use the benchmarking environment to generate any
+test objects/files, as this environment changes with each commit being
+benchmarked, creating inconsistent benchmark 'conditions'. The
+[generate_data](./benchmarks/generate_data/__init__.py) module offers a
+solution; read more detail there.
+
+### ASV re-run behaviour
+
+Note that ASV re-runs a benchmark multiple times between its `setup()` routine.
+This is a problem for benchmarking certain Iris operations such as data
+realisation, since the data will no longer be lazy after the first run.
+Consider writing extra steps to restore objects' original state _within_ the
+benchmark itself.
+
+If adding steps to the benchmark will skew the result too much then re-running
+can be disabled by setting an attribute on the benchmark: `number = 1`. To
+maintain result accuracy this should be accompanied by increasing the number of
+repeats _between_ `setup()` calls using the `repeat` attribute.
+`warmup_time = 0` is also advisable since ASV performs independent re-runs to
+estimate run-time, and these will still be subject to the original problem.
+
+### Scaling / non-Scaling Performance Differences
+
+When comparing performance between commits/file-type/whatever it can be helpful
+to know if the differences exist in scaling or non-scaling parts of the Iris
+functionality in question. This can be done using a size parameter, setting
+one value to be as small as possible (e.g. a scalar `Cube`), and the other to
+be significantly larger (e.g. a 1000x1000 `Cube`). Performance differences
+might only be seen for the larger value, or the smaller, or both, getting you
+closer to the root cause.
+
+## Benchmark environments
+
+We have disabled ASV's standard environment management, instead using an
+environment built using the same Nox scripts as Iris' test environments. This
+is done using ASV's plugin architecture - see
+[asv_delegated_conda.py](asv_delegated_conda.py) and the extra config items in
+[asv.conf.json](asv.conf.json).
+
+(ASV is written to control the environment(s) that benchmarks are run in -
+minimising external factors and also allowing it to compare between a matrix
+of dependencies (each in a separate environment). We have chosen to sacrifice
+these features in favour of testing each commit with its intended dependencies,
+controlled by Nox + lock-files).