Expand CI tests using matrix; make dependencies less restrictive; fix ONNX tests #233

tomaarsen · 2022-12-13T19:26:13Z

Resolves #232

Hello!

Pull Request Overview

Expand on the CI setup by:
1. Running tests for all combinations of Windows, Ubuntu, Python 3.6 through 3.10 and for the most recent versions as well as the oldest allowed versions of datasets, sentence-transformers and evaluate.
2. Caching requirements for a significant CI speedup.
Resolve less restrictive dependency pinning #232 by setting dependencies via e.g. datasets>=2.3.0 rather than datasets==2.3.0.
Fix two bugs in ONNX tests:
- If a GPU is available, then only the model is placed on the GPU, while the inputs remained on the CPU, causing an error.
- Invalid dtype of numpy inputs (int32 instead of int64).

Details

The motivation for this PR is to make the SetFit dependencies less strict. Ideally, I would like the required dependency versions to be as lenient as possible, but then we do need tests to ensure that these older versions still work as intended. This is a recurring problem in (open source) CI test suites, as generally the tests are only ran with the most recent versions. This PR tackles that, my introducing a new extras_require option for compatibility tests:

REQUIRED_PKGS = ["datasets>=2.3.0", "sentence-transformers>=2.2.1", "evaluate>=0.3.0"]

...

COMPAT_TESTS_REQUIRE = [requirement.replace(">=", "==") for requirement in REQUIRED_PKGS] + TESTS_REQUIRE

EXTRAS_REQUIRE = {
    ...
    "compat_tests": COMPAT_TESTS_REQUIRE,
}

This can be installed like so:

pip install "setfit[compat_tests]"

In short, this installs the oldest legal versions, i.e. datasets==2.3.0, sentence-transformers==2.2.1 and evaluate==0.3.0. Experimentation shows that these are the oldest versions for which setfit still works intended.

The CI has then been updated to use a matrix strategy:

...
  test_sampling:
    name: Run unit tests
    strategy:
      matrix:
        python-version: ['3.7', '3.8', '3.9', '3.10']
        os: [ubuntu-latest, windows-latest]
        requirements: ['.[tests]', '.[compat_tests]']
      fail-fast: false
    runs-on: ${{ matrix.os }}

    ...

This spawns 16 separate CI runs using the different combinations of these parameters. This helps us find issues that relate exclusively to old or new Python versions, exclusively to Windows or Ubuntu, or exclusively to older or newer dependency versions. I am open to including macos-latest to the test suite, but recognize that it will increase the number of test runners from 16 to 24.

Furthermore, the CI has now been equipped with simple pip requirements caching:

...
      - name: Try to load cached dependencies
        uses: actions/cache@v3
        id: restore-cache
        with:
          path: ${{ env.pythonLocation }}
          key: python-dependencies-${{ matrix.os }}-${{ matrix.python-version }}-${{ matrix.requirements }}-${{ hashFiles('setup.py') }}-${{ env.pythonLocation }}

      - name: Install dependencies on cache miss
        run: |
          python -m pip install --no-cache-dir --upgrade pip
          python -m pip install --no-cache-dir ${{ matrix.requirements }}
        if: steps.restore-cache.outputs.cache-hit != 'true'
...

Bug fixes

The following two commits tackle bugs in the ONNX exporter that were exposed by the aforementioned test suite on my fork.

cec5287
Previously, the model could be loaded onto the GPU automatically, while the inputs remained on the CPU after passing through the transformers tokenizer. This causes issues on machines with CUDA available, while the tests will pass normally on machines that don't, like the CI runner. That is why this was not picked up previously.

47ad8ba
On the CI, all of the Windows builds experienced:

FAILED tests/exporters/test_onnx.py::test_export_onnx_sklearn_head - onnxruntime.capi.onnxruntime_pybind11_state.InvalidArgument: [ONNXRuntimeError] : 2 : INVALID_ARGUMENT : Unexpected input data type. Actual: (tensor(int32)) , expected: (tensor(int64))

Including my local Windows dualboot. I resolved this by mapping the inputs to int64 in the test, after which all of the CI passed.

What now?

I believe this should be ready for merging if we agree on the changes to the CI.

Thank you @jannikmi for bringing this to my attention.

Tom Aarsen

And to run tests both on the most recent dependencies and on the oldest legal dependencies

and set fail-fast to false

https://github.com/actions/checkout#checkout-v3 https://github.com/actions/setup-python

UKPLab/sentence-transformers#1599

If not done, then the setfit model will be placed on the GPU while the inputs remain on the CPU. This caused test failures for me locally

…r(int32)) , expected: (tensor(int64))

As the sst2 dataset, commonly used in tests in setfit, was introduced in that version (https://github.com/huggingface/datasets/releases/tag/2.3.0)

lewtun

Wow, thanks for this big quality of life improvement to the CI @tomaarsen 🔥

Good call on relaxing the pinned library versions!

tomaarsen added 9 commits December 13, 2022 17:19

Add matrix to run CI on Python 3.7 through 3.10, Ubuntu & Windows

9809d80

And to run tests both on the most recent dependencies and on the oldest legal dependencies

Add dependency caching, install the right deps

84c280d

and set fail-fast to false

Upgrade CI actions

e272a1c

https://github.com/actions/checkout#checkout-v3 https://github.com/actions/setup-python

Use single quotes in hashFiles call

328264c

Use single quotes throughout the entire workflow

1c99610

sentence-transformer must at least be version 2.2.1

3cbf541

UKPLab/sentence-transformers#1599

Move ONNX dummy inputs to the correct device

cec5287

If not done, then the setfit model will be placed on the GPU while the inputs remain on the CPU. This caused test failures for me locally

Resolve INVALID_ARGUMENT : Unexpected input data type. Actual: (tenso…

47ad8ba

…r(int32)) , expected: (tensor(int64))

Increment lowest datasets version to 2.3.0

25e6c17

As the sst2 dataset, commonly used in tests in setfit, was introduced in that version (https://github.com/huggingface/datasets/releases/tag/2.3.0)

tomaarsen added enhancement New feature or request CI Regarding the Continuous Integration labels Dec 13, 2022

lewtun approved these changes Dec 14, 2022

View reviewed changes

lewtun merged commit 273244b into huggingface:main Dec 14, 2022

This was referenced Dec 14, 2022

Revert "Add SetFitModel.to" #234

Merged

Prevent the CI from reusing old cached versions of setfit #235

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Expand CI tests using matrix; make dependencies less restrictive; fix ONNX tests #233

Expand CI tests using matrix; make dependencies less restrictive; fix ONNX tests #233

tomaarsen commented Dec 13, 2022

lewtun left a comment

Expand CI tests using matrix; make dependencies less restrictive; fix ONNX tests #233

Expand CI tests using matrix; make dependencies less restrictive; fix ONNX tests #233

Conversation

tomaarsen commented Dec 13, 2022

Pull Request Overview

Details

Bug fixes

What now?

lewtun left a comment

Choose a reason for hiding this comment