Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Expand CI tests using matrix; make dependencies less restrictive; fix ONNX tests #233

Merged
merged 9 commits into from
Dec 14, 2022

Conversation

tomaarsen
Copy link
Member

Resolves #232

Hello!

Pull Request Overview

  • Expand on the CI setup by:
    1. Running tests for all combinations of Windows, Ubuntu, Python 3.6 through 3.10 and for the most recent versions as well as the oldest allowed versions of datasets, sentence-transformers and evaluate.
    2. Caching requirements for a significant CI speedup.
  • Resolve less restrictive dependency pinning #232 by setting dependencies via e.g. datasets>=2.3.0 rather than datasets==2.3.0.
  • Fix two bugs in ONNX tests:
    • If a GPU is available, then only the model is placed on the GPU, while the inputs remained on the CPU, causing an error.
    • Invalid dtype of numpy inputs (int32 instead of int64).

Details

The motivation for this PR is to make the SetFit dependencies less strict. Ideally, I would like the required dependency versions to be as lenient as possible, but then we do need tests to ensure that these older versions still work as intended. This is a recurring problem in (open source) CI test suites, as generally the tests are only ran with the most recent versions. This PR tackles that, my introducing a new extras_require option for compatibility tests:

REQUIRED_PKGS = ["datasets>=2.3.0", "sentence-transformers>=2.2.1", "evaluate>=0.3.0"]

...

COMPAT_TESTS_REQUIRE = [requirement.replace(">=", "==") for requirement in REQUIRED_PKGS] + TESTS_REQUIRE

EXTRAS_REQUIRE = {
    ...
    "compat_tests": COMPAT_TESTS_REQUIRE,
}

This can be installed like so:

pip install "setfit[compat_tests]"

In short, this installs the oldest legal versions, i.e. datasets==2.3.0, sentence-transformers==2.2.1 and evaluate==0.3.0. Experimentation shows that these are the oldest versions for which setfit still works intended.

The CI has then been updated to use a matrix strategy:

...
  test_sampling:
    name: Run unit tests
    strategy:
      matrix:
        python-version: ['3.7', '3.8', '3.9', '3.10']
        os: [ubuntu-latest, windows-latest]
        requirements: ['.[tests]', '.[compat_tests]']
      fail-fast: false
    runs-on: ${{ matrix.os }}

    ...

This spawns 16 separate CI runs using the different combinations of these parameters. This helps us find issues that relate exclusively to old or new Python versions, exclusively to Windows or Ubuntu, or exclusively to older or newer dependency versions. I am open to including macos-latest to the test suite, but recognize that it will increase the number of test runners from 16 to 24.

Furthermore, the CI has now been equipped with simple pip requirements caching:

...
      - name: Try to load cached dependencies
        uses: actions/cache@v3
        id: restore-cache
        with:
          path: ${{ env.pythonLocation }}
          key: python-dependencies-${{ matrix.os }}-${{ matrix.python-version }}-${{ matrix.requirements }}-${{ hashFiles('setup.py') }}-${{ env.pythonLocation }}

      - name: Install dependencies on cache miss
        run: |
          python -m pip install --no-cache-dir --upgrade pip
          python -m pip install --no-cache-dir ${{ matrix.requirements }}
        if: steps.restore-cache.outputs.cache-hit != 'true'
...

Bug fixes

The following two commits tackle bugs in the ONNX exporter that were exposed by the aforementioned test suite on my fork.

  1. cec5287
    Previously, the model could be loaded onto the GPU automatically, while the inputs remained on the CPU after passing through the transformers tokenizer. This causes issues on machines with CUDA available, while the tests will pass normally on machines that don't, like the CI runner. That is why this was not picked up previously.
  2. 47ad8ba
    On the CI, all of the Windows builds experienced:
    FAILED tests/exporters/test_onnx.py::test_export_onnx_sklearn_head - onnxruntime.capi.onnxruntime_pybind11_state.InvalidArgument: [ONNXRuntimeError] : 2 : INVALID_ARGUMENT : Unexpected input data type. Actual: (tensor(int32)) , expected: (tensor(int64))
    
    Including my local Windows dualboot. I resolved this by mapping the inputs to int64 in the test, after which all of the CI passed.

What now?

I believe this should be ready for merging if we agree on the changes to the CI.

Thank you @jannikmi for bringing this to my attention.

  • Tom Aarsen

@tomaarsen tomaarsen added enhancement New feature or request CI Regarding the Continuous Integration labels Dec 13, 2022
Copy link
Member

@lewtun lewtun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow, thanks for this big quality of life improvement to the CI @tomaarsen 🔥

Good call on relaxing the pinned library versions!

@lewtun lewtun merged commit 273244b into huggingface:main Dec 14, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CI Regarding the Continuous Integration enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

less restrictive dependency pinning
2 participants