Expand CI tests using matrix; make dependencies less restrictive; fix ONNX tests #233
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Resolves #232
Hello!
Pull Request Overview
datasets
,sentence-transformers
andevaluate
.datasets>=2.3.0
rather thandatasets==2.3.0
.Details
The motivation for this PR is to make the SetFit dependencies less strict. Ideally, I would like the required dependency versions to be as lenient as possible, but then we do need tests to ensure that these older versions still work as intended. This is a recurring problem in (open source) CI test suites, as generally the tests are only ran with the most recent versions. This PR tackles that, my introducing a new
extras_require
option for compatibility tests:This can be installed like so:
In short, this installs the oldest legal versions, i.e.
datasets==2.3.0
,sentence-transformers==2.2.1
andevaluate==0.3.0
. Experimentation shows that these are the oldest versions for which setfit still works intended.The CI has then been updated to use a matrix strategy:
This spawns 16 separate CI runs using the different combinations of these parameters. This helps us find issues that relate exclusively to old or new Python versions, exclusively to Windows or Ubuntu, or exclusively to older or newer dependency versions. I am open to including
macos-latest
to the test suite, but recognize that it will increase the number of test runners from 16 to 24.Furthermore, the CI has now been equipped with simple pip requirements caching:
Bug fixes
The following two commits tackle bugs in the ONNX exporter that were exposed by the aforementioned test suite on my fork.
Previously, the model could be loaded onto the GPU automatically, while the inputs remained on the CPU after passing through the
transformers
tokenizer. This causes issues on machines with CUDA available, while the tests will pass normally on machines that don't, like the CI runner. That is why this was not picked up previously.On the CI, all of the Windows builds experienced:
What now?
I believe this should be ready for merging if we agree on the changes to the CI.
Thank you @jannikmi for bringing this to my attention.