Add automated unit testing on pull requests with GitHub Actions #356

carlthome · 2023-06-09T13:07:33Z

What?

Introduce automated testing on PRs by GitHub Actions.

Why?

To make PR reviews easier by checking that unit tests pass automatically on fresh installations of mir_eval, as discussed in #354

carlthome · 2023-06-09T13:17:51Z

Getting a error about missing decorator: https://github.com/craffel/mir_eval/actions/runs/5222270760/jobs/9427537252#step:7:27

bmcfee · 2023-06-09T18:46:34Z

Getting a error about missing decorator: https://github.com/craffel/mir_eval/actions/runs/5222270760/jobs/9427537252#step:7:27

Looks like a dependency problem with the decorator package. In general, I think we'd do best to set up a proper miniconda environment here, rather than relying on pip.

carlthome · 2023-06-11T18:40:14Z

In general, I think we'd do best to set up a proper miniconda environment here, rather than relying on pip.

Can definitely update this to follow https://github.com/librosa/librosa/blob/main/.github/workflows/ci.yml instead but think there's value in getting a standard pip setup to work since conda is a bit more involved and harder to convince people to use.

carlthome · 2023-06-11T18:41:12Z

yield based tests are skipped by pytest: https://github.com/craffel/mir_eval/actions/runs/5236939084/jobs/9454797085#step:5:215

bmcfee · 2023-06-12T11:22:34Z

but think there's value in getting a standard pip setup to work since conda is a bit more involved and harder to convince people to use.

Distributing via conda (convincing people to use it) and using conda for internal testing environments are two different things. I was only advocating for the latter here, and the librosa example you point to actually uses pip to install the package being tested.

yield based tests are skipped by pytest:

Yes - migrating from nose to pytest is going to be a pretty hefty endeavor for this reason, and I think should be handled in a separate PR.

carlthome · 2023-06-12T22:35:20Z

Distributing via conda (convincing people to use it) and using conda for internal testing environments are two different things. I was only advocating for the latter here, and the librosa example you point to actually uses pip to install the package being tested.

Got it! I'll update this PR to use miniconda accordingly.

carlthome · 2023-06-13T00:02:59Z

Initial conda done! 🐍

Changing from actions/setup-python to conda-incubator/setup-miniconda made build times go up quite a bit however:

Questions:

Should the build matrix include Linux, macOS and Windows?
Should multiple conda environments and dependency resolution strategies be tested (as in librosa/.github)?

Or high-level rather: how closely should mir_eval follow librosa? Identically? Loosely? This package doesn't use Numba or Cython so might be less susceptible to breaking. I'd consider using actions/setup-python because of that, but probably missing some important details about maintenance over time.

bmcfee · 2023-06-13T11:23:54Z

Awesome, thanks for the effort on this!

Questions:

1. Should the build matrix include Linux, macOS and Windows?

In general, cross-platform tests are a good idea. However, I also see your point that there's not a lot going on in mir_eval that could break on different platforms (eg cython builds or numba/llvmlite dependencies).

There are some points in the tests that might need a little tweaking to run on windows, mainly to do with file globbing and the like, but I'm not too worried about it.

My vote for now would be to just test on linux until we have sufficient cause to expand. @craffel what do you think?

2. Should multiple conda environments and dependency resolution strategies be tested (as in [librosa/.github](https://github.com/librosa/librosa/tree/main/.github))?
Or high-level rather: how closely should mir_eval follow librosa? Identically? Loosely?

To be clear, are you referring to the different python versions (3.7, 3.8, etc) and the minimal environment here? In general, I think it's good to have at least two test builds here: one using pinned minimum versions of the dependencies, and one using the latest stable releases (no upper pin on versions).

In librosa, we also fan this out across python versions, but this has more to do with the fact that librosa ships conda packages as well as pypi; because conda packages are explicitly linked to a python version, we need to ensure that the entire dependency stack works in each version, hence separate tests for 3.7, 3.8, 3.9, etc. If mir_eval is not shipping conda packages, this is less important. (That said, I think mir_eval should ship conda packages, but I digress.)

We also have a handful of other envs and actions for things like doc builds (we can't use RTD for various reasons), release deployment, etc. These are mostly unrelated to testing, with the exception of the linting action. I made this one a separate action in librosa because it doesn't require installing librosa (and all of its dependencies), only the tools to do static analysis on the code, so the environment is much lighter than the full test env.

mir_eval has linting rolled into the main test action (ie in the same env). I think this step should probably be split off into its own job (if not a separate action entirely) so that we can disentangle style checking from correctness checking, but it's not a big deal to leave it as is. I'm happy to go with whatever's easiest for now and refine the workflow later on.

craffel · 2023-06-13T15:32:44Z

Yeah, I'm comfortable with only testing on Linux. OTOH I don't even think we have any filesystem access stuff in the core library (just in the tests).

bmcfee · 2023-06-13T16:49:17Z

However, I also see your point that there's not a lot going on in mir_eval that could break on different platforms (eg cython builds or numba/llvmlite dependencies).

Actually, I'm going to reverse myself on this. We should definitely include cross-platform tests.

While mir_eval itself doesn't have any platform-specific funny business, there have been incidents in the past where platform changes surfaced some nasty reproducibility issues that took me quite a bit of time to pin down. These mainly had to do with differences in the underlying BLAS package between linux and mac (and maybe architecture dependent), causing slight deviations in numerical stability for bss eval.

We eventually resolved this by relaxing the tests to the point where these deviations didn't matter - remember kids, never trust decibel measurements past the first decimal place - but having cross-platform tests in place are handy for detecting this kind of thing quickly.

1. Use both flexible and strict channel priority for conda environment. 2. Include macOS and Windows for latest Python version.

As per https://github.com/conda-incubator/setup-miniconda/tree/v2/#caching-packages

SciPy is already part of install_requires so not necessary to have it in extras_require as well.

carlthome · 2023-06-14T00:00:00Z

Actually, I'm going to reverse myself on this. We should definitely include cross-platform tests.

Well, that might have been a good call actually because two unit tests fails with OOM on Windows despite running similar code with similar memory availability (hardware specs according to github.com):

https://github.com/craffel/mir_eval/actions/runs/5261403240/jobs/9509466272?pr=356#step:6:592

Doesn't feel obvious to me that this isn't an actual error but could consider tweaking the sizes in here and here to something even shorter but 2147042648 is curiously large so need more digging. Aiming to look more into this tomorrow!

bmcfee · 2023-06-14T13:26:31Z

Doesn't feel obvious to me that this isn't an actual error but could consider tweaking the sizes in here and here to something even shorter but 2147042648 is curiously large so need more digging.

That looks pretty obviously like an integer overflow, eg triggered here:

 a = array([    441000,     441001,     441002, ..., 2147483645, 2147483646,
       2147483647])

These look like sample indices to me. It may also be related to the issue identified in #355 where the length field of time_frequency was not handled correctly.

carlthome · 2023-06-14T21:34:51Z

But why would this only fail on Windows runners? Didn't think conda packages for NumPy/SciPy could behave that differently. Assuming some OpenBLAS vs MKL or gcc vs MSVC differences that a curious soul could dig into. 🤔

Guess this PR is blocked by figuring this out proper first.

bmcfee · 2023-06-14T23:50:54Z

But why would this only fail on Windows runners?

Is it possible the windows build is running 32bit while the others are 64bit, and the integer overflow we're hitting just happens to land in between?

carlthome · 2023-06-15T20:53:26Z

Is it possible the windows build is running 32bit while the others are 64bit, and the integer overflow we're hitting just happens to land in between?

Hmm, perhaps but supposedly ubuntu-latest and windows-latest both run on x86_64 as per the guide so would that be if miniconda is installed differently with conda-incubator/setup-miniconda depending on OS?

carlthome · 2023-06-15T21:19:45Z

Well, it states "win32" here actually so perhaps, but x64 is specified here so perhaps something funky with the setup-miniconda action.

Since miniconda hasn't released Python 3.11 support yet as per conda/conda#11170 (comment)

carlthome · 2023-06-15T21:58:55Z

Fishing for ideas in conda-incubator/setup-miniconda#287

bmcfee · 2023-06-16T13:08:41Z

This might be easier to debug if you add a conda info step to the job, as that will explicitly report the archspec of the environment.

It might also help to add a step that runs:

python -c "import numpy; numpy.show_config()"

just to get the full blas / lapack setup in the test log.

carlthome · 2023-06-21T08:24:00Z

Hmm, didn't spot anything obviously wrong in the debug logging.

How would it feel to mark the Windows runner in the build matrix as excluded for now, with a TODO, and still proceed with merging this PR? Not optimal, but also nice with some incremental progress.

bmcfee · 2023-06-21T11:53:06Z

We could also just leave it in (failing) and merge a fix (#355) later.

carlthome · 2023-06-22T11:56:26Z

Sounds good! I'm happy enough with this PR for now so if you're up for it feel free to merge.

craffel · 2023-06-22T14:32:28Z

Thank you Carl!

carlthome force-pushed the add-github-actions-test-workflow branch 2 times, most recently from 3bf0152 to 376a227 Compare June 9, 2023 13:11

carlthome changed the title ~~Migrate pull request automated testing from Travis CI to GitHub Actions~~ Add automated unit testing on pull requests with GitHub Actions Jun 9, 2023

carlthome added 3 commits June 11, 2023 20:33

Add initial test workflow

b55229d

Add decorator as test dependency

f11d97c

Move test dependencies into extras_require

c216467

carlthome force-pushed the add-github-actions-test-workflow branch from bc98222 to c216467 Compare June 11, 2023 18:33

carlthome added 4 commits June 13, 2023 00:57

Add environment.yml with packages taken from .travis.yml

0ded1e1

Limit concurrency

dead785

Only trigger on PRs to default branch

469bc28

Only upgrade packages if needed

b23592d

carlthome force-pushed the add-github-actions-test-workflow branch from 1b69d70 to 0c0f6c1 Compare June 12, 2023 23:03

Use conda-incubator/setup-miniconda instead of actions/setup-python

41d42c0

carlthome force-pushed the add-github-actions-test-workflow branch from 0c0f6c1 to 41d42c0 Compare June 12, 2023 23:04

carlthome added 5 commits June 13, 2023 01:07

Remove atlas

dbf5c04

Rename test.yaml to test.yml

2d76638

python -m pip

3f301ca

Use login shell so conda environment gets activated

28613b1

Remove dependencies

bf5aa9f

carlthome marked this pull request as ready for review June 13, 2023 00:05

carlthome force-pushed the add-github-actions-test-workflow branch 2 times, most recently from 9221f9d to bb19bff Compare June 13, 2023 22:15

carlthome added 5 commits June 14, 2023 01:37

Remove upper bound on matplotlib

f234de0

Expand build matrix

d36a020

1. Use both flexible and strict channel priority for conda environment. 2. Include macOS and Windows for latest Python version.

Set use-only-tar-bz2 for caching

d3cab9a

As per https://github.com/conda-incubator/setup-miniconda/tree/v2/#caching-packages

Remove redundant dependency mention

3cbe490

SciPy is already part of install_requires so not necessary to have it in extras_require as well.

mpl_ic.image_comparison -> pytest.mark.mpl_image_compare

4ba2070

carlthome force-pushed the add-github-actions-test-workflow branch from fd2c383 to 4ba2070 Compare June 13, 2023 23:38

carlthome added 2 commits June 15, 2023 23:27

Use Python 3.10 as the latest version

10dbf30

Since miniconda hasn't released Python 3.11 support yet as per conda/conda#11170 (comment)

Install pytest with conda

896b552

Log package and dependency information

39f13a0

craffel merged commit 93f2741 into mir-evaluation:master Jun 22, 2023

carlthome deleted the add-github-actions-test-workflow branch June 22, 2023 18:47

bmcfee added this to the 0.8 milestone Feb 6, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add automated unit testing on pull requests with GitHub Actions #356

Add automated unit testing on pull requests with GitHub Actions #356

carlthome commented Jun 9, 2023

carlthome commented Jun 9, 2023

bmcfee commented Jun 9, 2023

carlthome commented Jun 11, 2023 •

edited

Loading

carlthome commented Jun 11, 2023

bmcfee commented Jun 12, 2023

carlthome commented Jun 12, 2023

carlthome commented Jun 13, 2023 •

edited

Loading

bmcfee commented Jun 13, 2023

craffel commented Jun 13, 2023

bmcfee commented Jun 13, 2023

carlthome commented Jun 14, 2023

bmcfee commented Jun 14, 2023

carlthome commented Jun 14, 2023

bmcfee commented Jun 14, 2023

carlthome commented Jun 15, 2023

carlthome commented Jun 15, 2023 •

edited

Loading

carlthome commented Jun 15, 2023

bmcfee commented Jun 16, 2023

carlthome commented Jun 21, 2023

bmcfee commented Jun 21, 2023

carlthome commented Jun 22, 2023

craffel commented Jun 22, 2023

Add automated unit testing on pull requests with GitHub Actions #356

Add automated unit testing on pull requests with GitHub Actions #356

Conversation

carlthome commented Jun 9, 2023

What?

Why?

carlthome commented Jun 9, 2023

bmcfee commented Jun 9, 2023

carlthome commented Jun 11, 2023 • edited Loading

carlthome commented Jun 11, 2023

bmcfee commented Jun 12, 2023

carlthome commented Jun 12, 2023

carlthome commented Jun 13, 2023 • edited Loading

bmcfee commented Jun 13, 2023

craffel commented Jun 13, 2023

bmcfee commented Jun 13, 2023

carlthome commented Jun 14, 2023

bmcfee commented Jun 14, 2023

carlthome commented Jun 14, 2023

bmcfee commented Jun 14, 2023

carlthome commented Jun 15, 2023

carlthome commented Jun 15, 2023 • edited Loading

carlthome commented Jun 15, 2023

bmcfee commented Jun 16, 2023

carlthome commented Jun 21, 2023

bmcfee commented Jun 21, 2023

carlthome commented Jun 22, 2023

craffel commented Jun 22, 2023

carlthome commented Jun 11, 2023 •

edited

Loading

carlthome commented Jun 13, 2023 •

edited

Loading

carlthome commented Jun 15, 2023 •

edited

Loading