audio metrics: SNR, SI_SDR, SI_SNR (#292)

* add snr, si_sdr, si_snr * format * add noqa: F401 to __init__.py * remove types in doc, change estimate to preds, remove EPS * update functional.rst * update CHANGELOG.md * switch preds and target * switch preds and target in Example * add SNR, SI_SNR, SI_SDR module implementation * add test * add module doc * use _check_same_shape * to alphabetical order * update test * move Base to the top of Audio * add soundfile * gcc * fix mocking * image * doctest * mypy * fix requirements * fix dtype * something * update * adjust * Apply suggestions from code review * update test_snr * update test_si_snr * new snr: use torch.finfo(preds.dtype).eps * update test_snr.py * new si_sdr imp * update test_si_sdr * update test_si_snr * remove pb_bss_eval * add museval * update test files * remove museval * add funcs update return None annotation * add 'Setup ffmpeg' * update "Setup ffmpeg" * use setup-conda@v1 * multi-OS * update atol to 1e-5 * Apply suggestions from code review * change atol to 1e-2 * update * fix 'Setup Linux' not activated * add sudo * reduce Time to 100 to reduce the test time * increase timeoutInMinutes to 40 * install ffmpeg * timeout-minutes to 55 * +git * show-error-codes * .detach().cpu().numpy() first * add numpy * numpy * ignore_errors torchmetrics.audio.* * solve mypy no-redef error * remove --quiet * pypesq * apt * add # type: ignore * try without test_si_snr & test_si_sdr * test_import_speechmetrics * test_speechmetrics_si_sdr * test_si_sdr_functional * test audio only * install libsndfile1 * add sisnr sisdr test * test all & add quiet & remove test_speechmetrics * remove sudo & install libsndfile1 * add test * update * fix tests * typing * fix typing * fix bus error * SRMRpy * pesq * gcc * comment -u root cuda 10.2 whoami * env Co-authored-by: quancs <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Nicki Skafte <[email protected]> Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com> Co-authored-by: Jirka <[email protected]> Co-authored-by: Jirka Borovec <[email protected]> Co-authored-by: Justus Schock <[email protected]>
Lightning-AI · Jun 22, 2021 · fe03f3a · fe03f3a
1 parent a75445b
commit fe03f3a
Show file tree

Hide file tree

Showing 22 changed files with 986 additions and 9 deletions.
diff --git a/.github/workflows/ci_test-conda.yml b/.github/workflows/ci_test-conda.yml
@@ -17,7 +17,7 @@ jobs:
         pytorch-version: [1.3, 1.4, 1.5, 1.6, 1.7, 1.8, 1.9]
 
     # Timeout: https://stackoverflow.com/a/59076067/4521646
-    timeout-minutes: 35
+    timeout-minutes: 55
     steps:
     - uses: actions/checkout@v2
 
@@ -54,9 +54,11 @@ jobs:
 
     - name: Update Environment
       run: |
+        sudo apt install libsndfile1
         conda info
         conda install mkl pytorch=${{ matrix.pytorch-version }} cpuonly
         conda install cpuonly $(python ./requirements/adjust-versions.py conda)
+        conda install -c conda-forge ffmpeg
         conda list
         pip --version
         python ./requirements/adjust-versions.py requirements.txt

diff --git a/.github/workflows/ci_test-full.yml b/.github/workflows/ci_test-full.yml
@@ -26,7 +26,7 @@ jobs:
             requires: 'minimal'
 
     # Timeout: https://stackoverflow.com/a/59076067/4521646
-    timeout-minutes: 35
+    timeout-minutes: 55
 
     steps:
     - uses: actions/checkout@v2
@@ -43,7 +43,15 @@ jobs:
     - name: Setup macOS
       if: runner.os == 'macOS'
       run: |
-        brew install libomp  # https://github.com/pytorch/pytorch/issues/20030
+        brew install gcc libomp ffmpeg # https://github.com/pytorch/pytorch/issues/20030
+    - name: Setup Linux
+      if: runner.os == 'Linux'
+      run: |
+        sudo apt install -y ffmpeg
+    - name: Setup Windows
+      if: runner.os == 'windows'
+      run: |
+        choco install ffmpeg
 
     - name: Set min. dependencies
       if: matrix.requires == 'minimal'
@@ -70,7 +78,6 @@ jobs:
 
     - name: Install dependencies
       run: |
-        python --version
         pip --version
         pip install --requirement requirements.txt --upgrade --find-links https://download.pytorch.org/whl/cpu/torch_stable.html
         python ./requirements/adjust-versions.py requirements.txt

diff --git a/.github/workflows/code-format.yml b/.github/workflows/code-format.yml
@@ -52,7 +52,7 @@ jobs:
           pip list
       - name: mypy
         run: |
-          mypy
+          mypy --show-error-codes
 
 #  format-check-yapf:
 #    runs-on: ubuntu-20.04

diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -36,6 +36,9 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
 - Added `is_differentiable` property to `ConfusionMatrix`, `F1`, `FBeta`, `Hamming`, `Hinge`, `IOU`, `MatthewsCorrcoef`, `Precision`, `Recall`, `PrecisionRecallCurve`, `ROC`, `StatScores` ([#253](https://github.com/PyTorchLightning/metrics/pull/253))
 
 
+- Added audio metrics: SNR, SI_SDR, SI_SNR ([#292](https://github.com/PyTorchLightning/metrics/pull/292))
+
+
 - Added Inception Score metric to image module ([#299](https://github.com/PyTorchLightning/metrics/pull/299))
 
 

diff --git a/azure-pipelines.yml b/azure-pipelines.yml
@@ -19,22 +19,24 @@ pr:
 jobs:
   - job: pytest
     # how long to run the job before automatically cancelling
-    timeoutInMinutes: 35
+    timeoutInMinutes: 45
     # how much time to give 'run always even if cancelled tasks' before stopping them
     cancelTimeoutInMinutes: 2
 
     pool: gridai-spot-pool
 
     container:
-      image: "pytorch/pytorch:1.7.1-cuda11.0-cudnn8-runtime"
-      options: "--runtime=nvidia -e NVIDIA_VISIBLE_DEVICES=all"
+      image: "pytorch/pytorch:1.8.1-cuda10.2-cudnn7-runtime"
+      options: "--runtime=nvidia -e NVIDIA_VISIBLE_DEVICES=all --name ci-container -v /usr/bin/docker:/tmp/docker:ro"
 
     workspace:
       clean: all
 
     steps:
 
     - bash: |
+        whoami
+        id
         lspci | egrep 'VGA|3D'
         whereis nvidia
         nvidia-smi
@@ -43,8 +45,15 @@ jobs:
         pip list
       displayName: 'Image info & NVIDIA'
 
+    - script: |
+        /tmp/docker exec -t -u 0 ci-container \
+        sh -c "apt-get update && DEBIAN_FRONTEND=noninteractive apt-get -o Dpkg::Options::="--force-confold" -y install sudo"
+      displayName: 'Install Sudo in container (thanks Microsoft!)'
+
     - bash: |
-        #sudo apt-get install -y cmake
+        set -ex
+        sudo apt-get update
+        sudo apt-get install -y gcc cmake ffmpeg git libsndfile1
         # python -m pip install "pip==20.1"
         pip install --requirement ./requirements/devel.txt --upgrade-strategy only-if-needed
         pip uninstall -y torchmetrics

diff --git a/docs/source/references/functional.rst b/docs/source/references/functional.rst
@@ -5,6 +5,31 @@
 Functional metrics
 ##################
 
+*************
+Audio Metrics
+*************
+
+si_sdr [func]
+~~~~~~~~~~~~~
+
+.. autofunction:: torchmetrics.functional.si_sdr
+    :noindex:
+
+
+si_snr [func]
+~~~~~~~~~~~~~
+
+.. autofunction:: torchmetrics.functional.si_snr
+    :noindex:
+
+
+snr [func]
+~~~~~~~~~~
+
+.. autofunction:: torchmetrics.functional.snr
+    :noindex:
+
+
 **********************
 Classification Metrics
 **********************

diff --git a/docs/source/references/modules.rst b/docs/source/references/modules.rst
@@ -18,6 +18,46 @@ your own metric type might be too burdensome.
 .. autoclass:: torchmetrics.AverageMeter
     :noindex:
 
+*************
+Audio Metrics
+*************
+
+About Audio Metrics
+~~~~~~~~~~~~~~~~~~~
+
+For the purposes of audio metrics, inputs (predictions, targets) must have the same size.
+If the input is 1D tensors the output will be a scalar. If the input is multi-dimensional with shape [..., time]` the metric will be computed over the `time` dimension.
+
+.. doctest::
+
+    >>> import torch
+    >>> from torchmetrics import SNR
+    >>> target = torch.tensor([3.0, -0.5, 2.0, 7.0])
+    >>> preds = torch.tensor([2.5, 0.0, 2.0, 8.0])
+    >>> snr = SNR()
+    >>> snr_val = snr(preds, target)
+    >>> snr_val
+    tensor(16.1805)
+
+SI_SDR
+~~~~~~
+
+.. autoclass:: torchmetrics.SI_SDR
+    :noindex:
+
+SI_SNR
+~~~~~~
+
+.. autoclass:: torchmetrics.SI_SNR
+    :noindex:
+
+SNR
+~~~
+
+.. autoclass:: torchmetrics.SNR
+    :noindex:
+
+
 **********************
 Classification Metrics
 **********************

diff --git a/requirements/test.txt b/requirements/test.txt
@@ -19,3 +19,10 @@ nltk>=3.6
 
 # add extra requirements
 -r image.txt
+
+# audio
+pypesq
+mir_eval>=0.6
+#pesq @ https://github.com/ludlows/python-pesq/archive/refs/heads/master.zip
+#SRMRpy @ https://github.com/jfsantos/SRMRpy/archive/refs/heads/master.zip
+speechmetrics @ https://github.com/aliutkus/speechmetrics/archive/refs/heads/master.zip
diff --git a/tests/audio/__init__.py b/tests/audio/__init__.py
diff --git a/tests/audio/test_si_sdr.py b/tests/audio/test_si_sdr.py
@@ -0,0 +1,131 @@
+# Copyright The PyTorch Lightning team.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+from collections import namedtuple
+from functools import partial
+
+import pytest
+import speechmetrics
+import torch
+from torch import Tensor
+
+from tests.helpers import seed_all
+from tests.helpers.testers import BATCH_SIZE, NUM_BATCHES, MetricTester
+from torchmetrics.audio import SI_SDR
+from torchmetrics.functional import si_sdr
+from torchmetrics.utilities.imports import _TORCH_GREATER_EQUAL_1_6
+
+seed_all(42)
+
+Time = 100
+
+Input = namedtuple('Input', ["preds", "target"])
+
+inputs = Input(
+    preds=torch.rand(NUM_BATCHES, BATCH_SIZE, 1, Time),
+    target=torch.rand(NUM_BATCHES, BATCH_SIZE, 1, Time),
+)
+
+speechmetrics_sisdr = speechmetrics.load('sisdr')
+
+
+def speechmetrics_si_sdr(preds: Tensor, target: Tensor, zero_mean: bool):
+    # shape: preds [BATCH_SIZE, 1, Time] , target [BATCH_SIZE, 1, Time]
+    # or shape: preds [NUM_BATCHES*BATCH_SIZE, 1, Time] , target [NUM_BATCHES*BATCH_SIZE, 1, Time]
+    if zero_mean:
+        preds = preds - preds.mean(dim=2, keepdim=True)
+        target = target - target.mean(dim=2, keepdim=True)
+    target = target.detach().cpu().numpy()
+    preds = preds.detach().cpu().numpy()
+    mss = []
+    for i in range(preds.shape[0]):
+        ms = []
+        for j in range(preds.shape[1]):
+            metric = speechmetrics_sisdr(preds[i, j], target[i, j], rate=16000)
+            ms.append(metric['sisdr'][0])
+        mss.append(ms)
+    return torch.tensor(mss)
+
+
+def average_metric(preds, target, metric_func):
+    # shape: preds [BATCH_SIZE, 1, Time] , target [BATCH_SIZE, 1, Time]
+    # or shape: preds [NUM_BATCHES*BATCH_SIZE, 1, Time] , target [NUM_BATCHES*BATCH_SIZE, 1, Time]
+    return metric_func(preds, target).mean()
+
+
+speechmetrics_si_sdr_zero_mean = partial(speechmetrics_si_sdr, zero_mean=True)
+speechmetrics_si_sdr_no_zero_mean = partial(speechmetrics_si_sdr, zero_mean=False)
+
+
+@pytest.mark.parametrize(
+    "preds, target, sk_metric, zero_mean",
+    [
+        (inputs.preds, inputs.target, speechmetrics_si_sdr_zero_mean, True),
+        (inputs.preds, inputs.target, speechmetrics_si_sdr_no_zero_mean, False),
+    ],
+)
+class TestSISDR(MetricTester):
+    atol = 1e-2
+
+    @pytest.mark.parametrize("ddp", [True, False])
+    @pytest.mark.parametrize("dist_sync_on_step", [True, False])
+    def test_si_sdr(self, preds, target, sk_metric, zero_mean, ddp, dist_sync_on_step):
+        self.run_class_metric_test(
+            ddp,
+            preds,
+            target,
+            SI_SDR,
+            sk_metric=partial(average_metric, metric_func=sk_metric),
+            dist_sync_on_step=dist_sync_on_step,
+            metric_args=dict(zero_mean=zero_mean),
+        )
+
+    def test_si_sdr_functional(self, preds, target, sk_metric, zero_mean):
+        self.run_functional_metric_test(
+            preds,
+            target,
+            si_sdr,
+            sk_metric,
+            metric_args=dict(zero_mean=zero_mean),
+        )
+
+    def test_si_sdr_differentiability(self, preds, target, sk_metric, zero_mean):
+        self.run_differentiability_test(
+            preds=preds,
+            target=target,
+            metric_module=SI_SDR,
+            metric_functional=si_sdr,
+            metric_args={'zero_mean': zero_mean}
+        )
+
+    @pytest.mark.skipif(
+        not _TORCH_GREATER_EQUAL_1_6, reason='half support of core operations on not support before pytorch v1.6'
+    )
+    def test_si_sdr_half_cpu(self, preds, target, sk_metric, zero_mean):
+        pytest.xfail("SI-SDR metric does not support cpu + half precision")
+
+    @pytest.mark.skipif(not torch.cuda.is_available(), reason='test requires cuda')
+    def test_si_sdr_half_gpu(self, preds, target, sk_metric, zero_mean):
+        self.run_precision_test_gpu(
+            preds=preds,
+            target=target,
+            metric_module=SI_SDR,
+            metric_functional=si_sdr,
+            metric_args={'zero_mean': zero_mean}
+        )
+
+
+def test_error_on_different_shape(metric_class=SI_SDR):
+    metric = metric_class()
+    with pytest.raises(RuntimeError, match='Predictions and targets are expected to have the same shape'):
+        metric(torch.randn(100, ), torch.randn(50, ))
Original file line number	Diff line number	Diff line change
Expand Up		@@ -36,6 +36,9 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
		- Added `is_differentiable` property to `ConfusionMatrix`, `F1`, `FBeta`, `Hamming`, `Hinge`, `IOU`, `MatthewsCorrcoef`, `Precision`, `Recall`, `PrecisionRecallCurve`, `ROC`, `StatScores` ([#253](https://github.com/PyTorchLightning/metrics/pull/253))


		- Added audio metrics: SNR, SI_SDR, SI_SNR ([#292](https://github.com/PyTorchLightning/metrics/pull/292))


		- Added Inception Score metric to image module ([#299](https://github.com/PyTorchLightning/metrics/pull/299))


Expand Down