Faster module for resampling time series. #911

small-yellow-duck · 2020-09-15T23:41:19Z

The proposed Resampler routine is more efficient than the existing Resample
module for resampling time series signals. Speed improvements are obtained by splitting
the signal into blocks where there are 'input_sr' input samples and 'output_sr' output
samples. Each block is treated with a convolution mapping 'input_sr' input channels to
'output_sr' output channels per time step. The existing Resample module uses a for loop
to iterate over the first index where each filter is applied, but this implementation is
fully convolutional.

The module is based on

https://github.com/danpovey/filtering/blob/master/lilfilter/resampler.py

with improvements to include additional filter types and input parameters that align
with the librosa api.

#908

inputs where the sequence length is not a whole number multiple of input_sr should be padded with zeros
adapt module to deal with multi-channel input
adapt to deal with non-integer sample rates?

Should tests try to reproduce signals resampled with scipy.signal.resample ?

The proposed Resampler routine is a more efficient routine than the existing Resample module for resampling time series signals. Speed improvements are obtained by splitting the signal into blocks where there are 'input_sr' input samples and 'output_sr' output samples. Each block is treated with a convolution mapping 'input_sr' input channels to 'output_sr' output channels per time step. The existing Resample module uses a for loop to iterate over the first index where each filter is applied, but this implementation is fully convolutional. The module is based on https://github.com/danpovey/filtering/blob/master/lilfilter/resampler.py with improvements to include additional filter types and input parameters that align with the librosa api.

mthrok · 2020-09-17T17:27:35Z

torchaudio/transforms.py

@@ -9,6 +9,9 @@
 from torchaudio import functional as F
 from torchaudio.compliance import kaldi

+import numpy as np
+from scipy import special


Hi @small-yellow-duck

Thanks for opening a PR. I will try this out to think how to test. I have not looked into the detail yet but numpy and scipy are not a mandatory dependencies of torchaudio.

We have a mechanism to differ loading the optional dependencies but it would be nice if we can implement without them. Quick googling tells that torch has heaviside function, so removing numpy appears plausible. We need to think what to do for scipy dependency.

scipy.special.i0 is used to compute the kaiser window for kaiser, kaiser_best and kaiser_fast. If scipy dependency is not desirable, then one option would be to just allow hann window mode. (the original goal of including kaiser windows was to try to duplicate the behaviour of librosa, but I don't think the results of librosa and this proposed routine are identical to numeric precision, so there is likely some other detail about how the librosa filters were generated that I haven't managed to duplicate). If the routine only needs to support hann windows, then I agree that it should be possible to remove scipy as a dependency.

I also see that there is an issue suggesting that pytorch add the modified bessel functions, although there doesn't seem to be a PR to merge the code into the pytorch codebase: pytorch/pytorch#7815 . If the code to calculate the equivalent of scipy.special.i0 is added to pytorch, then it should also be possible to include kaiser windows.

edit: the proposed resample routine also uses np.sinc, which might need a definition in pytorch as sinc = sin(pi*x)/pi/x and dangerous things happen when x->0. One other possibility that would be OK for this resample routine (where in theory nobody is taking any derivatives...) is to say sinc = sin(pi*x)/pi/(x+epsilon). Any thoughts? Scipy/numpy docs: https://docs.scipy.org/doc/numpy-1.15.1/reference/generated/numpy.sinc.html

I realized that sinc can be written in terms of the gamma function - and pytorch has a routine to compute the log of the gamma function (torch.lgamma)! So I think it should be possible to excise numpy as a dependency as well.... (see output 10 in https://mathworld.wolfram.com/SincFunction.html)

- removed numpy and scipy as dependencies - sinc function uses torch.exp(-torch.lgamma(1-x) - torch.lgamma(1+x)), but a better solution is to have a pytorch definition based on a taylor series expansion - in 'general' mode the .forward function no longer crops out the portion of the signal longer than (seq_len//input_sr) * input_sr

small-yellow-duck · 2020-09-22T05:13:53Z

Any comments to the following changes? @mthrok

Removing numpy and scipy as dependencies means that the resampler will no longer support kaiser windows. kaiser windows can be supported if there is a pytorch definition for scipy.special.i0.
In order to remove the numpy dependency, the sinc function is now defined as torch.exp(-torch.lgamma(1-x) - torch.lgamma(1+x)). This is not a good solution because the exp introduces numerical errors. A better solution is to have a pytorch definition based on a taylor series expansion of sinc.
The general mode in forward no longer crops out the portion of the signal that is longer than (seq_len//input_sr) * input_sr. (YAY)

I am still thinking about how to support non-integer sample rates.
Further modifications to support multi-channel signals is desirable.

mthrok

Hi @small-yellow-duck

Sorry for the late reply. I took a look at your PR and it was surprisingly straightforward to follow. I have not got to the point of thinking of testing phase, but I made couple of comments.

Specifically, torch seems to already have kaiser_window and hann_window. Did you look at them? sinc is also under development pytorch/pytorch#44713 so it might become available rather soon.

On the interface design aspect, since this module has four distinctive operation modes (trivial, integer_downsample, integer_upsample and general), it might be a better to split the resampling class into 4 different and introduce a factory function. cc @cpuhrsch what do you think?

mthrok · 2020-09-23T21:18:24Z

torchaudio/transforms.py

+            torch.arange(input_sr, dtype=dtype).reshape((1, input_sr, 1)) / input_sr -
+            (torch.arange(kernel_width, dtype=dtype).reshape((1, 1, kernel_width)) - blocks_per_side))
+
+        def hann_window(a):


Looks like there is torch.hann_window. https://pytorch.org/docs/master/generated/torch.hann_window.html

torch.hann_window doesn't have quite the same functionality as the hann_window function in the resample routine: the pytorch hann_window computes the filter for a fixed window size (the only input is the window size) without the ability to scale the period of the filter (ie to change the argument of cos(2*pi*n/N) to cos(2*pi*n*a/N). Compare this to the behaviour of the hann_window function in the resampler, which takes the 3d times array and computes the value of the filter for each element (cos(2*pi*a)) using the array element value (a[i, j,k]) instead of just the array index.

mthrok · 2020-09-23T21:19:13Z

torchaudio/transforms.py

+            return torch.heaviside(1 - torch.abs(a), 0.0) * (0.5 + 0.5 * torch.cos(a * pi))
+
+
+        def kaiser_window(a, beta):


Looks like there is torch.kaiser_window. https://pytorch.org/docs/master/generated/torch.kaiser_window.html

same issue as with hann_window above... but the good news is that pytorch now has the i0 function, so I can rewrite the resampler kaiser_window routine with that. https://pytorch.org/docs/master/generated/torch.i0.html#torch.i0

mthrok · 2020-09-23T21:34:00Z

torchaudio/transforms.py

+         There must be 2 axes, interpreted as (minibatch_size, sequence_length)...
+         the minibatch_size may in practice be the number of channels.
+
+         TODO: make default input dim (minibatch_size, channels, seq_len)?


Typical way we do in torchaudio is to accept Tensors of broad range of dimentions.
In this case, we can generalize it to at least two dimensions and the time axis is the last dimension, then inside of forward function we can reshape the input Tensor to 3D. If the input is 2D, then add extra axis in 0 or 1. and if it's more than 3D, then reshape it to [batch, -1, seq_len] (then back to [batch, ..., new_seq_len]

I also like the flexibility of having 2d or 3d inputs.

mthrok · 2020-09-23T21:56:40Z

torchaudio/transforms.py

+            #     num_blocks = 1
+
+            # data = data[:, 0:(num_blocks*self.input_sr)]  # Truncate input
+            data = data[:, 0:(num_blocks * self.input_sr)].view(minibatch_size, num_blocks, self.input_sr)


If I understand correctly, data is padded so that its length is multiple of input_sr. Do you still need to truncate it?

Nope! Let me tidy up this piece of code a bit more...

mthrok · 2020-09-23T22:17:55Z

torchaudio/transforms.py

+            self.resample_type = 'trivial'
+            return
+
+        def gcd(a, b):


Kind of surprise to me but torch has gcd. https://pytorch.org/docs/master/generated/torch.gcd.html#torch-gcd

gcd is also in the standard python library. I'm fine to switch to the pytorch gcd but is there some underlying reason why it would compelling to do so?

mthrok · 2020-09-23T22:22:41Z

torchaudio/transforms.py

+            return None
+
+
+        def sinc(x):


Looks like sinc function is under development. pytorch/pytorch#44713

Oh good. I'll leave the definition of the sinc function as it is for now, but the new function should be used when it's available.

small-yellow-duck · 2020-09-24T19:47:19Z

Your idea is to have a base class resample that does the weight generation and which resample_trivial, resample_integer_upsample, resample_integer_downsample and resample_general can inherit from? I am pretty agnostic on whether to split the resampler into resample_trivial, resample_integer_upsample, resample_integer_downsample and resample_general: I'm happy to do it if people think it makes the code easier to understand.

What are your thoughts on what the goal of tests should be? This issue has some discussion about trying to duplicate the behaviour of audio processing tools in scipy with pytorch.
#452

Another issue is that the resampler hann_window function needs to have a value for pi, but pytorch doesn't have a definition for pi. I've used a suggested workaround (pi = 2.0*torch.acos(torch.zeros(1)).item(), but there is also an open issue to make pytorch include pi: pytorch/pytorch#19172

small-yellow-duck · 2020-09-25T18:20:10Z

The latest PR now has the ability to accept 2d or 3d inputs.

Is there a compelling reason to switch from using gcd to using torch.gcd?
Is there a better workaround for a definition of pi?
the existing implementation of Resample uses different names for input_sr and output_sr, while this implementation tries to align with the librosa resample routine. I think it's better to try to reproduce the librosa interface. Any comments?
Is it desirable to split the resample class into resample_trivial, resample_integer_upsample, resample_integer_downsample and resample_general and have each class inherit the weight initializations from resample?
What should the goals of unit tests be? (with the existing definition of sinc and pi, we can't expect numeric agreement with other packages...) Perhaps one option is to downsample a signal and then upsample it again? (and vice versa?)
@mthrok

I think there is also a minor issue with general mode padding the input with 0's so that the seq_len is an integer multiple of input_sr. When the kernel width is small, this is pretty safe to do, but if the kernel width is large, then the convolution will involve a lot of the padded zeros and the output of the convolution is not going to be scaled correctly (the output will be too small). It should be possible to see this by comparing a signal padded with zeros at the end (resample(torch.cat(data, zeros), dim=-1)[:, 0:new_seq_len]) to the same signal padded with zeros at the begining (resample(torch.cat(zeros, data), dim=-1)[:, -new_seq_len:]). Maybe there is a smart way to to construct an appropriately normalized kernel to take care of the last segment of the input data[:, :, (seq_len//self.input_sr)*self.input_sr:], or the case where the seq_len is shorter than self.input_sr. (However, this kernel would have to be constructed during the .forward call as the size depends on the data seq_len.)

Finally, I think it would be good for somebody to really scrutinize the definition I have for kaiser_window to make sure that it is correct. :-)

small-yellow-duck · 2020-10-05T23:30:21Z

Any comments on the latest revision or the questions in the previous post? @mthrok

mthrok · 2020-10-06T17:13:07Z

Hi @small-yellow-duck

Sorry for the late reply. I do not forget you and this PR, but I got caught up with some works related to the upcoming release. I will try my best to get back to you before the end of the week.

mthrok · 2020-10-12T16:52:19Z

Hi @small-yellow-duck

Sorry for the delayed response. I am still working on release-related task, please give me some more time before I can give feedback.

small-yellow-duck · 2020-10-12T17:06:26Z

No worries, thanks for keeping this in PR mind. :-)

mthrok · 2020-11-03T19:59:54Z

@small-yellow-duck

Sorry for the late reply. torchaudio 0.7.0 is released and I was reviewing the project ideas. I posted an RFC #1000 regarding the faster I/O, which I have been working, then realized that in this work we can provide a resampling algorithm faster than the current Python implementation. Therefore I would like to hold this PR. I am so sorry for the efforts you have made based on the interaction here.

small-yellow-duck · 2020-11-03T21:56:13Z

I expect most people want to resample data as they load it and I agree that faster resampling with C++ on the cpu is the functionality most worth focusing on.

faroit · 2020-11-04T09:45:53Z

Also maybe its worth, comparing to https://github.com/adefossez/julius that was just released

mthrok · 2021-08-03T21:07:30Z

Closing the issue as it is mostly addressed by #1487.
Please comment in #1487 if we are missing something.
Thank you for the contribution @small-yellow-duck

small-yellow-duck mentioned this pull request Sep 15, 2020

more efficient resample module #908

Closed

mthrok reviewed Sep 17, 2020

View reviewed changes

mthrok reviewed Sep 23, 2020

View reviewed changes

forward function accepts 2d and 3d inputs

bf19cc5

vincentqb mentioned this pull request Nov 30, 2020

Torchaudio resampling could be faster and simpler #1057

Closed

small-yellow-duck mentioned this pull request May 6, 2021

Resampling #1487

Closed

6 tasks

mthrok closed this Aug 3, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Faster module for resampling time series. #911

Faster module for resampling time series. #911

small-yellow-duck commented Sep 15, 2020 •

edited

Loading

mthrok Sep 17, 2020

small-yellow-duck Sep 18, 2020 •

edited

Loading

small-yellow-duck Sep 19, 2020

small-yellow-duck commented Sep 22, 2020 •

edited

Loading

mthrok left a comment •

edited

Loading

mthrok Sep 23, 2020

small-yellow-duck Sep 24, 2020 •

edited

Loading

mthrok Sep 23, 2020

small-yellow-duck Sep 24, 2020

mthrok Sep 23, 2020

small-yellow-duck Sep 24, 2020

mthrok Sep 23, 2020

small-yellow-duck Sep 24, 2020

mthrok Sep 23, 2020

small-yellow-duck Sep 24, 2020

mthrok Sep 23, 2020 •

edited

Loading

small-yellow-duck Sep 24, 2020

small-yellow-duck commented Sep 24, 2020

small-yellow-duck commented Sep 25, 2020 •

edited

Loading

small-yellow-duck commented Oct 5, 2020

mthrok commented Oct 6, 2020

mthrok commented Oct 12, 2020

small-yellow-duck commented Oct 12, 2020

mthrok commented Nov 3, 2020

small-yellow-duck commented Nov 3, 2020

faroit commented Nov 4, 2020

mthrok commented Aug 3, 2021

		return torch.heaviside(1 - torch.abs(a), 0.0) * (0.5 + 0.5 * torch.cos(a * pi))


		def kaiser_window(a, beta):

Faster module for resampling time series. #911

Faster module for resampling time series. #911

Conversation

small-yellow-duck commented Sep 15, 2020 • edited Loading

Choose a reason for hiding this comment

small-yellow-duck Sep 18, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

small-yellow-duck commented Sep 22, 2020 • edited Loading

mthrok left a comment • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

small-yellow-duck Sep 24, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mthrok Sep 23, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

small-yellow-duck commented Sep 24, 2020

small-yellow-duck commented Sep 25, 2020 • edited Loading

small-yellow-duck commented Oct 5, 2020

mthrok commented Oct 6, 2020

mthrok commented Oct 12, 2020

small-yellow-duck commented Oct 12, 2020

mthrok commented Nov 3, 2020

small-yellow-duck commented Nov 3, 2020

faroit commented Nov 4, 2020

mthrok commented Aug 3, 2021

small-yellow-duck commented Sep 15, 2020 •

edited

Loading

small-yellow-duck Sep 18, 2020 •

edited

Loading

small-yellow-duck commented Sep 22, 2020 •

edited

Loading

mthrok left a comment •

edited

Loading

small-yellow-duck Sep 24, 2020 •

edited

Loading

mthrok Sep 23, 2020 •

edited

Loading

small-yellow-duck commented Sep 25, 2020 •

edited

Loading