Noise suppression with DSP+DNN, WebNN and Web Audio API feature gaps #100
Labels
Developer's Perspective
Machine Learning Experiences on the Web: A Developer's Perspective
Discussion topic
Topic discussed at the workshop
User's Perspective
Machine Learning Experiences on the Web: A User's Perspective
The RNNoise, Neural Speech Enhancement, and the Browser talk by @jmvalin -- which btw. has a superb audio quality in its recording :) -- explains the complexity of RNNoise (for a 48 kHz mono input signal) is around 40 megaflops, with the following top 3:
@jmvalin concludes:
The WebNN API recently added the Gated Recurrent Unit (GRU) and corresponding operators webmachinelearning/webnn#83 to fill the operator gaps to enable hardware acceleration of models that make use of GRUs, such as RNNoise.
In earlier related discussions @jmvalin noted:
The WebNN API also recently added the general matrix multiplication (gemm) of the Basic Linear Algebra Subprograms (BLAS), specifically its Level 3.
Couple of questions or discussion points in the context of the workshop:
What are the areas that need further focus on the web platform to ensure also future noise suppression models (DSP/DNN hybrids, or pure DNN maybe 100-1000x bigger?) could keep on performing?
What is the state of real-time (or near real-time) analysis of waveforms in pure userland JavaScript with libraries such as DSP.js in comparison with the Web Audio API primitives (e.g. AnalyserNode)? What are the gaps and issues that need to be filled in with a JS library or hand-rolled code?
I suspect @teropa might have perspectives and input to this discussion, so looping him in.
@padenot for the Web Audio API expertise.
@huningxin for feedback on noise suppression hardware perspectives.
The text was updated successfully, but these errors were encountered: