Python bindings

Featurization

Featurization module provides a bunch of classes for standard feature extraction from the audio data: Ceplifter, Dct, Derivatives, Dither, Mfcc, Mfsc, PowerSpectrum, PreEmphasis, TriFilterbank, Windowing.

All of them have the method apply which can be used to transform the input data. For example:

# imports
from wav2letter.feature import FeatureParams, Mfcc
import itertools

# read the wave


# create params struct
params = FeatureParams()
params.sampling_freq = 16000
params.low_freq_filterbank = 0
params.high_freq_filterbank = 8000
params.num_filterbank_chans = 20
params.num_cepstral_coeffs = 13
params.use_energy = False
params.zero_mean_frame = False
params.use_power = False
# define transformation and apply to the wave
mfcc = Mfcc(params)
features = mfcc.apply(wavinput)

ASG Loss

ASG loss is a pytorch module (nn.Module) which supports CPU and CUDA backends. It can be defined as

from wav2letter.criterion import ASGLoss
asg_loss = ASGLoss(ntokens, scale_mode).to(device)

where ntokens is the number of tokens predicted for each frame (number of classes), scale_mode is a scaling factor which can be:

NONE = 0, # no scaling
INPUT_SZ = 1, # scale to the input size
INPUT_SZ_SQRT = 2, # scale to the sqrt of the input size
TARGET_SZ = 3, # scale to the target size
TARGET_SZ_SQRT = 4, # scale to the sqrt of the target size

Beam-search decoder

Example how to define your own language model state

class LMStateNew(LMState):
    some_helpful_var = 1

    def __init__(self, some_helpful_var):
        super().__init__()
        self.some_helpful_var = some_helpful_var

Home

Installation

Dependencies
Build Instructions
Running With Docker

Training

Data Preparation
Writing Architecture Files
Train A Model
Distributed Training

Decoding

Beam Search Decoder

Python Bindings

Building Python Bindings
Python API

Inference Framework

Overview
Tutorial

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Python bindings

Featurization

ASG Loss

Beam-search decoder

Clone this wiki locally