Skip to content


Repository files navigation

Note: For a regular MATLAB implementation of the WCE, please see the folder of this repository. This root 
folder contains a MATLAB runtime standalone implementation of that algorithm 
to be used in conjunction with ACLEW DiVIMe virtual machine ( 


Word count estimator (WCE) version 0.11 for DiViMe virtual machine (

By Okko Räsänen & Shreyas Seshadri, ([email protected], [email protected])

See configs/config_default.txt for configuration options.

NOTE: This is a very preliminary release that has not been extensively tested for stability. Use at your own consideration.

If you use this code or its derivations in a publication or other software, remember cite the following document:

Rasanen, O., Seshadri, S., Karadayi, J., Riebling, E., Bunce, J., Cristia, A., Metze, F., Casillas, M., Rosemberg, C., Bergelson, E.,
& Soderstrom, M. (submitted): Automatic word count estimation from daylong child-centered recordings in various language environments
using language-independent syllabification of speech

How to operate WCE on VM

To prepare ACLEW-format data for training and cross-validation, place your .wav files into data/ folder of the VM
(e.g., data/wavs/), and then the daylong annotation .eaf files to another folder (e.g., data/eafs/). Then

    1) Call

    /utils/ to carry out SAD on the data, and to derive the SAD-segment specific word counts.

    and then either


    /launcher/ to carry out leave-one-subject-out cross-validation on the provided data (depending on the
        dataset size, this might take some time)


    /launcher/ to first adapt WCE module to all provided and prepared data

    and then

    /launcher/ <filenames.txt> to apply the adapted model to get word counts on new signals, where
        <filenames.txt> is an ASCII .txt file with one signal path per row.

You can also call (from the ~/repos/WCE_VM folder) the WCE training and testing functions directly

./ /usr/local/MATLAB/MATLAB_Runtime/v93/ <inputs.txt> <inputcounts.txt> <mymodelfile.mat> <configfile.txt>


    inputs.txt = a .txt or .csv file containing training signal .wav paths to be processed
                (one .wav per line)

    inputcounts.txt = a .txt or .csv file containing word count in each of the training .wavs
                (one per line)

    mymodelfile.mat = specify where to store WCE model resulting from the training (a .mat

    configfile = an ASCII file (e.g., .txt) containing parameter settings for the WCE, see
            configs/config_default.m for examples


./ /usr/local/MATLAB/MATLAB_Runtime/v93/ <inputs.txt> <mymodelfile.mat> <output.csv>

where inputs.txt and mymodelfile.mat are as in training, and output.csv is the location where estimated word counts are stored.


DEMO scripts:

./ /usr/local/MATLAB/MATLAB_Runtime/v93/ demofiles.txt democounts.txt models/mymodel.mat configs/config_default.txt

./ /usr/local/MATLAB/MATLAB_Runtime/v93/ demofiles.txt models/mymodel.mat outputs/output.csv

Other notes:

1) The current software uses (and includes) Voicebox toolbox for MATLAB by Mike Brooks,
as distributed under GNU Public License.

No modifications to the original voicebox have been made.

MCR installation for standalone use outside DiViMe

(note that the correct MCR is already included on the DiViMe)

This is a stand-alone MATLAB binary that requires MATLAB Runtime Environment version v9.3.

Step 1: download

Step 2: unzip and run
	sudo ./install -mode silent -agreeToLicense yes"
in the unpackaged MCR folder.

All source code (MATLAB and Python mostly) are included in the WCE/ folder, and can be re-compiled for new platforms if needed.


Word count estimator for VM






No releases published


No packages published