-
Notifications
You must be signed in to change notification settings - Fork 12
Home
Koe is open-source web-based software for analysing animal vocalisations. Read the Methods in Ecology and Evolution paper here. Please cite this paper if you publish using Koe.
It features tools for visualising, segmenting, measuring, classifying, data-filtering and exporting acoustic units, and analysing sequence structure (syntax). It is an end-to-end acoustic database solution, especially suitable for animal species with distinct acoustic units.
You can use it on any device at koe.io.ac.nz, which makes it ideal for collaboration, education, and citizen science. There is also a standalone desktop version for download: https://github.com/fzyukio/koe-docker.
KNOWN ISSUES
-
The Koe email system appears to have stopped working (perhaps due to high volumes, exceeding usage limits). To get in touch, use messenger https://Facebook.com/KoeSoftware rather than the "contact us" button on the website.
-
Some users are having trouble running the standalone version; stay tuned for a fix.
-
If a user has all permissions removed on the last database they viewed, they can't get in to Koe. This is because Koe will automatically direct to the last database viewed, which the user is now 'locked out' of. To prevent this, make sure you navigate to another database before your permissions are removed. If this happens to you, ask the collaborator to restore access to you so that you can navigate to another database, then have them remove permissions again.
-
High/low/bandpass filtering when creating a dataset doesn't appear to be working. For now, please pre-filter your audio in other software prior to import, if required.
-
Some extracted features may not perform as expected:
'spectral_flatness' and 'entropy' should yield the same measurements, but do not. The direction of effects is correct (comparing sin waves to white noise, for example), but the discrepancy needs investigating.
Fundamental frequency measurements appear accurate up to about 1000 Hz, but seem to fail at higher values. We suggest using dominant_frequency instead.
OVERVIEW VIDEO
For a quick overview and demonstration of program features, see this video:
Click here to download NZ bellbird song recordings to follow along with the user manual
KOE NEEDS YOUR CODING SKILLS!
Koe was designed as an open-source gift to the bioacoustics community. We are currently 1 programmer and 1 biologist, who work on Koe for free in our (limited) spare time. This tool is now being used around the world (over 1000 databases) and there are lots of exciting ways that Koe could be developed for bioacoustics research—if only coders like you (or your coder friend) would come on board!
If you can code in Python and/or JavaScript, please help develop Koe! Together let’s make this open-source software really rock.
- Why use Koe?
- Overview of Koe workflow
- Navigating Koe
- Create an account
- Create a new database
- Add collaborators to a database and set permission levels
- Set database restore points
- Upload songs
- Upload raw recordings and divide into songs
- Recordings too big to upload? Use this tool to automatically split your wav files
- Segment songs into units, or...
- Alternatively, import unit segmentation from csv
- Extract unit features
- Construct ordination
- View interactive ordination and classify units
- Calculate similarity
- View interactive unit table and classify units
- Exemplars
- Filtering data
- View all songs
- Mine sequence structure
- Visualise sequence structure as a network
- Exporting your data
- Copy songs to another database
- Make database subsets (collections)
- Keyboard shortcuts
- Analysis Tutorial: Case Study of NZ bellbird song
- Frequently Asked Questions
Table of contents generated with markdown-toc
Acoustic communication is fundamental to the behaviour of many species. If we want to understand animal behaviour we need tools that allow us to objectively examine the details of vocalisations. What information are they sharing, and how is it encoded?
Often, acoustic communication is structured as a temporal sequence of distinct acoustic units (e.g. syllables), where information is encoded in the types of units and sometimes their temporal arrangement (syntax). In such cases, this is the flowchart for acoustic analysis:
Classification is a key step, because once you have a dataset of labelled units, you can analyse repertoire size and sequence structure and compare between individuals, sexes, sites, and seasons.
Manual classification by human eye and ear remains the primary and most reliable method for most species, but is hindered by a lack of tools, especially for large and diverse datasets.
That’s where Koe comes in. By facilitating large-scale, high-resolution classification and analysis of acoustic units, Koe opens up many possibilities for bioacoustics research.
We designed the Koe workflow to be intuitive and flexible. Here is a suggested workflow:
-
Import raw recordings and divide into “songs” (vocalisation bouts), then segment songs into their constituent acoustic units.
-
You can extract acoustic features from units. These features are used to calculate unit similarity and expedite classification in two ways: through interactive ordination, and similarity indices.
-
Interactive ordination is a major time saver for classification. The user can encircle groups of points on the ordination to see spectrograms, hear playback, and bulk-label groups of units. Harnessing the interactive ordination in conjunction with human audio-visual perception, this technique offers a major advance in both speed and robustness over existing acoustic classification methods.
-
Units can also be viewed as an interactive unit table. A key feature of the unit table is the similarity index, which lets you sort by acoustic similarity. Because similar-sounding units are grouped together, units can easily be selected in bulk and classified.
-
A catalogue of class exemplars is generated automatically, displaying up to 10 randomly chosen exemplars for each class, which serves as a useful reference during classification.
-
Koe gives you full control of your databases, allowing you to add collaborators and set permission levels. This makes it straightforward to conduct a classification validation experiment: grant judges labelling access to your database, and once they have independently classified the units, compile their labels to examine concordance.
-
Once units have been classified, songs can be visualised as sequences of unit labels; you can filter by sequence to identify all instances of a specific song type, for example. You can also mine sequence structure in detail with association rule algorithms and network visualisations.
-
Export data from any program view as csv.
In Koe, on the left of the screen is a sidebar with program controls. At the top of the sidebar are User account controls for logging out / switching user; beneath that are controls specific to the active view. And beneath that is the workflow navigation pane with buttons to access different program views, labelled according to their function.
Go to koe.io.ac.nz and click on Register to register an account.
Choose a username, input your name and email address for purposes of Koe support correspondence, and choose a password. (We will never share your information with anyone).
Under Manage your database, click New database to create a new database.
You can add collaborators at any stage.
-
Click on Manage your database.
-
Select the round checkbox of the database you want to add collaborators to.
-
Click Add collaborator and Type the Koe username or Koe account email address of your intended collaborator.
-
Once added, you can set their permission level by double-clicking the Permission cell next to their username.
From lowest to highest permission level: View, Annotate, Import data, Copy files, Add files, Modify segmentation, Delete files, Assign user. Each subsequent permission level extends the permissions of the previous level as per the table below:
Level | Permissions |
---|---|
View | User can view data |
Annotate | User can view and annotate data |
Import data | User can import, view and annotate data |
Copy files | User can copy files, import, view and annotate data |
Add files | User can add files, copy files, import, view and annotate data |
Modify segmentation | User can modify segmentation, add files, copy files, import, view and annotate data |
Delete files | User can delete files, modify segmentation, add files, copy files, import, view and annotate data |
Assign user | User can assign additional users, delete files, modify segmentation, add files, copy files, import, view and annotate data |
Changes are saved to the database as soon as they happen. If you want to be able to revert to a previous state you will want to save database restore points.
-
Click on Manage your database.
-
Select the round checkbox of the database you want to set a restore point for.
-
Click Quick save or Full save at the bottom of the window to create a restore point. Quick save saves label data associated with each unit. Full save saves both label data and segmentation start/end points.
Please note that restore points do not affect which songs are in the database or which units are segmented. For example, if you make a save and then delete songs, reverting will not restore the songs to the database. Similarly, if you add songs, reverting will not remove the added songs. In the same way, if you add/delete units, reverting will not remove/restore the segmentation. What reverting will do is restore the label data and (for a full save) the segmentation start/end points for currently existing units. If you want to preserve all aspects of a database at a certain point, like a traditional 'Save As', then copy everything to a new database, as described in Copy songs to another database below.
Click this link to download NZ bellbird songs to try out Upload songs feature and subsequent steps
If you have already-divided song files on your computer, you can upload these directly. On the navigation pane, click Upload songs and select up to 100 WAV files at once to upload.
For raw recordings that need to be divided into songs, see next step.
To upload a raw recording and divide it into song selections, do the following:
On the navigation pane, click Upload & split raw recording
-
In the Upload raw recording dialogue box, Click Upload (WAV only)
-
Select the recording you want to upload
-
Click Open
After selecting a file to upload, an Upload raw recording dialogue appears with Track name and Record date fields. The track name is automatically filled in based on the wav
filename. For example, Raw_Recording_001.wav
will by default appear as Raw_Recording_001
.
-
Click in the Track name field to modify the track name, if you wish.
-
Click in the Record date field to modify the date. By default, the record date is set as the date of recording upload to Koe.
-
Click Submit track info.
-
For recordings with more than one channel, select the desired channel.
-
Play the recording with the playback controls, looking for songs to segment. To play from the beginning, click Play button.
-
You can also play from a specific timepoint by clicking that point, then clicking Play.
With the controls beneath the spectrogram, you can:
-
Adjust spectrogram zoom.
-
Adjust spectrogram colour map.
-
Adjust playback speed.
-
Adjust spectrogram contrast.
- To create a song selection simply drag over the spectrogram to create a selection box.
Click the selection box to play it. Adjust the selection box endpoints by holding Shift
and dragging the box handles that appear.
For each selection you make, a row appears in the table beneath.
- Fill in annotation columns as desired by double-clicking in those cells.
-
You can set an automatic naming scheme for segmented songs. Click the Naming pattern box to create a naming scheme with any combination of the following components:
Track name
,Year
,Month
,Day
,Order
. You can also add custom components to the scheme, such as text strings, if you wish. Click Rename all to apply the naming scheme to all existing song selections. -
Once you’ve finished making song selections, click Save to upload the songs to the database. The raw recording will not be saved, only songs – so make sure you finish partitioning all songs in one go.
Some bioacoustics researchers work with looong recordings (e.g. passive acoustic monitoring). Very large files can be cumbersome. But don't worry! Yukio Fukuzawa has heard your cries and has come galloping to our rescue with a simple tool for automatically splitting large wav files into smaller wav files. You specify how long you want the audio sections to be, and how many seconds of overlap between sections.
To run the tool you will need to install Python. After that it's just a simple line of code that you type into your command prompt.
The tool and instructions are found here: https://github.com/fzyukio/split-songs
This section describes how to segment songs into acoustic units. However, if you have already segmented units using other software, jump to the next section
On the navigation pane, click View all songs. You will see your list of songs; click on a song filename to open the song.
Much like the previous program view, you can play the song and adjust spectrogram appearance and playback speed. Drag over units on the spectrogram to segment. Mouse over a selection box to highlight that unit’s row in the table. Click a selection box for playback. Fill in annotation columns as desired. Then click Save segmentation to save to the database.
You can return to this view to alter segmentation at any time.
Note: a Koe user interested in comparing entire songs as units for analysis can still use the software by selecting songs rather than syllables as units and classifying them using interactive ordination / unit tables. In a future update, we hope to offer both unit- and song-level comparisons simultaneously, so that a user does not have to choose.
If you have already segmented units using other software, you can import the unit segmentation start/end point data into Koe, and avoid doing it over again in Koe. You're welcome!
-
Upload all the song wav files that you have start/end-point data for (see Upload songs)
-
Navigate to Classify and manage units > Unit table on the left-hand pane.
-
Click the Export current table as csv button.
-
Open the resulting csv file (in Excel, for example).
-
Leaving the column headers unchanged, input your desired filename and unit start/end time information into the table, with one row per unit.
(Note that start/end times are measured in ms from the start of the file. E.g. a unit with start column value of 680 and end column value of 750 means the unit begins 680 ms and ends at 750 ms from the start of the file. Koe will use this information to construct syllable boundaries.)
-
Excel may have reformatted values in the date column. You need to make sure the date format is yyyy-mm-dd.
-
Save the updated csv file.
-
In Koe, go to Classify and manage units > Unit table, and click Import data (csv) to table. Browse to select your csv file.
Your data should appear, and if everything worked properly you will see spectrograms of segmented units.
What to do if you get this error: "Error: Your CSV contains the exact same data as current on the table. The table will not be updated." This seems to be a quirk of importing to a completely blank unit table. Make one dummy unit in Koe so that the table is not completely blank. Label it something obvious like 'DUMMY'. After importing the csv, you can delete it.
You can extract a wide array of acoustic features from units. These features are used to calculate unit similarity, which is used for constructing ordination and calculating similarity indices. You can also export extracted features as a data matrix (csv
file; see Exporting your data).
Go to Extract features and compare > Extract unit features, then:
-
Select the desired set of features (if in doubt, tick all of them to start with).
-
Select the desired set of aggregation methods (see below for explanation).
-
Type a name for the data matrix (e.g. “All features”).
-
Click Submit.
The Koe server will process your request and send a notification email when your extraction job finishes. (Typically this will take only a minute or two). You can see what sets of features you have previously extracted under the Existing data matrices dropdown in this view.
Currently, all extracted features are equally weighted — the user cannot alter the weightings.
Features fall into three main categories: frequency domain features, perceptual features, and time domain features.
The time domain represents changes in the signal over time, or more precisely, the changes in air pressure when the sound reaches a microphone, which is measured by voltage value. Features in this category are necessarily extracted directly from the oscillogram without any transformation. Thus, they tend to have excellent time resolution and low computational cost.
Here are the features offered, with descriptions of each.
Also known as Wiener entropy. A measure of the noisiness of the spectrum, where higher values correspond to a more uniform distribution of power over all frequencies. White noise will have a high entropy, whereas a pure-tone note will have a low entropy.
It is computed as the ratio of the geometric mean to the arithmetic mean of the spectrum:
See also https://www.mathworks.com/help/audio/ug/spectral-descriptors.html
Also known as Spectral Spread, is usually defined as the magnitude-weighted average of the differences between the spectral components and the spectral centroid
The centre of gravity of the spectrum, calculated as the weighted mean of the frequencies using their normalised magnitude as weights:
Where is the total number of frequency bins,
is the normalised magnitude of the spectrum at bin
and
is the centre frequency of bin
. The value of spectral centroid is the frequency where most of the energy is concentrated in, which is usually where the dominant frequency is found.
Spectral contrast is defined as the decibel difference between peaks and valleys in the spectrum.
Each frame of a spectrogram is divided into sub-bands. For each sub-band, the energy contrast is estimated by comparing the mean energy in the top quantile (peak energy) to that of the bottom quantile (valley energy). High contrast values generally correspond to clear, narrow-band signals, while low contrast values correspond to broad-band noise.
Jiang, Dan-Ning, Lie Lu, Hong-Jiang Zhang, Jian-Hua Tao, and Lian-Hong Cai. "Music type classification by spectral contrast feature." In Multimedia and Expo, 2002. ICME'02. Proceedings. 2002 IEEE International Conference on, vol. 1, pp. 113-116. IEEE, 2002.
The roll-off frequency is defined for each frame as the center frequency for a spectrogram bin such that at least roll_percent (0.85 by default) of the energy of the spectrum in this frame is contained in this bin and the bins below. This can be used to, e.g., approximate the maximum (or minimum) frequency by setting roll_percent to a value close to 1 (or 0).
MFCC = Mel-frequency cepstral coefficient. See this wikipedia article for an explanation.
Defined as the number of zero crossings per second. An efficient way to approximate the dominant frequency
The total energy within the unit selection bounds. For a spectrogram, the energy is calculated as
where and
are the lower and upper frequency limits of the selection (in Koe these are fixed),
and
are the beginning and ending frame numbers of the selection,
is the power dB reference value,
is the spectrogram power spectral density in frame
and at frequency
(in dB), and Δ
is the frequency bin size (which is equal to the sample rate divided by the DFT size).
The aggregate entropy measures the disorder in a sound by analyzing the energy distribution within a selection. Higher entropy values correspond to greater disorder in the sound whereas a pure tone with energy in only one frequency bin would have zero entropy. By treating the fraction of energy in a selection present in a given frequency bin as a probability, Koe calculates the entropy using:
where is the aggregate entropy in the selection and
and
are the upper and lower frequencies bounds of the selection.
corresponds to the energy in a specific frequency bin over the full time span of the selection, and
is total energy summed over all frequency bins in the selection. The size of a frequency bin is determined by the spectrogram parameters.
The average entropy in a selection is calculated by finding the entropy for each frame in the unit and then taking the average of these values. Unlike the aggregate entropy which uses the total energy in a frequency bin over the full time span, the average entropy calculates an entropy value for each slice in time and then averages. As a result, the average entropy measurement describes the amount of disorder for a typical spectrum within the selection, whereas the aggregate entropy corresponds to the overall disorder in the sound.
To calculate average power, the values of the spectrogram’s power spectral density are summed, and the result is then divided by the number of time-frequency bins in the selection. Units: dB.
The maximum power within the unit selection bounds. In a grayscale spectrogram, the maximum power in a selection is the power at the darkest point in the selection. Units: dB
The frequency at which max_power occurs within the selection. (Returns one value per acoustic unit). Units: Hz.
Goodness of Pitch is an estimate of harmonic pitch periodicity. High goodness of pitch can be used as a detector of harmonic stack, modulated or not. Formally, it is defined as the peak of the derivative-cepstrum calculated for harmonic pitch. See http://soundanalysispro.com/manual-1/chapter-4-the-song-features-of-sap2/goodness-of-pitch
See http://soundanalysispro.com/manual-1/chapter-4-the-song-features-of-sap2/amplitude
This is Wiener entropy, the same as spectral_flatness (calculation is less prone to precision loss than spectral_flatness).
See [http://soundanalysispro.com/manual-1/chapter-4-the-song-features-of-sap2/frequency-amplitude-modulation]
See http://soundanalysispro.com/manual-1/chapter-4-the-song-features-of-sap2/mean-frequency
See http://soundanalysispro.com/manual-1/chapter-4-the-song-features-of-sap2/spectral-continuity
The duration of the acoustic unit (ms)
The average power is calculated for each frame (time slice), then the entropy of these averages is calculated. (Returns a value per acoustic unit).
Sum of power of each frame, divided by the number of time-frequency bins in each frame. (Returns a value per frame).
The power of the darkest pixel (time-frequency bin) on the spectrogram, for each frame. (Returns a value per frame).
The same as max_frequency except calculated per frame. (Returns a value per frame).
See https://arxiv.org/pdf/1306.0103.pdf
Is the measure of the changes in the shape of the spectrum over time. SF is computed as the Euclidean distance of two consecutive spectral frames with normalised PSD (power spectral density)
See https://www.mathworks.com/help/audio/ug/spectral-descriptors.html
(Spectral Crest Factor) is the opposite of spectral_flatness (i.e. the opposite of Wiener entropy). it measures how peaky the spectrum is. A noisy spectrum has a low value of SCF compared to a clear tonal spectrum. It is calculated as the ratio of the peak over the mean frequency value:
See https://www.mathworks.com/help/audio/ug/spectral-descriptors.html
The statistical skewness of the PSD (power spectral density).
The skewness is a measure for how much the shape of the spectrum below the centre of gravity is different from the shape above the mean frequency. For a white noise, the skewness is zero.
See https://www.mathworks.com/help/audio/ug/spectral-descriptors.html
The statistical kurtosis of the PSD (power spectral density).
See https://www.mathworks.com/help/audio/ug/spectral-descriptors.html for equation.
"The spectral kurtosis measures the flatness, or non-Gaussianity, of the spectrum around its centroid. Conversely, it is used to indicate the peakiness of a spectrum. For example, as the white noise is increased on the speech signal, the kurtosis decreases, indicating a less peaky spectrum."
See Antoni, J. (2006) "The spectral kurtosis: a useful tool for characterising non-stationary signals"
See https://www.mathworks.com/help/audio/ug/spectral-descriptors.html for equation.
"Spectral decrease represents the amount of decrease of the spectrum, while emphasizing the slopes of the lower frequencies."
Is the ratio of harmonic power to total power; it is computed in the autocorrelation domain and defined as the maximum value of the normalised autocorrelation function (after having ignored the zero-lag peak) within a frame of length
Where is the function of sound pressure in time domain
See the book Matlab Audio Analysis Library https://au.mathworks.com/matlabcentral/fileexchange/45831-matlab-audio-analysis-library?focused=3812669&tab=function
Is the lowest frequency of a harmonic series. It concurs with the rate of oscillation of the sound source. The FF is the same as the dominant frequency in case of pure-tone signals, which makes it relatively straightforward to determine. However, this is not the case with speech and many bird vocalisations, which are complex sounds (having harmonic structure). The fundamental frequency of a harmonic series can be determined as either the lowest harmonic or more reliably (in the case of missing fundamental) by finding the largest common denominator of the harmonics.
Mel-frequency cepstrum. See this wikipedia article for an explanation.
The slope in mfcc between the current audio frame and the next frame.
MFCC = Mel-frequency cepstral coefficient. See this wikipedia article for an explanation.
The slope in mfcc between the current audio frame and two frames ahead.
MFCC = Mel-frequency cepstral coefficient. See this wikipedia article for an explanation.
is the logarithm of time duration from the beginning of the sound to the point where the envelope reaches its first maximum. Log Attack Time characterises the attack of sound, which can be smooth or sudden.
Many of the available features are frame-based, with the result that units of different lengths are represented by unequal-length vectors of measurement values. It is therefore necessary to aggregate these vectors to arrive at the same dimensional representation for each unit, so that units can be compared. Thus, when extracting features, you must select at least one aggregation method. Aggregation methods will be applied to all the features being extracted. For example, if you extract 10 features and tick only the Median aggregation method, the output will be the median values for each of the 10 features for every unit in the analysis.
Aggregation methods offered in Koe:
This will average the measurements over each unit.
This will take the median measurements of each unit.
This will take the standard deviation of the measurements of each unit.
'Divcon' stands for 'Divide and Conquer'. In this case, the spectrogram of the acoustic unit is being subdivided into time bins. Divcon_3 divides the spectrogram into three equal-sized time bins, Divcon_5 divides the spectrogram into five equal-sized time bins, and so on. Thus, for example, Divcon_3_mean will take the means (of whatever features have been selected) within the first third, middle third, and final third of the spectrogram, resulting in three values per selected feature. This could be of use when there is a particular region of the unit that is of interest to measure (for example, if you are particularly interested in the final third of units, or the second fifth of units, or first seventh of units, etc.)
Takes the minimum measurement for each unit
Takes the maximum measurement for each unit
Takes the variance of the measurements of each unit
Takes the beginning measurement for each unit (i.e. for frame-based measures, from the first frame of each unit).
Takes the end measurement for each unit (i.e. for frame-based measures, from the final frame of each unit).
To export extracted feature matrices as csv
for external use, click Download.
If your dataset is very big, you will need a lot of free RAM for this to work. For example, with 21k units, you will need at least 8 GB of free RAM.
Koe can produce interactive two- or three-dimensional ordination plots, based on the unit features extracted in the previous step. Notably, two-dimensional plots can be used to classify units directly on the plot, which greatly expedites the classification process (see next section). To construct an ordination, go to Extract features and compare > Construct ordination and choose a data matrix from the Existing data matrices dropdown list (1).
Choose an ordination method (2): PCA (Principal Components Analysis), ICA (Independent Component Analysis) or t-SNE (t-distributed Stochastic Neighbor Embedding) with PCA prepocessing. Since t-SNE preserves local structure in the data, it is particularly effective for defining and discriminating between different clusters.
Note: t-SNE is controlled by two parameters, 'perplexity', which can be set to any value between 5 and 50, and number of iterations. By default, perplexity is set to 10 and number of iterations (n_iter) is set to 4000. There is no 'correct' value of perplexity -- you must experiment with different values to achieve best results. If the perplexity value you choose is too high or low, you will find points on the plot do not cluster well by similarity, and will form convoluted banding patterns. To change the value, type, e.g. perplexity=30 in the Extra parameters box. The numbers of iterations should be adequate in most cases at 4000, but you can change this by typing, e.g. n_iter=5000 in the Extra parameters box. To control both parameters, separate with a comma as in perplexity=30, n_iter=5000.
Choose (3) the desired number of dimensions: two or three. Then click Submit (4). The Koe server will process your request and send a notification email when your ordination construction job finishes. (Typically this will take only a few minutes).
For more tips on optimising your t-SNE ordination in Koe, see this FAQ
Click Classify & manage units > Ordination. If you have constructed several ordinations, choose the one you want from the Current ordination dropdown list in the active view controls on the side bar. For 2D ordinations, a plot will be generated like the below.
The plot is controlled by a toolbar at the top left of the plot:
With the lasso tool (selected by default), encircle groups of points on the plot to see their spectrograms. Mousing over a point in a selection highlights the corresponding spectrogram in the left-hand panel. Click a point on the plot or its spectrogram for audio.
Classify selections in bulk by clicking Label beneath the spectrograms pane. You can label at up to three levels of granularity.
As you add more classes, they each acquire a distinctive point type. Toggle class visibility by clicking the legend, zoom the plot (with the zoom tool) for more refined detail.
You can also view selections as an interactive unit table by clicking View table. More information on the unit table below.
A second method for expediting classification is to use unit similarity indices. In the unit table (below) the similarity index is used to sort your units by spectral similarity, assisting rapid manual classification by grouping syllables in clusters that can be labelled in bulk. The index is produced as follows: from the raw feature measurements or from the ordination, Koe calculates pairwise Euclidean distance between each pair of units, then constructs a ladderized dendrogram using agglomerative hierarchical clustering (UPGMA) (Sokal, 1958). The order of the dendrogram leaf nodes becomes the similarity index. Sorting by the similarity index column orders the table so that similar units arrange together, allowing them to be selected and labelled in large batches.
You have two input options for calculating a similarity index: one is to use an extracted feature set, and the other is to use the coordinates of a constructed ordination. Go to Extract features and compare > Calculate similarity and select an option from Existing data matrices (for the former option) or Existing ordinations (for the latter). For optimal results we recommend using an existing t-SNE ordination as the input for the similarity index. Then click Submit. The Koe server will process your request and send a notification email when your similarity calculation job finishes. (Typically this will take only a few minutes).
Click on Classify & manage units > Unit table to see all your units as an interactive unit table.
Much like in Excel, you can click on a cell to select it, or move around the grid with the arrow keys. Page Up
and Page Down
will page through one screen-height at a time. To hear a syllable, simply click on its spectrogram, or press spacebar
with spectrogram cell selected. There is a speed slider at the top of the sidebar for slowing down playback; this helps our human ears hear details not apparent at full speed. Single-click on a cell to select cell contents (e.g. to copy to the clipboard), or double-click on a cell to alter cell content, if it is editable (editable cells contain a pencil icon).
Unit table columns (re-order columns simply by dragging their headers):
Column name | Description |
---|---|
ID | Unique syllable ID |
Start | Start time of the syllable (milliseconds from song start). |
End | End time of the syllable (milliseconds from song start). |
Duration | Duration of the syllable (in milliseconds) |
Individual | The ID of the vocalising individual |
Song | Name of the song file the syllable comes from |
Checkbox | This is the column you will use for selecting rows. |
Spectrogram | Shows a spectrogram of the raw signal. Click to hear audio. |
Label columns (Label, Subfamily, Family) | These are where you label your units. There are three independent columns for different levels of label granularity. Label is for the most fine-scale labelling, Subfamily is broader, and Family is the broadest. |
Sex | The sex of the vocalising individual |
Quality | The quality of the song. For example, you could use EX = Excellent, VG = Very Good, G = Good, OK = Ok, B = Bad. |
Similarity Index | Sort by this column to group syllables by acoustic similarity (provided you have calculated similarity index as above). |
Note | Write notes about a row (free format) |
Date | The date of the raw recording |
Added | The date the song was added to the database |
If you change 'sex' or 'species' column info associated with an individual animal, the changes will be reflected for all rows in the table belonging to that individual (you may need to refresh your browser to see the change propagate). This prevents mistakes, such as the same individual being ascribed male sex in some rows and female sex in other rows. It also means you only have to enter sex and species information for an individual in one row.
A future Koe update will allow a user to add custom annotation columns, such as a species column. In the meantime a user can simply co-opt one of the existing annotation columns for their purposes, such as labelling species.
Sort by a column by clicking on its header (e.g. Duration). To sort by multiple columns at once, Shift+click
subsequent headers.
Units can have three sets of labels at once, using the three label columns: Label, Subfamily and Family. You can use these to label at different scales, e.g. fine-scale, intermediate-scale and broad-scale classification. Or you may choose to use these columns for other purposes.
Select multiple rows of the same type by clicking their checkboxes or pressing spacebar with the checkbox cell selected. When you have finished selecting rows, press Ctrl+Shift+L
to bulk label all selected rows at the fine-scale level. Ctrl+Shift+F
bulk labels at the broad-scale “Family” level, and Ctrl+Shift+S
bulk labels at the intermediate “Subfamily” level.
Alternatively, click the menu button on the right-hand side of the Label, Subfamily or Family column header and select Bulk set value:
A dialogue box will appear:
If you are creating a new class, type any new name you like. If you are adding a group of units to a pre-existing class, then start typing the name of the class and it will appear in the dropdown list. The numbers beside each label class in the dropdown list indicate how many instances of that class there are in the dataset. Click the correct class name on the list, then click Set. All selected rows will now have this label.
Note that after labelling, all selected rows will remain selected. You can un-select all rows quickly with Ctrl
+`
You can also classify an individual unit by double-clicking the label, subfamily or family cell and typing a string. If the label class already exists you will see it in the dropdown list. Click the list item you want, or use Alt+Down
to navigate down the list and Enter
to select:
Every new label class you create gets automatically added to the Exemplars view, which functions as a class catalogue to reference during classification. This view has one row per type, with up to 10 randomly-selected exemplars for each row. Go to Classify & manage units > Exemplars to see your exemplars. Click on spectrograms to play the sound, and change playback speed with the speed slider, just as in other views. Click Next 10 random exemplars to randomly select the next batch.
You can have Koe open in as many tabs or windows as you want, so it’s easy to have exemplars in one tab and unit table in another, to help you label by comparison. If you make changes in one tab, refresh your other tabs to see the changes.
The filter box at the top of each program view is a quick and powerful way to filter your data. It takes regular expressions (see https://www.computerhope.com/jargon/r/regex.htm for a crash course on using regex), but it can also be used simply with no technical expertise. Simply mouse over a column header, click the dropdown arrow that appears, and select Filter.
Then type the string you want to filter by. For example, in the unit table, to filter rows with a certain label, click the Label column dropdown and click Filter, then type a string, e.g.
label: crazysqueak
You can also filter by multiple columns at once, separated by semicolons.
label: crazysqueak; song: BobtheBat_1
To filter date columns (i.e. Date of record and Added columns), use the format from(yyyy-mm-dd)to(yyyy-mm-dd)
, like this: date: from(2015-01-01)to(2018-12-31)
To filter numerical columns, such as ID, use ==
for 'exactly', >
and <
for 'greater than' and 'less than', respectively.
^
indicates the start of a string, and $
indicates the end; for example,
label: squeak
will return rows with ‘squeak’ anywhere in the label, such as ‘crazysqueak’, ‘squeaky’, ‘upsqueak’.
label: ^squeak
will return rows beginning with ‘squeak’, such as ‘squeaky’.
label: squeak$
will return rows ending with ‘squeak’, such as ‘crazysqueak’, ‘upsqueak’.
label: ^squeak$
will return rows with exactly ‘squeak’ (nothing before or after).
label: ^$
is a handy way of finding rows with blank labels.
Click on View all songs. Here you can see all your songs, with one row per song. You will see the song sequence structure as a sequence of unit labels.
To illustrate, the figure below shows a song (upper panel) and how it is represented in View all songs as a string of labels (lower panel).
Click on a unit to play it and see its spectrogram:
Click the play button at the start of a sequence to play the entire song. Use the filter to search for songs of a particular unit sequence, e.g.
sequence:"trill"-"dragon"
will filter for all songs with a trill
unit immediately followed by a dragon
unit. This is useful for finding songs with a shared pattern and exploring syntax.
As one way of exploring rules that may govern song structure (i.e. syntax), Koe uses the cSPADE (constrained Sequential Pattern Discovery using Equivalence classes) algorithm (Zaki, 2001) to discover commonly-occurring sequences in a set of songs.
We consider a sequence to be an ordered list of acoustic units denoted as A⇒B⇒C⇒...⇒x
.
A sequence rule has two parts: a left side (what comes before) and a right side (what follows). The rule states that when the left side occurs, the right side follows
[Left side]
⇒[Right side]
For example, take A⇒B
⇒ C
.
This rule states that when the sequence A⇒B
occurs, C
comes next.
The left side can be a sequence of any length, but in our current implementation the right side is always one unit long For example:
A
⇒B
(when A
occurs, B
comes next)
or
A⇒B⇒C⇒D⇒E⇒F⇒G⇒H⇒I
⇒J
(when A⇒B⇒C⇒D⇒E⇒F⇒G⇒H⇒I
occurs, J
comes next).
We plan to relax this restriction in a future update to allow chains of units on the right side.
The cSPADE algorithm calculates the credibility of sequence rules. Credible rules have a large confidence factor, a large level of support and a value of lift greater than one (as defined below). Let's take the example rule A⇒B
⇒ C
Support: The level of support is the proportion of songs in the database that contain the entire sequence A⇒B⇒C
at least once.
Confidence: The strength of an association is defined by its confidence factor (hereafter ‘confidence’), which is the proportion of those songs containing A⇒B
that also contain A⇒B⇒C
.
Lift: Lift is a measure of the strength of the association relative to chance. It is equal to the confidence of the rule, divided by the proportion of songs containing the right side. Thus it gives the ratio of (i) the proportion of songs the transition from A⇒B
⇒C
occurs in, versus (ii) the proportion of songs expected to contain A⇒B
⇒C
by chance association.
Think of our example rule, A⇒B
⇒C
. Imagine a dataset with 100 songs, where 10 of those songs contain A⇒B
, 7 contain A⇒B⇒C
, and 20 songs contain C
.
The support is 0.07 (i.e. A⇒B⇒C
occurs in 7/100 songs). The confidence is 0.7 (i.e . A⇒B⇒C
occurs in 7/10 songs containing A⇒B
). The lift is the confidence (0.7) divided by the proportion of songs containing C
(0.2), which is 3.5. In other words, the association A⇒B
⇒C
occurs in 3.5 times as many songs as expected by chance association of A⇒B
and C
.
To demonstrate cSPADE with real data, consider the ten songs shown in the figure below to be a population of songs.
The rule
Stutter⇒Waah(magpie)
⇒ Pipe(B6)
has a support of 0.4 since the entire sequence occurs in four of the 10 songs. The rule has a confidence of 0.8, because in four of the five songs that contain Stutter⇒Waah(magpie)
, the transition to Pipe(B6)
occurs. The proportion of songs with Pipe(B6)
is 0.6, so the lift of this rule is 0.8/0.6=1.33. That is, the association Stutter⇒Waah(magpie)
⇒ Pipe(B6)
occurs in 1.33 times as many songs as expected by chance association of Stutter⇒Waah(magpie)
and Pipe(B6)
.
Two-unit associations from cSPADE can be visualised using a directed network. The network models the direction and strength of association between pairs of units, across a population of songs. Units are represented by nodes (vertices) which are joined by lines (edges) if the units occur consecutively (by default Koe shows only those sequences that occur in at least 1% of songs). The order of units is indicated by arrow direction, and strength of association between units (lift) is represented by edge thickness. Visually cluttered networks can be simplified using the filter, e.g. to show only associations with high lift.
Control network appearance with the following controls:
Network control | Description |
---|---|
Separate units by actual time (on/off) | Activate Max/Min gap controls |
Max gap (10ms-1000ms) | The maximum gap length between units before they are no longer considered to be in the same sequence |
Min gap (1ms-100ms) | The minimum gap length between units below which they are no longer considered to be in the same sequence |
Show pseudo begin and end (on/off) | Displays start and end nodes, to visualise which classes tend to start and/or end songs |
Show node labels (on/off) | Toggle visibility of node labels |
Reactive pseudo start (on/off) | Pulls starting units towards the start node |
Reactive pseudo end (on/off) | Pulls ending units towards the end node |
Centering (on/off) | Prevents the network clumping in a corner of the screen |
Repulsion (1-50) | Determines the extent to which nodes actively repel each other |
Distance (10%-400%) | Determines how ‘spread out’ the nodes are |
Nodes can also be dragged around with the mouse to optimise network appearance.
You can export the data from each program view as a CSV file by clicking Export entire table (csv) or Export filtered subset (csv) on the active controls pane. The format of exported tables matches how the data appears in Koe; for example, data exported from the Unit table view will match the column and row order of the unit table as it appears in Koe.
You can export the csv of extracted feature measurements by going to Extract features and compare > Extract unit features. Choose your previously-extracted matrix from the dropdown list, then click the download button at the bottom of the screen.
First make sure you have created the database which you want to copy songs to. Then, under View all songs, select the songs you want to copy (1), and click Copy to another database (2). Type the name of the database to copy to (3) then click Yes (4)
You can make collections of units as convenient subsets for analysis. Under Classify & manage units > Unit table, select all the units you want to be in the collection, then click Make a Collection from selected units on the active controls pane. Type a name for the collection, then click Set. Note the Collection name must be unique; if the collection name has already been created (by any users, not just you), the name will not 'take'. Also note that a Collection is not separate from the parent database – any changes you make to the Collection will be reflected in the parent database. Therefore you should take care when naming the collection so that you will remember which parent it belongs to. You can rename and delete collections under Manage your database.
If you want to create a subset that you can modify independently from the parent database, go to View all songs and copy songs to another database. Then you can edit freely without affecting the original database. To achieve this, do the following:
-
Create a new database under Manage your database.
-
Go to View all songs.
-
Under the Current database dropdown list on the controls pane, choose the database you want to copy songs from.
-
Select the songs you want to copy by checking their checkboxes.
-
Click Copy to another database on the active controls pane.
-
Type the name of the newly created database you want to copy songs to.
-
Click copy
In Chrome:
Hotkey | Action |
---|---|
Ctrl +Page Up / Ctrl +Page Down
|
Toggle between browser tabs |
Alt +D , then Alt +Enter
|
Duplicate current tab |
F5 |
Refresh current tab |
F12 (brings up developer panel) + long-click browser refresh button > Empty Cache and Hard Reload
|
Can resolve Koe glitches. Try it—if you don't mind emptying your cache |
These hotkeys are common to all views with data tables. Additional view-specific hotkeys are given in the following sections.
Hotkey | Action |
---|---|
← ↑ ↓ →
|
Navigate the table grid |
Home |
Jump to first cell in row |
End |
Jump to end cell in row |
Page Up /Page Down
|
Page the table one screen-height at a time |
click on table header, Shift +click on subsequent headers |
Sort by multiple columns, in the priority order headers were clicked |
Hotkey | Action |
---|---|
Spacebar |
Check/uncheck checkbox when checkbox cell is active |
Shift +click on checkbox |
Select a range (i.e. selects all rows between the last selected checkbox and the Shift +clicked checkbox) |
Hotkey | Action |
---|---|
Shift while mousing over segmented unit box |
Reveal unit selection box start/end handles; click and drag handles while holding Shift to adjust |
← →
|
Scroll spectrogram backwards and forwards (click in spectrogram area first to activate) |
Spacebar |
Check/uncheck checkbox when checkbox cell is active |
Shift +click on checkbox |
Select a range (i.e. selects all rows between the last selected checkbox and the Shift +clicked checkbox) |
Units can have three sets of labels at once, using the three label columns: Label, Subfamily and Family. You can use these to label at different scales, e.g. fine-scale, intermediate-scale and broad-scale classification. Or you may choose to use these columns for other purposes—it's up to you!
Hotkey | Action |
---|---|
Spacebar |
Play audio (when spectrogram cell is active); Check/uncheck row selection checkbox (when checkbox cell is active) |
Shift +click on checkbox |
Select a range (i.e. selects all rows between the last selected checkbox and the Shift +clicked checkbox) |
Ctrl +Shift +L
|
bulk-label all selected rows in Label column (e.g. fine-scale classification) |
Ctrl +Shift +S
|
bulk-label all selected rows in Subfamily column (e.g. intermediate-scale classification) |
Ctrl +Shift +F
|
bulk-label all selected rows in Family column (e.g. broad-scale classification) |
Ctrl +` |
Deselect all currently selected rows |
Suggestions for other hotkeys? Let us know!
=================================================
JUMP TO: Analysis Tutorial: Case Study of NZ bellbird song.
JUMP TO: Frequently Asked Questions
JUMP TO: What's new in Koe?
Information lacking? Questions? Comments? Live chat with us at www.facebook.com/KoeSoftware or send us an email ([email protected])