Skip to content

A Singing Melody Extraction Network Via Self-Distillation and Multi-Level Supervision. ICASSP 2025.

Notifications You must be signed in to change notification settings

SmoothJing/MF-TFA_SD-MS

Repository files navigation

MF-TFA_SD-MS

Introduction

The official implementation of "A Singing Melody Extraction Network Via Self-Distillation and Multi-Level Supervision." Our paper has been accepted by 2025 ICASSP.

We propose a singing melody extraction network consisting of five stacked multi-scale feature time-frequency aggregation (MF-TFA) modules. In the same network, deeper layers generally contain more contextual information than shallower layers. To help the shallower layers enhance the ability of task-relevant feature extraction, we propose a self-distillation and multi-level supervision (SD-MS) method, which leverages the feature distillation from the deepest layer to the shallower one and multi-level supervision to guide network training.

Table

Getting Started

Download Datasets

Results

Prediction result

The visualization illustrates that our proposed method can reduce the octave errors and the melody detection errors.

Table

Comprehensive result

The bold values indicate the best performance for a specific metric.

Table

Ablation study result_1

Results of ablation experiments introducing a self-distillation and multi-level supervision method in partially existing singing melody extraction model. SD-MS indicates that self-distillation and multi-level supervision is used.

Table

Ablation study result_2

Ablation study of the loss function on three datasets

Table

Important updata

The entire code scripts will be made public after being licensed.

Special thanks

About

A Singing Melody Extraction Network Via Self-Distillation and Multi-Level Supervision. ICASSP 2025.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages