INTERSPEECH-2023-Papers

Application
New collections

Speech Recognition: Technologies and Systems for New Applications

🆔	Title	Repo
2044	Syllable Discovery and Cross-Lingual Generalization in a Visually Grounded, Self-Supervised Speech Model
2032	Prompting the Hidden Talent of Web-Scale Speech Models for Zero-Shot Task Generalization
235	Progress and Prospects for Spoken Language Technology: Results from Five Sexennial Surveys	➖
268	Acoustic Word Embeddings for Untranscribed Target Languages with Continued Pretraining and Learned Pooling	➖
601	CASA-ASR: Context-Aware Speaker-Attributed ASR	➖
1321	Unsupervised Learning of Discrete Latent Representations with Data-Adaptive Dimensionality from Continuous Speech Streams	➖
1167	AD-TUNING: An Adaptive CHILD-TUNING Approach to Efficient Hyperparameter Optimization of Child Networks for Speech Processing Tasks in the SUPERB Benchmark
190	Distilling Knowledge from Gaussian Process Teacher to Neural Network Student	➖
135	Segmental SpeechCLIP: Utilizing Pretrained Image-Text Models for Audio-Visual Learning	➖
421	Towards Hate Speech Detection in Low-Resource Languages: Comparing ASR to Acoustic Word Embeddings on Wolof and Swahili	➖
385	Mitigating Catastrophic Forgetting for Few-Shot Spoken Word Classification through Meta-Learning
664	Online Punctuation Restoration using ELECTRA Model for Streaming ASR Systems	➖
2066	Language Agnostic Data-Driven Inverse Text Normalization	➖
1079	How to Estimate Model Transferability of Pre-trained Speech Models?	➖
1655	Transcribing Speech as Spoken and Written Dual Text using an Autoregressive Model	➖
587	Phonetic and Prosody-aware Self-Supervised Learning Approach for Non-Native Fluency Scoring	➖
380	Disentangling the Contribution of Non-Native Speech in Automated Pronunciation Assessment	➖
337	A Joint Model for Pronunciation Assessment and Mispronunciation Detection and Diagnosis with Multi-task Learning	➖
1635	Assessing Intelligibility in Non-Native Speech: Comparing Measures Obtained at Different Levels	➖
585	End-to-End Word-Level Pronunciation Assessment with MASK Pre-training
550	A Hierarchical Context-aware Modeling Approach for Multi-Aspect and Multi-Granular Pronunciation Assessment	➖
2541	Automatic Prediction of Language Learners' Listenability using Speech and Text Features Extracted from Listening Drills	➖
2371	Assessment of Non-Native Speech Intelligibility using Wav2vec2-based Mispronunciation Detection and Multi-Level Goodness of Pronunciation Transformer	➖
1899	Adapting an Unadaptable ASR System	➖
533	Addressing Cold Start Problem for End-to-End Automatic Speech Scoring	➖
816	Improving Grapheme-to-Phoneme Conversion by Learning Pronunciations from Speech Recordings	➖
2577	Orthography-based Pronunciation Scoring for Better CAPT Feedback	➖
1592	Zero-Shot Automatic Pronunciation Assessment	➖
364	Mispronunciation Detection and Diagnosis Model for Tonal Language, Applied to Vietnamese
793	An Efficient and Noise-Robust Audiovisual Encoder for Audiovisual Speech Recognition	➖
540	A Novel Self-training Approach for Low-Resource Speech Recognition	➖
1428	FunASR: A Fundamental End-to-End Speech Recognition Toolkit
487	Streaming Audio-Visual Speech Recognition with Alignment Regularization	➖
462	SparseVSR: Lightweight and Noise Robust Visual Speech Recognition	➖
2262	Multimodal Speech Recognition for Language-Guided Embodied Agents

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

speech-recognition-technologies-and-systems-for-new-applications.md

speech-recognition-technologies-and-systems-for-new-applications.md

INTERSPEECH-2023-Papers

Speech Recognition: Technologies and Systems for New Applications

Files

speech-recognition-technologies-and-systems-for-new-applications.md

Latest commit

History

speech-recognition-technologies-and-systems-for-new-applications.md

File metadata and controls

INTERSPEECH-2023-Papers

Speech Recognition: Technologies and Systems for New Applications