AI-S2-Lab

All

49 repositories

RDD-ADD
Public
0•0•0•0•Updated Feb 7, 2025Feb 7, 2025
NE-PADD
Public
0•0•0•0•Updated Feb 7, 2025Feb 7, 2025
MEIJU2025-website
Public
MEIJU2025-website
Rich Text Format
•0•1•0•0•Updated Jan 15, 2025Jan 15, 2025
FluentEditor2
Public
FluentEditor2: Text-based Speech Editing by Modeling Multi-Scale Acoustic and Prosody Consistency
HTML
•1•5•0•0•Updated Jan 3, 2025Jan 3, 2025
MS2KU-VTTS
Public
[ICASSP'2025] Multi-Source Spatial Knowledge Understanding for Immersive Visual Text-to-Speech
0•4•0•0•Updated Dec 20, 2024Dec 20, 2024
M2CI-Dubber
Public
[ICASSP'2025] Towards Expressive Video Dubbing with Multiscale Multimodal Context Interaction
0•5•0•0•Updated Dec 20, 2024Dec 20, 2024
I3CSS
Public
[ICASSP'2025] Intra- and Inter-modal Context Interaction Modeling for Conversational Speech Synthesis
0•0•2•0•Updated Dec 20, 2024Dec 20, 2024
ARF-MSA
Public
[IEEE TAFFC'2024] Connecting Cross-Modal Representations for Compact and Robust Multimodal Sentiment Analysis With Sentiment Word Substitution Error
0•1•1•0•Updated Dec 16, 2024Dec 16, 2024
M2SE-VTTS
Public
[AAAI'2025] Multi-modal and Multi-scale Spatial Environment Understanding for Immersive Visual Text-to-Speech
0•1•0•0•Updated Dec 16, 2024Dec 16, 2024
MC-EIU-main
Public
Python
•0•7•0•0•Updated Nov 22, 2024Nov 22, 2024
RADKA-CSS
Public
[Information Fusion'2025] Retrieval-Augmented Dialogue Knowledge Aggregation for Expressive Conversational Speech Synthesis
2•1•0•0•Updated Nov 16, 2024Nov 16, 2024
NCE-TTS
Public
[IEEE TASLP' 2025] Noise Robust Cross-Speaker Emotion Transfer in TTS Through Knowledge Distillation and Orthogonal Constraint
2•0•0•0•Updated Nov 14, 2024Nov 14, 2024
GPT-Talker
Public
[ACMMM'2024] Generative Expressive Conversational Speech Synthesis
2•32•1•0•Updated Oct 28, 2024Oct 28, 2024
NCSSD
Public
[ACMMM'2024] Generative Expressive Conversational Speech Synthesis (Accepted by MM'2024)
Python
•2•2•0•0•Updated Oct 28, 2024Oct 28, 2024
F5-TTS
Public
Official code for "F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching"
Python
•
MIT License
•1.4k•1•0•0•Updated Oct 23, 2024Oct 23, 2024
FluentEditor
Public
[InterSpeech'2024] FluentEditor:Text-based Speech Editing by Considering Acoustic and Prosody Consistency
Python
•2•51•3•0•Updated Oct 23, 2024Oct 23, 2024
MEIJU2025-baseline
Public
Python
•
MIT License
•1•20•0•0•Updated Oct 23, 2024Oct 23, 2024
ECap-Spoken
Public
1•1•0•0•Updated Oct 11, 2024Oct 11, 2024
MC-EIU
Public
Emotion and Intent Joint Understanding in Multimodal Conversation: A Benchmarking Dataset
0•0•0•0•Updated Sep 26, 2024Sep 26, 2024
ICASSP-2020
Public
[ICASSP'2020] Teacher-Student Training for Robust Tacotron-based TTS
HTML
•1•0•0•0•Updated Sep 24, 2024Sep 24, 2024
Expressive-TTS-Training-with-Frame-and-Style-Reconstruction-Loss
Public
[IEEE/ACM TASLP'2021] Expressive TTS Training with Frame and Style Reconstruction Loss
HTML
•1•1•0•0•Updated Sep 24, 2024Sep 24, 2024
IOT
Public
[IEEE Internet of Things Journal (IEEE-IoTJ)'2022] Multi-Stage Deep Transfer Learning for EmIoT-enabled Human-Computer Interaction
HTML
•1•0•0•0•Updated Sep 24, 2024Sep 24, 2024
SPL2020
Public
[IEEE SPL'2020] Modeling Prosodic Phrasing With Multi-Task Learning in Tacotron-Based TTS
HTML
•1•0•0•0•Updated Sep 24, 2024Sep 24, 2024
FastTalker
Public
[Neural Networks'2021] FastTalker: A neural text-to-speech architecture with shallow and group autoregression
HTML
•1•0•0•0•Updated Sep 24, 2024Sep 24, 2024
MT-KD
Public
[IEEE/ACM-TASLP'2022] Decoding Knowledge Transfer for Neural Text-to-Speech Training
HTML
•1•0•0•0•Updated Sep 24, 2024Sep 24, 2024
Ai-TTS
Public
[InterSpeech'2023] Explicit Intensity Control for Accented Text-to-speech
HTML
•1•0•0•0•Updated Sep 24, 2024Sep 24, 2024
MAM-BERT
Public
[IEEE/ACM-TASLP'2024] Text-to-Speech for Low-Resource Agglutinative Language With Morphology-Aware Language Model Pre-Training
HTML
•1•0•0•0•Updated Sep 24, 2024Sep 24, 2024
CTA-TTS
Public
[IEEE/ACM-TASLP 2024] Controllable Accented Text-to-Speech Synthesis with Fine and Coarse-Grained Intensity Rendering
HTML
•1•0•0•0•Updated Sep 24, 2024Sep 24, 2024
GraphSpeech
Public
[ICASSP'2021] GraphSpeech: Syntax-aware Graph Attention Network For Neural Speech Synthesis
HTML
•1•0•0•0•Updated Sep 24, 2024Sep 24, 2024
i-ETTS
Public
[InterSpeech'2021] Reinforcement Learning for Emotional Text-to-Speech Synthesis with Improved Emotion Discriminability
HTML
•1•0•0•0•Updated Sep 24, 2024Sep 24, 2024