Skip to content
Change the repository type filter

All

    Repositories list

    • RDD-ADD

      Public
      0000Updated Feb 7, 2025Feb 7, 2025
    • NE-PADD

      Public
      0000Updated Feb 7, 2025Feb 7, 2025
    • MEIJU2025-website
      Rich Text Format
      0100Updated Jan 15, 2025Jan 15, 2025
    • FluentEditor2: Text-based Speech Editing by Modeling Multi-Scale Acoustic and Prosody Consistency
      HTML
      1500Updated Jan 3, 2025Jan 3, 2025
    • [ICASSP'2025] Multi-Source Spatial Knowledge Understanding for Immersive Visual Text-to-Speech
      0400Updated Dec 20, 2024Dec 20, 2024
    • [ICASSP'2025] Towards Expressive Video Dubbing with Multiscale Multimodal Context Interaction
      0500Updated Dec 20, 2024Dec 20, 2024
    • I3CSS

      Public
      [ICASSP'2025] Intra- and Inter-modal Context Interaction Modeling for Conversational Speech Synthesis
      0020Updated Dec 20, 2024Dec 20, 2024
    • ARF-MSA

      Public
      [IEEE TAFFC'2024] Connecting Cross-Modal Representations for Compact and Robust Multimodal Sentiment Analysis With Sentiment Word Substitution Error
      0110Updated Dec 16, 2024Dec 16, 2024
    • M2SE-VTTS

      Public
      [AAAI'2025] Multi-modal and Multi-scale Spatial Environment Understanding for Immersive Visual Text-to-Speech
      0100Updated Dec 16, 2024Dec 16, 2024
    • Python
      0700Updated Nov 22, 2024Nov 22, 2024
    • RADKA-CSS

      Public
      [Information Fusion'2025] Retrieval-Augmented Dialogue Knowledge Aggregation for Expressive Conversational Speech Synthesis
      2100Updated Nov 16, 2024Nov 16, 2024
    • NCE-TTS

      Public
      [IEEE TASLP' 2025] Noise Robust Cross-Speaker Emotion Transfer in TTS Through Knowledge Distillation and Orthogonal Constraint
      2000Updated Nov 14, 2024Nov 14, 2024
    • [ACMMM'2024] Generative Expressive Conversational Speech Synthesis
      23210Updated Oct 28, 2024Oct 28, 2024
    • NCSSD

      Public
      [ACMMM'2024] Generative Expressive Conversational Speech Synthesis (Accepted by MM'2024)
      Python
      2200Updated Oct 28, 2024Oct 28, 2024
    • F5-TTS

      Public
      Official code for "F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching"
      Python
      MIT License
      1.4k100Updated Oct 23, 2024Oct 23, 2024
    • [InterSpeech'2024] FluentEditor:Text-based Speech Editing by Considering Acoustic and Prosody Consistency
      Python
      25130Updated Oct 23, 2024Oct 23, 2024
    • Python
      MIT License
      12000Updated Oct 23, 2024Oct 23, 2024
    • 1100Updated Oct 11, 2024Oct 11, 2024
    • MC-EIU

      Public
      Emotion and Intent Joint Understanding in Multimodal Conversation: A Benchmarking Dataset
      0000Updated Sep 26, 2024Sep 26, 2024
    • [ICASSP'2020] Teacher-Student Training for Robust Tacotron-based TTS
      HTML
      1000Updated Sep 24, 2024Sep 24, 2024
    • [IEEE/ACM TASLP'2021] Expressive TTS Training with Frame and Style Reconstruction Loss
      HTML
      1100Updated Sep 24, 2024Sep 24, 2024
    • IOT

      Public
      [IEEE Internet of Things Journal (IEEE-IoTJ)'2022] Multi-Stage Deep Transfer Learning for EmIoT-enabled Human-Computer Interaction
      HTML
      1000Updated Sep 24, 2024Sep 24, 2024
    • SPL2020

      Public
      [IEEE SPL'2020] Modeling Prosodic Phrasing With Multi-Task Learning in Tacotron-Based TTS
      HTML
      1000Updated Sep 24, 2024Sep 24, 2024
    • [Neural Networks'2021] FastTalker: A neural text-to-speech architecture with shallow and group autoregression
      HTML
      1000Updated Sep 24, 2024Sep 24, 2024
    • MT-KD

      Public
      [IEEE/ACM-TASLP'2022] Decoding Knowledge Transfer for Neural Text-to-Speech Training
      HTML
      1000Updated Sep 24, 2024Sep 24, 2024
    • Ai-TTS

      Public
      [InterSpeech'2023] Explicit Intensity Control for Accented Text-to-speech
      HTML
      1000Updated Sep 24, 2024Sep 24, 2024
    • MAM-BERT

      Public
      [IEEE/ACM-TASLP'2024] Text-to-Speech for Low-Resource Agglutinative Language With Morphology-Aware Language Model Pre-Training
      HTML
      1000Updated Sep 24, 2024Sep 24, 2024
    • CTA-TTS

      Public
      [IEEE/ACM-TASLP 2024] Controllable Accented Text-to-Speech Synthesis with Fine and Coarse-Grained Intensity Rendering
      HTML
      1000Updated Sep 24, 2024Sep 24, 2024
    • [ICASSP'2021] GraphSpeech: Syntax-aware Graph Attention Network For Neural Speech Synthesis
      HTML
      1000Updated Sep 24, 2024Sep 24, 2024
    • i-ETTS

      Public
      [InterSpeech'2021] Reinforcement Learning for Emotional Text-to-Speech Synthesis with Improved Emotion Discriminability
      HTML
      1000Updated Sep 24, 2024Sep 24, 2024