zpcoftts

Follow

PengchengZhu zpcoftts

Follow

TTS/SVS/VC/Talking Avatar

91 followers · 153 following

NetEase
hangzhou, China

Achievements

Achievements

Starred repositories

ai-dynamo / dynamo

A Datacenter Scale Distributed Inference Serving Framework

Rust 3,004 197 Updated Mar 23, 2025

ahujasid / blender-mcp

Python 8,061 674 Updated Mar 22, 2025

codezjx / netease-cloud-music-dl

Netease cloud music song downloader, with full ID3 metadata, eg: front cover image, artist name, album name, song title and so on.

Python 530 87 Updated Jun 7, 2024

cisnlp / MaskLID

💬 MaskLID: Code-Switching Language Identification through Iterative Masking -- ACL 2024

Python 8 1 Updated Jun 11, 2024

mannaandpoem / OpenManus

No fortress, purely open ground. OpenManus is Coming.

Python 39,031 6,469 Updated Mar 22, 2025

OpenRLHF / OpenRLHF

An Easy-to-use, Scalable and High-performance RLHF Framework (70B+ PPO Full Tuning & Iterative DPO & LoRA & RingAttention & RFT)

Python 5,831 568 Updated Mar 23, 2025

huggingface / open-r1

Fully open reproduction of DeepSeek-R1

Python 23,180 2,111 Updated Mar 23, 2025

facebookresearch / sam2

The repository provides code for running inference with the Meta Segment Anything Model 2 (SAM 2), links for downloading the trained model checkpoints, and example notebooks that show how to use th…

Jupyter Notebook 14,647 1,557 Updated Dec 25, 2024

ASLP-lab / DiffRhythm

Di♪♪Rhythm: Blazingly Fast and Embarrassingly Simple End-to-End Full-Length Song Generation with Latent Diffusion

Python 1,251 121 Updated Mar 23, 2025

SesameAILabs / csm

A Conversational Speech Generation Model

Python 11,037 856 Updated Mar 22, 2025

webui-dev / webui

Use any web browser or WebView as GUI, with your preferred language in the backend and modern web technologies in the frontend, all in a lightweight portable library.

C 3,451 221 Updated Mar 19, 2025

index-tts / index-tts

An Industrial-Level Controllable and Efficient Zero-Shot Text-To-Speech System

276 7 Updated Feb 11, 2025

deepseek-ai / 3FS

A high-performance distributed file system designed to address the challenges of AI training and inference workloads.

C++ 8,250 785 Updated Mar 20, 2025

Wan-Video / Wan2.1

Wan: Open and Advanced Large-Scale Video Generative Models

Python 8,951 957 Updated Mar 21, 2025

astral-sh / uv

An extremely fast Python package and project manager, written in Rust.

Rust 45,805 1,291 Updated Mar 23, 2025

deepseek-ai / DualPipe

A bidirectional pipeline parallelism algorithm for computation-communication overlap in V3/R1 training.

Python 2,646 279 Updated Mar 10, 2025

SparkAudio / Spark-TTS

Spark-TTS Inference Code

Python 5,904 605 Updated Mar 21, 2025

k2-fsa / sherpa-onnx

Speech-to-text, text-to-speech, speaker diarization, speech enhancement, and VAD using next-gen Kaldi with onnxruntime without Internet connection. Support embedded systems, Android, iOS, HarmonyOS…

C++ 5,330 604 Updated Mar 22, 2025

deepseek-ai / FlashMLA

FlashMLA: Efficient MLA decoding kernels

C++ 11,360 807 Updated Mar 1, 2025

om-ai-lab / VLM-R1

Solve Visual Understanding with Reinforced VLMs

Python 4,273 264 Updated Mar 23, 2025

hiyouga / EasyR1

EasyR1: An Efficient, Scalable, Multi-Modality RL Training Framework based on veRL

Python 1,651 107 Updated Mar 21, 2025

roudimit / whisper-flamingo

[Interspeech 2024] Whisper-Flamingo: Integrating Visual Features into Whisper for Audio-Visual Speech Recognition and Translation

Jupyter Notebook 145 9 Updated Feb 12, 2025

langgenius / dify-docs

The open-source repo for docs.dify.ai

Shell 443 306 Updated Mar 21, 2025

PKU-Alignment / align-anything

Align Anything: Training All-modality Model with Feedback

Python 2,975 386 Updated Mar 23, 2025

SkyworkAI / SkyReels-V1

SkyReels V1: The first and most advanced open-source human-centric video foundation model

Python 1,869 174 Updated Mar 10, 2025

Seed3D / MagicArticulate

[CVPR 2025] Official repository for “MagicArticulate: Make Your 3D Models Articulation-Ready”

Python 237 4 Updated Mar 22, 2025

v-iashin / SpecVQGAN

Source code for "Taming Visually Guided Sound Generation" (Oral at the BMVC 2021)

Jupyter Notebook 358 39 Updated Jul 12, 2024

ZJU-LLMs / Foundations-of-LLMs

9,191 786 Updated Jan 14, 2025

microsoft / OmniParser

A simple screen parsing tool towards pure vision based GUI agent

Jupyter Notebook 20,957 1,713 Updated Mar 17, 2025

ASLP-lab / OSUM

OSUM: Open Speech Understanding Model, open-sourced by ASLP@NPU.

Python 342 20 Updated Mar 18, 2025

Starred topics

reinforcement-learning