
Starred repositories
Integrate the DeepSeek API into popular softwares
Generation of diagrams like flowcharts or sequence diagrams from text in a similar manner as markdown
Milvus is a high-performance, cloud-native vector database built for scalable vector ANN search
整理开源的中文大语言模型,以规模较小、可私有化部署、训练成本较低的模型为主,包括底座模型,垂直领域微调及应用,数据集与教程等。
✨✨VITA-1.5: Towards GPT-4o Level Real-Time Vision and Speech Interaction
✨✨Latest Advances on Multimodal Large Language Models
A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity Detection, Text Post-processing etc.
Open-source offline translation library written in Python
Free and Open Source Machine Translation API. Self-hosted, offline capable and easy to setup.
Robust Speech Recognition via Large-Scale Weak Supervision
🔥 Turn entire websites into LLM-ready markdown or structured data. Scrape, crawl and extract with a single API.
Refine high-quality datasets and visual AI models
The reinforcement learning codes for dataset SPA-VL
OpenChat: Advancing Open-source Language Models with Imperfect Data
A work list of recent human video generation method. This repository focus on half/full body human video generation method, The Nerf, Gaussian splashing, Motion Pose, and talking head/Portrait is n…
AutoGPT is the vision of accessible AI for everyone, to use and to build on. Our mission is to provide the tools, so that you can focus on what matters.
[AAAI 2025] EchoMimic: Lifelike Audio-Driven Portrait Animations through Editable Landmark Conditioning
ModelScope: bring the notion of Model-as-a-Service to life.
Generative Models by Stability AI
Open-Sora: Democratizing Efficient Video Production for All
This project aim to reproduce Sora (Open AI T2V model), we wish the open source community contribute to this project.
[ICLR2024] The official implementation of paper "VDT: General-purpose Video Diffusion Transformers via Mask Modeling", by Haoyu Lu, Guoxing Yang, Nanyi Fei, Yuqi Huo, Zhiwu Lu, Ping Luo, Mingyu Ding.
Official implementation of "DreamPose: Fashion Image-to-Video Synthesis via Stable Diffusion"
A programming framework for agentic AI 🤖 PyPi: autogen-agentchat Discord: https://aka.ms/autogen-discord Office Hour: https://aka.ms/autogen-officehour
[TMLR 2025] Latte: Latent Diffusion Transformer for Video Generation.
Official PyTorch Implementation of "Scalable Diffusion Models with Transformers"
[CSUR] A Survey on Video Diffusion Models