Skip to content

InftyAI/Awesome-LLMOps

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Awesome-LLMOps Awesome

πŸŽ‰ An awesome & curated list of best LLMOps tools.

More than welcomed to add new projects in alphabetical order.

Table of Contents

Agent

Framework

  • Agno: Build Multimodal AI Agents with memory, knowledge and tools. Simple, fast and model-agnostic. Stars Contributors LastCommit
  • AutoGPT: AutoGPT is the vision of accessible AI for everyone, to use and to build on. Our mission is to provide the tools, so that you can focus on what matters. Stars Contributors LastCommit
  • kagent: kagent is a kubernetes native framework for building AI agents. Stars Contributors LastCommit
  • LangGraph: Build resilient language agents as graphs. Stars Contributors LastCommit
  • MetaGPT: 🌟 The Multi-Agent Framework: First AI Software Company, Towards Natural Language Programming. Stars Contributors LastCommit
  • OpenAI Agents SDK: A lightweight, powerful framework for multi-agent workflows. Stars Contributors LastCommit
  • OpenManus: No fortress, purely open ground. OpenManus is Coming. Stars Contributors LastCommit
  • PydanticAI: Agent Framework / shim to use Pydantic with LLMs. Stars Contributors LastCommit
  • Swarm: Educational framework exploring ergonomic, lightweight multi-agent orchestration. Managed by OpenAI Solution team. Stars Contributors LastCommit Tag

Tools

  • Browser Use: Make websites accessible for AI agents. Stars Contributors LastCommit
  • Mem0: The Memory layer for AI Agents. Stars Contributors LastCommit
  • OpenAI CUA: Computer Using Agent Sample App. Stars Contributors LastCommit

Alignment

  • OpenRLHF: An Easy-to-use, Scalable and High-performance RLHF Framework (70B+ PPO Full Tuning & Iterative DPO & LoRA & RingAttention & RFT). Stars Contributors LastCommit
  • Self-RLHF: Safe RLHF: Constrained Value Alignment via Safe Reinforcement Learning from Human Feedback. Stars Contributors LastCommit

Application Orchestration Framework

  • Dify: Dify is an open-source LLM app development platform. Dify's intuitive interface combines AI workflow, RAG pipeline, agent capabilities, model management, observability features and more, letting you quickly go from prototype to production. Stars Contributors LastCommit
  • Flowise: Drag & drop UI to build your customized LLM flow. Stars Contributors LastCommit
  • Haystack: AI orchestration framework to build customizable, production-ready LLM applications. Connect components (models, vector DBs, file converters) to pipelines or agents that can interact with your data. With advanced retrieval methods, it's best suited for building RAG, question answering, semantic search or conversational agent chatbots. Stars Contributors LastCommit
  • Inference: Turn any computer or edge device into a command center for your computer vision projects. Stars Contributors LastCommit Tag
  • LangChain: πŸ¦œπŸ”— Build context-aware reasoning applications. Stars Contributors LastCommit
  • LightRAG: "LightRAG: Simple and Fast Retrieval-Augmented Generation" Stars Contributors LastCommit
  • LlamaIndex: LlamaIndex is the leading framework for building LLM-powered agents over your data. Stars Contributors LastCommit
  • Semantic Kernel: An open-source integration framework for integrating LLMs into your applications, featuring plugin integration, memory management, planners, and multi-modal capabilities. Stars Contributors LastCommit

Chat Framework

  • 5ire: 5ire is a cross-platform desktop AI assistant, MCP client. It compatible with major service providers, supports local knowledge base and tools via model context protocol servers. Stars Contributors LastCommit
  • Chatbot UI: AI chat for any model. Stars Contributors LastCommit
  • Cherry Studio: πŸ’ Cherry Studio is a desktop client that supports for multiple LLM providers. Support deepseek-r1. Stars Contributors LastCommit
  • FastChat: An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena. Stars Contributors LastCommit
  • Gradio: Build and share delightful machine learning apps, all in Python. 🌟 Star to support our work! Stars Contributors LastCommit
  • Jan: Jan is an open source alternative to ChatGPT that runs 100% offline on your computer. Stars Contributors LastCommit
  • Lobe Chat: 🀯 Lobe Chat - an open-source, modern-design AI chat framework. Supports Multi AI Providers( OpenAI / Claude 3 / Gemini / Ollama / DeepSeek / Qwen), Knowledge Base (file upload / knowledge management / RAG ), Multi-Modals (Plugins/Artifacts) and Thinking. One-click FREE deployment of your private ChatGPT/ Claude / DeepSeek application. Stars Contributors LastCommit
  • NextChat: ✨ Light and Fast AI Assistant. Support: Web | iOS | MacOS | Android | Linux | Windows. Stars Contributors LastCommit
  • Open WebUI: User-friendly AI Interface (Supports Ollama, OpenAI API, ...). Stars Contributors LastCommit
  • PrivateGPT: Interact with your documents using the power of GPT, 100% privately, no data leaks. Stars Contributors LastCommit

Code Assistant

  • Auto-dev: πŸ§™β€AutoDev: The AI-powered coding wizard(AI ι©±εŠ¨ηΌ–η¨‹εŠ©ζ‰‹οΌ‰with multilingual support 🌐, auto code generation πŸ—οΈ, and a helpful bug-slaying assistant 🐞! Customizable prompts 🎨 and a magic Auto Dev/Testing/Document/Agent feature πŸ§ͺ included! πŸš€. Stars Contributors LastCommit
  • Codefuse-chatbot: An intelligent assistant serving the entire software development lifecycle, powered by a Multi-Agent Framework, working with DevOps Toolkits, Code&Doc Repo RAG, etc. Stars Contributors LastCommit
  • Cody: Type less, code more: Cody is an AI code assistant that uses advanced search and codebase context to help you write and fix code. Stars Contributors LastCommit
  • Continue: ⏩ Create, share, and use custom AI code assistants with our open-source IDE extensions and hub of models, rules, prompts, docs, and other building blocks. Stars Contributors LastCommit
  • Sweep: JSweep: AI coding assistant for JetBrains. Stars Contributors LastCommit
  • Tabby: Self-hosted AI coding assistant. Stars Contributors LastCommit

Database

  • chroma: the AI-native open-source embedding database. Stars Contributors LastCommit
  • deeplake: Database for AI. Store Vectors, Images, Texts, Videos, etc. Use with LLMs/LangChain. Store, query, version, & visualize any AI data. Stream data in real-time to PyTorch/TensorFlow. Stars Contributors LastCommit
  • Faiss: A library for efficient similarity search and clustering of dense vectors. Stars Contributors LastCommit
  • milvus: Milvus is a high-performance, cloud-native vector database built for scalable vector ANN search. Stars Contributors LastCommit
  • weaviate: Weaviate is an open-source vector database that stores both objects and vectors, allowing for the combination of vector search with structured filtering with the fault tolerance and scalability of a cloud-native database​. Stars Contributors LastCommit

Evaluation

  • AgentBench: A Comprehensive Benchmark to Evaluate LLMs as Agents (ICLR'24). Stars Contributors LastCommit
  • lm-evaluation-harness: A framework for few-shot evaluation of language models. Stars Contributors LastCommit
  • LongBench: LongBench v2 and LongBench (ACL 2024). Stars Contributors LastCommit
  • OpenCompass: OpenCompass is an LLM evaluation platform, supporting a wide range of models (Llama3, Mistral, InternLM2,GPT-4,LLaMa2, Qwen,GLM, Claude, etc) over 100+ datasets. Stars Contributors LastCommit

FineTune

  • Axolotl: Go ahead and axolotl questions. Stars Contributors LastCommit
  • EasyLM: Large language models (LLMs) made easy, EasyLM is a one stop solution for pre-training, finetuning, evaluating and serving LLMs in JAX/Flax. Stars Contributors LastCommit
  • LLaMa-Factory: Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024). Stars Contributors LastCommit
  • LMFlow: An Extensible Toolkit for Finetuning and Inference of Large Foundation Models. Large Models for All. Stars Contributors LastCommit
  • maestro: streamline the fine-tuning process for multimodal models: PaliGemma 2, Florence-2, and Qwen2.5-VL. Stars Contributors LastCommit
  • MLX-VLM: MLX-VLM is a package for inference and fine-tuning of Vision Language Models (VLMs) on your Mac using MLX. Stars Contributors LastCommit
  • Swift: Use PEFT or Full-parameter to finetune 450+ LLMs (Qwen2.5, InternLM3, GLM4, Llama3.3, Mistral, Yi1.5, Baichuan2, DeepSeek-R1, ...) and 150+ MLLMs (Qwen2.5-VL, Qwen2-Audio, Llama3.2-Vision, Llava, InternVL2.5, MiniCPM-V-2.6, GLM4v, Xcomposer2.5, Yi-VL, DeepSeek-VL2, Phi3.5-Vision, GOT-OCR2, ...). Stars Contributors LastCommit
  • torchtune: PyTorch native post-training library. Stars Contributors LastCommit
  • Transformer Lab: Open Source Application for Advanced LLM Engineering: interact, train, fine-tune, and evaluate large language models on your own computer. Stars Contributors LastCommit
  • unsloth: Finetune Llama 3.3, DeepSeek-R1 & Reasoning LLMs 2x faster with 70% less memory! πŸ¦₯ Stars Contributors LastCommit

Gateway

LLM Router

  • AI Gateway: A blazing fast AI Gateway with integrated guardrails. Route to 200+ LLMs, 50+ AI Guardrails with 1 fast & friendly API. Stars Contributors LastCommit
  • LiteLLM: Python SDK, Proxy Server (LLM Gateway) to call 100+ LLM APIs in OpenAI format - [Bedrock, Azure, OpenAI, VertexAI, Cohere, Anthropic, Sagemaker, HuggingFace, Replicate, Groq]. Stars Contributors LastCommit
  • RouteLLM: A framework for serving and evaluating LLM routers - save LLM costs without compromising quality. Stars Contributors LastCommit

API Gateway

  • APISIX: The Cloud-Native API Gateway and AI Gateway with extensive plugin system and AI capabilities. Stars Contributors LastCommit
  • Envoy AI Gateway: Envoy AI Gateway is an open source project for using Envoy Gateway to handle request traffic from application clients to Generative AI services. Stars Contributors LastCommit
  • Higress: πŸ€– AI Gateway | AI Native API Gateway. Stars Contributors LastCommit
  • kgateway: The Cloud-Native API Gateway and AI Gateway. Stars Contributors LastCommit
  • Kong: 🦍 The Cloud-Native API Gateway and AI Gateway. Stars Contributors LastCommit
  • gateway-api-inference-extension: Gateway API Inference Extension. Stars Contributors LastCommit

Inference

Inference Engine

  • Cortex.cpp: Local AI API Platform. Stars Contributors LastCommit
  • DeepSpeed-MII: MII makes low-latency and high-throughput inference possible, powered by DeepSpeed. Stars Contributors LastCommit
  • Nvidia Dynamo: A Datacenter Scale Distributed Inference Serving Framework. Stars Contributors LastCommit
  • ipex-llm: Accelerate local LLM inference and finetuning (LLaMA, Mistral, ChatGLM, Qwen, DeepSeek, Mixtral, Gemma, Phi, MiniCPM, Qwen-VL, MiniCPM-V, etc.) on Intel XPU (e.g., local PC with iGPU and NPU, discrete GPU such as Arc, Flex and Max); seamlessly integrate with llama.cpp, Ollama, HuggingFace, LangChain, LlamaIndex, vLLM, DeepSpeed, Axolotl, etc. Stars Contributors LastCommit
  • LMDeploy: LMDeploy is a toolkit for compressing, deploying, and serving LLMs. Stars Contributors LastCommit
  • LoRAX: Multi-LoRA inference server that scales to 1000s of fine-tuned LLMs. Stars Contributors LastCommit Tag
  • llama.cpp: LLM inference in C/C++. Stars Contributors LastCommit
  • Llumnix: Efficient and easy multi-instance LLM serving. Stars Contributors LastCommit
  • MInference: [NeurIPS'24 Spotlight, ICLR'25] To speed up Long-context LLMs' inference, approximate and dynamic sparse calculate the attention, which reduces inference latency by up to 10x for pre-filling on an A100 while maintaining accuracy. Stars Contributors LastCommit Tag
  • MLC LLM: Universal LLM Deployment Engine with ML Compilation. Stars Contributors LastCommit
  • MLServer: An inference server for your machine learning models, including support for multiple frameworks, multi-model serving and more. Stars Contributors LastCommit
  • Ollama: Get up and running with Llama 3.3, DeepSeek-R1, Phi-4, Gemma 3, and other large language models. Stars Contributors LastCommit
  • OpenVINO: OpenVINOβ„’ is an open source toolkit for optimizing and deploying AI inference. Stars Contributors LastCommit
  • Ratchet: A cross-platform browser ML framework. Stars Contributors LastCommit Tag
  • SGLang: SGLang is a fast serving framework for large language models and vision language models. Stars Contributors LastCommit
  • transformers.js: State-of-the-art Machine Learning for the web. Run πŸ€— Transformers directly in your browser, with no need for a server! Stars Contributors LastCommit Tag
  • Triton Inference Server: The Triton Inference Server provides an optimized cloud and edge inferencing solution. Stars Contributors LastCommit
  • Text Generation Inference: Large Language Model Text Generation Inference. Stars Contributors LastCommit
  • vLLM: A high-throughput and memory-efficient inference and serving engine for LLMs. Stars Contributors LastCommit
  • web-llm: High-performance In-browser LLM Inference Engine. Stars Contributors LastCommit Tag
  • zml: Any model. Any hardware. Zero compromise. Built with @ziglang / @openxla / MLIR / @bazelbuild. Stars Contributors LastCommit

Inference Platform

  • AIBrix: Cost-efficient and pluggable Infrastructure components for GenAI inference. Stars Contributors LastCommit
  • Kaito: Kubernetes operator for large-model inference and fine-tuning, with GPU auto-provisioning, container-based hosting, and CRD-based orchestration. Stars Contributors LastCommit
  • Kserve: Standardized Serverless ML Inference Platform on Kubernetes. Stars Contributors LastCommit
  • KubeAI: AI Inference Operator for Kubernetes. The easiest way to serve ML models in production. Supports VLMs, LLMs, embeddings, and speech-to-text. Stars Contributors LastCommit
  • llmaz: ☸️ Easy, advanced inference platform for large language models on Kubernetes. 🌟 Star to support our work! Stars Contributors LastCommit
  • LMCache: 10x Faster Long-Context LLM By Smart KV Cache Optimizations. Stars Contributors LastCommit Tag
  • Mooncake: Mooncake is the serving platform for Kimi, a leading LLM service provided by Moonshot AI. Stars Contributors LastCommit
  • OpenLLM: Run any open-source LLMs, such as DeepSeek and Llama, as OpenAI compatible API endpoint in the cloud. Stars Contributors LastCommit

MCP

MCP Server

  • awesome-mcp-servers: A curated list of awesome Model Context Protocol (MCP) servers. Stars Contributors LastCommit
  • mcp-directory: A directory for Awesome MCP Servers. Stars Contributors LastCommit
  • Smithery: Smithery is a platform to help developers find and ship language model extensions compatible with the Model Context Protocol Specification.

MCP Client

MLOps

  • BentoML: The easiest way to serve AI apps and models - Build Model Inference APIs, Job queues, LLM apps, Multi-model pipelines, and more! Stars Contributors LastCommit
  • Flyte: Scalable and flexible workflow orchestration platform that seamlessly unifies data, ML and analytics stacks. Stars Contributors LastCommit
  • Kubeflow: Machine Learning Toolkit for Kubernetes. Stars Contributors LastCommit
  • Metaflow: Build, Deploy and Manage AI/ML Systems. Stars Contributors LastCommit
  • MLflow: Open source platform for the machine learning lifecycle. Stars Contributors LastCommit
  • Polyaxon: MLOps Tools For Managing & Orchestrating The Machine Learning LifeCycle. Stars Contributors LastCommit
  • Ray: Ray is an AI compute engine. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads. Stars Contributors LastCommit
  • Seldon-Core: An MLOps framework to package, deploy, monitor and manage thousands of production machine learning models. Stars Contributors LastCommit
  • ZenML: ZenML πŸ™: The bridge between ML and Ops. https://zenml.io. Stars Contributors LastCommit

Observation

  • OpenLLMetry: Open-source observability for your LLM application, based on OpenTelemetry. Stars Contributors LastCommit
  • Helicone: 🧊 Open source LLM observability platform. One line of code to monitor, evaluate, and experiment. YC W23 πŸ“ Stars Contributors LastCommit
  • phoenix: AI Observability & Evaluation. Stars Contributors LastCommit
  • wandb: The AI developer platform. Use Weights & Biases to train and fine-tune models, and manage models from experimentation to production. Stars Contributors LastCommit

Output

Training

  • Candle: Minimalist ML framework for Rust. Stars Contributors LastCommit
  • ColossalAI: Making large AI models cheaper, faster and more accessible. Stars Contributors LastCommit
  • Ludwig: Low-code framework for building custom LLMs, neural networks, and other AI models. Stars Contributors LastCommit
  • MaxText: A simple, performant and scalable Jax LLM! Stars Contributors LastCommit
  • MLX: MLX: An array framework for Apple silicon. Stars Contributors LastCommit