diff --git a/docs/docs/llms/index.mdx b/docs/docs/llms/index.mdx index a870c7fc6a44a..70592e07f2ce8 100644 --- a/docs/docs/llms/index.mdx +++ b/docs/docs/llms/index.mdx @@ -2,6 +2,354 @@ sidebar_position: 5 --- +import { CardGroup, Card } from "@site/src/components/Card"; +import { APILink } from "@site/src/components/APILink"; + # LLMs -{/* WIP */} +LLMs, or Large Language Models, have rapidly become a cornerstone in the machine learning domain, offering +immense capabilities ranging from natural language understanding to code generation and more. +However, harnessing the full potential of LLMs often involves intricate processes, from interfacing with +multiple providers to fine-tuning specific models to achieve desired outcomes. + +Such complexities can easily become a bottleneck for developers and data scientists aiming to integrate LLM +capabilities into their applications. + +**MLflow's Support for LLMs** aims to alleviate these challenges by introducing a suite of features and tools designed with the end-user in mind. + +## Tutorials and Use Case Guides for GenAI applications in MLflow + +Interested in learning how to leverage MLflow for your GenAI projects? + +Look in the tutorials and guides below to learn more about interesting use cases that could help to make your journey into leveraging GenAI a bit easier! + +Note that there are additional tutorials within the ["Explore the Native LLM Flavors" section below](#explore-the-native-llm-flavors), so be sure to check those out as well! + + + ChatModel, + ]} + /> + ChatModel, + ]} + /> + mlflow.evaluate(), + " API.", + ]} + /> + + + + +## MLflow Tracing + +:::note +MLflow Tracing is currently in **Experimental Status** and is subject to change without deprecation warning or notification. +::: + +MLflow offers comprehensive tracing capabilities to monitor and analyze the execution of GenAI applications. This includes automated tracing GenAI frameworks such as +LangChain, OpenAI, LlamaIndex, manual trace instrumentation using high-level fluent APIs, and low-level client APIs for fine-grained control. This functionality +allows you to capture detailed trace data, enabling better debugging, performance monitoring, and insights into complex workflows. +Whether through decorators, context managers, or explicit API calls, MLflow provides the flexibility needed to trace and optimize the operations +of your GenAI models and retain your traced data within the tracking server for further analysis. + +- [Automated tracing with GenAI libraries](tracing#automatic-tracing): Seamless integration with libraries such as LangChain, OpenAI, LlamaIndex, and AutoGen, for automatic trace data collection. +- [Manual trace instrumentation with high-level fluent APIs](tracing#tracing-fluent-apis): Easy-to-use decorators and context managers for adding tracing with minimal code changes. +- [Low-level client APIs for tracing](tracing#tracing-client-apis): Thread-safe methods for detailed and explicit control over trace data management. + +To learn more about what tracing is, see our [Tracing Concepts Overview](tracing/overview) guide. For an in-depth exploration into the structure of +MLflow traces and their schema, see the [Tracing Schema](tracing/tracing-schema) guide. + +If you're interested in contributing to the development of MLflow Tracing, please refer to the [Contribute to MLflow Tracing](tracing/contribute) guide. + +## MLflow AI Gateway for LLMs + +Serving as a unified interface, the [MLflow AI Gateway](deployments) +simplifies interactions with multiple LLM providers. In addition to supporting the most popular SaaS LLM providers, the MLflow AI Gateway +provides an integration to MLflow model serving, allowing you to serve your own LLM or a fine-tuned foundation model within your own serving infrastructure. + +:::note +The MLflow AI Gateway is in active development and has been marked as **Experimental**. +APIs may change as this new feature is refined and its functionality is expanded based on feedback. +::: + +### Benefits of the MLflow AI Gateway + +- **Unified Endpoint**: No more juggling between multiple provider APIs. + +- **Simplified Integrations**: One-time setup, no repeated complex integrations. + +- **Secure Credential Management**: + + - Centralized storage prevents scattered API keys. + - No hardcoding or user-handled keys. + +- **Consistent API Experience**: + + - Uniform API across all providers. + - Easy-to-use REST endpoints and Client API. + +- **Seamless Provider Swapping**: + + - Swap providers without touching your code. + - Zero downtime provider, model, or route swapping. + +### Explore the Native Providers of the MLflow AI Gateway + +The MLflow AI Gateway supports a large range of foundational models from popular SaaS model vendors, as well as providing a means of self-hosting your +own open source model via an integration with MLflow model serving. + +Please refer to [Supported Provider Models](deployments#providers) for the full list of supported providers and models. + +If you're interested in learning about how to set up the MLflow AI Gateway for a specific provider, follow the links below for our up-to-date +documentation on GitHub. Each link will take you to a README file that will explain how to set up a route for the provider. In the same directory as +the README, you will find a runnable example of how to query the routes that the example creates, providing you with a quick reference for getting started +with your favorite provider! + +
+
+ +
![OpenAI Logo](/images/logos/openai-logo.png)
+
+ +
+ ![MosaicML Logo](/images/logos/mosaicml-logo.svg) +
+
+ +
+ ![Anthropic Logo](/images/logos/anthropic-logo.svg) +
+
+ +
![Cohere Logo](/images/logos/cohere-logo.png)
+
+ +
![MLflow Logo](/images/logos/mlflow-logo.svg)
+
+ +
+ ![AWS BedLock Logo](/images/logos/aws-logo.svg) +
+
+ +
![PaLM Logo](/images/logos/PaLM-logo.png)
+
+ +
+ ![ai21Labs Logo](/images/logos/ai21labs-logo.svg) +
+
+ +
+ ![Azure OpenAI Logo](/images/logos/azure-ml-logo.png) +
+
+ +
+ ![Hugging Face Logo](/images/logos/huggingface-logo.svg) +
+
+
+
+ +:::note +The **MLflow** and **Hugging Face TGI** providers are for self-hosted LLM serving of either foundation open-source LLM models, fine-tuned open-source +LLM models, or your own custom LLM. The example documentation for these providers will show you how to get started with these, using free-to-use open-source +models from the `Hugging Face Hub `\_. +::: + +## LLM Evaluation + +Navigating the vast landscape of Large Language Models (LLMs) can be daunting. Determining the right model, prompt, or service that aligns +with a project's needs is no small feat. Traditional machine learning evaluation metrics often fall short when it comes to assessing the +nuanced performance of generative models. + +Enter [MLflow LLM Evaluation](llm-evaluate). This feature is designed to simplify the evaluation process, +offering a streamlined approach to compare foundational models, providers, and prompts. + +### Benefits of MLflow's LLM Evaluation + +- **Simplified Evaluation**: Navigate the LLM space with ease, ensuring the best fit for your project with standard metrics that can be used to compare generated text. + +- **Use-Case Specific Metrics**: Leverage MLflow's API for a high-level, frictionless evaluation experience. + +- **Customizable Metrics**: Beyond the provided metrics, MLflow supports a plugin-style for custom scoring, enhancing the evaluation's flexibility. + +- **Comparative Analysis**: Effortlessly compare foundational models, providers, and prompts to make informed decisions. + +- **Deep Insights**: Dive into the intricacies of generative models with a comprehensive suite of LLM-relevant metrics. + +MLflow's LLM Evaluation is designed to bridge the gap between traditional machine learning evaluation and the unique challenges posed by LLMs. + +## Prompt Engineering UI + +Effective utilization of LLMs often hinges on crafting the right prompts. +The development of a high-quality prompt is an iterative process of trial and error, where subsequent experimentation is not guaranteed to +result in cumulative quality improvements. With the volume and speed of iteration through prompt experimentation, it can quickly become very +overwhelming to remember or keep a history of the state of different prompts that were tried. + +Serving as a powerful tool for prompt engineering, the [MLflow Prompt Engineering UI](prompt-engineering) revolutionizes the +way developers interact with and refine LLM prompts. + +### Benefits of the MLflow Prompt Engineering UI + +- **Iterative Development**: Streamlined process for trial and error without the overwhelming complexity. + +- **UI-Based Prototyping**: Prototype, iterate, and refine prompts without diving deep into code. + +- **Accessible Engineering**: Makes prompt engineering more user-friendly, speeding up experimentation. + +- **Optimized Configurations**: Quickly hone in on the best model configurations for tasks like question answering or document summarization. + +- **Transparent Tracking**: + + - Every model iteration and configuration is meticulously tracked. + - Ensures reproducibility and transparency in your development process. + +:::note +The MLflow Prompt Engineering UI is in active development and has been marked as **Experimental**. +Features and interfaces may evolve as feedback is gathered and the tool is refined. +::: + +## Native MLflow Flavors for LLMs + +Harnessing the power of LLMs becomes effortless with flavors designed specifically for working with LLM libraries and frameworks. + +- **Native Support for Popular Packages**: Standardized interfaces for tasks like saving, logging, and managing inference configurations. + +- **PyFunc Compatibility**: + + - Load models as PyFuncs for broad compatibility across serving infrastructures. + - Strengthens the MLOps process for LLMs, ensuring smooth deployments. + - Utilize the [Models From Code feature](../model/models-from-code) for simplified GenAI application development. + +- **Cohesive Ecosystem**: + + - All essential tools and functionalities consolidated under MLflow. + - Focus on deriving value from LLMs without getting bogged down by interfacing and optimization intricacies. + +### Explore the Native LLM Flavors + +Select the integration below to read the documentation on how to leverage MLflow's native integration with these popular libraries: + +
+
+ + + + + + +
+
+ +## LLM Tracking in MLflow + +Empowering developers with advanced tracking capabilities, the [MLflow LLM Tracking System](llm-tracking) stands out as the +premier solution for managing and analyzing interactions with Large Language Models (LLMs). + +### Benefits of the MLflow LLM Tracking System + +- **Robust Interaction Management**: Comprehensive tracking of every LLM interaction for maximum insight. + +- **Tailor-Made for LLMs**: + + - Unique features specifically designed for LLMs. + - From logging prompts to tracking dynamic data, MLflow has it covered. + +- **Deep Model Insight**: + + - Introduces 'predictions' as a core entity, alongside the existing artifacts, parameters, and metrics. + - Gain unparalleled understanding of text-generating model behavior and performance. + +- **Clarity and Repeatability**: + + - Ensures consistent and transparent tracking across all LLM interactions. + - Facilitates informed decision-making and optimization in LLM deployment and utilization. diff --git a/docs/docs/llms/tracing/contribute.mdx b/docs/docs/llms/tracing/contribute.mdx new file mode 100644 index 0000000000000..01dc480dd32b4 --- /dev/null +++ b/docs/docs/llms/tracing/contribute.mdx @@ -0,0 +1,147 @@ +--- +sidebar_position: 4 +--- + +import { APILink } from "@site/src/components/APILink"; + +# Contributing to MLflow Tracing + +Welcome to the MLflow Tracing contribution guide! This step-by-step resource will assist you in implementing additional GenAI library integrations for tracing into MLflow. + +:::tip +If you have any questions during the process, try the **“Ask AI”** feature in the bottom-right corner. It can provide both reference documentation and quick answers to common questions about MLflow. +::: + +## Step 1. Set up Your Environment + +Set up a dev environment following the [CONTRIBUTING.md](https://github.com/mlflow/mlflow/blob/master/CONTRIBUTING.md). After setup, +verify the environment is ready for tracing development by running the unit tests with the `pytest tests/tracing` command and ensure that all tests pass. + +## Step 2. Familiarize Yourself with MLflow Tracing + +First, get a solid understanding of what MLflow Tracing does and how it works. Check out these docs to get up to speed: + +- [Tracing Concepts](./overview/) - Understand what tracing is and the specific benefits for MLflow users. +- [Tracing Schema Guide](./tracing-schema) - Details on trace data structure and the information stored. +- [MLflow Tracing API Guide](../) - Practical guide to auto-instrumentation and APIs for manually creating traces. + +📝 **Quick Quiz**: Before moving on to the next step, let's challenge your understanding with a few questions. +If you are not sure about the answers, revisit the docs for a quick refresh. + +
+
+ Q. What is the difference between a Trace and a Span? +

A. Trace is the main object holding multiple Spans, with each Span capturing different parts of an operation. A Trace has metadata (TraceInfo) and a list of Spans (TraceData).\ + Reference: [Tracing Schema Guide](./tracing-shema/)

+
+ +
+ Q. What is the easiest way to create a span for a function call? +

A. Use the `@mlflow.trace` decorator to capture inputs, outputs, and execution duration automatically.\ + Reference: [MLflow Tracing API Guide](../#trace-decorator)

+
+ +
+ Q. When would you use the MLflow Client for creating traces? +

A. The Client API is useful when you want fine-grained control over how you start and end a trace. For example, you can specify a parent span ID when starting a span.\ + Reference: [MLflow Tracing API Guide](../#tracing-client-apis)

+
+ +
+ Q. How do you log input data to a span? +

A. You can log input data with the `span.set_inputs()` method for a span object returned by the ``mlflow.start_span`` context manager or Client APIs.\ + Reference: [Tracing Schema Guide](../tracing-schema/)

+
+ +
+ Q. Where is exception information stored in a Trace? +

A. Exceptions are recorded in the `events` attribute of the span, including details such as exception type, message, and stack trace.\ + Reference: [MLflow Tracing API Guide](../#q-how-can-i-see-the-stack-trace-of-a-span-that-captured-an-exception)

+ +
+
+ +## Step 3. Understand the Integration Library + +From a tracing perspective, GenAI libraries can be categorized into two types: + +**🧠 LLM Providers or Wrappers** + +Libraries like **OpenAI**, **Anthropic**, **Ollama**, and **LiteLLM** focus on providing access to LLMs. These libraries often have simple client SDKs, therefore, we often simply use ad-hoc patching to trace those APIs. + +For this type of library, start with listing up the core APIs to instrument. For example, in OpenAI auto-tracing, we patch the `create()` method of the `ChatCompletion`, `Completion`, and `Embedding` classes. +Refer to the [OpenAI auto-tracing implementation](https://github.com/mlflow/mlflow/blob/master/mlflow/openai/_openai_autolog.py#L123-L178) as an example. + +**⚙️ Orchestration Frameworks** + +Libraries such as **LangChain**, **LlamaIndex**, and **DSPy** offer higher-level workflows, integrating LLMs, embeddings, retrievers, and tools into complex applications. +Since these libraries require trace data from multiple components, we do not want to rely on ad-hoc patching. Therefore, auto-tracing for these libraries often leverage available callbacks +(e.g., [LangChain Callbacks](https://python.langchain.com/docs/how_to/#callbacks)) for more reliable integration. + +For this type of library, first check if the library provides any callback mechanism you can make use of. If there isn't, consider filing a feature request to the library to have this functionality added by the project maintainers, +providing comprehensive justification for the request. Having a callback mechanism also benefits the other users of the library, by providing flexibility and allowing integration with many other tools. +If there is a certain reason the library cannot provide callbacks, consult with the MLflow maintainers. We will not likely proceed with a design that relies on ad-hoc patching, but we can discuss alternative approaches if there are any to be had. + +## Step 4. Write a Design Document + +Draft a design document for your integration plan, using the [design template](https://docs.google.com/document/d/1AQGgJk-hTkUo0lTkGqCGQOMelQmz05kQz_OA4bJWaJE/edit#heading=h.4cz970y1mk93). Here are some important considerations: + +- **Integration Method**: Describe whether you'll use callbacks, API hooks, or patching. If there are multiple methods, list them as options and explain your choice. +- **Maintainability**: LLM frameworks evolve quickly, so avoid relying on internal methods as much as possible. Prefer public APIs such as callbacks. +- **Standardization**: Ensure consistency with other MLflow integrations for usability and downstream tasks. For example, retrieval spans should follow the [Retriever Schema](./tracing-schema.html#retriever-schema) for UI compatibility. + +Include a brief overview of the library's core functionality and use cases to provide context for reviewers. Once the draft is ready, share your design with MLflow maintainers, and if time allows, create a proof of concept to highlight potential challenges early. + +## Step 5. Begin Implementation + +With the design approved, start implementation: + +1. **Create a New Module**: If the library isn't already integrated with MLflow, create a new directory under `mlflow/` (e.g., `mlflow/llama_index`). Add an `__init__.py` file to initialize the module. +2. **Develop the Tracing Hook**: Implement your chosen method (patch, callback, or decorator) for tracing. If you go with patching approach, use the `safe_patch` function to ensure stable patching (see [example](https://github.com/mlflow/mlflow/blob/master/mlflow/openai/__init__.py#L905)). +3. **Define `mlflow.xxx.autolog() function`**: This function will be the main entry point for the integration, which enables tracing when called (e.g., ). +4. **Write Tests**: Cover edge cases like asynchronous calls, custom data types, and streaming outputs if the library supports them. + +:::warning attention +There are a few gotchas to watch out for when integrating with MLflow Tracing: + +- **Error Handling**: Ensure exceptions are captured and logged to spans with type, message, and stack trace. +- **Streaming Outputs**: For streaming (iterators), hook into the iterator to assemble and log the full output to the span. Directly logging the iterator object is not only unhelpful but also cause unexpected behavior e.g. exhaust the iterator during serialization. +- **Serialization**: MLflow serializes traces to JSON via the custom `TraceJsonEncoder` implementation, which supports common objects and Pydantic models. If your library uses custom objects, consider extending the serializer, as unsupported types are stringified and may lose useful detail. +- **Timestamp Handling**: When using timestamps provided by the library, validate the unit and timezone. MLflow requires timestamps in _nanoseconds since the UNIX epoch_; incorrect timestamps will disrupt span duration. + ::: + +## Step 6. Test the Integration + +Once implementation is complete, run end-to-end tests in a notebook to verify functionality. Ensure: + +◻︎ Traces appear correctly in the MLflow Experiment. + +◻︎ Traces are properly rendered in the MLflow UI. + +◻︎ Errors from MLflow trace creation should not interrupt the original execution of the library. + +◻︎ Edge cases such as asynchronous and streaming calls function as expected. + +In addition to the local test, there are a few Databricks services that are integrated with MLflow Tracing. Consult with an MLflow maintainer for guidance on how to test those integrations. + +When you are confident that the implementation works correctly, open a PR with the test result pasted in the PR description. + +## Step 7. Document the Integration + +Documentation is a prerequisite for release. Follow these steps to complete the documentation: + +1. Add the integrated library icon and example in the [main Tracing documentation](../). +2. If the library is already present in an existing MLflow model flavor, add a Tracing section in the flavor documentation ([example page](../../llama-index/index.html#enable-tracing)). +3. Add a notebook tutorial to demonstrate the integration ([example notebook](https://github.com/mlflow/mlflow/blob/master/docs/source/llms/llama-index/notebooks/llama_index_quickstart.ipynb)) + +Documentation sources are located in the `docs/` folder. Refer to [Writing Docs](https://github.com/mlflow/mlflow/blob/master/CONTRIBUTING.md#writing-docs) for more details on how to build and preview the documentation. + +## Step 8. Release🚀 + +Congratulations! Now you've completed the journey of adding a new tracing integration to MLflow. The release notes will feature your name, and we will write an SNS or/and a blog post to highlight your contribution. + +Thank you so much for helping improve MLflow Tracing, and we look forward to working with you again!😊 + +## Contact + +If you have any questions or need help, feel free to reach out to the maintainers (POC: @B-Step62, @BenWilson2) for further guidance. diff --git a/docs/docs/llms/tracing/index.mdx b/docs/docs/llms/tracing/index.mdx new file mode 100644 index 0000000000000..950afe473503a --- /dev/null +++ b/docs/docs/llms/tracing/index.mdx @@ -0,0 +1,1137 @@ +--- +description: MLflow Tracing is a feature that enables LLM observability in your apps. MLflow automatically logs traces for LangChain, LlamaIndex, and more. +sidebar_position: 1 +--- + +import { APILink } from "@site/src/components/APILink"; +import TOCInline from "@theme/TOCInline"; +import Tabs from "@theme/Tabs"; +import TabItem from "@theme/TabItem"; + +# MLflow Tracing + +:::note +MLflow Tracing is currently in **Experimental Status** and is subject to change without deprecation warning or notification. +::: + +
+ +
+
+ +MLflow Tracing is a feature that enhances LLM observability in your Generative +AI (GenAI) applications by capturing detailed information about the execution of +your application's services. Tracing provides a way to record the inputs, +outputs, and metadata associated with each intermediate step of a request, +enabling you to easily pinpoint the source of bugs and unexpected behaviors. + +MLflow offers a number of different options to enable tracing of your GenAI applications. + +- **Automated tracing**: MLflow provides fully automated integrations with various GenAI libraries such as LangChain, OpenAI, LlamaIndex, DSPy, AutoGen, and more that can be activated by simply enabling `mlflow..autolog()`. +- **Manual trace instrumentation with high-level fluent APIs**: Decorators, function wrappers and context managers via the fluent API allow you to add tracing functionality with minor code modifications. +- **Low-level client APIs for tracing**: The MLflow client API provides a thread-safe way to handle trace implementations, even in aysnchronous modes of operation. + +To learn more about what tracing is, see our [Tracing Concepts Overview](./overview) guide. + +To explore the structure and schema of MLflow Tracing, please see the [Tracing Schema](./tracing-schema) guide. + +:::note +MLflow Tracing support is available with the **MLflow 2.14.0** release. Versions of MLflow prior to this release +do not contain the full set of features that are required for trace logging support. +::: + + + +## Automatic Tracing + +:::info Hint +Is your favorite library missing from the list? Consider [contributing to MLflow Tracing](./contribute) or [submitting a feature request](https://github.com/mlflow/mlflow/issues/new?assignees=&labels=enhancement&projects=&template=feature_request_template.yaml&title=%5BFR%5D) to our Github repository. +::: + +The easiest way to get started with MLflow Tracing is to leverage the built-in capabilities with MLflow's integrated libraries. MLflow provides automatic tracing capabilities for some of the integrated libraries such as +LangChain, OpenAI, LlamaIndex, and AutoGen. For these libraries, you can instrument your code with +just a single command `mlflow..autolog()` and MLflow will automatically log traces +for model/API invocations to the active MLflow Experiment. + + + + ### LangChain Automatic Tracing + + As part of the LangChain autologging integration, traces are logged to the active MLflow Experiment when calling invocation APIs on chains. You can enable tracing + for LangChain by calling the function. + + ```python + import mlflow + + mlflow.langchain.autolog() + ``` + + In the full example below, the model and its associated metadata will be logged as a run, while the traces are logged separately to the active experiment. To learn more, please visit [LangChain Autologging documentation](../langchain/autologging). + + :::note + This example has been confirmed working with the following requirement versions: + ```shell + pip install openai==1.30.5 langchain==0.2.1 langchain-openai==0.1.8 langchain-community==0.2.1 mlflow==2.14.0 tiktoken==0.7.0 + ``` + ::: + + ```python + import os + + from langchain.prompts import PromptTemplate + from langchain_openai import OpenAI + + import mlflow + + assert ( + "OPENAI_API_KEY" in os.environ + ), "Please set your OPENAI_API_KEY environment variable." + + # Using a local MLflow tracking server + mlflow.set_tracking_uri("http://localhost:5000") + + # Create a new experiment that the model and the traces will be logged to + mlflow.set_experiment("LangChain Tracing") + + # Enable LangChain autologging + # Note that models and examples are not required to be logged in order to log traces. + # Simply enabling autolog for LangChain via mlflow.langchain.autolog() will enable trace logging. + mlflow.langchain.autolog(log_models=True, log_input_examples=True) + + llm = OpenAI(temperature=0.7, max_tokens=1000) + + prompt_template = ( + "Imagine that you are {person}, and you are embodying their manner of answering questions posed to them. " + "While answering, attempt to mirror their conversational style, their wit, and the habits of their speech " + "and prose. You will emulate them as best that you can, attempting to distill their quirks, personality, " + "and habits of engagement to the best of your ability. Feel free to fully embrace their personality, whether " + "aspects of it are not guaranteed to be productive or entirely constructive or inoffensive." + "The question you are asked, to which you will reply as that person, is: {question}" + ) + + chain = prompt_template | llm + + # Test the chain + chain.invoke( + { + "person": "Richard Feynman", + "question": "Why should we colonize Mars instead of Venus?", + } + ) + + # Let's test another call + chain.invoke( + { + "person": "Linus Torvalds", + "question": "Can I just set everyone's access to sudo to make things easier?", + } + ) + ``` + + If we navigate to the MLflow UI, we can see not only the model that has been auto-logged, but the traces as well, as shown in the below video: + + ![LangChain Tracing via autolog](/images/llms/tracing/langchain-tracing.gif) + + :::note + The example above is purposely simple (a simple chat completions demonstration) for purposes of brevity. In real-world scenarios involving complex + RAG chains, the trace that is recorded by MLflow will be significantly more complex and verbose. + ::: + + + + ### OpenAI Automatic Tracing + + The MLflow OpenAI flavor's autologging feature has a direct integration with MLflow tracing. When OpenAI autologging is enabled with , + usage of the OpenAI SDK will automatically record generated traces during interactive development. + + ```python + import mlflow + + mlflow.openai.autolog() + ``` + + For example, the code below will log traces to the currently active experiment (in this case, the activated experiment `"OpenAI"`, set through the use of the API). + To learn more about OpenAI autologging, you can [view the documentation here](../openai/autologging). + + ```python + import os + import openai + import mlflow + + # Calling the autolog API will enable trace logging by default. + mlflow.openai.autolog() + + mlflow.set_experiment("OpenAI") + + openai_client = openai.OpenAI(api_key=os.environ.get("OPENAI_API_KEY")) + + messages = [ + { + "role": "user", + "content": "How can I improve my resting metabolic rate most effectively?", + } + ] + + response = openai_client.chat.completions.create( + model="gpt-4o", + messages=messages, + temperature=0.99, + ) + + print(response) + ``` + + The logged trace, associated with the `OpenAI` experiment, can be seen in the MLflow UI, as shown below: + + ![OpenAI Tracing](/images/llms/tracing/openai-tracing.png) + + + + ### OpenAI Swarm Automatic Tracing + + The MLflow OpenAI flavor supports automatic tracing for [Swarm](https://github.com/openai/swarm), a multi-agent orchestration + framework from OpenAI. To enable tracing for **Swarm**, just call + before running your multi-agent interactions. MLflow will trace all LLM interactions, tool calls, and agent operations automatically. + + ```python + import mlflow + + mlflow.openai.autolog() + ``` + + For example, the code below will run the simplest example of multi-agent interaction using OpenAI Swarm. + + ```python + import mlflow + from swarm import Swarm, Agent + + # Calling the autolog API will enable trace logging by default. + mlflow.openai.autolog() + + mlflow.set_experiment("OpenAI Swarm") + + client = Swarm() + + + def transfer_to_agent_b(): + return agent_b + + + agent_a = Agent( + name="Agent A", + instructions="You are a helpful agent.", + functions=[transfer_to_agent_b], + ) + + agent_b = Agent( + name="Agent B", + instructions="Only speak in Haikus.", + ) + + response = client.run( + agent=agent_a, + messages=[{"role": "user", "content": "I want to talk to agent B."}], + ) + print(response) + ``` + + The logged trace, associated with the `OpenAI Swarm` experiment, can be seen in the MLflow UI, as shown below: + + ![OpenAI Swarm Tracing](/images/llms/tracing/openai-swarm-tracing.png) + + + + ### LlamaIndex Automatic Tracing + + The MLflow LlamaIndex flavor's autologging feature has a direct integration with MLflow tracing. When LlamaIndex autologging is enabled with , invocation of components + such as LLMs, agents, and query/chat engines will automatically record generated traces during interactive development. + + ```python + import mlflow + + mlflow.llama_index.autolog() + ``` + + To see the full example of tracing LlamaIndex, please visit [LLamaIndex Tracing documentation](../llama-index). + + ![LlamaIndex Tracing](/images/llms/llama-index/llama-index-trace.png) + + + + ### AutoGen Automatic Tracing + + MLflow Tracing ensures observability for your AutoGen application that involves complex multi-agent interactions. You can enable auto-tracing by calling , + then the internal steps of the agents chat session will be logged to the active MLflow Experiment. + + ```python + import mlflow + + mlflow.autogen.autolog() + ``` + + To see the full example of tracing AutoGen, please refer to the [AutoGen Tracing example](https://github.com/mlflow/mlflow/tree/master/examples/autogen/tracing.py). + + ![AutoGen Tracing](/images/llms/autogen/autogen-trace.png) + + + + +## Tracing Fluent APIs + +MLflow's `fluent APIs` provide a straightforward way to add tracing to your functions and code blocks. +By using decorators, function wrappers, and context managers, you can easily capture detailed trace data with minimal code changes. + +As a comparison between the fluent and the client APIs for tracing, the figure below illustrates the differences in complexity between the two APIs, +with the fluent API being more concise and the recommended approach if your tracing use case can support using the higher-level APIs. + +
+ ![Fluent vs Client APIs](/images/llms/tracing/fluent-vs-client-tracing.png) +
+ +This section will cover how to initiate traces using these fluent APIs. + +### Initiating a Trace + +In this section, we will explore different methods to initiate a trace using MLflow's fluent APIs. These methods allow you to add tracing +functionality to your code with minimal modifications, enabling you to capture detailed information about the execution of your functions and workflows. + +#### Trace Decorator + +The trace decorator allows you to automatically capture the inputs and outputs of a function by simply adding the `@mlflow.trace` decorator +to its definition. This approach is ideal for quickly adding tracing to individual functions without significant changes to your existing code. + +```python +import mlflow + +# Create a new experiment to log the trace to +mlflow.set_experiment("Tracing Demo") + + +# Mark any function with the trace decorator to automatically capture input(s) and output(s) +@mlflow.trace +def some_function(x, y, z=2): + return x + (y - z) + + +# Invoking the function will generate a trace that is logged to the active experiment +some_function(2, 4) +``` + +You can add additional metadata to the tracing decorator as follows: + +```python +@mlflow.trace(name="My Span", span_type="func", attributes={"a": 1, "b": 2}) +def my_func(x, y): + return x + y +``` + +When adding additional metadata to the trace decorator constructor, these additional components will be logged along with the span entry within +the trace that is stored within the active MLflow experiment. + +Since MLflow 2.16.0, the trace decorator also supports async functions: + +```python +from openai import AsyncOpenAI + +client = AsyncOpenAI() + + +@mlflow.trace +async def async_func(message: str): + return await client.chat.completion.create( + model="gpt-4o", messages=[{"role": "user", "content": message}] + ) + + +await async_func("What is MLflow Tracing?") +``` + +#### What is captured? + +If we navigate to the MLflow UI, we can see that the trace decorator automatically captured the following information, in addition to the basic +metadata associated with any span (start time, end time, status, etc): + +- **Inputs**: In the case of our decorated function, this includes the state of all input arguments (including the default `z` value that is applied). +- **Response**: The output of the function is also captured, in this case the result of the addition and subtraction operations. +- **Trace Name**: The name of the decorated function. + +![Trace UI - simple use case](/images/llms/tracing/trace-demo-1.png) + +#### Error Handling with Traces + +If an `Exception` is raised during processing of a trace-instrumented operation, an indication will be shown within the UI that the invocation was not +successful and a partial capture of data will be available to aid in debugging. Additionally, details about the Exception that was raised will be included +within the `events` attribute of the partially completed span, further aiding the identification of where issues are occuring within your code. + +An example of a trace that has been recorded from code that raised an Exception is shown below: + +```python +# This will raise an AttributeError exception +do_math(3, 2, "multiply") +``` + +![Trace Error](/images/llms/tracing/trace-error.png) + +#### How to handle parent-child relationships + +When using the trace decorator, each decorated function will be treated as a separate span within the trace. The relationship between dependent function calls +is handled directly through the native call excecution order within Python. For example, the following code will introduce two "child" spans to the main +parent span, all using decorators. + +```python +import mlflow + + +@mlflow.trace(span_type="func", attributes={"key": "value"}) +def add_1(x): + return x + 1 + + +@mlflow.trace(span_type="func", attributes={"key1": "value1"}) +def minus_1(x): + return x - 1 + + +@mlflow.trace(name="Trace Test") +def trace_test(x): + step1 = add_1(x) + return minus_1(step1) + + +trace_test(4) +``` + +If we look at this trace from within the MLflow UI, we can see the relationship of the call order shown in the structure of the trace. + +![Trace Decorator](/images/llms/tracing/trace-decorator.gif) + +#### Span Type + +Span types are a way to categorize spans within a trace. By default, the span type is set to `"UNKNOWN"` when using the trace decorator. MLflow provides a set of predefined span types for common use cases, while also allowing you to setting custom span types. + +The following span types are available: + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
**Span Type****Description**
`"LLM"`Represents a call to an LLM endpoint or a local model.
`"CHAT_MODEL"` + Represents a query to a chat model. This is a special case of an LLM + interaction. +
`"CHAIN"`Represents a chain of operations.
`"AGENT"`Represents an autonomous agent operation.
`"TOOL"` + Represents a tool execution (typically by an agent), such as querying a + search engine. +
`"EMBEDDING"`Represents a text embedding operation.
`"RETRIEVER"` + Represents a context retrieval operation, such as querying a vector + database. +
`"PARSER"` + Represents a parsing operation, transforming text into a structured + format. +
`"RERANKER"` + Represents a re-ranking operation, ordering the retrieved contexts based + on relevance. +
`"UNKNOWN"` + A default span type that is used when no other span type is specified. +
+ +To set a span type, you can pass the `span_type` parameter to the `@mlflow.trace` decorator or +context manager. When you are using [automatic tracing](#automatic-tracing), the span type is automatically set by MLflow. + +```python +import mlflow +from mlflow.entities import SpanType + + +# Using a built-in span type +@mlflow.trace(span_type=SpanType.RETRIEVER) +def retrieve_documents(query: str): + ... + + +# Setting a custom span type +with mlflow.start_span(name="add", span_type="MATH") as span: + span.set_inputs({"x": z, "y": y}) + z = x + y + span.set_outputs({"z": z}) + + print(span.span_type) + # Output: MATH +``` + +#### Context Handler + +The context handler provides a way to create nested traces or spans, which can be useful for capturing complex interactions within your code. +By using the context manager, you can group multiple traced functions under a single parent span, making it easier to understand +the relationships between different parts of your code. + +The context handler is recommended when you need to refine the scope of data capture for a given span. If your code is logically constructed such that +individual calls to services or models are contained within functions or methods, on the other hand, using the decorator approach is more straight-forward +and less complex. + +```python +import mlflow + + +@mlflow.trace +def first_func(x, y=2): + return x + y + + +@mlflow.trace +def second_func(a, b=3): + return a * b + + +def do_math(a, x, operation="add"): + # Use the fluent API context handler to create a new span + with mlflow.start_span(name="Math") as span: + # Specify the inputs and attributes that will be associated with the span + span.set_inputs({"a": a, "x": x}) + span.set_attributes({"mode": operation}) + + # Both of these functions are decorated for tracing and will be associated + # as 'children' of the parent 'span' defined with the context handler + first = first_func(x) + second = second_func(a) + + result = None + + if operation == "add": + result = first + second + elif operation == "subtract": + result = first - second + else: + raise ValueError(f"Unsupported Operation Mode: {operation}") + + # Specify the output result to the span + span.set_outputs({"result": result}) + + return result +``` + +When calling the `do_math` function, a trace will be generated that has the root span (parent) defined as the +context handler `with mlflow.start_span():` call. The `first_func` and `second_func` calls will be associated as child spans +to this parent span due to the fact that they are both decorated functions (having `@mlflow.trace` decorated on the function definition). + +Running the following code will generate a trace. + +```python +do_math(8, 3, "add") +``` + +This trace can be seen within the MLflow UI: + +![Trace within the MLflow UI](/images/llms/tracing/trace-view.png) + +#### Function wrapping + +Function wrapping provides a flexible way to add tracing to existing functions without modifying their definitions. This is particularly useful when +you want to add tracing to third-party functions or functions defined outside of your control. By wrapping an external function with , you can +capture its inputs, outputs, and execution context. + +```python +import math + +import mlflow + +mlflow.set_experiment("External Function Tracing") + + +def invocation(x, y=4, exp=2): + # Initiate a context handler for parent logging + with mlflow.start_span(name="Parent") as span: + span.set_attributes({"level": "parent", "override": y == 4}) + span.set_inputs({"x": x, "y": y, "exp": exp}) + + # Wrap an external function instead of modifying + traced_pow = mlflow.trace(math.pow) + + # Call the wrapped function as you would call it directly + raised = traced_pow(x, exp) + + # Wrap another external function + traced_factorial = mlflow.trace(math.factorial) + + factorial = traced_factorial(int(raised)) + + # Wrap another and call it directly + response = mlflow.trace(math.sqrt)(factorial) + + # Set the outputs to the parent span prior to returning + span.set_outputs({"result": response}) + + return response + + +for i in range(8): + invocation(i) +``` + +The video below shows our external function wrapping runs within the MLflow UI. Note that + +![External Function tracing](/images/llms/tracing/external-trace.gif) + +## Tracing Client APIs + +The MLflow client API provides a comprehensive set of thread-safe methods for manually managing traces. These APIs allow for fine-grained +control over tracing, enabling you to create, manipulate, and retrieve traces programmatically. This section will cover how to use these APIs +to manually trace a model, providing step-by-step instructions and examples. + +### Starting a Trace + +Unlike with the fluent API, the MLflow Trace Client API requires that you explicitly start a trace before adding child spans. This initial API call +starts the root span for the trace, providing a context request_id that is used for associating subsequent spans to the root span. + +To start a new trace, use the method. This method creates a new trace and returns the root span object. + +```python +from mlflow import MlflowClient + +client = MlflowClient() + +# Start a new trace +root_span = client.start_trace("my_trace") + +# The request_id is used for creating additional spans that have a hierarchical association to this root span +request_id = root_span.request_id +``` + +### Adding a Child Span + +Once a trace is started, you can add child spans to it with the API. Child spans allow you to break down the trace into smaller, more manageable segments, +each representing a specific operation or step within the overall process. + +```python +# Create a child span +child_span = client.start_span( + name="child_span", + request_id=request_id, + parent_id=root_span.span_id, + inputs={"input_key": "input_value"}, + attributes={"attribute_key": "attribute_value"}, +) +``` + +### Ending a Span + +After performing the operations associated with a span, you must end the span explicitly using the method. Make note of the two required fields +that are in the API signature: + +- **request_id**: The identifier associated with the root span +- **span_id**: The identifier associated with the span that is being ended + +In order to effectively end a particular span, both the root span (returned from calling `start_trace`) and the targeted span (returned from calling `start_span`) +need to be identified when calling the `end_span` API. +The initiating `request_id` can be accessed from any parent span object's properties. + +:::note +Spans created via the Client API will need to be terminated manually. Ensure that all spans that have been started with the `start_span` API +have been ended with the `end_span` API. +::: + +```python +# End the child span +client.end_span( + request_id=child_span.request_id, + span_id=child_span.span_id, + outputs={"output_key": "output_value"}, + attributes={"custom_attribute": "value"}, +) +``` + +### Ending a Trace + +To complete the trace, end the root span using the method. This will also ensure that all associated child +spans are properly ended. + +```python +# End the root span (trace) +client.end_trace( + request_id=request_id, + outputs={"final_output_key": "final_output_value"}, + attributes={"token_usage": "1174"}, +) +``` + +## Searching and Retrieving Traces + +### Searching for Traces + +You can search for traces based on various criteria using the method. This method allows you to filter traces by experiment IDs, +filter strings, and other parameters. + +```python +# Search for traces in specific experiments +traces = client.search_traces( + experiment_ids=["1", "2"], + filter_string="attributes.status = 'OK'", + max_results=5, +) +``` + +Alternatively, you can use fluent API to search for traces, which returns a pandas DataFrame with each row containing a trace. +This method allows you to specify fields to extract from traces using the format `"span_name.[inputs|outputs]"` or `"span_name.[inputs|outputs].field_name"`. +The extracted fields are included as extra columns in the pandas DataFrame. This feature can be used to build evaluation datasets to further improve model and agent performance. + +```python +import mlflow + +with mlflow.start_span(name="span1") as span: + span.set_inputs({"a": 1, "b": 2}) + span.set_outputs({"c": 3, "d": 4}) + +# Search for traces with specific fields extracted +traces = mlflow.search_traces( + extract_fields=["span1.inputs", "span1.outputs.c"], +) + +print(traces) +``` + +This outputs: + +```text + request_id ... span1.inputs span1.outputs.c +0 tr-97c4ef97c21f4348a5698f069c1320f1 ... {'a': 1, 'b': 2} 3.0 +1 tr-4dc3cd5567764499b5532e3af61b9f78 ... {'a': 1, 'b': 2} 3.0 +``` + +### Retrieving a Specific Trace + +To retrieve a specific trace by its request ID, use the method. This method returns the trace object corresponding to the given request ID. + +```python +# Retrieve a trace by request ID +trace = client.get_trace(request_id="12345678") +``` + +## Managing Trace Data + +### Deleting Traces + +You can delete traces based on specific criteria using the method. This method allows you to delete traces by **experiment ID**, +**maximum timestamp**, or **request IDs**. + +:::tip +Deleting a trace is an irreversible process. Ensure that the setting provided within the `delete_traces` API meet the intended range for deletion. +::: + +```python +import time + +# Get the current timestamp in milliseconds +current_time = int(time.time() * 1000) + +# Delete traces older than a specific timestamp +deleted_count = client.delete_traces( + experiment_id="1", max_timestamp_millis=current_time, max_traces=10 +) +``` + +### Setting and Deleting Trace Tags + +Tags can be added to traces to provide additional metadata. Use the method to set a tag on a trace, +and the method to remove a tag from a trace. + +```python +# Set a tag on a trace +client.set_trace_tag(request_id="12345678", key="tag_key", value="tag_value") + +# Delete a tag from a trace +client.delete_trace_tag(request_id="12345678", key="tag_key") +``` + +## Async Logging + +By default, MLflow Traces are logged synchronously. This may introduce a performance overhead when logging Traces, especially when your MLflow Tracking Server is running on a remote server. If the performance overhead is a concern for you, you can enable **asynchronous logging** for tracing in MLflow 2.16.0 and later. + +To enable async logging for tracing, call in your code. This will make the trace logging operation non-blocking and reduce the performance overhead. + +```python +import mlflow + +mlflow.config.enable_async_logging() + +# Traces will be logged asynchronously +with mlflow.start_span(name="foo") as span: + span.set_inputs({"a": 1}) + span.set_outputs({"b": 2}) + +# If you don't see the traces in the UI after waiting for a while, you can manually flush the traces +# mlflow.flush_trace_async_logging() +``` + +Note that the async logging does not fully eliminate the performance overhead. Some backend calls still need to be made synchronously and there are other factors such as data serialization. However, async logging can significantly reduce the overall overhead of logging traces, empirically about ~80% for typical workloads. + +## Using OpenTelemetry Collector for Exporting Traces + +Traces generated by MLflow are compatible with the [OpenTelemetry trace specs](https://opentelemetry.io/docs/specs/otel/trace/api/#span). +Therefore, MLflow Tracing supports exporting traces to an OpenTelemetry Collector, which can then be used to export traces to various backends such as Jaeger, Zipkin, and AWS X-Ray. + +By default, MLflow exports traces to the MLflow Tracking Server. To enable exporting traces to an OpenTelemetry Collector, set the `OTEL_EXPORTER_OTLP_ENDPOINT` environment variable (or `OTEL_EXPORTER_OTLP_TRACES_ENDPOINT`) to the target URL of the OpenTelemetry Collector **before starting any trace**. + +```python +import mlflow +import os + +# Set the endpoint of the OpenTelemetry Collector +os.environ["OTEL_EXPORTER_OTLP_TRACES_ENDPOINT"] = "http://localhost:4317/v1/traces" +# Optionally, set the service name to group traces +os.environ["OTEL_SERVICE_NAME"] = "" + +# Trace will be exported to the OTel collector at http://localhost:4317/v1/traces +with mlflow.start_span(name="foo") as span: + span.set_inputs({"a": 1}) + span.set_outputs({"b": 2}) +``` + +:::warning +MLflow only exports traces to a single destination. When the `OTEL_EXPORTER_OTLP_ENDPOINT` environment variable is configured, MLflow will **not** export traces to the MLflow Tracking Server and you will not see traces in the MLflow UI. + +Similarly, if you deploy the model to the [Databricks Model Serving with tracing enabled](https://docs.databricks.com/en/mlflow/mlflow-tracing.html#use-mlflow-tracing-in-production), using the OpenTelemetry Collector will result in traces not being recorded in the Inference Table. +::: + +### Configurations + +MLflow uses the standard OTLP Exporter for exporting traces to OpenTelemetry Collector instances. Thereby, you can use [all of the configurations](https://opentelemetry.io/docs/languages/sdk-configuration/otlp-exporter/) supported by OpenTelemetry. The following example configures the OTLP Exporter to use HTTP protocol instead of the default gRPC and sets custom headers: + +```bash +export OTEL_EXPORTER_OTLP_TRACES_ENDPOINT="http://localhost:4317/v1/traces" +export OTEL_EXPORTER_OTLP_TRACES_PROTOCOL="http/protobuf" +export OTEL_EXPORTER_OTLP_TRACES_HEADERS="api_key=12345" +``` + +## FAQ + +### Q: Can I disable and re-enable tracing globally? + +Yes. + +There are two fluent APIs that are used for blanket enablement or disablement of the MLflow Tracing feature in order to support +users who may not wish to record interactions with their trace-enabled models for a brief period, or if they have concerns about long-term storage +of data that was sent along with a request payload to a model in interactive mode. + +To **disable** tracing, the API will cease the collection of trace data from within MLflow and will not log +any data to the MLflow Tracking service regarding traces. + +To **enable** tracing (if it had been temporarily disabled), the API will re-enable tracing functionality for instrumented models +that are invoked. + +### Q: How can I associate a trace with an MLflow Run? + +If a trace is generated within a run context, the recorded traces to an active Experiment will be associated with the active Run. + +For example, in the following code, the traces are generated within the `start_run` context. + +```python +import mlflow + +# Create and activate an Experiment +mlflow.set_experiment("Run Associated Tracing") + +# Start a new MLflow Run +with mlflow.start_run() as run: + # Initiate a trace by starting a Span context from within the Run context + with mlflow.start_span(name="Run Span") as parent_span: + parent_span.set_inputs({"input": "a"}) + parent_span.set_outputs({"response": "b"}) + parent_span.set_attribute("a", "b") + # Initiate a child span from within the parent Span's context + with mlflow.start_span(name="Child Span") as child_span: + child_span.set_inputs({"input": "b"}) + child_span.set_outputs({"response": "c"}) + child_span.set_attributes({"b": "c", "c": "d"}) +``` + +When navigating to the MLflow UI and selecting the active Experiment, the trace display view will show the run that is associated with the trace, as +well as providing a link to navigate to the run within the MLflow UI. See the below video for an example of this in action. + +![Tracing within a Run Context](/images/llms/tracing/run-trace.gif) + +You can also programmatically retrieve the traces associated to a particular Run by using the method. + +```python +from mlflow import MlflowClient + +client = MlflowClient() + +# Retrieve traces associated with a specific Run +traces = client.search_traces(run_id=run.info.run_id) + +print(traces) +``` + +### Q: Can I use the fluent API and the client API together? + +You definitely can. However, the Client API is much more verbose than the fluent API and is designed for more complex use cases where you need +to control asynchronous tasks for which a context manager will not have the ability to handle an appropriate closure over the context. + +Mixing the two, while entirely possible, is not generally recommended. + +For example, the following will work: + +```python +import mlflow + +# Initiate a fluent span creation context +with mlflow.start_span(name="Testing!") as span: + # Use the client API to start a child span + child_span = client.start_span( + name="Child Span From Client", + request_id=span.request_id, + parent_id=span.span_id, + inputs={"request": "test input"}, + attributes={"attribute1": "value1"}, + ) + + # End the child span + client.end_span( + request_id=span.request_id, + span_id=child_span.span_id, + outputs={"response": "test output"}, + attributes={"attribute2": "value2"}, + ) +``` + +![Using Client APIs within fluent context](/images/llms/tracing/client-with-fluent.png) + +:::warning +Using the fluent API to manage a child span of a client-initiated root span or child span is not possible. +Attempting to start a `start_span` context handler while using the client API will result in two traces being created, +one for the fluent API and one for the client API. +::: + +### Q: How can I add custom metadata to a span? + +There are several ways. + +#### Fluent API + +1. Within the constructor itself. + +```python +with mlflow.start_span( + name="Parent", attributes={"attribute1": "value1", "attribute2": "value2"} +) as span: + span.set_inputs({"input1": "value1", "input2": "value2"}) + span.set_outputs({"output1": "value1", "output2": "value2"}) +``` + +2. Using the `set_attribute` or `set_attributes` methods on the `span` object returned from the `start_span` returned object. + +```python +with mlflow.start_span(name="Parent") as span: + # Set multiple attributes + span.set_attributes({"attribute1": "value1", "attribute2": "value2"}) + # Set a single attribute + span.set_attribute("attribute3", "value3") +``` + +#### Client API + +1. When starting a span, you can pass in the attributes as part of the `start_trace` and `start_span` method calls. + +```python +parent_span = client.start_trace( + name="Parent Span", + attributes={"attribute1": "value1", "attribute2": "value2"} +) + +child_span = client.start_span( + name="Child Span", + request_id=parent_span.request_id, + parent_id=parent_span.span_id, + attributes={"attribute1": "value1", "attribute2": "value2"} +) +``` + +2. Utilize the `set_attribute` or `set_attributes` APIs directly on the `Span` objects. + +```python +parent_span = client.start_trace( + name="Parent Span", attributes={"attribute1": "value1", "attribute2": "value2"} +) + +# Set a single attribute +parent_span.set_attribute("attribute3", "value3") +# Set multiple attributes +parent_span.set_attributes({"attribute4": "value4", "attribute5": "value5"}) +``` + +3. Set attributes when ending a span or the entire trace. + +```python +client.end_span( + request_id=parent_span.request_id, + span_id=child_span.span_id, + attributes={"attribute1": "value1", "attribute2": "value2"}, +) + +client.end_trace( + request_id=parent_span.request_id, + attributes={"attribute3": "value3", "attribute4": "value4"}, +) +``` + +### Q: How can I see the stack trace of a Span that captured an Exception? + +The MLflow UI does not display Exception types, messages, or stacktraces if faults occur while logging a trace. +However, the trace does contain this critical debugging information as part of the Span objects that comprise the Trace. + +The simplest way to retrieve a particular stack trace information from a span that endured an exception is to retrieve the trace directly in +an interactive environment (such as a Jupyter Notebook). + +Here is an example of intentionally throwing an Exception while a trace is being collected and a simple way to view the exception details: + +```python +import mlflow + +experiment = mlflow.set_experiment("Intentional Exception") + +with mlflow.start_span(name="A Problematic Span") as span: + span.set_inputs({"input": "Exception should log as event"}) + span.set_attribute("a", "b") + raise Exception("Intentionally throwing!") + span.set_outputs({"This": "should not be recorded"}) +``` + +When running this, an Exception will be thrown, as expected. However, a trace is still logged to the active experiment and can be retrieved as follows: + +```python +from pprint import pprint + +trace = mlflow.get_trace(span.request_id) +trace_data = trace.data +pprint(trace_data.to_dict(), indent=1) # Minimum indent due to depth of Span object +``` + +In an interactive environment, such as a Jupyter Notebook, the `stdout` return will render an output like this: + +```text +{'spans': [{'name': 'A Span', + 'context': {'span_id': '0x896ff177c0942903', + 'trace_id': '0xcae9cb08ec0a273f4c0aab36c484fe87'}, + 'parent_id': None, + 'start_time': 1718063629190062000, + 'end_time': 1718063629190595000, + 'status_code': 'ERROR', + 'status_message': 'Exception: Intentionally throwing!', + 'attributes': {'mlflow.traceRequestId': '"7d418211df5945fa94e5e39b8009039e"', + 'mlflow.spanType': '"UNKNOWN"', + 'mlflow.spanInputs': '{"input": "Exception should log as event"}', + 'a': '"b"'}, + 'events': [{'name': 'exception', + 'timestamp': 1718063629190527000, + 'attributes': {'exception.type': 'Exception', + 'exception.message': 'Intentionally throwing!', + 'exception.stacktrace': 'Traceback (most recent call last):\n + File "/usr/local/lib/python3.8/site-packages/opentelemetry/trace/__init__.py", + line 573, in use_span\n + yield span\n File "/usr/local/mlflow/mlflow/tracing/fluent.py", + line 241, in start_span\n + yield mlflow_span\n File "/var/folders/cd/n8n0rm2x53l_s0xv_j_xklb00000gp/T/ipykernel_9875/4089093747.py", + line 4, in \n + raise Exception("Intentionally throwing!")\nException: Intentionally throwing!\n', + 'exception.escaped': 'False'}}]}], + 'request': '{"input": "Exception should log as event"}', + 'response': None +} +``` + +The `exception.stacktrace` attribute contains the full stack trace of the Exception that was raised during the span's execution. + +Alternatively, if you were to use the MLflowClient API to search traces, the access to retrieve the span's event data from the failure would be +slightly different (due to the return value being a `pandas` DataFrame). To use the `search_traces` API to access the same exception data would +be as follows: + +```python +import mlflow + +client = mlflow.MlflowClient() + +traces = client.search_traces( + experiment_ids=[experiment.experiment_id] +) # This returns a pandas DataFrame +pprint(traces["trace"][0].data.spans[0].to_dict(), indent=1) +``` + +The stdout values that will be rendered from this call are identical to those from the example span data above. diff --git a/docs/docs/llms/tracing/overview.mdx b/docs/docs/llms/tracing/overview.mdx new file mode 100644 index 0000000000000..af2ab89e81afb --- /dev/null +++ b/docs/docs/llms/tracing/overview.mdx @@ -0,0 +1,132 @@ +--- +sidebar_position: 2 +--- + +# Tracing Concepts + +In this guide, you can learn about what tracing is as it applies to Generative AI (GenAI) applications and what the main components of tracing are. + +
+ ![MLflow Tracing](/images/intro/tracing-ui.gif) +
+ +A good companion to the explanations in this guide is the [Tracing Schema](./tracing-schema) guide which will show how MLflow Tracing constructs the +concepts discussed here. + +## What is tracing? + +Tracing in the context of machine learning (ML) refers to the detailed tracking and recording of the data flow and processing steps during the execution of an ML model. +It provides transparency and insights into each stage of the model's operation, from data input to prediction output. This detailed tracking is crucial for debugging, +optimizing, and understanding the performance of ML models. + +### Traditional Machine Learning + +In traditional ML, the inference process is relatively straightforward. When a request is made, the input data is fed into the model, which processes the data and generates a prediction. + +The diagram below illustrates the relationship between the input data, the model serving interface, and the model itself. + +
+ ![Traditional ML Inference + Architecture](/images/llms/tracing/tracing-traditional-ml.png) +
+ +This process is wholly visible, meaning both the input and output are clearly defined and understandable to the end-user. For example, in a spam detection model, the input is an email, +and the output is a binary label indicating whether the email is spam or not. The entire inference process is transparent, making it easy to determine what data was sent and what prediction was returned, +rendering full tracing a largely irrelevant process within the context of qualitative model performance. + +However, tracing might be included as part of a deployment configuration to provide additional insights into the nature of processing the requests made to the server, the latency of the model's prediction, +and for logging API access to the system. For this classical form of trace logging, in which metadata associated with the inference requests from a latency and performance perspective are monitored and logged, these logs +are not typically used by model developers or data scientists to understand the model's operation. + +### Concept of a Span + +In the context of tracing, a span represents a single operation within the system. It captures metadata such as the start time, end time, and other contextual information about the operation. Along with the metadata, the +inputs that are provided to a unit of work (such as a call to a GenAI model, a retrieval query from a vector store, or a function call), as well as the output from the operation, are recorded. + +The diagram below illustrates a call to a GenAI model and the collection of relevant information within a span. The span includes metadata such as the start time, end time, and the request arguments, as well as the input and output of the invocation call. + +
+ ![Span Structure](/images/llms/tracing/span-anatomy.png) +
+ +### Concept of a Trace + +A trace in the context of GenAI tracing is a collection of Directed Acyclic Graph (DAG)-like Span events that are asynchronously called and recorded in a processor. Each span represents a single operation within +the system and includes metadata such as start time, end time, and other contextual information. These spans are linked together to form a trace, which provides a comprehensive view of the end-to-end process. + +- **DAG-like Structure**: The DAG structure ensures that there are no cycles in the sequence of operations, making it easier to understand the flow of execution. +- **Span Information**: Each span captures a discrete unit of work, such as a function call, a database query, or an API request. Spans include metadata that provides context about the operation. +- **Hierarchical Association**: Spans mirror the structure of your applications, allowing you to see how different components interact and depend on each other. + +By collecting and analyzing these spans, one can trace the execution path, identify bottlenecks, and understand the dependencies and interactions between different components of the system. This level of +visibility is crucial for diagnosing issues, optimizing performance, and ensuring the robustness of GenAI applications. + +To illustrate what an entire trace can capture in a RAG application, see the illustration below. + +
+ ![Tracing in a nutshell](/images/llms/tracing/trace-concept.png) +
+ +The subsystems that are involved in this application are critical to the quality and relevancy of the system. Having no visibility into the paths that data will follow when interacting with the final stage LLM +creates an application whose quality could only be achieved by a high degree of monotonouos, tedious, and expensive manual validation of each piece in isolation. + +### GenAI ChatCompletions Use Case + +In Generative AI (GenAI) applications, such as chat completions, tracing becomes far more important for the developers of models and GenAI-powered applications. These use cases involve generating human-like text +based on input prompts. While not nearly as complex as GenAI applications that involve agents or informational retrieval to augment a GenAI model, a chat interface can benefit from tracing. Enabling tracing on per-interaction interfaces +with a GenAI model via a chat session allows for evaluating the entire contextual history, prompt, input, and configuration parameters along with the output, enacpasulating the full context of the request payload that has been +submitted to the GenAI model. + +As an example, the illustration below shows the nature of a ChatCompletions interface used for connecting a model, hosted in a deployment server, to an external GenAI service. + +
+ ![GenAI ChatCompletions + Architecture](/images/llms/tracing/chat-completions-architecture.png) +
+ +Additional metadata surrounding the inference process is useful for various reasons, including billing, performance evaluation, relevance, evaluation of hallucinations, and general debugging. Key metadata includes: + +- **Token Counts**: The number of tokens processed, which affects billing. +- **Model Name**: The specific model used for inference. +- **Provider Type**: The service or platform providing the model. +- **Query Parameters**: Settings such as temperature and top-k that influence the generation process. +- **Query Input**: The request input (user question). +- **Query Response**: The system-generated response to the input query, utilizing the query parameters to adjust generation. + +This metadata helps in understanding how different settings affect the quality and performance of the generated responses, aiding in fine-tuning and optimization. + +### Advanced Retrieval-Augmented Generation (RAG) Applications + +In more complex applications like Retrieval-Augmented Generation (RAG), tracing is essential for effective debugging and optimization. RAG involves multiple stages, including document retrieval and interaction with GenAI models. +When only the input and output are visible, it becomes challenging to identify the source of issues or opportunities for improvement. + +For example, if a GenAI system generates an unsatisfactory response, the problem might lie in: + +- **Vector Store Optimization**: The efficiency and accuracy of the document retrieval process. +- **Embedding Model**: The quality of the model used to encode and search for relevant documents. +- **Reference Material**: The content and quality of the documents being queried. + +Tracing allows each step within the RAG pipeline to be investigated and adjudicated for quality. By providing visibility into every stage, tracing helps pinpoint where adjustments are needed, whether in the +retrieval process, the embedding model, or the content of the reference material. + +For example, the diagram below illustrates the complex interactions that form a simple RAG application, wherein the GenAI model is called repeatedly with additional retrieved data that guides the final output generation response. + +
+ ![RAG Architecture](/images/llms/tracing/rag-architecture.png) +
+ +Without tracing enabled on such a complex system, it is challenging to identify the root cause of issues or bottlenecks. The following steps would effectively be a "black box": + +1. **Embedding of the input query** +2. **The return of the encoded query vector** +3. **The vector search input** +4. **The retrieved document chunks from the Vector Database** +5. **The final input to the GenAI model** + +Diagnosing correctness issues with responses in such a system without these 5 critical steps having instrumentation configured to capture the inputs, outputs, and metadata associated with each request +creates a challenging scenario to debug, improve, or refine such an application. When considering performance tuning for responsiveness or cost, not having the visibility into latencies for each of these +steps presents an entirely different challenge that would require the configuration and manual instrumentation of each of these services. + +## Getting Started with Tracing in MLflow + +To learn how to utilize tracing in MLflow, see the [MLflow Tracing Guide](./). diff --git a/docs/docs/llms/tracing/tracing-schema.mdx b/docs/docs/llms/tracing/tracing-schema.mdx new file mode 100644 index 0000000000000..33624edef8af0 --- /dev/null +++ b/docs/docs/llms/tracing/tracing-schema.mdx @@ -0,0 +1,516 @@ +--- +sidebar_position: 3 +--- + +import { APILink } from "@site/src/components/APILink"; +import Tabs from "@theme/Tabs"; +import TabItem from "@theme/TabItem"; + +# MLflow Tracing Schema + +This document provides a detailed view of the schema for traces and its ingredients. MLflow traces are **compatible to OpenTelemetry specs**, +but we also define a few additional layers of structure upon the OpenTelemetry Spans to provide additional metadata about the trace. + +## Structure of Traces + +**TL;DR**: `Trace = TraceInfo + TraceData` where `TraceData = List[Span]` + + + + #### Trace Structure + + A `Trace` in MLflow consists of two components: + `Trace Info` and + `Trace Data`. + + The metadata that aids in explaining the origination + of the trace, the status of the trace, and the information about the total execution time is stored within the Trace Info. The Trace + Data is comprised entirely of the instrumented `Span` + objects that make up the core of the trace. + +
+ ![Trace Architecture](/images/llms/tracing/schema/trace_architecture.png) +
+ +
+ + #### Trace Info Structure + + The Trace Info within MLflow's tracing feature aims to provide a lightweight snapshot of critical data about the overall trace. + This includes the logistical information about the trace, such as the experiment_id, providing the storage location for the trace, + as well as trace-level data such as start time and total execution time. The Trace Info also includes tags and status information for + the trace as a whole. + +
+ ![Trace Info Architecture](/images/llms/tracing/schema/trace_info_architecture.png) +
+ +
+ + #### Trace Data Structure + + The Trace Data within MLflow's tracing feature provides the core of the trace information. Within this object is a list of + `Span` objects that represent the individual steps of the trace. + These spans are associated with one another in a hierarchical relationship, providing a clear order-of-operations linkage of what + happened within your application during the trace. + +
+ ![Trace Data Architecture](/images/llms/tracing/schema/trace_data_architecture.png) +
+ +
+ + #### Span Structure + + The Span object within MLflow's tracing feature provides detailed information about the individual steps of the trace. + It complies to the [OpenTelemetry Span spec](https://opentelemetry.io/docs/concepts/signals/traces/#spans). + Each Span object contains information about the step being instrumented, including the span_id, name, start_time, parent_id, status, + inputs, outputs, attributes, and events. + +
+ ![Span Architecture](/images/llms/tracing/schema/span_architecture.png) +
+ +
+
+ +## Trace + +A trace is a root object composed of two components: + +- `TraceInfo` +- `TraceData` + +:::tip +Check the API documentation for helper methods on these dataclass objects for more information on how to convert or extract data from them. +::: + +## Trace Info + +Trace Info is a dataclass object that contains metadata about the trace. This metadata includes information about the trace's origin, status, and +various other data that aids in retrieving and filtering traces when used with and for +navigation of traces within the MLflow UI. + +To learn more about how `TraceInfo` metadata is used for searching, you can see examples [here](./#searching-and-retrieving-traces). + +The data that is contained in the `TraceInfo` object is used to populate the trace view page within the MLflow tracking UI, as shown below. + +![TraceInfo as it is used in the MLflow UI](/images/llms/tracing/schema/trace_info_in_ui.png) + +The primary components of MLflow `TraceInfo` objects are listed below. + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
**Property****Description****Note**
**request_id** + A unique identifier for the trace. The identifier is used within MLflow + and integrated system to resolve the event being captured and to provide + associations for external systems to map the logged trace to the + originating caller. + + { + "This value is generated by the tracing backend and is immutable. Within the tracing client APIs, you will need to deliberately pass this value to the " + } + + {"span creation API"} + + {" to ensure that a given span is associated with a trace."} +
**experiment_id** + The ID of the experiment in which the trace was logged. All logged + traces are associated with the current active experiment when the trace + is generated (during invocation of an instrumented object). + + { + "This value is immutable and is set by the tracing backend. It is a system-controlled value that is very useful when using the " + } + + {"Search Traces"} + + {" API."} +
**timestamp_ms** + The time that marks the moment when the root span of the trace was + created. This is a Unix timestamp in milliseconds. + + The time reflected in this property is the time at with the trace was + created, not the time at which a request to your application was made. + As such, it does not factor into account the time it took to process the + request to the environment in which your application is being served, + which may introduce additional latency to the total round trip time, + depending on network configurations. +
**execution_time_ms** + The time that marks the moment when the call to end the trace is made. + This is a Unix timestamp in milliseconds. + + This time does not include the networking time associated with sending + the response from the environment that generates the trace to the + environment that is consuming the application’s invocation result. +
**status**An enumerated value that denotes the status of the trace. + `TraceStatus` values are one of: + + - **OK** - The trace and the + instrumented call were successful. + - **ERROR** - An error occurred + while an application was being instrumented. The error can be seen + within the span data for the trace. + - **IN_PROGRESS** - The trace has + started and is currently running. Temporary state that will update while + spans are being logged to a trace. + - **TRACE_STATUS_UNSPECIFIED** - + internal default state that should not be seen in logged traces. +
**request_metadata** + The request metadata are additional key-value pairs of information that + are associated with the Trace, set and modified by the tracing backend. + + These are not open for addition or modification by the user, but can + provide additional context about the trace, such as an MLflow `run_id` + that is associated with the trace. +
**tags** + User-defined key-value pairs that can be applied to a trace for applying + additional context, aid in [search + functionality](./#searching-and-retrieving-traces), or to + provide additional information during the creation or after the + successful logging of a trace. + + These tags are fully mutable and can be changed at any time, even long + after a trace has been logged to an experiment. +
+ +## Trace Data + +The MLflow `TraceData` object is a dataclass object that holds the core of the trace data. +This object contains the following elements: + + + + + + + + + + + + + + + + + + + + + + + + + + +
**Property****Description****Note**
**request** + The `request` property is the input data for the entire trace. The input + `str` is a JSON-serialized string that contains the input data for the + trace, typically the end-user request that was submitted as a call to + the application. + + Due to the varied structures of inputs that could go to a given + application that is being instrumented by MLflow Tracing, all inputs are + JSON serialized for compatibility’s sake. This allows for the input data + to be stored in a consistent format, regardless of the input data’s + structure. +
**spans** + {"This property is a list of "} + {"Span"} + {" objects that represent the individual steps of the trace."} + + For further information on the structure of Span objects, see the + section below. +
**response** + The response property is the final output data that will be returned to + the caller of the invocation of the application. + + Similar to the request property, this value is a JSON-serialized string + to maximize compatibility of disparate formats. +
+ +## Span Schema + +Spans are the core of the trace data. They record key, critical data about each of the steps within your genai application. + +When you view your traces within the MLflow UI, you're looking at a collection of spans, as shown below. + +![Spans within the MLflow UI](/images/llms/tracing/schema/spans_in_mlflow_ui.png) + +The sections below provide a detailed view of the structure of a span. + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
**Property****Description****Note**
**inputs** + The inputs are stored as JSON-serialized strings, representing the + input data that is passed into the particular stage (step) of your + application. Due to the wide variety of input data that can be passed + between specific stages of a GenAI application, this data may be + extremely large (such as when using the output of a vector store + retrieval step). + + Reviewing the Inputs, along with the Outputs, of individual stages can + dramatically increase the ability to diagnose and debug issues that + exist with responses coming from your application. +
**outputs** + The outputs are stored as JSON-serialized strings, representing the + output data that is passed out of the particular stage (step) of your + application. + + Just as with the Inputs, the Outputs can be quite large, depending on + the complexity of the data that is being passed between stages. +
**attributes** + Attributes are metadata that are associated with a given step within + your application. These attributes are key-value pairs that can be used + to provide insight into behavioral modifications for function and method + calls, giving insight into how modification of them can affect the + performance of your application. + + Common examples of attributes that could be associated with a given + span include: + + - **model** + - **temperature** + - **document_count** + + These attributes provide additional context and insight into the results that are present in the outputs property for the span. +
**events** + Events are a system-level property that is optionally applied to a span + only if there was an issue during the execution of the span. These + events contain information about exceptions that were thrown in the + instrumented call, as well as the stack trace. + + {"This data is structured within a "} + {"SpanEvent"} + {" object, containing the properties:"} + - **name** + - **timestamp** + - **attributes** + + **The attributes** property contains the stack trace of the exception that was thrown during the execution of the span if such an error occurred during execution. +
**parent_id** + The `parent_id` property is an identifier that establishes the + hierarchical association of a given span with its parent span. This is + used to establish an event chain for the spans, helping to determine + which step followed another step in the execution of the application. + + A span **must** have a `parent_id` set. +
**span_id** + The `span_id` is a unique identifier that is generated for each span + within a trace. This identifier is used to disambiguate spans from one + another and allow for proper association of the span within the + sequential execution of other spans within a trace. + A `span_id` is set when a span is created and is immutable.
**request_id** + The `request_id` property is a unique identifier that is generated for + each **trace** and is propogated to each span that is a member of that trace. + The `request_id` is a system-generated propoerty and is immutable.
**name** + The name of the trace is either user-defined (optionally when using the + fluent and client APIs) or is automatically generated through CallBack + integrations or when omitting the name argument when calling the fluent + or client APIs. If the name is not overridden, the name will be + generated based on the name of the function or method that is being + instrumented. + + It is recommended to provide a name for your span that is unique and + relevant to the functionality that is being executed when using manual + instumentation via the client or fluent APIs. Generic names for spans or + confusing names can make it difficult to diagnose issues when reviewing + traces. +
**status** + The status of a span is reflected in a value from the enumeration object + `SpanStatusCode`. The span status object contains an optional description + if the `status_code` is reflecting an error that occured. The values that + the status may have are: + - **OK** - The span and the underlying instrumented + call were successful. + - **UNSET** - The status of the span hasn’t been set yet + (this is the default value and should not be seen in logged trace + events) + - **ERROR** - An issue happened within the call being instrumented. + The `description` property will contain additional information about the + error that occurred. + + Evaluating the status of spans can greatly reduce the amount of time and + effort required to diagnose issues with your applications. +
**start_time_ns**The unix timestamp (in nanoseconds) when the span was started. + The precision of this property is higher than that of the trace start + time, allowing for more granular analysis of the execution time of very + short-lived spans. +
**end_time_ns**The unix timestamp (in nanoseconds) when the span was ended. + This precision is higher than the trace timestamps, similar to the + `start_time_ns` timestamp above. +
+ +## Schema for specific span types + +MLflow has a set of 10 predefined types of spans (see mlflow.entities.SpanType), and +certain span types have properties that are required in order to enable additional functionality +within the UI and downstream tasks such as evaluation. + +### Retriever Spans + +The `RETRIEVER` span type is used for operations involving retrieving data from a data store (for example, querying +documents from a vector store). The `RETRIEVER` span type has the following schema: + + + + + + + + + + + + + + + + + + + + + + + + + + + +
**Property****Description****Note**
**Input**There are no restrictions on the span inputs
**Output** + {"The output must be of type "} + + {"List["} + + {"mlflow.entities.Document"} + + {"]"} + + { + ", or a dict matching the structure of the dataclass*. The dataclass contains the following properties:" + } + + - **id** (`Optional[str]`) - An optional unique identifier for the document. + - **page_content** (`str`) - The text content of the document. + - **metadata** (`Optional[Dict[str,any]]`) - The metadata associated with the document. There are two important metadata keys that are reserved for the MLflow UI and evaluation metrics: + - `"doc_uri" (str)`: The URI for the document. This is used for rendering a link in the UI. + - `"chunk_id" (str)`: If your document is broken up into chunks in your data store, this key can be used to + identify the chunk that the document is a part of. This is used by some evaluation metrics. + + This output structure is guaranteed to be provided if the traces are generated via MLflow autologging for the LangChain and LlamaIndex flavors. + By conforming to this specification, `RETRIEVER` spans will be rendered in a more user-friendly manner in the MLflow UI, and downstream tasks + such as evaluation will function as expected. +
**Attributes**There are no restrictions on the span attributes
+ +\* For example, both `[Document(page_content="Hello world", metadata={"doc_uri": "https://example.com"})]` and +`[{"page_content": "Hello world", "metadata": {"doc_uri": "https://example.com"}}]` are valid outputs for a `RETRIEVER` span. diff --git a/docs/docs/plugins/index.mdx b/docs/docs/plugins/index.mdx index 3d55a219a4906..2737c8109ae85 100644 --- a/docs/docs/plugins/index.mdx +++ b/docs/docs/plugins/index.mdx @@ -2,6 +2,6 @@ sidebar_position: 17 --- -# MLFlow Plugins +# MLflow Plugins {/* WIP */} diff --git a/docs/src/css/custom.css b/docs/src/css/custom.css index 2ba86b02f201b..760850948b0d0 100644 --- a/docs/src/css/custom.css +++ b/docs/src/css/custom.css @@ -120,3 +120,29 @@ .padding-md { padding: var(--padding-md); } + +/* Most of our images are PNGs with transparent +backgrounds so they look bad in dark mode */ +main img { + background-color: white; +} + +.center-div { + margin-inline: auto; +} + +/* We need to this class to limit image size using max-height. The class +must be applied to the wrapping div along with the desired max-height */ +.max-height-img-container { + display: flex; + + p { + display: flex; + justify-content: center; + } + + img { + max-height: 100%; + width: auto; + } +}