Distributed tracing in MLflow

January 23, 2026 · 3 min read

We're excited to announce that MLflow 3.9.0 brings distributed tracing, which supports connecting spans from multiple services into a single trace.

Distributed Tracing Architecture Diagram

What's Distributed Tracing?

When your application spans multiple services, you may want to connect spans from these services into a single trace for tracking the end-to-end execution in one place. MLflow supports this via Distributed Tracing, by propagating the active trace context over HTTP so that spans recorded in different services can be connected into a single trace.

Typical use cases:

Multi-agent orchestration: Track how multiple specialized agents (planner, researcher, coder) collaborate and identify which agent caused a failure or bottleneck.
Remote tool calling chain observation: Track spans generated by nested remote tool callings, and identify which remote tool caused a failure or bottleneck.

Using MLflow distributed tracing

The following is a simple example of distributed tracing with an LLM application made up of two services: a client and a server. The client creates the trace and the parent span, while the server adds a nested span. To achieve this, the trace context (including the trace ID and the parent span ID) is formatted according to the W3C TraceContext specification and passed in the request headers from the client to the server. MLflow provides two APIs to simplify fetching headers in the client and ingesting them in the server:

Use the mlflow.tracing.get_tracing_context_headers_for_http_request API in the client to fetch headers.
Use the mlflow.tracing.set_tracing_context_from_http_request_headers in the server to extract the trace and span information from the request headers and set them to current trace context.

Client example

import requests
import mlflow
from mlflow.tracing import get_tracing_context_headers_for_http_request

with mlflow.start_span("client-root"):
    headers = get_tracing_context_headers_for_http_request()
    requests.post(
        "https://your.service/handle", headers=headers, json={"input": "hello"}
    )

Server handler example

import mlflow
from flask import Flask, request
from mlflow.tracing import set_tracing_context_from_http_request_headers

app = Flask(__name__)


@app.post("/handle")
def handle():
    headers = dict(request.headers)
    with set_tracing_context_from_http_request_headers(headers):
        with mlflow.start_span("server-handler") as span:
            # ... your logic ...
            span.set_attribute("status", "ok")
    return {"ok": True}

Databricks configuration

If you set up MLflow tracking to Databricks, to make distributed tracing work, the trace destination must be set to Unity Catalog. Please refer to Store MLflow traces in Unity Catalog for details.

Learn More

Ready to get started with distributed tracing in MLflow? Check out these resources:

MLflow Distributed Tracing Documentation

Join the Community

We're excited about the possibilities this integration opens up and would love to hear your feedback and contributions.

For those interested in sharing knowledge, we invite you to collaborate on the MLflow website. Whether it's writing tutorials, sharing use cases, or providing feedback, every contribution enriches the MLflow community.

Gen AI

Model training

Model Training

GenAI Apps & Agents

Distributed tracing in MLflow

What's Distributed Tracing?

Using MLflow distributed tracing

Client example

Server handler example

Databricks configuration

Learn More

Join the Community

Gen AI

Model training

Model Training

GenAI Apps & Agents

What's Distributed Tracing?​

Using MLflow distributed tracing​

Client example​

Server handler example​

Databricks configuration​

Learn More​

Join the Community​

What's Distributed Tracing?

Using MLflow distributed tracing

Client example

Server handler example

Databricks configuration

Learn More

Join the Community