Published on

Zero-Code LLM Instrumentation on Kubernetes with the OpenLIT Operator

Zero-Code LLM Instrumentation on Kubernetes with the OpenLIT Operator

TL;DR: Add the label openlit.io/instrument: "true" to your Kubernetes pods and the OpenLIT Operator automatically injects LLM instrumentation — no code changes, no redeployments, no SDK installation. Every OpenAI, Anthropic, or LangChain call in your Python services starts producing OpenTelemetry traces.


The Problem: Instrumenting Dozens of Services Is Tedious

You have 15 microservices. Seven of them make LLM calls. Three use OpenAI, two use Anthropic, one uses LangChain, and one uses LiteLLM.

The normal approach: go to each service, add pip install openlit, add openlit.init() to the entry point, configure the OTLP endpoint, test it, deploy it. Seven PRs, seven code reviews, seven deployments.

The operator approach: deploy the operator once, add a label to each service's pod spec, done.

How the Operator Works

The OpenLIT Operator runs as a Kubernetes controller with a mutating admission webhook. When a pod with the openlit.io/instrument: "true" label is created, the operator:

  1. Injects an init container that installs the OpenLIT Python SDK into a shared volume

  2. Sets up sitecustomize.py so that Python automatically runs openlit.init() when the application starts

  3. Configures environment variables for the OTLP endpoint, application name, and environment

Your application code doesn't change. It doesn't even know it's being instrumented.

┌────────────────────────────────────────────────┐
│                  Pod Creation                   │
│                                                 │
│  kubectl apply deployment.yaml                  │
│       ↓                                         │
│  Admission Webhook intercepts                   │
│       ↓                                         │
│  OpenLIT Operator mutates pod spec:             │
│    - Adds init container (installs SDK)         │
│    - Adds shared volume                         │
│    - Sets PYTHONPATH + env vars                 │
│    - Sets sitecustomize.py                      │
│       ↓                                         │
│  Pod starts → Python auto-imports openlit       │
│       ↓                                         │
│  All LLM calls are traced via OTLP             │
└────────────────────────────────────────────────┘

Step-by-Step Setup

1. Deploy the Operator

kubectl apply -f https://raw.githubusercontent.com/openlit/openlit/main/operator/deploy/openlit-operator.yaml

This creates:

  • The operator deployment in the openlit-system namespace

  • A MutatingWebhookConfiguration that watches for labeled pods

  • The necessary RBAC roles and service accounts

  • TLS certificates for webhook communication

2. Create an AutoInstrumentation Resource

The AutoInstrumentation custom resource tells the operator what to instrument and where to send the data:

apiVersion: openlit.io/v1alpha1
kind: AutoInstrumentation
metadata:
  name: openlit-instrumentation
  namespace: default
spec:
  selector:
    labels:
      openlit.io/instrument: "true"
  otlp:
    endpoint: "http://openlit.monitoring.svc.cluster.local:4318"
  python:
    image: ghcr.io/openlit/openlit-python-init:latest
  resource:
    environment: "production"

3. Label Your Deployments

Add the label to any deployment that makes LLM calls:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: my-chatbot
spec:
  template:
    metadata:
      labels:
        app: my-chatbot
        openlit.io/instrument: "true"   # <-- this is all you add
    spec:
      containers:
        - name: chatbot
          image: my-chatbot:latest

Apply it:

kubectl apply -f deployment.yaml

The next time a pod starts for this deployment, it will be auto-instrumented.

4. Verify It's Working

Check the pod's init containers:

kubectl describe pod -l app=my-chatbot

You should see an init container like openlit-init that ran successfully. Then check your OpenLIT dashboard or OTLP backend for incoming traces.

What Gets Instrumented

The operator injects the full OpenLIT Python SDK, which means everything that openlit.init() normally instruments gets covered:

  • LLM Providers: OpenAI, Anthropic, Cohere, Mistral, Groq, Bedrock, Ollama, vLLM, and 30+ more

  • Agent Frameworks: LangChain, LangGraph, LlamaIndex, CrewAI, Pydantic AI, OpenAI Agents

  • Vector Databases: Pinecone, Chroma, Qdrant, Milvus

  • Web Frameworks: FastAPI, Flask, Django (if your service uses them)

The operator detects which libraries are imported at runtime and instruments only what's present.

Configuring the Operator

The operator supports several configuration options via environment variables:

VariableDescriptionDefault
OPENLIT_OTLP_ENDPOINTWhere to send OTLP datahttp://localhost:4318
OPENLIT_DEFAULT_ENVIRONMENTDefault environment labeldefault
WEBHOOK_PORTWebhook server port9443
WEBHOOK_FAILURE_POLICYWhat happens if webhook fails (Fail or Ignore)Ignore
CERT_VALIDITY_DAYSTLS cert validity365

Setting WEBHOOK_FAILURE_POLICY to Ignore is recommended for production — if the operator is down, pods still start normally without instrumentation rather than failing to schedule.

Excluding Specific Services

Maybe you want to instrument everything in a namespace except one sensitive service. Use the ignore field in your AutoInstrumentation resource:

spec:
  selector:
    labels:
      openlit.io/instrument: "true"
  ignore:
    labels:
      openlit.io/instrument: "false"

Or simply don't add the label to services you want to skip.

Running OpenLIT on Kubernetes Too

If you're deploying the OpenLIT platform itself on Kubernetes, the natural setup is:

┌─────────────────────────────────────┐
│           Your K8s Cluster          │
│                                     │
│  ┌──────────┐   ┌───────────────┐  │
│  │ OpenLIT  │   │  ClickHouse   │  │
│  │ Platform │◄──│  (storage)    │  │
│  │ :3000    │   │               │  │
│  │ :4317/18 │   └───────────────┘  │
│  └──────────┘                      │
│       ▲                             │
│       │ OTLP                        │
│       │                             │
│  ┌────┴─────┐  ┌──────────┐       │
│  │ Service A │  │ Service B│       │
│  │ (labeled) │  │ (labeled)│       │
│  └──────────┘  └──────────┘       │
└─────────────────────────────────────┘

The OpenLIT Helm chart makes this straightforward:

helm repo add openlit https://openlit.github.io/helm-chart
helm repo update
helm install openlit openlit/openlit

Then point your AutoInstrumentation resource's OTLP endpoint to the OpenLIT service.

Real-World Example: Instrumenting a FastAPI + LangChain Service

Here's a typical scenario. You have a FastAPI service that uses LangChain for RAG:

# app.py — no OpenLIT code anywhere
from fastapi import FastAPI
from langchain_openai import ChatOpenAI
from langchain.chains import RetrievalQA

app = FastAPI()
llm = ChatOpenAI(model="gpt-4o")

@app.post("/ask")
async def ask(question: str):
    chain = RetrievalQA.from_chain_type(llm=llm, retriever=my_retriever)
    result = chain.invoke(question)
    return {"answer": result}

Deploy it on Kubernetes with the label:

metadata:
  labels:
    openlit.io/instrument: "true"

Without touching app.py, you now get:

  • Full trace of the /ask HTTP request (via FastAPI instrumentation)

  • LangChain chain execution spans

  • Individual OpenAI chat completion spans with tokens, cost, latency

  • Retriever spans showing what documents were fetched

All stitched together in a single distributed trace.

When to Use the Operator vs. the SDK

ScenarioUse the OperatorUse the SDK
Many services, don't want to modify codeYes
Need fine-grained control over what's tracedYes
Platform team managing observability centrallyYes
Single service, quick setupYes
Non-Python services (TypeScript, Go)Yes
Brownfield codebase you can't easily modifyYes

The operator currently supports Python services. For TypeScript or Go, use the SDK directly.


FAQ

Does it work with non-Python services?

The operator currently supports Python auto-instrumentation. TypeScript and Go support is on the roadmap. For those languages, use the OpenLIT SDK directly.

Can I configure which providers to instrument?

Yes. You can set disabled_instrumentors via environment variables in the AutoInstrumentation resource to exclude specific libraries from instrumentation.

What happens if the operator is down?

With the default Ignore failure policy, pods continue to start normally — they just won't be instrumented. No impact on your application availability.

Does it add latency to pod startup?

The init container adds a few seconds to pod startup (it installs the SDK). Once the pod is running, the instrumentation overhead is negligible — the same as running openlit.init() directly.

Can I use it with an existing OTel Collector?

Yes. Point the otlp.endpoint in the AutoInstrumentation resource to your existing OTel Collector. The data format is standard OTLP.

openlit
kubernetes
operator
opentelemetry
auto-instrumentation
llm
  • Name
    Twitter