Integrations

Framework-specific notes for popular stacks.

OpenAI

Supported out of the box via beval.wrap(OpenAI()). See Usage → wrap.

Covers:

client.chat.completions.create (sync)
Vision / multimodal content — image parts are auto-promoted to kind="vlm"

Not yet covered in 0.1:

Streaming (stream=True)
The new client.responses.create API
Async client (AsyncOpenAI)

Workaround for uncovered cases — log manually around the call.

Anthropic

Supported via beval.wrap(Anthropic()). See Usage → wrap.

Covers:

client.messages.create (sync)
System prompts
Image content blocks (base64 + URL sources)

Not yet covered in 0.1:

Streaming
Async client
Tool-use metadata (coming in 0.2)

LiteLLM

Third-party projects can use the SDK around LiteLLM’s unified API. Example from bolder-fit-agent:


import beval
import litellm
from time import perf_counter
 
beval.init()
 
t0 = perf_counter()
try:
    response = await litellm.acompletion(
        model="gpt-4o-mini",
        messages=messages,
    )
    latency = int((perf_counter() - t0) * 1000)
    usage = getattr(response, "usage", None)
    beval.log(
        kind="llm",
        input=json.dumps(messages),
        output=response.choices[0].message.content,
        model_id="gpt-4o-mini",
        latency_ms=latency,
        tokens_in=getattr(usage, "prompt_tokens", None),
        tokens_out=getattr(usage, "completion_tokens", None),
        cost_usd=float(litellm.completion_cost(completion_response=response) or 0.0),
    )
except Exception as e:
    # log failure...
    raise

A first-class LiteLLM CustomLogger integration is on the roadmap.

LangChain / LlamaIndex

No official wrapper in 0.1. Two workable patterns:

Pattern A — at the call site. Log with beval.log() inside the chain’s final step or a custom callback handler.

Pattern B — @beval.trace on your entry point. Captures the whole chain as a single agent log. You lose per-step visibility but gain a working integration in one line.


@beval.trace(name="rag-chain", kind="agent")
def run_rag(query: str) -> str:
    return chain.invoke(query)

Proper callback handlers for both frameworks are on the roadmap.

FastAPI

The SDK is thread-safe and async-safe. Initialize once at app startup:


from fastapi import FastAPI
import beval
 
app = FastAPI()
 
@app.on_event("startup")
def _init_beval():
    beval.init()
 
@app.post("/chat")
async def chat(query: str):
    # ... your logic ...
    beval.log(kind="llm", input=query, output=answer, ...)
    return {"answer": answer}

Call beval.flush() from a shutdown hook if you want to block on queue drain before the process exits:


@app.on_event("shutdown")
def _flush_beval():
    beval.flush(timeout=5.0)

(Usually unnecessary — atexit handles it.)

Celery

Important: call init() per worker child, not at module import. Celery’s prefork forks children after import, and fork does not carry threads across — the SDK’s background queue worker thread dies in each child.


from celery import Celery
from celery.signals import worker_process_init
import beval
 
celery_app = Celery(...)
 
@worker_process_init.connect
def _init_beval_in_worker(**_kwargs):
    beval.init()  # runs once per forked worker child

If you skip this and initialize at import, logs will silently queue into a dead queue and never ship.

Django


# settings.py
import beval
 
beval.init()

Or in an AppConfig.ready() hook for more control. Use @beval.trace on views or service functions.

Jupyter / scripts


import beval
beval.init()
 
# ... your code ...
 
beval.flush()   # wait for queue to drain before the notebook ends

In notebooks the SDK’s atexit may not fire (kernel restart, cell interrupt). An explicit flush() before you finish is good hygiene.