Validating LLM Tool Calls with Pydantic v2

LLMs hallucinate parameters. Schema validation is your safety net.

The fundamental problem

When an LLM “calls a tool,” it generates JSON that looks like a function call. The platform parses it and hands you the dict. You execute the function.

But the JSON is generated by a probabilistic model. Sometimes it:

Skips required fields
Sends the wrong type (string "true" instead of boolean true)
Includes extra fields you didn’t ask for
Hallucinates enum values ("location": "Mars")
Constructs malformed nested objects

If you params["email"] without validating, you crash on KeyError. If you int(params["age"]) without validating, you crash on ValueError. Either way the call ends abruptly and the caller hears silence.

The safety net: Pydantic v2 schemas

For every tool, declare a Pydantic model:

from pydantic import BaseModel, Field

class ScheduleMeetingParams(BaseModel):
    name: str = Field(...)
    email: str = Field(...)
    purpose: str = Field(...)
    datetime: str = Field(...)
    location: str = Field(...)

class VerifyParams(BaseModel):
    full_name: str = ""
    phone_number: str = ""

class QueryCorpusParams(BaseModel):
    question: str | None = None

Required vs optional is explicit. Defaults absorb missing fields. Types coerce when sensible.

The dispatcher pattern

A single entry point validates and routes:

TOOL_HANDLERS: dict[str, tuple[type[BaseModel], ToolHandler]] = {
    "queryCorpus":        (QueryCorpusParams,       handle_queryCorpus),
    "verify":             (VerifyParams,            handle_verify),
    "schedule_meeting":   (ScheduleMeetingParams,   handle_schedule_meeting),
    "move_to_main_convo": (MoveToMainConvoParams,   handle_move_to_main_convo),
    "hangUp":             (HangUpParams,            handle_hangUp),
}

async def handle_tool_invocation(uv_ws, tool_name, invocation_id, raw_params):
    entry = TOOL_HANDLERS.get(tool_name)
    if not entry:
        return await _send_tool_error(uv_ws, invocation_id, f"Unknown tool: {tool_name}")

    ParamModel, handler = entry
    try:
        params = ParamModel(**raw_params)
    except ValidationError as e:
        return await _send_tool_error(uv_ws, invocation_id, f"Invalid params: {e}")

    try:
        await handler(uv_ws, invocation_id, params)
    except Exception:
        logger.exception("Tool %s failed", tool_name)
        await _send_tool_error(uv_ws, invocation_id, "Internal tool error")

Three guards in twelve lines:

Unknown tool → polite error to the model
Bad params → validation error sent back as a tool result, model can retry with corrected args
Handler crash → caught, logged, generic error to the model

The conversation never dies because of one bad tool call.

Adding a new tool: two steps

# 1. Define the schema
class TransferToHumanParams(BaseModel):
    reason: str = Field(...)
    department: str = Field(..., pattern="^(billing|tech|sales)$")

# 2. Write the handler and register
async def handle_transfer_to_human(uv_ws, invocation_id, params: TransferToHumanParams):
    # ... do the transfer
    await _send_tool_result(uv_ws, invocation_id, "Transferred")

TOOL_HANDLERS["transfer_to_human"] = (TransferToHumanParams, handle_transfer_to_human)

No if/elif chain to grow. No new error handling to write. The dispatcher already covers it.

Why Pydantic v2 specifically

Speed — rewritten in Rust, ~10x faster than v1. Latency-critical on a voice call.
Field(pattern=...) — regex validation built-in (replaces v1’s regex).
Better error messages — sent verbatim back to the model, which often self-corrects.
Type coercion — "5" → 5 for int fields. Helpful because LLMs are inconsistent about quoting numbers.

What good error messages look like

When validation fails, you get something like:

1
2
3

1 validation error for ScheduleMeetingParams
location
  String should match pattern '^(Downtown|Uptown|Westside)$' [type=string_pattern_mismatch, input_value='Mars', input_type=str]

Send this verbatim to the model in the tool result. Modern models read this and try again with a valid value. You get free retry behavior without writing retry logic.

A trap: don’t validate happy-path responses

Validation belongs at the boundary — when data enters your code from outside (LLM, HTTP request, user input). Don’t re-validate when passing the model to internal functions. You already know the data is good; double validation is overhead.

Testing the schemas

These tests run in milliseconds and don’t need Twilio or Ultravox:

import pytest
from pydantic import ValidationError

def test_schedule_meeting_requires_all_fields():
    with pytest.raises(ValidationError):
        ScheduleMeetingParams(name="John")  # missing 4 required fields

def test_schedule_meeting_accepts_valid_input():
    p = ScheduleMeetingParams(
        name="John Smith",
        email="john@example.com",
        purpose="Routine checkup",
        datetime="2026-06-01 10:00",
        location="Downtown",
    )
    assert p.location == "Downtown"

You’re testing the contract, not the LLM. If the contract is right, the LLM either complies or gets a useful error.

Takeaway

LLM tool calls are untrusted input. Treat them like form submissions from the internet. Pydantic v2 + a registry-based dispatcher = small code, big safety net, easy to extend.