Audit template

ai-llm

Audit your LLM features the way an attacker and an honest user both would.

Maps to: OWASP LLM Top 10 · NIST AI RMF

View the full prompt

Your codebase

specialists, in parallel

Prompt injectionjailbreaksoutput handlingagent/tool safetyRAGhallucination

Priority-sorted issues

Each finding is evidence-bound and survives ≥2-of-3 adversarial skeptics.

How this audit works

Provider- and framework-agnostic, this audit first maps every place an LLM is called and every trust boundary where untrusted input — user text, retrieved documents, tool results — enters a prompt. Twelve specialists then probe prompt injection, jailbreaks, system-prompt and secret leakage, insecure output handling, tool/agent agency, RAG grounding, hallucination, evals, and cost, each finding mapped to the OWASP LLM Top 10 and severity-scored P0–P3. Every claim is traced to a concrete artifact and survives independent skeptics before it ships.

Use it when

Shipping a RAG support bot

Your assistant answers from internal docs and a shared knowledge base. The audit traces whether retrieval respects per-user permissions, whether the bot abstains or hallucinates when there is no good context, and whether instructions hidden in a retrieved document can override the system prompt (indirect injection).

Giving an agent real tools

Your agent can email, query the database, or call internal APIs on model say-so. The audit checks each tool's blast radius — can it take a destructive or irreversible action without a human gate? — validates the arguments the model generates, and traces every output sink for XSS, SQL, or eval injection.

Before scaling a public LLM endpoint

You are about to open an AI feature to untrusted traffic. The audit looks for the per-user and global cost caps that stop runaway spend and abuse, the evals that guard your high-stakes paths, and the provider retention and PII handling you need before user data leaves your machine.

What you get

A per-dimension scorecard, a trust-boundary and data-flow map, and verified findings filed as priority-sorted GitHub issues — each with OWASP LLM mapping, a redacted repro, and a concrete before/after fix.

Explore the other audits