13 audits · MIT · v0.8.0

Turn any AI agent into a swarm of specialist auditors

Copy one master prompt into your AI coding agent. It runs many specialists in parallel, makes independent skeptics refute every finding, and files only what survives — as priority-sorted GitHub issues.

Not another "audit my code" prompt — these make your agent prove every finding before it files it.

See a real run

Copies a template — swap in your repo URL before running.

evidence-bound · ≥2-of-3 adversarial verification · 30/60/90 roadmap

Run it

Point your AI agent at auditor.rapold.io

No install — works with any agent that can fetch a URL and run tools (runs on Claude Code). Tell it to audit a repo using auditor.rapold.io: it fetches the audit prompt — a single entry point that asks your language and which audits to run — then files the issues.

Audit github.com/your/repo using auditor.rapold.io

Copy it, swap in your repo, paste it to your agent.

It then asks: output language (Deutsch / English), which of the 13 audits (or the full repo), and whether to create issues — then runs them and opens a priority-sorted tracker.

AI agents: fetch auditor.rapold.io/llms.txt and act as the audit orchestrator (ask language + scope, then run).

Honest answers to the obvious questions

Won't it hallucinate evidence?: Every file:line is re-checked against the actual file in adversarial verify (Phase 3); unverifiable claims are dropped.
Does my code leave my machine?: No — your agent fetches the prompt; your repo never touches our domain.
What does a run cost?: Just your agent's tokens — scope it to a few audits to control spend.

See the 13 audits

Why it is different

Most audit prompts return unverified opinions. These enforce discipline.

Evidence or it didn't happen

Every finding cites a concrete artifact — file:line, a query plan, a request, a config value. No evidence means it is discarded.

Adversarial self-challenge

No finding survives until independent skeptic agents have tried to refute it — it must clear at least two of three, or it is dropped. If it cannot survive a hostile reading, it is not a finding.

Blind-spot hunting

A completeness critic asks each round which surface, use-case, or assumption went unexamined. Gaps are declared, never hidden.

Actionable issue tracker

Output is GitHub issues — German or English — led by a single priority-sorted tracking issue, each with a management summary and a before/after fix. A finding you cannot act on is just an opinion.

Proof, not claims

We pointed the whole suite at our own repo

The library is dogfooded on itself — pointed at this very repo, with the backlog filed as public GitHub issues. This page got its own content audit; across the full repo, every lens scored A− to A. Below: the real backlog, the grades, and one finding in full.

github.com/marcelrapold/auditor/issues

is:issue label:content23 findings

The real backlog from auditing this very page — 23 findings, every one now fixed.

documentationA94performanceA92securityA910 P0 · 1 P1

Open any issue to check the evidence on GitHub.

/de served <html lang="en">

Evidence: web/app/layout.tsx:71
Before: / and /de → <html lang="en">
After: /de → <html lang="de">

Issue #81

See the full run (#97)

The library

13 stack-agnostic audits, one shared methodology

Each is a self-contained master prompt: a multi-phase spec, not a one-shot "review my code" question. Run several against the same repo and their findings merge into one priority-sorted tracker — no duplicates — because every audit emits the same issue contract.

security

14 domains: injection, authN/Z, secrets, supply chain, IaC, CI/CD, business logic, privacy, LLM.

OWASP · CWE · MITRE · CIS

repo

Whole-repo engineering: architecture, stack consistency, docs, tests, deps, CI/CD, git hygiene.

Google Eng · SRE · SLSA

frontend

16-agent sweep: usability, psychology, visual design, a11y, performance, SEO, copy, CRO.

Nielsen · WCAG · CWV

api

Resource modeling, HTTP semantics, error model, versioning, idempotency, rate limits, DX.

RFC 9110/9457 · OpenAPI

performance

Hotspots, N+1, caching, concurrency, leaks, load behavior, resilience, FinOps.

SRE · DORA · SLOs

data

Schema and modeling, constraints, migration safety, transactions, integrity, backup/DR.

ACID/CAP · RLS

infrastructure

IaC, cloud security, IAM, secrets, containers, k8s, CI/CD, HA, DR, observability, cost.

CIS · Well-Architected · DORA

ai-llm

Prompt injection, jailbreaks, output handling, agent/tool safety, RAG, hallucination, evals.

OWASP LLM Top 10 · NIST AI RMF

compliance-privacy

Lawful basis, consent/cookies, data-subject rights, retention, transfers, breach readiness.

GDPR · ePrivacy · EU AI Act

accessibility

Semantics, keyboard, focus, screen reader, contrast, forms, zoom, motor, motion, cognitive.

WCAG 2.2 · EAA · ADA/508

documentation

Docs quality vs the standard: head-matter, onboarding, doc–code drift, writing, Diátaxis.

DOCUMENTATION-STANDARD · Diátaxis

content

Content & messaging: thesis challenge, audience fit, evidence & originality, structure, voice, concrete rewrites.

E-E-A-T · BLUF · rhetoric

lean

Bloat, redundancy & dependency transparency: dead code, unused/phantom deps, duplication, AI slop — a safe strip-down that never over-deletes.

Google Eng · OWASP · YAGNI

The method

Six phases, every time

Recon, a parallel specialist swarm, then adversarial verification before anything reaches the report.

Audit using auditor.rapold.io

You type — one line, any capable agent.

0
Reconnaissance
Factual inventory and a surface map. No opinions yet.
1
Specialist swarm
Many domain experts run in parallel, each evidence-bound.
securityaccessibilityperformancedata… · in parallel
2
Cross-pollination
Merge, dedupe, and surface compound findings.
3
Adversarial verify
Independent skeptics try to refute each P0/P1; ≥2 of 3 to survive.
survives ≥2 of 3
4
Benchmark
Compare against named best-in-class references and standards.
5
Synthesis
Report, scorecard, issues, and a 30/60/90 roadmap.

You get — priority-sorted GitHub issues.

The yardsticks

Two yardsticks. Every report measured against them.

Reusable on their own. One scores 0–100 against a rubric; the other fixes the exact issue shape — so two runs stay comparable even when the generated prose differs.

Documentation standard

A documentation standard with five repo profiles and a 0–100 scoring rubric — the same yardstick the documentation audit scores against.

DOCUMENTATION-STANDARD.md

Issue-output standard

The mandatory contract every audit follows: a priority-sorted tracking issue first, then one German issue per finding, each with its own management summary.

ISSUE-OUTPUT-STANDARD.md

See what your AI agent finds when it has to prove every claim.

It is free and MIT-licensed. Run it on a throwaway branch, read findings that had to survive 2-of-3 skeptics, and keep only the fixes you agree with.

Every claim here survived the same audit — read the public backlog (#97).

Browse the prompts Quickstart