Turn any AI agent into a swarm of specialist auditors
Copy one master prompt into your AI coding agent. It runs many specialists in parallel, makes independent skeptics refute every finding, and files only what survives — as priority-sorted GitHub issues.
Not another "audit my code" prompt — these make your agent prove every finding before it files it.
Copies a template — swap in your repo URL before running.
evidence-bound · ≥2-of-3 adversarial verification · 30/60/90 roadmap
Run it
Point your AI agent at auditor.rapold.io
No install — works with any agent that can fetch a URL and run tools (runs on Claude Code). Tell it to audit a repo using auditor.rapold.io: it fetches the audit prompt — a single entry point that asks your language and which audits to run — then files the issues.
Audit github.com/your/repo using auditor.rapold.io
Copy it, swap in your repo, paste it to your agent.
It then asks: output language (Deutsch / English), which of the 13 audits (or the full repo), and whether to create issues — then runs them and opens a priority-sorted tracker.
AI agents: fetch auditor.rapold.io/llms.txt and act as the audit orchestrator (ask language + scope, then run).
Honest answers to the obvious questions
- Won't it hallucinate evidence?
- Every file:line is re-checked against the actual file in adversarial verify (Phase 3); unverifiable claims are dropped.
- Does my code leave my machine?
- No — your agent fetches the prompt; your repo never touches our domain.
- What does a run cost?
- Just your agent's tokens — scope it to a few audits to control spend.
Why it is different
Most audit prompts return unverified opinions. These enforce discipline.
Evidence or it didn't happen
Every finding cites a concrete artifact — file:line, a query plan, a request, a config value. No evidence means it is discarded.
Adversarial self-challenge
No finding survives until independent skeptic agents have tried to refute it — it must clear at least two of three, or it is dropped. If it cannot survive a hostile reading, it is not a finding.
Blind-spot hunting
A completeness critic asks each round which surface, use-case, or assumption went unexamined. Gaps are declared, never hidden.
Actionable issue tracker
Output is GitHub issues — German or English — led by a single priority-sorted tracking issue, each with a management summary and a before/after fix. A finding you cannot act on is just an opinion.
Proof, not claims
We pointed the whole suite at our own repo
The library is dogfooded on itself — pointed at this very repo, with the backlog filed as public GitHub issues. This page got its own content audit; across the full repo, every lens scored A− to A. Below: the real backlog, the grades, and one finding in full.
- P1No visible output proof — it preaches "evidence", ships prose #100
- P1Activation command is not copyable #103
- P1"any AI agent" — an unsupported universal claim #104
- P1End-CTA repeats the hero instead of closing #107
- P2"Google-grade" — an unsupported superlative #113
- P3Hero badge "copy & paste" duplicates the subhead #122
The real backlog from auditing this very page — 23 findings, every one now fixed.
Open any issue to check the evidence on GitHub.
/de served <html lang="en">
- Evidence
- web/app/layout.tsx:71
- Before
- / and /de → <html lang="en">
- After
- /de → <html lang="de">
The library
13 stack-agnostic audits, one shared methodology
Each is a self-contained master prompt: a multi-phase spec, not a one-shot "review my code" question. Run several against the same repo and their findings merge into one priority-sorted tracker — no duplicates — because every audit emits the same issue contract.
The method
Six phases, every time
Recon, a parallel specialist swarm, then adversarial verification before anything reaches the report.
Audit using auditor.rapold.io
You type — one line, any capable agent.
- 0
Reconnaissance
Factual inventory and a surface map. No opinions yet.
- 1
Specialist swarm
Many domain experts run in parallel, each evidence-bound.
securityaccessibilityperformancedata… · in parallel - 2
Cross-pollination
Merge, dedupe, and surface compound findings.
- 3
Adversarial verify
Independent skeptics try to refute each P0/P1; ≥2 of 3 to survive.
survives ≥2 of 3 - 4
Benchmark
Compare against named best-in-class references and standards.
- 5
Synthesis
Report, scorecard, issues, and a 30/60/90 roadmap.
You get — priority-sorted GitHub issues.
The yardsticks
Two yardsticks. Every report measured against them.
Reusable on their own. One scores 0–100 against a rubric; the other fixes the exact issue shape — so two runs stay comparable even when the generated prose differs.
See what your AI agent finds when it has to prove every claim.
It is free and MIT-licensed. Run it on a throwaway branch, read findings that had to survive 2-of-3 skeptics, and keep only the fixes you agree with.
Every claim here survived the same audit — read the public backlog (#97).