Planner-Generator-Evaluator pattern

Overview

The Planner-Generator-Evaluator pattern is an agentic design pattern for LLM-assisted software development. It decomposes a development workflow into three specialised agent roles with distinct scopes and responsibilities, preventing any single agent from holding too much context or making unreviewable decisions in one pass.

The three roles

Planner

Input: a user brief or feature request
Output: a full specification (not code)
Constraint: plan-level only — no implementation details
In Claude Code: uses plan mode to generate the spec

The Planner’s scope restriction is deliberate: keeping it at spec-level means its output is reviewable by a human before any code is generated.

Generator

Input: the spec from the Planner
Output: implemented features, sprint by sprint
Builds incrementally — one sprint (or feature slice) at a time rather than generating the entire codebase in a single pass

The incremental approach limits blast radius: if a sprint’s output is wrong, the damage is bounded and the Evaluator can catch it before the next sprint begins.

Evaluator

Input: the live running application (not just the code)
Output: a grade (pass/fail) against hard-coded thresholds
Method: interacts with the live app — not a code review, but behavioural verification
Implementation options:
- A /goal or /code-review skill (slash command)
- A dedicated verification subagent

The Evaluator grading against hard thresholds is the quality gate: it prevents Generator drift from accumulating across sprints undetected.

Why this pattern matters

Single-agent coding loops have two failure modes: (1) the agent drifts from the original intent across a long context window, and (2) the agent self-evaluates optimistically. The Planner-Generator-Evaluator pattern addresses both:

Separating planning from generation forces explicit scope definition upfront
Sprint-by-sprint generation bounds context length per step
An independent Evaluator role eliminates self-evaluation bias

The pattern mirrors human engineering team structure: architect (Planner), engineers (Generator), QA (Evaluator).

Claude Code — the tool that provides plan mode for the Planner role and slash commands for the Evaluator
LLM wiki — another multi-role LLM pattern (ingest/query/lint) that separates concerns similarly
Design Patterns — general design-pattern context; this is an agentic/workflow pattern rather than a structural one
Software Engineering — QA, spec-first development, and sprint-based delivery as background practices

Resources

2026-06-03 ◦ Claude Code: Team Infrastructure and Agentic Patterns (talk slides) — slide introducing the pattern with Planner (brief → spec), Generator (sprint-by-sprint), and Evaluator (live-app grading) roles
2026-06-26 ◦ nano-analyzer (GitHub, weareaisle) — LLM-powered vulnerability scanner illustrating multi-stage context injection: Stage 1 (context generation) → Stage 2 (analysis) → Stage 3 (triage/verification), with confidence-based escalation to stronger models for ambiguous cases; see Prompt engineering patterns