A single agent working alone is easy. A system where multiple agents coordinate work is where most people get lost. Three patterns solve 90% of cases.
The problem with scaling agents
A single agent with access to all tools and all context collapses quickly. The context window fills up, instructions mix together, and errors propagate uncontrolled. The solution isn’t bigger prompts — it’s smaller, well-coordinated systems.
Pattern 1: Orchestrator + Workers
The orchestrator receives the high-level task, breaks it down, and delegates to specialized workers. Each worker has a limited scope and specific tools.
Orchestrator: "Analyze this PR and write the report"
→ Worker A: reviews code changes (tools: Read, Bash/git)
→ Worker B: verifies tests (tools: Bash/npm test)
→ Worker C: drafts report (tools: Write)
The orchestrator receives the outputs and assembles them. No worker needs to know what the others are doing.
Pattern 2: Sequential pipeline
For processes where each step depends on the previous one. Data flows from agent to agent, each transforming the previous agent’s output.
This pattern is simple but powerful for ETL processes, content pipelines, or any workflow where order matters. The risk is that an error in one step blocks the entire pipeline — you need to design handoffs with explicit validation.
Pattern 3: Parallel fan-out
When you have independent tasks that can run simultaneously. The orchestrator launches them in parallel and waits for all results before continuing.
The implementation with the Claude API is straightforward: multiple async calls with asyncio.gather() in Python or Promise.all() in JS. Each agent’s context is independent, so there’s no interference risk.
The practical rule
Start with one agent. When you notice the context window filling up or the agent mixing distinct concerns, that’s when to separate. Each split should have a concrete reason — no premature abstractions.