nikolas.sapa
April 16, 2026

06 — Custom Agents

An agent is a specialist Claude spawned inside your session — its own system prompt, its own tool list, its own model. The main session delegates; the agent executes; results come back. Use them to parallelize, to isolate context, or to run a cheaper model on work that doesn't need the best one.


The Core Idea

Without agents, every task runs in one long context window. Claude sees everything you've ever said this session, and complexity compounds. Agents break that.

Each agent gets:

  • A focused system prompt (what it knows and how it behaves)
  • A scoped tool list (only what it needs)
  • Its own model (matched to the task's cost)
  • An isolated context (it doesn't see your full history)

When the agent finishes, it returns output to the calling session. Clean handoff.


Three Reasons to Use Agents

1. Parallelism. If two tasks don't depend on each other, run them simultaneously. A security review and a database migration can happen at the same time — you don't wait for one to finish before starting the other.

2. Context isolation. Some tasks are better with a fresh slate. A code reviewer that only sees the diff (not your 40-message session history) will give cleaner output. Agents let you control exactly what context a task gets.

3. Specialization. A single well-prompted specialist outperforms a generalist on its home turf. A git agent that knows every merge strategy cold is more reliable than asking the main session to figure it out mid-task.


Model Routing

This is where agents pay for themselves. Not every task needs the same model.

Work typeModelWhy
Exploration, file reads, lookupshaikuFast, cheap, capable enough
Implementation, writing, editssonnetDefault. Strong across the board.
Architecture, hard decisionsopus via advisor()Expensive — use sparingly

The rule: use haiku for anything read-only or repetitive. Default to sonnet for anything that writes. Opus is a deliberate choice, not a default.

On a moderately complex task (explore → plan → implement → review), routing correctly can cut costs by 40–60% with no quality loss.


How Agents Are Defined

Agents live in ~/.claude/agents/. Each is a Markdown file with YAML frontmatter followed by the system prompt.

---
name: my-agent
description: What this agent does and when Claude should use it
model: claude-haiku-4-5
tools: Read, Glob, Grep
color: blue
---

You are a specialist in X. When given Y, do Z.
Focus only on [scope]. Do not [out-of-scope thing].

The description field is what Claude reads when deciding whether to spawn this agent. Write it as a clear trigger: "Use this agent when reviewing security implications of auth changes." Vague descriptions lead to wrong-agent calls.


Parallel vs Sequential

Parallel — tasks that don't depend on each other's output:

  • Lint check + type check + test run
  • Security audit + dependency audit
  • Frontend implementation + backend implementation

Sequential — when B needs A's output:

  • Plan → implement (need the plan before writing code)
  • Write → review (need the draft before reviewing it)
  • Migrate schema → update queries (need new schema first)

In practice, most real workflows are a mix: a parallel exploration phase, then a sequential implementation phase, then a parallel verification phase.


A Worked Example

Task: add a new API endpoint, including tests and docs.

Instead of one long session, you might structure it as:

  1. Parallel: spawn an explorer agent (haiku) to map the existing route structure, and a reviewer agent (haiku) to read the current test patterns.
  2. Sequential: feed both outputs to an implementation agent (sonnet) that writes the endpoint + tests.
  3. Sequential: pass the implementation to a docs agent (sonnet) that writes the API reference.
  4. Parallel: run a security agent (sonnet) and a lint agent (haiku) against the finished code simultaneously.

Total wall time drops. Each agent works with only the context it needs. The main session orchestrates without doing the work itself.


A Starter Roster

A starter set worth having. Add specialists as your work patterns emerge.

  • Explore — built-in. Maps files, finds patterns, understands structure. Always haiku; it's read-only work.
  • Plan — built-in. Designs implementation strategy before any code gets written. Feed it output from Explore, then hand its plan to an implementation agent.
  • general-purpose — built-in. Open-ended research, multi-step lookups, anything that doesn't fit a specialist. Last resort, not first.
  • reviewer — spawn after implementation. Reads the diff, checks for security issues, wrong types, missed edge cases. Give it no write tools — read-only by design.
  • debugger — diagnoses failing tests and applies the minimal fix. Useful because it forces a hypothesis before touching code.
  • frontend / backend / database — domain specialists worth separating so each gets the right tool scope and model. A frontend agent with database write tools is an accident waiting to happen.

Building Your Own

Start with one specialist for your most repeated task — the thing you ask Claude to do the same way every session. Write a tight system prompt, give it only the tools it needs, pick the cheapest model that can do the job well.

Good agents are narrow. An agent that does three different things is three agents waiting to happen.