Prompt Chaining

Break complex tasks into reliable steps — 2 real production pipelines with copy-paste prompts for Claude, ChatGPT, and any LLM.

One Prompt vs Chain — When Each Wins

ApproachBest forMain riskDebuggability
Single promptSimple, well-defined tasks (<500 word output)Model loses focus on complex tasksLow — can't isolate which part failed
Prompt chain ✓Multi-step tasks, long outputs, tasks needing validationMore API calls; slightly more latencyHigh — inspect and retry each step independently

Blog Post Pipeline (4-step chain)

Writing a long-form blog post in one prompt produces inconsistent structure and rambling sections. A 4-step chain produces structured, consistent output that's easy to validate and fix at each stage.

Step 1

Outline

Generate a 6-section outline for a 1,500-word blog post titled: "How to Cut Your Claude API Costs by 80% with Prompt Caching" Target audience: backend developers using the Anthropic API in production. Return: a JSON array with {section_title, key_points: string[]} for each section.
Structured JSON output makes the outline machine-readable — you can validate it, display it for human review, or feed specific sections to the next step.

Step 2

Section drafts (loop)

Write the following section of a blog post for backend developers. Section title: {section_title} Key points to cover: {key_points} Target length: 200–250 words Tone: technical, direct, no marketing fluff Return: the section text only, no preamble.
Run this prompt once per section (6 calls). Each call is small, focused, and retryable independently. Parallel execution cuts wall-clock time by ~5×.

Step 3

Edit pass

Edit the following draft blog post for clarity and consistency. Rules: fix passive voice, remove filler phrases, ensure technical terms are used consistently, ensure examples use the same API version throughout. Draft: {assembled_sections} Return: the edited post only.
Editing is a separate cognitive task from drafting. Separating them produces better results than asking the model to write and edit simultaneously.

Step 4

SEO metadata

Given this blog post, generate: 1. A title tag (55–60 chars) optimized for "claude api cost" 2. A meta description (150–160 chars) with primary CTA 3. 5 semantic keyword variations for H2 headers Blog post: {edited_post} Return: JSON with {title, meta_description, keywords: string[]}
SEO metadata requires reading the final post. Running this step last ensures it reflects actual content rather than the original brief.

Code Review Pipeline (3-step chain)

Asking an LLM to simultaneously find bugs, check style, and generate tests produces mediocre results on all three. Chaining produces thorough, actionable output for each concern.

Step 1

Bug and security scan

You are a senior security engineer. Review this code for bugs and security vulnerabilities only. For each issue: severity (Critical/High/Medium/Low), vulnerability class, line number, exploit scenario, fix. Return: JSON array {severity, class, line, exploit, fix}. Code: {code}
Constraining to bugs/security only prevents the model from mixing in style feedback, which dilutes severity judgments.

Step 2

Style and maintainability

Review this code for style and maintainability issues only. Ignore security (already reviewed). Focus: naming, function length, comments, error handling, test coverage gaps. Return: JSON array {category, description, suggested_fix}. Code: {code}
Style issues from step 2 are kept separate from security issues from step 1. The author sees two distinct, actionable lists — not a mixed jumble.

Step 3

Test generation

Generate unit tests for this code. Base tests on the bugs found in the security review. Code: {code} Security bugs found: {bugs_json} Return: complete test file in pytest format. Include one test per security finding that would catch the bug.
Step 3 has access to step 1 output — tests are written specifically to catch the bugs found, not generic happy-path tests.

Prompt Chaining Best Practices

  • Force structured output at each step — JSON responses are easier to validate and inject into the next prompt
  • Add a validation gate between steps — check format and required fields before advancing; retry the step if it fails
  • Parallelize independent steps — steps that don't depend on each other (e.g. multiple section drafts) can run simultaneously
  • Cache the shared prefix — if your system prompt appears in multiple steps, enable prompt caching (Claude) to save tokens
  • Keep each step's context minimal — only pass the fields the next step actually needs, not the entire previous output
  • Name your placeholders consistently{output}, {bugs_json}, {section_title} make chains readable and maintainable
→ Estimate chain token cost→ Improve a chain step prompt

Frequently Asked Questions

What is prompt chaining?
Prompt chaining is a technique where you break a complex task into a sequence of smaller prompts, passing the output of each step as input to the next. Instead of asking an LLM to "write a complete blog post from scratch," a chain might be: (1) outline generation → (2) section-by-section drafting → (3) editing pass → (4) SEO title generation. Each prompt is simple and targeted; the chain handles complexity. Prompt chaining is the foundation of most production AI agents and workflows.
Why use prompt chaining instead of one big prompt?
One big prompt has three problems: (1) LLMs lose focus and make more errors as prompts get longer. (2) You can't inspect or fix intermediate steps — if the output is wrong, you don't know where it went wrong. (3) Retrying the whole prompt is expensive. Chaining solves all three: each step is simple enough for the model to do well, you can inspect and validate each output, and you can retry a single broken step without rerunning the entire pipeline.
What is the difference between prompt chaining and an AI agent?
Prompt chaining is a sequence of prompts where the next step is determined by you (the developer) in advance. An AI agent is a system where the model itself decides what to do next — it reads a task, picks a tool, executes it, reads the result, and decides the next action. Agents use prompt chaining internally (each reasoning step is a prompt), but the chain is dynamic rather than fixed. Chaining is simpler, more predictable, and easier to debug. Agents are more flexible for open-ended tasks.
How do I pass context between steps in a prompt chain?
Three patterns: (1) Direct injection — the raw output of step N becomes the input to step N+1. Simple but can bloat later prompts. (2) Summarization — after each step, add a compression prompt that extracts only the fields the next step needs. (3) Structured extraction — force each step to return JSON, then inject only the relevant keys. For production chains, pattern 3 is most robust: structured output prevents formatting bugs and makes it easy to validate each step before proceeding.
Does prompt chaining work with Claude?
Yes — Claude is particularly effective in chains because it follows structured output instructions reliably (critical for passing data between steps), handles large contexts well for step N+1 inputs, and its strong instruction-following reduces step-level errors. For production Claude chains, use the Messages API, enable prompt caching on the shared prefix (saves tokens on the system prompt repeated across steps), and build in validation gates that check output format before advancing to the next step.

Individual step not working? Let the optimizer fix it.

Paste any step prompt into Prompt Improver. It rewrites the instruction, adds the right output format constraint, and shows every change. Free, BYO Anthropic API key.

→ Improve My Chain Step
🔥 Tonight: Claude Code Power Prompts · £5 £3 first 10Get PDF →