Six prompt shapes — generate, refactor, debug, explain, orchestrate, and review — cover most useful Claude Code interactions, and each one rewards a different prompt structure. This post is part 2 of a four-part series on whether Claude Code is a 5th-generation programming language. The pillar post made the argument that “describe what, not how” works at general-purpose programming scale. This post covers the practical layer underneath that argument: what the prompts that actually work look like, and the anti-patterns that produce plausible-but-wrong code I end up rewriting.

I have been using Claude Code as my daily driver for a while now. Some of these patterns I picked up on the first day. Most of them came from prompting badly, watching the agent produce something that compiled and did the wrong thing, and figuring out what part of my specification was missing.

Quick summary

  • Generate — write code that does X. Best when the desired outcome is concrete and the constraints are explicit.
  • Refactor — change code Y to behave differently while preserving Z. Best with named constraints and a passing test suite.
  • Debug — find why test or behavior X is failing. Best with the actual failure output included.
  • Explain — describe what code Y does. Best with a specific file or symbol pointed at.
  • Orchestrate — plan and execute a multi-step task. Best with the steps, dependencies, and acceptance criteria stated.
  • Review — audit for a named concern (security, perf, style). Best with the concern named explicitly, not “anything wrong.”

The anti-patterns: vague performance prompts, no-constraints refactors, unbounded scope, missing acceptance criteria, no file pointers in large repos, and “write code for X” without first asking whether X should exist.

Generate: write code that does X

Generation prompts produce new code from a description of desired behavior. They work best when the description includes the surrounding context (where the code goes, what language and conventions to use), the input and output shape, and at least one acceptance criterion the agent can verify by running.

A weak generate prompt:

Write a function that retries failing requests.

A specific one:

In src/Http/RetryClient.cs, add a method SendWithRetryAsync(HttpRequestMessage request, CancellationToken ct) that wraps _inner.SendAsync with up to three retries on HttpRequestException or 5xx responses, using exponential backoff starting at 200ms. The existing tests in RetryClient.Tests.cs should still pass; add a new test that verifies the retry count on a transient 503.

The first prompt is going to produce a method that probably retries something. The second prompt produces a method I can ship. The difference is not “more words” — it is that the second specifies the file, the signature, the failure modes that count as retryable, the backoff strategy, and the acceptance test.

Refactor: change Y while preserving Z

Refactor prompts change existing code in a constrained way. They work best when the prompt names the constraint that must hold across the change. Without the constraint, the agent will happily change the public API, tighten or loosen types, or rename things in ways that break callers you forgot existed.

A weak refactor prompt:

Clean up OrderService.cs.

A specific one:

In OrderService.cs, extract the validation logic in PlaceOrder (lines 47-92) into a separate OrderValidator class with a single Validate(Order) method that returns Result<Unit, OrderValidationError>. The public signature of PlaceOrder must not change, and all existing tests must still pass.

The constraint “the public signature of PlaceOrder must not change” is the part that makes this safe to merge. Without it, the agent will sometimes decide that returning a different shape is “cleaner” and take the API change with it.

Debug: find why X is failing

Debug prompts ask the agent to find the cause of a specific failure. They work best when the actual failure output (stack trace, test name, observed vs expected behavior) is included in the prompt. The agent reading the code without the failure context is just guessing.

A weak debug prompt:

The order endpoint is broken, please fix it.

A specific one:

The test OrderControllerTests.PlaceOrder_PersistsOrder is failing on main with this output:

Expected order.Id to be > 0 but was 0
at OrderControllerTests.PlaceOrder_PersistsOrder line 34

The test was passing as of commit abc123. Find what changed between then and now that would cause this, and propose a fix that does not change the test.

The agent now has a specific failure, a specific point in history, and a constraint (do not change the test). The investigation surface is narrow enough to actually solve.

Explain: what does this code do

Explain prompts ask the agent to describe what existing code does. They work best when pointed at a specific file, symbol, or line range, with a stated audience for the explanation. “Explain this codebase” is too broad to produce anything useful.

Explain the role of IIncrementalGenerator.Initialize in src/Generators/JsonModelGenerator.cs:18-86. Audience: a C# developer who has not written a Roslyn source generator before. Cover what each registered callback does and in what order.

The audience constraint matters more than people expect. “Explain to a beginner” produces a different output than “explain to someone who already knows Roslyn but not source generators specifically.” The agent will calibrate the depth to whatever audience you name.

Orchestrate: plan and execute a multi-step task

Orchestration prompts ask the agent to break a larger task into steps and execute them in order. They work best when the steps are stated explicitly, the dependencies between them are clear, and each step has a verifiable outcome. Without that structure, the agent invents its own decomposition, which is sometimes great and sometimes not what you wanted.

A weak orchestration prompt:

Add a new column archived to the orders table.

A specific one:

Add an archived: bool column to the Orders table, default false. Steps:

  1. Create a new EF Core migration named AddArchivedToOrders.
  2. Update the Order model in src/Models/Order.cs with the new property.
  3. Update OrderDto and the JSON contract in src/Api/OrderDto.cs so the new field is exposed but optional in the response.
  4. Update existing tests that construct Order directly so they still pass.
  5. Run dotnet test and confirm everything is green before reporting back.

That sequence is mostly mechanical, but it covers four files in three layers and one of them is a database migration. Without the explicit decomposition, the agent will pick its own ordering, often skip the response contract, and call the work done before running the test suite. With the decomposition, the result is checkable.

Review: audit for a named concern

Review prompts ask the agent to audit existing code for a specific concern. They work best when the concern is named — security, performance, accessibility, error handling — and the scope is bounded. “Review this code” produces filler. “Review this code for X” produces actual findings.

Review src/Http/RetryClient.cs for retry-loop bugs:

  • Is the backoff actually exponential, or does it accidentally reset between retries?
  • Are cancellation tokens propagated correctly through every retry?
  • Is the maximum retry count enforced even when the underlying call throws?

For each finding, point to the specific line and propose a fix.

The named concern keeps the review focused. The bounded scope keeps the response short enough to act on.

Anti-patterns: prompts that waste tokens

These are the patterns I have used and regretted, in roughly the order I learned to stop using them:

  • Vague performance prompts. “Make it faster” without a baseline, target, or hot path produces a refactor that may or may not address the actual bottleneck. The agent has no way to know what “fast enough” means.
  • No-constraints refactors. “Clean this up” or “improve this” gives the agent permission to change anything it does not like, which often includes the public API of the thing you asked it to clean.
  • Unbounded scope. “Rewrite the order module” is a project, not a prompt. Break it into orchestrated steps with verifiable outcomes per step.
  • Missing acceptance criteria. Without a test or observable success condition, the agent’s stopping point is “the code looks reasonable,” which is not the same thing as “the code is correct.”
  • No file pointers in a large repo. The agent can search, but you saying “in src/Foo.cs around line 87” is faster, cheaper, and more accurate than letting the agent grep first.
  • “Write code for X” without asking whether X should exist. Some prompts are architecture decisions disguised as implementation requests. The agent will happily implement something that should not have been built. That is a human-judgment failure, not a tool failure.
  • Stack-overflow-style prompts. “How do I do X in C#” is what a search engine is for. Claude Code’s value is in working on your code; if the prompt does not reference your repo, you are paying agent prices for a generic answer.

Meta-pattern: write the prompt as if to a senior engineer who just walked in

The single mental model that has helped me most is to write the prompt as if I am briefing a senior engineer who joined the team this morning. They are smart, they are skilled, they have not seen the code, and they have one hour. Tell them what we are doing, where the code lives, what the constraints are, and what done looks like. That mental model produces prompts that the agent can act on and that I can review.

Most of the bad prompts I have written break this rule in some specific way: they assume context the agent does not have, they leave done undefined, or they smuggle in an architecture decision the agent should not be making.

My take

The Claude Code agent is good. The bottleneck is usually upstream of the agent — at the human writing the prompt. That is the part of the workflow that has the most leverage to improve. Once specifying the work clearly becomes a habit, the velocity gain is real.

The other three posts in this series cover where the agentic frame fits, where authorship goes when 200 words of prompt produce 500 lines of code, and what gets built when the marginal cost of writing code drops:

  • Pillar: Is Claude Code a 5th-Generation Language? — the definitional argument
  • Philosophical: Who Wrote This Code? (coming soon) — authorship and the specification crisis
  • Future-looking: The Long Tail of Bespoke Software (coming soon) — what gets built when code gets cheap

If you have been getting plausible-but-wrong code out of Claude Code, the fix is almost always one layer up — in the specification you handed it.