Prompting best practices

A great flow lives or dies by the prompts you give its actors. The model is capable; the prompt is what tells it how to be capable at your problem. This page distils Anthropic’s official Prompting best practices for Claude Opus 4.7, Sonnet 4.6, and Haiku 4.5, and translates each technique into how you’d apply it inside CogniAgent — actor system prompts, the Ask AI node, Call AI Agent, and step-builder configurations.

Most of these techniques are additive — you can mix and match. Start with Be clear and direct, Use examples, and Add context, and reach for the more specialized ones (XML structure, role, prefill replacements, thinking guidance) when the basics don’t fully solve your problem.

The golden rule

Show your prompt to a colleague with minimal context on the task and ask them to follow it. If they’d be confused, the model will be too.

Think of the model as a brilliant new hire who has no idea about your norms, customer base, edge cases, or internal jargon. The more precisely you explain what you want, the better the result.

Foundations

1. Be clear and direct

State exactly what you want. If you want “above and beyond” behavior, ask for it. If the order of steps matters, number them. Less effective:

Help users with their support requests.

More effective:

You are a support agent for Acme. Your job is to:

1. Greet the user by name if they introduce themselves.
2. Ask one clarifying question if the issue isn't yet specific enough to act on.
3. Resolve the request by either: (a) answering directly from the knowledge base,
   (b) handing off to a human when the user asks for an agent, or (c) creating
   a ticket via the `create_ticket` capability if the issue can't be resolved live.
4. End by confirming what you did and asking if there's anything else.

This applies to actor system prompts, Ask AI node prompts, and the Task field in step-builder definitions.

2. Add context to improve performance

Telling the model why a rule matters lets it generalize sensibly to edge cases you didn’t anticipate. Less effective:

NEVER use ellipses.

More effective:

This actor's responses are read aloud by a text-to-speech engine in our phone
channel, so never use ellipses since the TTS engine won't know how to pronounce them.

The second version also covers em-dashes, ASCII art, and other written-only quirks the model will infer are problematic for the same reason.

When you write an actor system prompt, briefly tell the model what channel it lives on (widget, email, Telegram, phone), who the user is (paying customer, internal employee, lead), and what success looks like (booked meeting, ticket created, question answered). Three sentences of context outperform three pages of rules.

3. Use examples (few-shot prompting)

Examples are the most reliable way to steer output format, tone, and structure. A few well-chosen examples often beat a paragraph of instructions. When adding examples to an actor or Ask AI prompt, make them:

Relevant — Mirror your real cases, including the awkward ones.
Diverse — Cover edge cases. Vary enough that the model doesn’t pick up unintended patterns.
Structured — Wrap them in <example> tags so the model can tell them apart from instructions.

Here are examples of how to triage an inbound lead message:

<examples>
  <example>
    <user_message>Hey, do you guys do annual plans?</user_message>
    <triage>billing_inquiry</triage>
    <response>Yes — our annual plan is 20% off the monthly rate. Want me to walk you through the pricing?</response>
  </example>
  <example>
    <user_message>I keep getting "401 Unauthorized" when I call /v1/users.</user_message>
    <triage>technical_support</triage>
    <response>That's almost always an expired API key. Can you check Settings → API Keys and try regenerating?</response>
  </example>
  <example>
    <user_message>cancel my subscription</user_message>
    <triage>retention_risk</triage>
    <response>I'm sorry to hear that. Before we cancel, can I ask what's not working for you?</response>
  </example>
</examples>

Three to five examples is the sweet spot. You can also ask the model itself (“evaluate these examples for diversity”) or have it generate new variants from your initial set.

4. Structure prompts with XML tags

When a prompt mixes instructions, context, examples, and variable inputs, wrap each block in its own XML tag. The model parses these reliably and won’t confuse the example tone for an instruction.

<role>
You are a senior account executive at Acme.
</role>

<context>
The user is a free-tier customer who has been on the platform for 14 days.
They have not yet invited a teammate.
</context>

<instructions>
1. Recommend they invite a teammate to unlock collaboration features.
2. Use a warm, peer-to-peer tone — not a sales pitch.
3. Never make claims about features that aren't in <feature_list>.
</instructions>

<feature_list>
{{features_csv}}
</feature_list>

<conversation>
{{conversation_so_far}}
</conversation>

Use consistent tag names across the flow’s actors so the model can pick up patterns. Nest tags when there’s a natural hierarchy (e.g. <documents> containing <document index="n">).

5. Give the actor a role

Setting a role focuses behavior and tone fast. Even one sentence helps.

You are a calm, factual technical support agent who specializes in API authentication issues.
You explain things in concrete terms and never speculate about features you can't verify.

In CogniAgent, the actor’s Name and Description fields combine with the system prompt to set role. Spend a minute on them — they show up in the conversation UI and in handoff messages between actors, so they’re not just metadata.

6. Long-context prompting

For Ask AI, Call AI Agent, or actors that read long documents (20k+ tokens), structure matters:

Put the long content at the top, queries at the bottom. Queries placed after the documents can improve quality by up to 30% in Anthropic’s tests.
Wrap each document in <document> tags with <source> and <document_content> subtags.
Ask the model to quote first. For long-doc tasks, prompt it to extract relevant quotes into <quotes> tags before reasoning. This cuts through noise.

<documents>
  <document index="1">
    <source>annual_report_2024.pdf</source>
    <document_content>
      {{ANNUAL_REPORT}}
    </document_content>
  </document>
  <document index="2">
    <source>competitor_analysis_q2.xlsx</source>
    <document_content>
      {{COMPETITOR_ANALYSIS}}
    </document_content>
  </document>
</documents>

Find quotes from the documents relevant to Q3 strategy and place them in <quotes>
tags. Then, based on those quotes, recommend three Q3 focus areas in <recommendations> tags.

Output and formatting

Control verbosity

Claude Opus 4.7 calibrates response length to perceived task complexity. If your channel needs a specific shape, tell it. To make outputs shorter:

Provide concise, focused responses. Skip non-essential context, and keep examples minimal.

To make outputs longer / more thorough:

Provide a complete answer with relevant context. If there are multiple aspects to the
question, address each one. Don't truncate explanations to be brief.

Tell the model what to do, not what not to do

This generalizes everywhere. Positive instructions are followed more reliably than negative ones.

Instead of	Try
”Do not use markdown"	"Write in flowing prose paragraphs with no bullet points or headings"
"Don’t be overly formal"	"Use a warm, conversational tone, like a teammate explaining over coffee"
"Avoid hallucinating data"	"Cite the specific knowledge-base article or tool result you used. If you don’t have a source, say so”

Channel-shaped output

CogniAgent already injects channel-aware steering for some surfaces — phone actors get TTS-friendly guidance; Telegram actors get a MarkdownV2 nudge (Telegram channel). For other channels, lean on the same pattern:

This actor replies in our embeddable Web Widget. Keep responses to 1–3 short
paragraphs. Markdown is supported (links, bold, italics). Long bulleted lists
look cluttered in the widget — use them only when comparing 3+ items.

Match the prompt style to the desired output style

If you write the prompt in dense markdown with deeply nested bullets, the output drifts that way. If you want flowing prose out, write the prompt in flowing prose.

Tool use, capabilities, and proactive behavior

CogniAgent actors get capabilities (workflow apps, knowledge bases, hand-off, etc.). The model’s decision to call a capability vs. just talk about it is steerable.

Be explicit when you want action

“Can you suggest some changes?” → the actor suggests. “Make these changes.” → the actor acts. For an actor whose job is to do things on the user’s behalf, set the bar in the system prompt:

<default_to_action>
When the user describes a clear intent, complete it using the available capabilities
rather than only describing what could be done. If intent is ambiguous, ask one
clarifying question and then act.
</default_to_action>

For an actor whose job is to advise (compliance, security, financial planning), bias the other way:

<do_not_act_before_instructions>
Do not invoke a capability that changes state (book, cancel, send, charge) unless
the user has explicitly confirmed they want that action taken. Default to providing
information and recommendations until you receive an unambiguous go-ahead.
</do_not_act_before_instructions>

Avoid over-aggressive language

Older prompts often used CRITICAL: You MUST always.... With current models, that can cause overtriggering — calling tools or running checks when a simpler answer would do. Normal prompting language (“Use this capability when…”) works better.

Encourage parallel tool calls when independent

For research-style actors that may need to consult multiple sources:

<use_parallel_tool_calls>
When you need to call multiple capabilities and the calls don't depend on each
other's results, invoke them in parallel. For example, when looking up three
different KB articles, call all three searches at once. If a call's parameters
depend on a previous call's result, run them sequentially.
</use_parallel_tool_calls>

Thinking and reasoning

Claude’s latest models use adaptive thinking — they decide when and how much to deliberate. You don’t typically configure thinking from a flow prompt, but you can nudge it.

Ask for self-verification

For accuracy-critical actors (medical triage, legal research, code review):

Before you finalize your answer, verify it against the source material you consulted.
If you find a contradiction, revise your answer and note what changed. If you don't
have enough source material, ask for it instead of guessing.

Manual chain-of-thought when thinking is off

For lightweight actors running on Haiku 4.5 with no extended thinking:

For multi-step questions, work through the problem in <reasoning> tags first,
then give your final user-facing answer in <answer> tags. The reasoning is for
you; only the <answer> content is shown to the user.

The flow’s renderer can strip the <reasoning> block, leaving only the answer.

Don’t over-prompt reasoning

A short, general nudge (“think this through carefully”) often beats a prescriptive step-by-step plan. The model’s natural reasoning frequently exceeds what a human would script. Avoid filling the system prompt with a 12-step decision tree unless you genuinely need that exact tree.

When extended thinking is disabled, Claude Opus 4.5 is sensitive to the literal word “think.” If you see weird behavior, swap “think” for “consider,” “evaluate,” or “reason through.”

Agentic patterns

These apply most when you’re using Call AI Agent inside a workflow, or when an actor runs many turns autonomously.

State tracking and incremental progress

For long-running agentic tasks, ask the actor to keep structured state and emphasize incremental progress over heroic one-shot attempts:

This is a multi-step task. Track your progress in `progress.json` with this shape:

{
  "completed": ["step-1", "step-2"],
  "in_progress": "step-3",
  "blocked": [],
  "next": "step-4"
}

After each step, update `progress.json` before moving on. If you get stuck, write
the obstacle to `blocked` and continue with the next unblocked step rather than
spinning on the current one.

Balance autonomy and safety

By default, capable agentic models may take irreversible actions (delete, force-push, send, charge) without confirmation. For agents that touch shared systems:

Consider the reversibility and blast radius of each action. Local, reversible
actions (drafts, reads, tests) you may take freely. For destructive, irreversible,
or externally-visible actions — sending messages, charging cards, deleting records,
modifying shared infrastructure — confirm with the user before proceeding.

Reduce overengineering

When a coding-style actor over-elaborates (adds files, abstractions, validators no one asked for):

Keep solutions minimal and focused on what was asked.

- Scope: Don't add features, refactor, or "improve" code that isn't part of the task.
- Documentation: Don't add comments to code you didn't change.
- Defensive coding: Don't add validation for scenarios that can't happen.
  Trust internal callers; only validate at system boundaries.
- Abstractions: Don't create helpers for one-time operations. Don't design for
  hypothetical future requirements.

Minimize hallucinations

For knowledge-base-backed actors that occasionally invent details:

<investigate_before_answering>
Never speculate about facts you haven't looked up. If the user references a specific
policy, document, or customer record, search for it first using the available
capabilities. If you can't find it, say so explicitly rather than guessing.
</investigate_before_answering>

Replacements for response prefilling

Older Claude models supported prefill — putting words into the assistant’s mouth to force a particular start. Claude 4.6+ no longer supports prefill on the last assistant turn. Use these alternatives instead:

Old prefill use	Modern replacement
Force JSON / YAML output	Use Structured Outputs with a schema, or ask for the structure and validate with a downstream Resolve Value node
Skip `"Here is the requested summary:"` preamble	Add: `Respond directly without preamble. Do not start with phrases like 'Here is...', 'Based on...', etc.`
Steer around unwanted refusals	Clear prompting in the user/system message — modern Claude refuses appropriately without prefill help
Continue a truncated response	Pass the previous response into a new user turn: `Your previous response was cut off. Continue from where you stopped: [previous_text]`
Inject mid-conversation reminders	Inject them as user-turn context or as tool results, not as fake assistant messages

Chaining prompts

For complex tasks that span multiple decisions, splitting one big prompt into a chain of smaller prompts is often more reliable than asking one actor to do everything. In CogniAgent, you have two natural chaining patterns:

Multi-actor flows. Hand off between actors at well-defined boundaries. One actor triages, another resolves, a third confirms. Each actor gets a tightly scoped prompt instead of a megasystem-prompt that does it all.
Multi-node workflows. Use Ask AI / Call AI Agent / Resolve Value nodes in sequence: generate a draft → critique it → refine. Each node’s output is structured input to the next.

The most common chain is self-correction: draft → review against criteria → refine. Worth the latency cost for high-stakes outputs (customer-facing emails, generated SQL, anything legally consequential).

A re-usable actor prompt template

Use this as a starting skeleton when you’re configuring a new actor. Fill in the bracketed pieces.

<role>
You are [role: e.g. a senior customer success manager at Acme].
</role>

<context>
[Who the user typically is. What channel this actor lives on. What success looks like.]
[Any non-obvious constraints — language, jurisdiction, compliance, tone.]
</context>

<instructions>
1. [First thing to do — usually greet / acknowledge.]
2. [Core job — what this actor exists for.]
3. [When to use which capability or knowledge base.]
4. [When to hand off to another actor or to a human.]
5. [How to end the turn.]
</instructions>

<style>
- Tone: [warm / professional / playful / clinical]
- Length: [one paragraph / 1–3 short paragraphs / as long as needed]
- Format: [plain prose / markdown / structured fields]
- Do: [positive examples of style choices]
</style>

<examples>
  <example>
    <user_message>[realistic user input]</user_message>
    <response>[ideal actor response]</response>
  </example>
  <example>
    <user_message>[edge case]</user_message>
    <response>[ideal handling]</response>
  </example>
</examples>

<guardrails>
- Never [hard "no" — e.g. quote prices not in the official price sheet].
- If the user [trigger condition], [action — e.g. hand off to a human].
- For destructive or irreversible actions, confirm before acting.
</guardrails>

Common pitfalls (and what to do instead)

Pitfall	Symptom	Fix
Vague instructions	Inconsistent, drifty answers	Be specific. State the desired format, length, tone, and edge-case handling.
All negatives, no positives	Model finds creative ways to break the rule	Replace each “don’t X” with “do Y instead.”
Wall-of-text system prompt	Model ignores half of it; high latency	Split by purpose into XML-tagged sections. Cut anything not load-bearing.
Examples that all look the same	Model overfits and gives one-note answers	Diversify examples — different lengths, tones, edge cases.
Over-aggressive `CRITICAL: MUST`	Tool overtriggering, anxious behavior	Use calm, normal language. The model already takes instructions seriously.
Asking for action with “could you maybe”	Model suggests instead of doing	Use imperatives: “Do X.” “Make these changes.” “Send the email.”
Telling the model how to think step by step	Reasoning becomes brittle and shallow	Ask for thoroughness in general terms; trust the model’s own planning.
Long documents at the end of the prompt	Model misses key details	Put long content first; queries last.
No channel hint	Output looks fine in widget, broken on phone	Tell the actor which channel it’s on so it can shape output accordingly.

Iterating on a prompt

Treat prompts as software. The fastest improvement loop:

Pick 5–10 real conversations

Include the awkward ones — ambiguity, edge cases, the customer who never finishes a sentence.

Run them in test mode

Use Test a flow with the current prompt. Capture outputs.

Score each output

Tone, accuracy, format, action taken. Note the worst failure.

Change one thing

Add one example, tweak one instruction, clarify one ambiguity. Not five things at once.

Re-run and compare

If the worst failure improved without regressing the others, keep the change. If something else got worse, revert and try a different lever.

Save the test conversations in a file (or as a Conversation Flow snapshot) so you can re-run them after any prompt change. Prompts that look better often regress on cases you forgot about.

Configure an actor

Where the system prompt lives in the actor configuration UI.

Capabilities

Give actors workflow apps, KB search, and hand-off tools.

Ask AI node

One-shot prompting inside a workflow.

Anthropic's full guide

The source material this guide is condensed from.

​The golden rule

​Foundations

​1. Be clear and direct

​2. Add context to improve performance

​3. Use examples (few-shot prompting)

​4. Structure prompts with XML tags

​5. Give the actor a role

​6. Long-context prompting

​Output and formatting

​Control verbosity

​Tell the model what to do, not what not to do

​Channel-shaped output

​Match the prompt style to the desired output style

​Tool use, capabilities, and proactive behavior

​Be explicit when you want action

​Avoid over-aggressive language

​Encourage parallel tool calls when independent

​Thinking and reasoning

​Ask for self-verification

​Manual chain-of-thought when thinking is off

​Don’t over-prompt reasoning

​Agentic patterns

​State tracking and incremental progress

​Balance autonomy and safety

​Reduce overengineering

​Minimize hallucinations

​Replacements for response prefilling

​Chaining prompts

​A re-usable actor prompt template

​Common pitfalls (and what to do instead)

​Iterating on a prompt

​See also

Configure an actor

Capabilities

Ask AI node

Anthropic's full guide

The golden rule

Foundations

1. Be clear and direct

2. Add context to improve performance

3. Use examples (few-shot prompting)

4. Structure prompts with XML tags

5. Give the actor a role

6. Long-context prompting

Output and formatting

Control verbosity

Tell the model what to do, not what not to do

Channel-shaped output

Match the prompt style to the desired output style

Tool use, capabilities, and proactive behavior

Be explicit when you want action

Avoid over-aggressive language

Encourage parallel tool calls when independent

Thinking and reasoning

Ask for self-verification

Manual chain-of-thought when thinking is off

Don’t over-prompt reasoning

Agentic patterns

State tracking and incremental progress

Balance autonomy and safety

Reduce overengineering

Minimize hallucinations

Replacements for response prefilling

Chaining prompts

A re-usable actor prompt template

Common pitfalls (and what to do instead)

Iterating on a prompt

See also