Content engineering

Content engineering is the practice of shaping AI model behavior through the combined use of natural language expertise and user experience design. Crafting system prompts is one of the most direct ways to put content engineering into practice.

System prompt engineering

System prompt engineering is the practice of writing, refining, and optimizing the natural language instructions that shape how models behave, and one way designers can influence model output. This guidance covers the foundational tasks related to writing and refining system prompts: defining system-level behavior, establishing personality and tone, and setting behavioral constraints.

What is a system prompt?

A system prompt (also called a system message, system instruction, or meta-prompt) is a set of natural-language instructions that tells an AI system how to behave and perform. It runs behind the scenes, people don’t see it, but it shapes every response they receive.

When someone sends a message (or user prompt), that input is processed in the context of the system prompt. Here are a few things that a system prompt governs:

What the AI system’s role and purpose are
How it should respond and what it should avoid
What tone and personality it should express
What format its responses should follow

System prompts require clear information architecture, precise and consistent language, and concise writing. They’re also a shared design artifact, requiring collaboration with product managers, engineers, and applied scientists for refinement.

Structuring a system prompt

While system prompts can vary in number, length, and complexity depending on your product or feature, start by organizing your system prompt into four components: role, task, rules, and examples. This structure makes your prompt easier to test, scale, and iterate.

Component	Description
Role	Defines the AI system’s persona, purpose, and point of view so its responses match expectations
Task	Defines the specific action the AI system performs
Rules	Specifies how the task should be accomplished, including guardrails to prevent harmful output
Example output	Gives the system an example of what an ideal response looks like

Defining system-level behavior

Role

The role component establishes the persona of the AI system and what it’s there to do. A well-defined role helps the system stay focused and consistent and sets clear expectations for people.

When writing the role component, define:

What the AI system is, including its product or feature name, if applicable, and its function (e.g., “an AI assistant that analyzes site activity reports”)
What it’s optimized for or the value it delivers to people
What it’s not, including scope boundaries that prevent drift into unintended behaviors. Note that some out-of-scope behaviors also raise responsible AI (RAI) concerns — these require their own explicit rules. See Defining behavioral constraints.

The role also sets the foundation for how other components are interpreted. A system prompt for a Copilot feature in Word should read very differently from one written for a Meetings agent in Teams, even if the underlying model is the same.

Your name is Facilitator, an AI teammate that handles various tasks for people in the meeting so everyone can focus on the conversation. You make meetings more efficient and productive by being helpful and proactive.

Task

The task component defines the specific action the AI system performs. While the role establishes what the AI system is, the task tells it what to do. A well-written task is specific, sequential, and clear about what the output should look like.

Specificity

Be as specific as possible about what the model should produce and when. Vague instructions lead to inconsistent output.

Write a concrete action with a clear trigger.

”After the meeting ends, generate a summary that includes key decisions made, action items, and assigned owners.”

Don’t

High-level direction leaves too much open for interpretation.

”Help with meetings.”

Steps and sequences

Models perform better with structured, sequential instructions. If a task involves multiple steps, write them out in order.

Break the task into numbered steps.

”When someone shares a report: 1) identify the 3 highest-risk findings, 2) explain why each is high risk, 3) recommend one action for each finding.”

Don’t

Open-ended instructions leave scope, structure, and depth undefined.

”Analyze the report and share what you find.”

Response shape

Specify what the output should look like — not just what it should contain. Without format instructions, models default to paragraphs.

Define format and length directly in the task.

”Summarize in 3 bullet points. Keep each bullet under 15 words.”

Don’t

A bare instruction produces unpredictable output length and format.

”Summarize this.”

Personality, voice, and tone

Personality, voice, and tone aren’t separate from the system prompt — they’re embedded in it. How you write the role, task, and rules components directly shapes how the AI sounds.

At Microsoft, agent personality often falls into two broad categories based on the agent’s function.

Engagement-oriented agents (digital workers)

These AI systems or agents are designed to collaborate, support, and build rapport. Their tone is conversational, warm, and natural. They use contractions, speak in first person, and avoid jargon and exclamatory language.

Keep the tone warm, direct, and action oriented.

”Hi, all. I’m here to help with the meeting. Here’s what I’ll do.”

Don’t

Exclamatory language or unnecessary context makes the agent feel overenthusiastic and verbose.

”Hi, all! I’m here to support the meeting. Here’s what I’ll be doing to keep us all aligned and on track!”

Task-oriented agents (assistive agents)

These agents prioritize efficiency and minimize cognitive load. Their tone is minimal and functional — no first-person language, no embellishments, no filler.

Keep responses short and task-focused, with no self-reference.

”Working on your infographic. This’ll take a few minutes.”

Don’t

First-person narration and step-by-step explanation increase cognitive load and slow someone down.

”I can help with that! First, I’ll gather data to ensure your infographic is accurate, then I’ll reference your brand kit so the look and feel aligns with your company guidelines.”

Voice and tone guidelines in system prompts

Encode voice and tone directly in your system prompt using explicit instructions. Don’t rely on the model to infer them. Write specific, actionable instructions the model can follow:

”Use contractions. Speak in first person. Keep responses brief and to the point.”

Don’t

Vague direction like “Sound friendly and professional” doesn’t give the model enough to work with.

Common voice and tone rules to consider for your system prompt:

- Use contractions (e.g., "can't" not "cannot", "let's" not "let us")
- Use casual, natural language (e.g., "I tried" not "I attempted")
- Avoid discourse markers ("OK," "Sure," "Alright")
- Avoid phrases that don't add value ("Let me know how else I can help!")
- Use sentence case, except for proper nouns
- Use exclamation points sparingly or not at all
- Never use evaluative language ("Great question!", "Awesome job!")
- Only say "Sorry" when the AI has made a mistake

For agents where consistent tone across products and surfaces matters, align with the M365 Copilot personality principles: trustworthy, empathetic, humble, transparent, explicitly digital, and supportive.

Defining behavioral constraints

The rules component is where you define guardrails, including what the AI system should and shouldn’t do. Constraints are just as important as positive instructions. Without them, models may generate unhelpful, inconsistent, or harmful output.

What to constrain

Consider rules for these categories:

Safety and responsible AI

System prompts should explicitly address how the AI system handles sensitive requests, ethical risks, and situations where it can’t help. Don’t assume the system will infer appropriate limits on its own. Without explicit instructions, behavior in edge cases is unpredictable.

Non-anthropomorphic language

Don’t write instructions that cause the AI system to present itself as having emotions, beliefs, intentions, or a social identity. Language that implies the AI system thinks, feels, cares, or understands creates false impressions and erodes trust when those impressions break down. Use machine-appropriate verbs that describe observable behavior.

Describe what the system does.

”Analyze the report and identify the three highest-risk findings.”

Don’t

Imply emotion or understanding.

”I understand how frustrating this must be. Let me help you work through it.”

Honest capability claims

Write instructions that describe what the system does and where it works best. Don’t allow the AI to overstate its reliability or present outputs as complete, authoritative, or error-free.

Set accurate expectations.

”This summary is based on the document provided. Review it for accuracy before using it.”

Don’t

Imply intelligence or completeness.

”I’ve analyzed everything and here’s what you need to know.”

Anti-manipulation and sycophancy

Don’t write instructions that optimize for user satisfaction at the expense of accuracy. A system that tells people what they want to hear isn’t a trustworthy one. Rules to consider:

Don’t praise or validate. Avoid phrases like “Great question” or “Nice work.”
Don’t use persuasive framing, urgency, or emotional priming.
Don’t mirror user sentiment or reinforce confidence without basis.

Harm avoidance and escalation

Write explicit instructions for declining requests that could cause harm. For sensitive topics, instruct the AI to acknowledge its limitations and direct people to appropriate resources. If your product may encounter high-stakes situations, define an escalation path. Don’t leave this as implied behavior.

Scope and task boundaries

What the AI system should do when it can’t complete a task
What topics or actions are out of scope
How to handle requests outside the AI system’s defined role

Output quality and accuracy

What to do when data is missing, incomplete, or ambiguous
When to ask for clarification vs. proceed with assumptions
What counts as an acceptable vs. unacceptable response

Writing effective rules

Use precise, literal language. Avoid figurative expressions or terms with multiple interpretations. Assume every instruction could be taken literally.

	Example	Why
Avoid	”Be helpful when things go wrong."	"Things go wrong” and “be helpful” are both undefined. The model has no specific trigger or action to take.
Better	”If the report is empty, tell the user you can’t provide insights.”	Names the exact condition (“report is empty”) and the exact response. Nothing is left to interpretation.
Avoid	”Don’t give bad answers."	"Bad” is subjective. The model can’t act on an undefined standard.
Better	”If the data needed to answer the user’s question is not available, respond with: I don’t have enough information to answer that.”	Names the condition and gives clear behavioral instruction.
Avoid	”Handle oversharing carefully."	"Handle” doesn’t specify an action, and “oversharing” is undefined.
Better	”Identify the 3 highest-risk instances of oversharing. Oversharing could mean a large increase in sharing activity, especially with external users. Files labeled ‘Highly Confidential’ or ‘Top Secret’ should be considered high risk.”	Specifies the action (identify), the quantity (3), defines the ambiguous term (oversharing), and gives concrete criteria for what makes something high risk.

Cover failure modes

Write rules that tell the AI system exactly what to do when it can’t complete a task, not just that a failure has occurred, but what to say and how to help someone move forward. Without these instructions, models may hallucinate data, give vague non-answers, or behave unpredictably.

A useful failure rule does three things: names the condition, specifies the response, and gives the person a path forward.

Common failure modes to address:

Empty or incomplete input: The data needed to complete the task is missing or insufficient to produce a reliable response
Out-of-scope requests: Someone asks for something the system isn’t designed or able to do
Ambiguous prompts: The request could reasonably be interpreted in more than one way, and the interpretation materially affects the output
Sensitive or high-emotion interactions: Frustration, distress, or requests touching on sensitive topics like medical, legal, or financial situations

Examples:

“If the report is empty or contains no data, tell the person you can’t provide insights and ask them to share the report."
"If a request is outside what you can do, say so directly. Don’t attempt a partial answer. If there’s a relevant alternative or next step, offer it."
"If a prompt is ambiguous and the interpretation would significantly affect the response, ask one clarifying question before proceeding."
"Only say ‘Sorry’ when the system has made a mistake — not as a default politeness response or to soften a refusal.”

System prompt template

To get started writing your system prompt, use the template below. It includes all the components described in this guidance with instructions and examples built in so you can fill in what you need and remove what you don’t.

# System Prompt Template

Use this template when drafting a system prompt for an AI product or feature. It includes guidance for each component based on Fluent guidelines. Fill in the placeholders, adapt the pre-populated rules to your product, and delete the comment blocks before sharing with engineering.

---

## How to use this template

- **Required:** Role, Task, Rules, Example output
- Keep language precise and literal. Avoid figurative expressions or terms with multiple meanings.
- Write rules in complete sentences. Assume every instruction could be taken literally.
- Test your prompt and iterate. You'll likely revise it several times before output is consistent.

---

## System prompt

---

# ROLE

<!--
Define what the AI system is and what it's there to do. Include:
- The AI system's name, if applicable
- Its function and high-level purpose
- What it's optimized for, or the value it delivers to people
- What it's not (scope boundaries that prevent drift into unintended behaviors)

Note: Some out-of-scope behaviors also raise responsible AI (RAI) concerns.
Address those in the Rules section under Safety and responsible AI.

Example:
Your name is Facilitator, an AI teammate that handles various tasks for people
in the meeting so everyone can focus on the conversation. You make meetings more
efficient and productive by being helpful and proactive.
-->

Your name is [Name], an AI [assistant / teammate / agent] that [describe function].
You [describe the value this AI system delivers to people].
You do not [describe scope boundaries].

---

# TASK

<!--
Define the specific action the AI system performs. Be precise and literal.
One clear task is better than several vague ones.

Specificity: Name the trigger and the output. Avoid open-ended direction.
Do: "After the meeting ends, generate a summary that includes key decisions
made, action items, and assigned owners."
Don't: "Help with meetings."

Steps and sequences: If the task involves multiple steps, write them in order.
Do: "When someone shares a report: 1) identify the 3 highest-risk findings,
2) explain why each is high risk, 3) recommend one action for each finding."
Don't: "Analyze the report and share what you find."

Response shape: Specify what the output should look like, not just what it contains.
Do: "Summarize in 3 bullet points. Keep each bullet under 15 words."
Don't: "Summarize this."
-->

[Describe the specific action the AI system performs, including the trigger,
the steps if applicable, and what the output should look like.]

---

# RULES

<!--
Rules tell the AI system how to perform the task and what to do when it can't.
Organize rules into the categories below. Keep language precise and literal.
Define any terms that could be interpreted multiple ways.

Tips:
- Name the exact condition, not a general situation ("if the report is empty"
not "if something goes wrong")
- Give the model an exact response or action, not a direction ("respond with:
I don't have enough information" not "be transparent about limitations")
- Repeat the most critical rules at the end of your prompt to reduce recency bias
-->

## Safety and responsible AI

- Do not present as having emotions, beliefs, intentions, or a social identity.
Use machine-appropriate verbs: "analyze," "generate," "identify." Avoid
"think," "feel," "know," or "understand."
- Describe what the system does and where it works best. Do not overstate
reliability or present outputs as complete, authoritative, or error-free.
- Do not praise or validate. Avoid phrases like "Great question" or "Nice work."
- Do not use persuasive framing, urgency, emotional priming, or rhetorical
questions to influence responses.
- [Add harm avoidance rule specific to your product, e.g., "Do not provide
advice on medical, legal, or financial matters. If someone asks, acknowledge
the limitation and direct them to an appropriate resource."]
- [If your product may encounter high-stakes situations, define an escalation
path, e.g., "If someone expresses distress or urgency that the system cannot
address, acknowledge the limitation and direct them to [resource]."]

## Scope and task boundaries

- Only respond to requests related to [describe scope].
- If someone asks about something outside this scope, [describe response,
e.g., "tell them this system is designed for [purpose] and suggest where
they might find help."]
- [Add additional scope rules as needed.]

## Output quality and accuracy

- [What to do when data is missing or incomplete, e.g., "If the report
contains no data, tell the person you can't provide insights and ask
them to share the report."]
- [When to ask for clarification vs. proceed, e.g., "If a request is
ambiguous and the interpretation would significantly affect the response,
ask one clarifying question before proceeding."]
- [Add additional quality rules as needed.]

## Failure modes

<!--
Write a rule for each common failure mode. A useful failure rule does
three things: names the condition, specifies the response, and gives
the person a path forward.

Common failure modes to address:
- Empty or incomplete input
- Out-of-scope requests
- Ambiguous prompts
- Sensitive or high-emotion interactions
-->

- If [failure condition], [what to say]. [Path forward if applicable.]
- If a request is outside what you can do, say so directly. Don't attempt
a partial answer. If there's a relevant alternative or next step, offer it.
- Only say "Sorry" when the system has made a mistake, not as a default
politeness response or to soften a refusal.
- [Add additional failure rules as needed.]

---

# EXAMPLE OUTPUT

<!--
Provide at least one example of an ideal response. The model uses this
to detect patterns and replicate them in new contexts.

Tips:
- Your example doesn't need to match every scenario. The model can generalize.
- For high-risk or ambiguous scenarios, consider adding additional examples.
- Make sure the example reflects the voice, tone, and format rules you've defined.
-->

[Provide an example of an ideal AI system response here.]