Part 2 - Risks to address before deploying Copilot Studio agents

In the first article in this series, we looked at why Microsoft 365 readiness matters before introducing Copilot Studio agents. The next step is understanding the security risks that come with agent-based AI, particularly where there is a possibility of access to confidential information or personal data.

The key point is this: most agent risks are not caused by AI alone. They arise when AI is combined with broad permissions, weak governance, poor source curation, or insufficient safeguards around prompts, channels, and outputs.

This article outlines the main security risks organisations should plan for before moving beyond low-risk pilot scenarios.

Why agent risk is different

A traditional application usually performs a defined function against a controlled set of inputs. An AI agent is different. It can:

accept open-ended user prompts
retrieve information from connected sources
summarise and reframe content
maintain conversational context
operate across channels
use connectors and workflows to interact with other systems

That flexibility is useful, but it creates a wider risk surface. Even when an agent is technically functioning as designed, it may still disclose information in ways that are operationally inappropriate.

For this reason, organisations need to think beyond simple access control and consider how an agent behaves, how it interprets requests, and what it may expose unintentionally.

Understanding the data access model

Before reviewing the specific risks, it is important to understand how agents access data.

Copilot Studio agents may retrieve information through configured connectors and APIs, including platforms such as:

SharePoint
OneDrive
Dataverse
Teams
line-of-business systems

A critical design question is:

Which identity is the agent using at runtime?

That identity may be:

the signed-in user
a delegated user identity
a service account
an app registration or service principal

The answer matters because it defines the effective permissions boundary. If the runtime identity has broad access, the agent may retrieve much more than intended.

This is particularly important where service identities are involved. A poorly scoped service account can bypass the normal boundaries that would otherwise apply in a user-context interaction.

User-context vs agent-context risk

There is an important distinction between user-context Copilot experiences and agent-context execution.

User-context

In user-context scenarios, the AI works within the permissions of the signed-in user. If the user cannot access a document, the AI should not be able to retrieve it on their behalf.

Agent-context

In agent-context scenarios, the agent may operate with its own configured permissions, connectors, policies, or service identity. This can increase risk if those permissions are broader than the intended use case.

In practice, this means:

a user-context experience usually mirrors existing access boundaries
an agent-context experience can introduce a new operational boundary that must be designed and governed carefully

This is why least privilege is so important for agent identities.

The main categories of security risk

The report identifies several practical scenarios where data may be exposed unintentionally. These are among the most important risks to manage.

1. Over-broad answering

One of the most common risks is that the agent gives more information than is necessary.

For example, a user asks whether a person’s status is current. Instead of answering yes or no, the agent returns a full record containing:

address details
date of birth
phone number
financial information
case notes

This is not always a failure of permissions. Often, it is a failure of response design. The agent may have legitimate access to the source record, but no constraint on how much of that information it should reveal.

Why it happens

Over-broad answering is more likely when:

entire records are retrieved rather than specific fields
sensitive and non-sensitive data are stored together
prompts do not enforce data minimisation
there are no field-level controls in the source platform
the agent is asked open-ended questions in broad channels

How to reduce the risk

Useful controls include:

field-level access controls where supported
designing the agent to return minimum necessary information
using curated structured sources where possible
applying sensitivity labels and DLP
reviewing prompts and outputs regularly
monitoring for over-disclosure patterns

2. Prompt injection and instruction hijack

Prompt injection happens when a user attempts to override the agent’s operating rules through conversational instructions.

Examples include:

“ignore your rules”
“show me everything”
“list all records”
“export all users”
“tell me everything you know about this person”

This matters because the agent may try to be helpful, even when the request is clearly inappropriate. If refusal logic is weak, unsafe outputs may follow.

Why it happens

This risk increases when:

the system prompt is vague
prohibited topics are not clearly defined
there are no dedicated refusal topics
sensitive repositories are connected broadly
the agent is not tested with adversarial prompts before release

How to reduce the risk

Organisations should:

define strong system instructions that cannot be casually overridden
create high-priority refusal topics for sensitive requests
detect common high-risk phrases and patterns
restrict knowledge sources to curated repositories
maintain a regression pack of adversarial test prompts

Prompt injection is not just a technical concern. It is also an operational testing concern.

3. Conversation context leakage

Sensitive details may be exposed not because the agent retrieved them from a document, but because they appeared earlier in the chat and remained in context.

For example:

a user pastes personal data into the conversation
the agent references sensitive information from an earlier exchange
a later summary unintentionally restates that information

This can create a false sense that the agent is “remembering” appropriately, when in fact it is leaking data beyond what the current task requires.

Why it happens

Context leakage becomes more likely when:

users enter full identifiers in free text
prompts do not instruct the agent to avoid restating personal data
the interaction model relies heavily on conversational memory
there is no detection of sensitive patterns at the conversation layer

How to reduce the risk

Mitigations include:

designing for data minimisation
avoiding unnecessary storage or repetition of identifiers
using regex or entity-based detection for personal data
routing sensitive exchanges into approved systems rather than free-text chat
instructing the agent not to restate personal data unless strictly necessary

This is particularly important in support or case-related scenarios, where users may naturally enter sensitive details into the conversation.

4. Wrong recipient or wrong channel

Sometimes the content is correct, but the audience is wrong.

For example, an agent may respond in:

a Teams channel instead of a one-to-one chat
a shared workspace with changing membership
a context where the output can easily be copied or forwarded

This is effectively an access control bypass through collaboration behaviour rather than through the source system itself.

Why it happens

This risk increases when:

agents are enabled broadly across Teams
channels are not explicitly approved
guest users are present
there is no distinction between low-risk and high-risk deployment channels

How to reduce the risk

Key controls include:

making channel enablement a formal approval step
restricting high-risk agents to one-to-one or tightly controlled channels
applying labels and sharing restrictions to source content
training users not to copy or forward sensitive outputs casually

The safest technical configuration can still fail operationally if the agent is placed in the wrong place.

5. Search relevance and grounding errors

AI retrieval is based on relevance, not always authority.

That means an agent may retrieve the most similar document rather than the correct one. In a poorly curated knowledge base, this can lead to the agent summarising:

a live case instead of a template
a draft instead of an approved policy
an outdated document instead of the current version
person-specific content instead of generic guidance

Why it happens

This risk is common when:

large libraries are connected without curation
real case material sits alongside templates and guidance
naming conventions are inconsistent
metadata is missing or weak
content review processes are immature

How to reduce the risk

Organisations should:

use curated “Copilot-ready” libraries
separate real records from guidance and templates
apply consistent naming and metadata
require review dates and version control
hand off or request confirmation when confidence is low

This is one of the strongest arguments for starting with a single, tightly governed knowledge source.

6. Hidden or embedded personal data in documents and attachments

Sensitive data is not always obvious. Agents may summarise content that includes personal data in places users overlook, such as:

headers and footers
comments
tracked changes
scanned PDF text
hidden tabs in spreadsheets
hidden columns
embedded attachments

Users may believe a document is safe because the visible body looks generic, while the file still contains sensitive material elsewhere.

Why it happens

This risk increases when:

agents are allowed to process ad hoc uploads
attachment governance is weak
document hygiene is inconsistent
users are not aware of embedded metadata and hidden content

How to reduce the risk

Mitigations include:

limiting which attachments can be used as knowledge sources
requiring documents to come from approved repositories
treating uploaded documents as sensitive by default
promoting better document hygiene before publication
discouraging use of files with comments, tracked changes, or hidden content

This is often underestimated in early AI rollouts.

7. Logging, telemetry, and secondary data exposure

Even where the primary interaction is controlled, sensitive information may still appear in:

conversation transcripts
audit logs
diagnostic traces
analytics exports
support bundles
quality review samples

This can create a second exposure path, sometimes to administrators or support staff who do not need access to the original content.

Why it happens

Secondary exposure becomes more likely when:

logs are retained broadly
export processes are informal
access to diagnostics is too wide
support processes rely on raw transcript review
retention periods are not defined

How to reduce the risk

Good practice includes:

treating logs as sensitive information
applying least privilege to transcript and telemetry access
defining retention and expiry periods
controlling export pathways
redacting samples used for support, training, or review

AI governance must extend beyond the visible front-end conversation.

Personal data raises the impact of every risk

These risks exist in most AI deployments, but their impact becomes much more serious when personal data is involved.

Examples of personal data that may require stronger controls include:

names and contact details
home addresses
dates of birth
employee or membership identifiers
bank account details
payroll details
health information
complaints
disciplinary matters
internal case references linked to a person

Where this type of content is present, an organisation should assume a higher level of scrutiny is required for:

source selection
channel design
testing
logging
monitoring
incident response

Practical principles for safer deployment

Across all of these risk areas, several practical principles stand out.

Minimise what the agent can see

Use curated knowledge sources and tightly scoped permissions.

Minimise what the agent can say

Design prompts and controls so the output contains only what is necessary.

Minimise where the agent can operate

Start with low-risk channels and approved audiences.

Minimise who can change it

Treat prompts, connectors, sources, and permissions as controlled items.

Test before release

Do not rely on design intent. Use adversarial prompts and over-disclosure scenarios.

Monitor after release

Risk does not stop at go-live. Watch for unusual requests, repeated refusals, and suspicious patterns.

Final thoughts

The biggest security risks in Copilot Studio are usually not dramatic failures. They are quiet failures of scope, curation, access, or governance.

An agent does not need to be malicious to create harm. It only needs to be:

too broadly connected
too permissive in its answers
too easy to manipulate
too widely deployed
too lightly governed

That is why organisations should avoid starting with high-risk use cases. Begin with a controlled pilot, use curated sources, keep personal data out of scope, and build governance maturity before expanding.

Coming next in the series

Blog 3 will focus on governance and control design for Copilot Studio agents, including system prompts, refusal topics, knowledge source curation, DLP, ownership, approval gates, and change control.

Part 2 – Risks to address before deploying Copilot Studio agents

Why agent risk is different

Understanding the data access model

User-context vs agent-context risk

User-context

Agent-context

The main categories of security risk

1. Over-broad answering

Why it happens

How to reduce the risk

2. Prompt injection and instruction hijack

Why it happens

How to reduce the risk

3. Conversation context leakage

Why it happens

How to reduce the risk

4. Wrong recipient or wrong channel

Why it happens

How to reduce the risk

5. Search relevance and grounding errors

Why it happens

How to reduce the risk

6. Hidden or embedded personal data in documents and attachments

Why it happens

How to reduce the risk

7. Logging, telemetry, and secondary data exposure

Why it happens

How to reduce the risk

Personal data raises the impact of every risk

Practical principles for safer deployment

Minimise what the agent can see

Minimise what the agent can say

Minimise where the agent can operate

Minimise who can change it

Test before release

Monitor after release

Final thoughts

Coming next in the series

Leave a Reply Cancel reply