Part 2 – Risks to address before deploying Copilot Studio Agents

Posted by

In the first article in this series, we looked at why Microsoft 365 readiness matters before introducing Copilot Studio agents. The next step is understanding the security risks that come with agent-based AI, particularly where there is a possibility of access to confidential information or personal data.

The key point is this: most agent risks are not caused by AI alone. They arise when AI is combined with broad permissions, weak governance, poor source curation, or insufficient safeguards around prompts, channels, and outputs.

This article outlines the main security risks organisations should plan for before moving beyond low-risk pilot scenarios.

Why agent risk is different

A traditional application usually performs a defined function against a controlled set of inputs. An AI agent is different. It can:

  • accept open-ended user prompts
  • retrieve information from connected sources
  • summarise and reframe content
  • maintain conversational context
  • operate across channels
  • use connectors and workflows to interact with other systems

That flexibility is useful, but it creates a wider risk surface. Even when an agent is technically functioning as designed, it may still disclose information in ways that are operationally inappropriate.

For this reason, organisations need to think beyond simple access control and consider how an agent behaves, how it interprets requests, and what it may expose unintentionally.

Understanding the data access model

Before reviewing the specific risks, it is important to understand how agents access data.

Copilot Studio agents may retrieve information through configured connectors and APIs, including platforms such as:

  • SharePoint
  • OneDrive
  • Dataverse
  • Teams
  • line-of-business systems

A critical design question is:

Which identity is the agent using at runtime?

That identity may be:

  • the signed-in user
  • a delegated user identity
  • a service account
  • an app registration or service principal

The answer matters because it defines the effective permissions boundary. If the runtime identity has broad access, the agent may retrieve much more than intended.

This is particularly important where service identities are involved. A poorly scoped service account can bypass the normal boundaries that would otherwise apply in a user-context interaction.

User-context vs agent-context risk

There is an important distinction between user-context Copilot experiences and agent-context execution.

User-context

In user-context scenarios, the AI works within the permissions of the signed-in user. If the user cannot access a document, the AI should not be able to retrieve it on their behalf.

Agent-context

In agent-context scenarios, the agent may operate with its own configured permissions, connectors, policies, or service identity. This can increase risk if those permissions are broader than the intended use case.

In practice, this means:

  • a user-context experience usually mirrors existing access boundaries
  • an agent-context experience can introduce a new operational boundary that must be designed and governed carefully

This is why least privilege is so important for agent identities.

The main categories of security risk

The report identifies several practical scenarios where data may be exposed unintentionally. These are among the most important risks to manage.

1. Over-broad answering

One of the most common risks is that the agent gives more information than is necessary.

For example, a user asks whether a person’s status is current. Instead of answering yes or no, the agent returns a full record containing:

  • address details
  • date of birth
  • phone number
  • financial information
  • case notes

This is not always a failure of permissions. Often, it is a failure of response design. The agent may have legitimate access to the source record, but no constraint on how much of that information it should reveal.

Why it happens

Over-broad answering is more likely when:

  • entire records are retrieved rather than specific fields
  • sensitive and non-sensitive data are stored together
  • prompts do not enforce data minimisation
  • there are no field-level controls in the source platform
  • the agent is asked open-ended questions in broad channels

How to reduce the risk

Useful controls include:

  • field-level access controls where supported
  • designing the agent to return minimum necessary information
  • using curated structured sources where possible
  • applying sensitivity labels and DLP
  • reviewing prompts and outputs regularly
  • monitoring for over-disclosure patterns

2. Prompt injection and instruction hijack

Prompt injection happens when a user attempts to override the agent’s operating rules through conversational instructions.

Examples include:

  • “ignore your rules”
  • “show me everything”
  • “list all records”
  • “export all users”
  • “tell me everything you know about this person”

This matters because the agent may try to be helpful, even when the request is clearly inappropriate. If refusal logic is weak, unsafe outputs may follow.

Why it happens

This risk increases when:

  • the system prompt is vague
  • prohibited topics are not clearly defined
  • there are no dedicated refusal topics
  • sensitive repositories are connected broadly
  • the agent is not tested with adversarial prompts before release

How to reduce the risk

Organisations should:

  • define strong system instructions that cannot be casually overridden
  • create high-priority refusal topics for sensitive requests
  • detect common high-risk phrases and patterns
  • restrict knowledge sources to curated repositories
  • maintain a regression pack of adversarial test prompts

Prompt injection is not just a technical concern. It is also an operational testing concern.

3. Conversation context leakage

Sensitive details may be exposed not because the agent retrieved them from a document, but because they appeared earlier in the chat and remained in context.

For example:

  • a user pastes personal data into the conversation
  • the agent references sensitive information from an earlier exchange
  • a later summary unintentionally restates that information

This can create a false sense that the agent is “remembering” appropriately, when in fact it is leaking data beyond what the current task requires.

Why it happens

Context leakage becomes more likely when:

  • users enter full identifiers in free text
  • prompts do not instruct the agent to avoid restating personal data
  • the interaction model relies heavily on conversational memory
  • there is no detection of sensitive patterns at the conversation layer

How to reduce the risk

Mitigations include:

  • designing for data minimisation
  • avoiding unnecessary storage or repetition of identifiers
  • using regex or entity-based detection for personal data
  • routing sensitive exchanges into approved systems rather than free-text chat
  • instructing the agent not to restate personal data unless strictly necessary

This is particularly important in support or case-related scenarios, where users may naturally enter sensitive details into the conversation.

4. Wrong recipient or wrong channel

Sometimes the content is correct, but the audience is wrong.

For example, an agent may respond in:

  • a Teams channel instead of a one-to-one chat
  • a shared workspace with changing membership
  • a context where the output can easily be copied or forwarded

This is effectively an access control bypass through collaboration behaviour rather than through the source system itself.

Why it happens

This risk increases when:

  • agents are enabled broadly across Teams
  • channels are not explicitly approved
  • guest users are present
  • there is no distinction between low-risk and high-risk deployment channels

How to reduce the risk

Key controls include:

  • making channel enablement a formal approval step
  • restricting high-risk agents to one-to-one or tightly controlled channels
  • applying labels and sharing restrictions to source content
  • training users not to copy or forward sensitive outputs casually

The safest technical configuration can still fail operationally if the agent is placed in the wrong place.

5. Search relevance and grounding errors

AI retrieval is based on relevance, not always authority.

That means an agent may retrieve the most similar document rather than the correct one. In a poorly curated knowledge base, this can lead to the agent summarising:

  • a live case instead of a template
  • a draft instead of an approved policy
  • an outdated document instead of the current version
  • person-specific content instead of generic guidance

Why it happens

This risk is common when:

  • large libraries are connected without curation
  • real case material sits alongside templates and guidance
  • naming conventions are inconsistent
  • metadata is missing or weak
  • content review processes are immature

How to reduce the risk

Organisations should:

  • use curated “Copilot-ready” libraries
  • separate real records from guidance and templates
  • apply consistent naming and metadata
  • require review dates and version control
  • hand off or request confirmation when confidence is low

This is one of the strongest arguments for starting with a single, tightly governed knowledge source.

6. Hidden or embedded personal data in documents and attachments

Sensitive data is not always obvious. Agents may summarise content that includes personal data in places users overlook, such as:

  • headers and footers
  • comments
  • tracked changes
  • scanned PDF text
  • hidden tabs in spreadsheets
  • hidden columns
  • embedded attachments

Users may believe a document is safe because the visible body looks generic, while the file still contains sensitive material elsewhere.

Why it happens

This risk increases when:

  • agents are allowed to process ad hoc uploads
  • attachment governance is weak
  • document hygiene is inconsistent
  • users are not aware of embedded metadata and hidden content

How to reduce the risk

Mitigations include:

  • limiting which attachments can be used as knowledge sources
  • requiring documents to come from approved repositories
  • treating uploaded documents as sensitive by default
  • promoting better document hygiene before publication
  • discouraging use of files with comments, tracked changes, or hidden content

This is often underestimated in early AI rollouts.


7. Logging, telemetry, and secondary data exposure

Even where the primary interaction is controlled, sensitive information may still appear in:

  • conversation transcripts
  • audit logs
  • diagnostic traces
  • analytics exports
  • support bundles
  • quality review samples

This can create a second exposure path, sometimes to administrators or support staff who do not need access to the original content.

Why it happens

Secondary exposure becomes more likely when:

  • logs are retained broadly
  • export processes are informal
  • access to diagnostics is too wide
  • support processes rely on raw transcript review
  • retention periods are not defined

How to reduce the risk

Good practice includes:

  • treating logs as sensitive information
  • applying least privilege to transcript and telemetry access
  • defining retention and expiry periods
  • controlling export pathways
  • redacting samples used for support, training, or review

AI governance must extend beyond the visible front-end conversation.


Personal data raises the impact of every risk

These risks exist in most AI deployments, but their impact becomes much more serious when personal data is involved.

Examples of personal data that may require stronger controls include:

  • names and contact details
  • home addresses
  • dates of birth
  • employee or membership identifiers
  • bank account details
  • payroll details
  • health information
  • complaints
  • disciplinary matters
  • internal case references linked to a person

Where this type of content is present, an organisation should assume a higher level of scrutiny is required for:

  • source selection
  • channel design
  • testing
  • logging
  • monitoring
  • incident response

Practical principles for safer deployment

Across all of these risk areas, several practical principles stand out.

Minimise what the agent can see

Use curated knowledge sources and tightly scoped permissions.

Minimise what the agent can say

Design prompts and controls so the output contains only what is necessary.

Minimise where the agent can operate

Start with low-risk channels and approved audiences.

Minimise who can change it

Treat prompts, connectors, sources, and permissions as controlled items.

Test before release

Do not rely on design intent. Use adversarial prompts and over-disclosure scenarios.

Monitor after release

Risk does not stop at go-live. Watch for unusual requests, repeated refusals, and suspicious patterns.

Final thoughts

The biggest security risks in Copilot Studio are usually not dramatic failures. They are quiet failures of scope, curation, access, or governance.

An agent does not need to be malicious to create harm. It only needs to be:

  • too broadly connected
  • too permissive in its answers
  • too easy to manipulate
  • too widely deployed
  • too lightly governed

That is why organisations should avoid starting with high-risk use cases. Begin with a controlled pilot, use curated sources, keep personal data out of scope, and build governance maturity before expanding.

Coming next in the series

Blog 3 will focus on governance and control design for Copilot Studio agents, including system prompts, refusal topics, knowledge source curation, DLP, ownership, approval gates, and change control.

If you want, I can draft Blog 3 next.

Leave a Reply

Your email address will not be published. Required fields are marked *