Microsoft 365 Copilot data security and privacy

"What does Copilot do with our data?" is the most common question from leadership and security teams when Microsoft 365 Copilot is on the table. The honest answer requires understanding the architecture, not just Microsoft's marketing.

The data flow, simplified

When a user asks Copilot something:

The prompt is sent from the Office app to the Microsoft 365 Copilot service in your tenant region.
The service queries the Microsoft Graph for relevant content the user has access to — files, emails, chats, meetings — using semantic retrieval.
Relevant chunks are added to the prompt as grounding context.
The augmented prompt is sent to a Microsoft-hosted LLM (currently a mix of Microsoft and OpenAI models).
The model generates a response.
The response, with citations to the grounded content, is sent back to the user.

The whole flow happens inside Microsoft's tenant boundary. The model itself doesn't see content unrelated to the user's prompt.

Where the data lives

Prompts and responses are stored in the user's mailbox in a hidden folder as part of their Microsoft 365 substrate. They're searchable in eDiscovery, Purview audit, and subject to retention.
Grounding content never leaves the tenant — Microsoft retrieves it, includes it in the prompt, but doesn't store it outside the tenant.
Foundation models are hosted in Azure, in tenant regions, with no cross-tenant model fine-tuning.

What Microsoft commits to

Microsoft's contractual commitments for Microsoft 365 Copilot:

Customer data is not used to train foundation models.
Customer data stays within the Microsoft 365 service boundary.
Customer data follows existing Microsoft 365 data residency, retention, and compliance commitments.
EU Data Boundary applies for EU tenants — data processing stays in the EU.
Customer Lockbox applies to Copilot interactions for tenants on E5.

These commitments are stronger than most third-party AI assistants. They're the reason regulated industries are deploying Copilot at all.

What permissions and labels affect

Copilot strictly respects:

SharePoint and OneDrive permissions — anything the user can't open, Copilot won't surface.
Sensitivity labels — encrypted files the user doesn't have rights to are excluded.
DLP policies — content with active DLP can be excluded.
Information barriers — users in segmented groups don't surface each other's content.
Retention policies — deleted-but-retained content is also excluded.

This means bad permissions = bad Copilot output. The product is doing the right thing; the source data is the problem.

What's audited

Purview audit logs Copilot interactions in detail:

Prompt and response text (configurable).
User and timestamp.
Files and content surfaced as grounding.
Apps and surfaces (Word, Teams, Outlook, etc.).

Compliance officers can search Copilot activity in eDiscovery the same way they search email or chat.

What still needs human judgement

Copilot is not a compliance shield. It will:

Surface oversharing that exists.
Generate content that may not be accurate.
Pass content into outbound channels via user actions.

Sensitivity labels, DLP, and oversharing remediation remain the user's responsibility. Microsoft provides the controls; the configuration is yours.