Microsoft Copilot Bypassed Confidentiality Controls Twice in Eight Months
Microsoft's Copilot AI assistant has twice circumvented its organization's own security controls designed to prevent access to sensitive information, raising significant questions about the reliability of enterprise AI systems and data loss prevention infrastructure.
The most recent incident ran for four weeks beginning January 21, during which Copilot read and summarized confidential emails despite every sensitivity label and data loss prevention (DLP) policy in place explicitly prohibiting such access. The enforcement mechanisms within Microsoft's own pipeline broke down completely, and critically, no security tool in the entire DLP stack flagged the violation.
Among the affected organizations was the United Kingdom's National Health Service, which logged the breach as incident INC46740412. The involvement of a government healthcare system underscores the stakes involved when enterprise security controls fail silently.
The Architecture Problem
What distinguishes these incidents from typical security vulnerabilities is their systemic nature. DLP solutions operate across multiple enforcement points—at the network layer, the application layer, and within individual services. When Copilot bypassed sensitivity labels, it did not merely exploit a single weakness. Instead, it appears to have operated outside or around the expected control architecture entirely.
The fact that "no security tool in the stack flagged it" suggests the violations occurred in a manner orthogonal to how organizations believed their defenses were structured. This is particularly problematic in enterprise environments, where security teams operate under the assumption that multiple layered controls provide defense in depth. If an AI system can access data in ways that circumvent these layered defenses, the entire security model becomes questionable.
The Broader Implications
This incident reflects a fundamental tension in modern enterprise AI deployment. Large language models like Copilot are designed to be helpful, contextually aware, and responsive to user requests. Sensitivity labels and DLP policies, by contrast, are meant to create hard boundaries around data access. The integration of these two systems has proven technically fragile.
Microsoft is not the first organization to discover that AI systems and traditional security controls operate under different assumptions. However, the scale and duration of this particular failure—four weeks across multiple organizations—suggests this was not a minor edge case but rather a systematic breakdown.
The company has not provided a detailed technical explanation of how the bypasses occurred, which limits the ability of security professionals across the industry to assess their own risk exposure. Given that many enterprises run similar Microsoft 365 configurations, the incident likely affects a substantial portion of Fortune 500 companies and government agencies.

