AI PrivacyFebruary 25, 2026· 11 min read

Shadow AI: your team is using tools you don't know about

Three tiers of shadow AI in 2026: the browser tab, the in-SaaS toggle, the OAuth-scoped agent. IBM puts the breach delta at $670K, Article 4 enforcement starts 2 August 2026, and a register beats a ban.

TLDR

Shadow AI in 2026 has three tiers, and only the first one looks like the thing most security teams imagine. Tier 1 is the browser tab (ChatGPT, Claude, Perplexity) on a personal account. Tier 2 is the AI toggle inside a SaaS tool you already approved (Microsoft 365 Copilot, Slack AI, Notion AI, Gemini in Workspace). Tier 3 is the AI agent with an OAuth scope into your data. Tier 2 is the new majority. Tier 3 is the one that scales with no-one watching.
IBM's 2025 Cost of a Data Breach report puts a number on it. 20% of breached organisations had a shadow AI incident, the average breach cost rose by $670,000 when shadow AI was involved, and 97% of organisations that suffered an AI-related breach lacked basic AI access controls. The detection delay was 247 days against a 241-day baseline.
EU AI Act Article 4 (AI literacy) has been in force since 2 February 2025. National market surveillance authorities start enforcing it on 2 August 2026. "We did not know our staff were using AI" stops being a defence on that date.
The fix is a register, not another acceptable use policy. One living list of every AI use you can name, with classification and a review cadence. Bans push usage underground. A register surfaces it and gives the regulator something to read.

Three numbers from 2025 that change what shadow AI means

Most articles about shadow AI start with the shadow IT analogy. That comparison was useful in 2023 and is misleading now, because three pieces of data from 2025 reframe the problem.

The first is from IBM's 2025 Cost of a Data Breach report. 20% of organisations in the study reported a breach where shadow AI was a contributing factor. The average breach cost when shadow AI was involved was $4.63 million, which is $670,000 above the global mean of $4.44 million. 97% of the organisations that reported an AI-related breach did not have basic AI access controls. 63% had no AI governance policy at all, or were still drafting one. The mean detection time for shadow-AI-involved breaches was 247 days against the 241-day global baseline. Six days is not a huge gap; the more interesting number is the cost delta.

The second is from the LayerX Enterprise AI and SaaS Data Security Report 2025. 45% of enterprise employees use generative AI tools at work. Of those users, 77% paste data into chatbot prompts. 67% of ChatGPT access in the enterprise happens through unmanaged personal accounts. 92% of all enterprise AI usage funnels through one tool, ChatGPT. SSO adoption for AI tools is, in LayerX's words, "effectively zero."

The third is from Deloitte's State of Generative AI in the Enterprise (Q4 2025). Worker access to AI tools rose by 50% in 2025 alone, while only one in five organisations has a governance model that survives audit.

Read together, those datasets say something specific. Shadow AI is not a fringe behaviour by power users. It is the dominant mode of enterprise AI consumption, and the gap between adoption and governance widened, not narrowed, through 2025. The discovery problem is the security problem.

Tier 1: the browser tab

Tier 1 is what most security teams already mean when they say shadow AI. An employee opens chatgpt.com on a personal account and pastes a customer email. A recruiter drops a CV into Claude.ai to draft a screening summary. A finance analyst uploads an unredacted spreadsheet. None of it touches the corporate identity provider. None of it lands in any audit log you own.

The discovery vector for tier 1 is network telemetry. If you run a CASB or a forward proxy, your DNS logs are the answer. The domains worth watching, as of April 2026:

OpenAI: chat.openai.com, chatgpt.com, api.openai.com
Anthropic: claude.ai, api.anthropic.com
Google: gemini.google.com, aistudio.google.com, generativelanguage.googleapis.com
Microsoft: copilot.microsoft.com
Perplexity, You.com, Mistral Le Chat, DeepSeek, Qwen Chat
Writing assistants: jasper.ai, writesonic.com, copy.ai, grammarly.com
Image and video: midjourney.com, stability.ai, runwayml.com, pika.art
Transcription: otter.ai, fireflies.ai, read.ai
Coding: cursor.sh, codeium.com, tabnine.com, windsurf.com

The list is incomplete on purpose. New tools launch every week, and the only durable solution is a CASB feed that maintains the catalogue rather than a static block list you forget to update. What network telemetry will not catch is anyone using a personal device on a mobile network. For that population, tier 1 stays invisible.

I think tier 1 is the easy tier to fix, and the wrong one to obsess over. The metrics from LayerX confirm what discovery exercises tend to find. Most of tier 1 is one tool, ChatGPT, on personal accounts, used for tasks that already have an enterprise alternative. The fix is straight: enterprise tier with SSO, named alternatives in the policy, and an amnesty window when the policy launches. Companies that get stuck on tier 1 miss what is happening one layer down.

Tier 2: the AI toggle inside the SaaS you already bought

Tier 2 is the one that broke the old model.

Through 2025, every major productivity SaaS turned into an AI vendor. Microsoft 365 Copilot reads email, calendar, OneDrive, SharePoint, and Teams chat. Slack AI summarises channels and threads. Notion AI rewrites pages. Google Gemini sits inside Workspace and reads Gmail and Drive. Zoom AI Companion writes meeting summaries. Atlassian Rovo crawls Jira and Confluence. Salesforce Einstein touches CRM records. None of these required your team to install a new tool. They activated through admin toggles, free trials, and per-user opt-ins inside platforms you already pay for.

The tier 2 trap is that none of this shows up in your network telemetry. The egress is to microsoft.com and slack.com. You already allow that traffic. Tier 1 detection methods fail completely.

The tier 2 discovery vector is the audit endpoint of each SaaS, and it is uneven. Microsoft Purview's Unified Audit Log captures Copilot interactions (who, when, which document was used as context) but it does not capture the prompt or response text. Slack enterprise plans expose AI usage through the audit logs API. Notion's audit log is enterprise-tier only. Google Workspace logs Gemini activity through the Admin SDK. Each platform has a different format, a different retention default, and a different gap between "we log it" and "you can search it."

Tip

For Microsoft 365 Copilot, the Purview audit log records the interaction event but not the prompt text. If you need actual content for a discovery request or a breach investigation, you need to enable Microsoft Purview eDiscovery and run a content search against the user's CopilotInteractionHistory mailbox folder. Most teams discover this on day one of an investigation, not day one of deployment. Turn it on before you need it.

The harder problem with tier 2 is that the AI feature inherits the user's existing permissions. If the user can read a document, the AI feature can read it. The result is overpermissioned summarisation. A Copilot prompt that asks for "this quarter's deal pipeline" returns deal data from every Salesforce object the user can technically reach, including ones they would never normally open. The recurring Copilot rollout incident through 2024 and 2025 was a sales rep discovering they had inherited access to the executive team's compensation review folder because the SharePoint permissions had drifted years earlier.

I think tier 2 is now the dominant form of shadow AI in any company that has an M365 or Workspace tenant. The LayerX 92% concentration figure is correct for the unmanaged channel and misleading for the whole picture. Once you count the volume of in-SaaS AI features that read customer data daily, the long tail is bigger than ChatGPT.

Tier 3: the agent with an OAuth scope

Tier 3 is small in 2025 and structural in 2026.

An AI agent is an LLM with tools. Some of those tools are read-only API calls. Some are write operations. The way agents get access to enterprise data is through OAuth: a developer or an employee clicks "Connect to Google Drive" or "Connect to Slack" inside an agent platform, grants a scope, and the agent inherits a long-lived token that does not require the user to be online. The agent then runs on its own schedule against your data.

Gartner forecasts that 40% of enterprise applications will ship task-specific AI agents by the end of 2026, up from under 5% in 2025. That is the pace you are governing against. Most of those agents will be installed by a developer or a department lead, configured against a personal Google or Microsoft account, and granted scopes that nobody cleared.

The tier 3 discovery vector is the OAuth grant log of your identity provider. In Google Workspace, the path is Admin Console → Security → API controls → App access control → Manage third-party app access. In Microsoft Entra, it is Enterprise applications → All applications, filtered by OAuth permissions. In Okta, the OAuth grants report. The list is usually longer than the security team expects, because OAuth grants accumulate across years and survive employee offboarding by default.

Watch out

OAuth grants for AI agents survive offboarding unless you revoke them explicitly. When you disable the user account, the agent's refresh token usually keeps working until the token's natural expiry, and many refresh tokens last 90 days, six months, or until revoked. An agent installed by a former employee can keep reading a Drive folder for weeks after their last day. Add "revoke all OAuth grants" to the offboarding runbook and to the quarterly review of the OAuth grants report.

The other tier 3 problem is identity. When the agent acts, the audit log shows the agent token, not the human who installed it. Your incident response process needs a way to map agent tokens back to the human or service account that authorised them, and that mapping is not in the audit log itself. It has to be maintained separately, in the register described below.

What discovery actually finds at each tier

A real discovery exercise across the three tiers usually surfaces the same shape regardless of company.

Tier 1 dominates the headcount but not the data exposure. Most tier 1 usage is one or two employees in marketing, HR, and support pasting content into ChatGPT. The fix is an enterprise tier and an amnesty window.

Tier 2 dominates the data flow. The audit logs from Microsoft Purview, Slack Enterprise Grid, Notion enterprise, and Google Workspace will reveal that Copilot, Slack AI, and similar features are summarising customer data, employee data, and internal documents at a much higher volume than any tier 1 tool. The fix is to scope the AI feature's permissions to what the user actually needs, not what the user inherits, and to turn on eDiscovery before you need it.

Tier 3 dominates the long tail of risk per OAuth grant. Most companies have between 10 and 50 OAuth grants for AI tools that nobody recognises. A handful are critical. The fix is the OAuth grants report on a calendar.

A discovery sweep across all three tiers usually takes a competent security engineer two weeks and surfaces two or three immediate-action items. A tier 1 user who pasted client PII. A tier 2 permission inheritance bug. A tier 3 OAuth grant from a former employee. Those three findings are usually what justifies the budget for the rest of the program.

Note

Anonymous self-report surveys still have a place. They are the only way to surface tier 1 usage on personal devices on mobile networks, and they capture intent and motivation that telemetry cannot. Run the survey alongside the technical sweep, frame it as amnesty rather than audit, and treat the gap between survey and telemetry as your real visibility number. If the survey says 60% of staff use AI tools and your telemetry shows 12%, the gap is your tier 1 blind spot.

Article 4 changes the legal frame

EU AI Act Article 4 was the easiest provision to dismiss when the Act passed. It is two sentences. It says providers and deployers must take measures to ensure "a sufficient level of AI literacy" for staff who deal with AI systems. There is no specific curriculum, no certification standard, and no defined penalty.

It is also already in force. Article 4 took effect on 2 February 2025 alongside the Article 5 prohibitions. National market surveillance authorities start enforcing on 2 August 2026.

The reason this matters for shadow AI is the inversion of the burden. Before Article 4, an employer's defence to a customer data leak through ChatGPT was usually "the employee acted outside our policy." After Article 4, the question is whether the employer ensured the staff member had the AI literacy to know not to do that, and whether the employer can produce any evidence of literacy measures at all. "We did not know our staff were using AI" cannot survive Article 4, because Article 4 creates an affirmative obligation to know.

The case law here is genuinely fuzzy. There has been no Article 4 enforcement action as of April 2026, and the standard for "sufficient" is unlitigated. France's CNIL and Germany's BNetzA have both indicated Article 4 will be assessed as part of broader AI Act compliance reviews; neither has put down a specific threshold. I am not sure how aggressive the first round of enforcement will be, and the regulators may give a long grace period. But the discovery layer for Article 4 is the same as the discovery layer for shadow AI: you cannot evidence literacy for AI uses you cannot enumerate. The register is the artefact that ties literacy to use.

The other lever is GDPR Article 5(1)(a). Personal data processing through an unauthorised AI tool is processing without a lawful basis, processing without a DPA where required by Article 28, and a likely transfer outside the EEA without an Article 46 mechanism. The Italian Garante, the French CNIL, and the Spanish AEPD have all opened enforcement files against employers for tier 1 incidents through 2025. Article 4 is the new lever. Article 5(1)(a) was always the old one.

The shadow AI register, not another policy

Most shadow AI advice ends in "write an acceptable use policy." The acceptable use policy is necessary and not sufficient. The artefact that actually carries weight is the register.

A shadow AI register is one living document with one row per AI use you can name. The minimum columns:

Column	Content
Use ID	Short slug
Tier	1 / 2 / 3 (browser tab / in-SaaS toggle / OAuth agent)
Tool	Vendor name and product
Owner	Named individual
Personal data	Yes / no, and what categories
Lawful basis	The GDPR Art. 6 basis, with a note if special category
DPA in place	Yes / no, link to signed copy
Sub-processor route	Direct or via existing SaaS contract
Risk class	Low / medium / high, mapped to AI Act if applicable
Article 4 literacy	Training reference for the user group
Last review	Date

The register replaces the approved-tools list because the approved-tools list goes stale within a quarter and stops being read. The register also gives you a single artefact to hand to a market surveillance authority asking how you complied with Article 4. It is the unit that ties the discovery exercise to the policy to the regulatory obligation.

The review cadence is monthly for the first six months and quarterly after that. The monthly cadence catches the speed of vendor change in 2025-2026, since every major SaaS shipped a material AI feature update in the last twelve months and any cadence longer than monthly will miss most of them. The quarterly cadence after the program matures is a compromise between the cost of the review and the velocity of change.

I think the register is the only artefact in this space that survives both audit and onboarding turnover. Policies rot. Approved-tools lists rot faster. A register kept short, owned by a named person, and reviewed on a calendar is the one piece of the program I would defend in front of a DPA. Everything else is supporting material.

The visibility-gap test for your shadow AI program

If your acceptable use policy went out last quarter and you cannot tell me how many AI uses your staff have today, your program is at zero. If you have a register with twelve rows and one named owner, you are ahead of most of the field. The metric that matters is the gap between the AI uses your register names and the AI uses your discovery sweep finds. Drive that gap toward zero, monthly, and you have a defensible Article 4 program.

Before your next AI Act audit

The 2 August 2026 enforcement date is the only deadline that matters in this space. Two things to do before then.

First, run the discovery sweep across all three tiers. Two weeks of a security engineer's time. The output is a draft register with maybe twenty rows and three or four findings worth immediate action. If you cannot get to a draft register in two weeks, you have surfaced a different problem: your audit log and OAuth telemetry is not where you thought it was, and that is the first thing to fix.

Second, write the literacy reference for each row. Not a course. Not a certificate. A short page per AI use that names the tool, the categories of data it can and cannot see, the escalation path for an incident, and the owner. The combined literacy references for all rows in your register are your Article 4 evidence package on the day a market surveillance authority asks.

Do not start with a policy, do not start with a ban, and do not start with a tool inventory in a spreadsheet. Start with the register and the discovery sweep that fills it. The rest of the program (policy, training, vendor evaluation) has somewhere to attach once the register exists.

Sources

Continue reading

AI PrivacyMar 7, 2026

How to write an AI acceptable use policy for your team

A guide-tier walkthrough of writing an AI acceptable use policy that survives contact with reality. Includes the full template, the four sections that matter, the rollout playbook, and the EU AI Act Article 4 connection most teams miss.

14 min read

AI PrivacyFeb 23, 2026

Your employee pasted client data into ChatGPT. Now what

A time-anchored runbook for handling the most common AI incident in 2026: a team member pasted personal data into a consumer-tier ChatGPT account. First hour through the policy that stops the next one.

8 min read

AI PrivacyFeb 21, 2026

Your company uses AI tools with customer data. Here's what to check

Six questions a regulator, a DPO, or an enterprise customer will ask you about AI and customer data. Grounded in 2025-2026 enforcement, CNIL guidance, and the Court of Rome OpenAI annulment.

10 min read

Free tool · live

AI Data Flow Checker

Map how personal data flows through your AI integrations and spot the privacy risks before they spot you.