AI PrivacyMarch 22, 2026· 11 min read

Your AI agent has access to production data. Is that ok?

Five concentric rings of agent blast radius (read, write, OAuth reach, external input, memory) anchored on the AEPD's 18 February 2026 agentic AI guidance and EchoLeak (CVE-2025-32711).

TLDR

Stop thinking about agent permissions as a list of grants and start thinking about them as a blast radius. There are five concentric rings: what the agent reads, what it writes, what it can reach through inherited OAuth scopes, what arrives from outside as input, and what stays in memory after the task ends. The fix is different in each ring.
The Spanish AEPD published an 81-page agentic AI guidance on 18 February 2026 that names six risk categories, including "shadow-leak" (the silent, gradual leakage of personal data through repeated queries and long-term memory). It is the most detailed regulatory document on agentic AI in the EU as of April 2026, and it is the document your DPIA needs to point at.
Gravitee's 2026 State of AI Agent Security survey: 88% of organisations confirmed or suspected an AI agent security incident in the last year (92.7% in healthcare), 14.4% have full security/IT approval for agents in production, 21.9% treat agents as independent identities, 45.6% rely on shared API keys.
EchoLeak (CVE-2025-32711, CVSS 9.3, disclosed June 2025) was the first published real-world zero-click prompt injection in a production LLM system. Aim Security chained four bypasses to extract data from M365 Copilot through Outlook and SharePoint without a user click. Classifier-based defences (XPIA) were not sufficient.
If your agent makes or influences decisions about people, it likely sits inside Article 6 high-risk under the EU AI Act, with full obligations from 2 August 2026. The clock is short for the documentation.

The 18 February 2026 AEPD guidance is the document your agent program now answers to

The AEPD's 81-page guidance on agentic AI, published on 18 February 2026, is the document that EU privacy lawyers and DPAs are now reading for agentic AI cases. It is more granular than anything the EDPB has shipped on agents specifically, and it names six risk categories that map directly to the technical controls a security team can build: lack of accountability, poor user data access management, inadequate governance of multiple processing purposes, insufficient human oversight, prompt injection attacks, and unauthorised memory access.

The guidance does two things that make it load-bearing. It treats agent memory as a regulated processing surface, distinguishing working memory (the context window during execution) from management memory (long-term storage and learned context). And it names the "shadow-leak" threat: the silent, gradual leakage of personal data through repeated queries and pattern inference from long-term storage, a class of exfiltration that does not look like a data breach until you reconstruct the query log months later.

The rest of this article walks the agent's reachable state space as five concentric rings of blast radius. Each ring has its own AEPD risk category, its own technical control, and its own typical failure mode. The point is that "least privilege" for a reasoning agent is not one decision at deployment time. It is five decisions at five different layers, and each layer fails differently when you skip it.

Ring 1: what the agent reads

The first ring is the read scope. Which databases, files, and APIs can the agent retrieve from, and which records inside each one? This is the ring everyone audits first and the ring most teams still get wrong, because access added for a demo three months ago is rarely removed.

The Gravitee 2026 survey found that only 24.4% of enterprises have full visibility into which AI agents are communicating with which systems, and that more than half of all agents in production run with no logging of tool invocations at the execution layer. The default agent in 2026 is a system prompt plus a fetched OAuth token plus broad read access to a database the team forgot to row-level-secure. The audit question to start with is not "what could the agent read in theory" but "what did the agent actually read in the last seven days," and most teams cannot answer it.

The fix is two layers. Row-level security and column-level masking on the database itself, so a misconfigured query cannot return columns the agent's task does not need. And task-scoped credentials at the application layer, so the credential the agent presents to the database is bounded to the task that requested it, not to the agent's deployment identity. The first layer defends against drift in the application code. The second defends against drift in the agent's reasoning.

Tip

The AEPD makes a sharp distinction between operational logs (for performance and debugging) and the compliance audit trail (for Article 30 records of processing). Operational logs do not need to capture the specific personal data the agent touched. The compliance trail does. The retention windows should also be different: operational logs short, compliance trail aligned with your overall GDPR retention policy. Most teams build one log and bolt the compliance use onto it; the AEPD's framing is that they should be two logs with different sanitisation and access rules.

Ring 2: what the agent writes

The second ring is the write scope. Updates, inserts, deletes, status changes, file modifications, and any irreversible operation. This ring is smaller than the read ring in most agents but it carries more incident risk per operation, because a write that should not have happened is harder to roll back than a read that should not have happened.

The OWASP AI Agent Security Cheat Sheet recommends a risk-based approval framework that maps action sensitivity to a human-in-the-loop threshold:

Risk level	Example	Approval
Low	Read operations on non-sensitive data	Auto-approved
Medium	Write operations, status updates	Queued for review
High	Financial transactions, external comms	Human approval required
Critical	Irreversible operations, bulk deletes	Mandatory human review with two reviewers

The pattern that fails is "approval at deployment time, free hand at runtime." The agent gets blanket write access in the deployment IAM role, and the only check is whether the deployment passed a security review six months ago. The runtime layer where the agent decides which write to issue has no policy at all.

The fix is an authorisation gateway between the agent and the data plane: every write the agent attempts is evaluated against the current task's approved risk level, and high-risk writes are queued for a named human reviewer rather than auto-approved. This gateway is where you implement the AEPD's "automatic circuit breakers for anomalous activity." A burst of 200 deletions in 30 seconds from one task scope is an anomaly the gateway should refuse, regardless of whether the credentials are valid.

I think the read-write separation is the most under-built control in agent stacks today. Teams that built one with row-level security and a write gateway usually built it after their first real incident. The cost of building it before is much smaller than the cost of building it after.

Ring 3: what the agent can reach through inherited OAuth scopes

The third ring is the one most security teams underestimate, because it is the ring where the agent's reachable state is bigger than the database it is supposedly bounded to.

When an agent is granted OAuth access to a SaaS system (Google Drive, Slack, GitHub, Salesforce, Notion, M365), it usually inherits the full scope of the human user who authorised the connection. This is the ambient authority problem: the agent can reach everything that user can reach, even when the task at hand only needs a tiny slice. A 2025 audit of public MCP server implementations found that 43% had OAuth flow flaws, 43% had command injection vulnerabilities, 33% allowed unrestricted network access, and 22% allowed file access outside the intended data sources. CVE-2025-6514, an OAuth proxy remote-code-execution issue in MCP, affected over 500,000 developers when it landed.

Note

The ambient authority problem compounds across MCP servers. An agent connected to two MCP servers, each granting "the same scope as the user," can synthesise actions across both that neither server's authorisation policy was designed to permit. The agent can read a private channel from one server and write a public file in the other. The cross-server policy that would catch this pattern lives at the agent gateway, not in any individual MCP server. If you have not built that policy layer, the security review of each MCP server in isolation will miss the cross-server attack class entirely.

The fix in this ring is the same shape as ring 2 but with a different set of controls. Mint short-lived task-scoped tokens at the agent gateway rather than passing through the user's OAuth refresh token. Bound the scope of each token to the specific MCP servers and capabilities the task needs. Revoke on task completion. Audit OAuth grants quarterly and treat any agent OAuth grant that survives offboarding as a finding.

The depth of work here is larger than a configuration change. It is an architectural decision: are agents principals in their own right inside your identity provider, or are they impersonators of human users? The Gravitee survey number on this is bracing: only 21.9% of organisations treat agents as independent identities and 45.6% rely on shared API keys. The default is impersonation. Impersonation is what makes the third ring large.

Ring 4: what arrives from outside as input

The fourth ring is the one where the threat is not the agent's permissions but the agent's inputs. Anything the agent reads from an untrusted source (an email, a customer message, a web page, a document, a file from a partner) can contain instructions that override the agent's intended task. Prompt injection is the AEPD's fifth named risk category and it is the single most-cited attack class in production agent failures from 2025-2026.

EchoLeak (CVE-2025-32711, CVSS 9.3, disclosed June 2025 by Aim Security) was the first published real-world zero-click prompt injection in a production LLM system. Aim Security's research, later published as arXiv 2509.10540, chained four bypasses: evading Microsoft's XPIA cross-prompt-injection classifier with specific phrasings, circumventing link redaction with reference-style Markdown, exploiting auto-fetched images, and abusing a Microsoft Teams proxy that the content security policy allowed. The result was data exfiltrated from Outlook and SharePoint with no user interaction, in a fully patched M365 Copilot deployment, on a default configuration.

Watch out

EchoLeak shows that classifier-based prompt-injection defences are necessary and not sufficient. The XPIA classifier was designed exactly to catch the pattern that EchoLeak exploited; Aim Security found phrasings that slipped past it. If your agent's prompt-injection defence is "the model's safety training plus a classifier," your defence has the same shape as the one EchoLeak bypassed. The architectural fix is to treat untrusted input as data, not as instructions: separate the user's question from the retrieved content at the prompt boundary, enforce that the agent cannot follow instructions found inside retrieved content, and isolate the egress channels so a prompt injection cannot exfiltrate to an attacker-controlled domain even if it succeeds at instruction injection.

The defensive pattern that works is layered. Treat retrieved content as data, lock egress to a small allowlist of approved destinations, add a content-type-aware classifier as a tripwire (knowing it can be bypassed), and log every action the agent takes after reading untrusted content. The honest framing is that prompt injection cannot be eliminated against a determined attacker today. The realistic goal is to make a successful injection unable to exfiltrate or to perform irreversible writes, by constraining the rings around it.

Ring 5: what stays in memory after the task ends

The fifth ring is the one the AEPD spent the most pages on and the one most teams have not thought about at all. The agent's memory is its own processing surface.

The AEPD distinguishes working memory (the context window during execution, which contains the prompt, the retrieved documents, the tool call results, and the partial reasoning chain) from management memory (long-term storage that persists across sessions: vector stores of past conversations, learned user preferences, cached tool outputs, fine-tuning datasets harvested from production). Both are personal data processing surfaces under GDPR if the contents include identifiable individuals, and both are subject to Article 5 minimisation, Article 17 erasure, and Article 30 records of processing.

The shadow-leak threat lives here. A user asks the agent twenty questions across two months, each question containing one fragment of personal data. None of the individual questions look sensitive. The aggregate, sitting in management memory, reconstructs a profile that the user never consented to. This is not a hypothetical from the AEPD. It is how the AEPD describes the dominant exfiltration pattern for agents in deployment: not a single dramatic breach but the slow assembly of a profile through legitimate-looking queries, stored in a memory layer the controller did not realise was a processing surface.

The fix is a memory policy that the AEPD calls memory compartmentalisation: separation of organisational memory from per-user memory, retention limits per memory type, sanitisation of obsolete entries, and explicit segregation between memory used for task execution and memory used for model improvement. Most agent platforms in 2026 ship with a single undifferentiated vector store as memory. That default fails the AEPD's test. The work is to add policy on top of the default.

I am still not sure how aggressive EU DPAs will be in enforcing the AEPD's memory framing in the next 12 months, because no test case has landed yet and the AEPD itself has not announced an enforcement priority for agentic AI. But the 81-page document is a published interpretive position and it is the position your DPIA will be measured against if a complaint lands. Treat it as the floor.

The August 2026 inflection and the agent register

Two things change for agentic AI in the EU between now and the end of 2026.

First, the AI Act Article 6 high-risk classification becomes enforceable on 2 August 2026. If your agent makes or influences decisions about people in any of the Annex III categories (employment, education, access to essential services, biometrics, law enforcement, critical infrastructure), the full obligations apply: data governance, technical documentation, human oversight, post-market monitoring. Your agent's blast radius assessment is the input to most of these obligations, not a separate exercise. The DPIA you already owe under GDPR Article 35 carries through to the FRIA (Fundamental Rights Impact Assessment) under AI Act Article 27(4), and the agent register is the artefact that ties them together.

Second, the AEPD guidance is the first detailed national DPA reading of agentic AI under GDPR, and other DPAs (CNIL, BfDI, the Garante) tend to converge with each other within 6-12 months on documents this granular. The AEPD framing of memory as a regulated surface, of shadow-leak as a named threat, and of the six risk categories will likely show up in the next CNIL or EDPB document on the same topic. Building to the AEPD now is buying optionality on the next round of EU guidance.

The operational artefact the article keeps pointing back to is an agent register: one row per deployed agent, with the five blast-radius rings (read scope, write scope, OAuth scope, input sources, memory layout) named explicitly and a named owner. The register replaces the IAM policy as the single source of truth for "what can this agent do" because the IAM policy answers ring 2 and ring 3 only. The register has to span all five rings or it does not match the AEPD's expected documentation surface.

The blast-radius test for your agent program

For each agent in production, can you draw the five rings on a single page? Ring 1 (what it reads), ring 2 (what it writes), ring 3 (what it can reach through OAuth), ring 4 (what untrusted inputs it consumes), ring 5 (what it stores in memory across sessions). If you cannot, the agent's blast radius is implicit, and "least privilege" is a slogan rather than a control. If you can, you have a register entry that maps to the AEPD's six risk categories and to your DPIA. That single page is the deliverable.

Before your next agent change

Two things to do before the next code change to your agent lands in production.

First, draw the five rings for the agent as it exists today. Two hours of work for an engineer who built it. The output is a single-page diagram and a register entry. If any ring is too big to draw, that ring is the first thing to scope down.

Second, point your DPIA at the AEPD's 81-page document and the six risk categories explicitly. Map each risk category to the ring that contains its mitigation. The DPIA is your evidence package for August 2026, and the AEPD framing is what makes it readable to a DPA reviewer who has read the same document.

If you only do one of these, do the diagram. The DPIA is the artefact a regulator reads. The diagram is the artefact your team uses to refuse the next "just give it write access for the demo" request, which is where most blast-radius growth comes from.

Sources

Continue reading

AI SecurityApr 2, 2026

Securing MCP servers: the attack surface your AI agent just opened

The MCP specification is strict. Most implementations skip the MUST-level requirements. The 30+ CVEs filed in the first 60 days of 2026 live in that gap. A field guide to the four attack classes that matter, with named CVEs and what to actually do.

10 min read

AI SecurityApr 3, 2026

Prompt injection in production: how to defend what you've shipped

What EchoLeak actually showed, what the lethal trifecta actually is, and how your defense posture should change by architecture tier. Grounded in 2025 Microsoft, Google, and OWASP research.

11 min read

AI PrivacyMar 19, 2026

Building RAG with customer data. Here are the 5 things that matter

A practical guide to building RAG systems with customer data while handling GDPR obligations. Lineage tables, retrieval authorization, embedding inversion, and erasure planning.

7 min read

Free tool · live

AI Data Flow Checker

Map how personal data flows through your AI integrations and spot the privacy risks before they spot you.