EU AI data residency in 2026: the seven layers where data lives, the CLOUD Act mechanic, the OpenAI in-region GPU launch, and when sovereignty beats residency.
On 16 January 2026, OpenAI extended its EU data residency programme to in-region GPU inference. For eligible ChatGPT Enterprise, Edu, Healthcare, and API customers, model execution on European customer content now happens on European GPUs. Before that update, "EU residency" with OpenAI meant prompts could be stored in Europe, but the inference itself (the part where the prompt becomes a response) still ran on US hardware. The fine print on the OpenAI help centre is the part that matters: even with inference residency turned on, "non-GPU processing may still occur globally", and "activities such as authentication, routing, and analytics may still occur outside the selected region".
That sentence is the entire article in miniature. Residency is not a binary. It is a stack of layers, each with its own answer, and the marketing version usually only talks about one of them.
The word travels with three different meanings, and most disagreements about whether a vendor "is EU" come from people using two of them at the same time.
Physical residency is where the bytes sit. A datacentre in Frankfurt holding your prompts, with no copy elsewhere. This is the easiest to verify and the easiest to satisfy. Every major hyperscaler has European regions; every frontier model provider can keep at-rest data in Europe if you ask for it.
Jurisdictional residency is whose courts can compel access to those bytes. This is the question the CLOUD Act answers in a way Europeans tend to find unwelcome. The bytes can be physically in Frankfurt. If the operator is a US corporation or a subsidiary controlled by one, a US warrant can still reach them.
Operational residency is who has admin control, who holds the keys, and who can read the running infrastructure. A perfectly EU-resident dataset behind a perfectly EU-jurisdictional contract is still operationally exposed if the support engineer reading your incident logs sits in Bangalore or Seattle. This is the layer GDPR Article 28(2) and the Article 28(3)(b) confidentiality obligation point at, and it is the layer that the EUCS High level was supposed to standardise before the sovereignty requirements were quietly removed in March 2024.
When a vendor says "EU data residency", they almost always mean the first one. When a procurement team asks "is this EU-resident?", they almost always mean the second or the third. That mismatch is why the sales call ends in agreement and the legal review ends in a fight.
A single AI API call fans out into more places than the architecture diagram shows. Each is a separate residency question.
| Layer | What sits here | Typical residency story |
|---|---|---|
| Inference (GPU execution) | Prompt + model weights, milliseconds | The marquee "EU residency" claim |
| Storage at rest | Conversation history, files, fine-tunes | Usually covered, sometimes separately |
| Cache (KV / prompt cache) | Recent prompt prefixes, attention state | Rarely mentioned; ephemeral but real processing |
| Logs (request / response) | Full payloads for abuse + debugging | Default-on at most providers; opt-out exists |
| Metadata | Timestamps, user IDs, token counts, IP | Usually flows through global telemetry |
| Sub-processors | Hyperscaler, monitoring, analytics, payments | Each one a separate residency question |
| Orchestration | Auth, routing, abuse detection, support | The layer the OpenAI footnote refers to |
The pattern that bites teams is this. Inference and storage are the two layers vendors compete on, so those are the two layers buyers check. The other five are where the residency promise quietly leaks. The CNIL's €42M decision against Free and Free Mobile on 13 January 2026 was largely about the log retention layer, not about anything the customer-facing application did wrong. Logs are processing under Article 4(2). Caches are processing. Metadata is processing. Sub-processors are sub-processors, and Article 28(2) requires authorisation per change.
The honest version of "we are EU-resident" is a row-by-row answer to this table. If a vendor cannot give you one, you have a layer audit to do.
The auth and routing layer is usually the leakiest one. Read OpenAI's inference-residency footnote one more time. Authentication, routing, and analytics may still occur outside the selected region. That is how internet-scale infrastructure works, not a bug. The same is true for Azure, Vertex, and Bedrock. If your threat model includes the metadata stream around the prompt (who asked, when, from where, how often), inference residency does not cover it. Treat it as a separate audit row.
This is the layer where the conversation gets uncomfortable, and it is the layer that most data residency content treats as an aside. It is the load-bearing question.
The US CLOUD Act (Clarifying Lawful Overseas Use of Data Act, 2018) gives US courts the authority to compel any entity subject to US jurisdiction to produce data in its "possession, custody, or control", regardless of where the data physically sits. The legal mechanism is corporate, not territorial. The question a US warrant turns on is not "where are the bytes" but "is the entity holding the bytes a US corporation or controlled by one".
Stack that against GDPR Article 48: a third-country court order is only enforceable in the EU through a mutual legal assistance treaty or another international agreement. The two regimes do not agree. When a US warrant lands on Microsoft Corporation for data physically held by Microsoft Ireland Limited in Frankfurt, there is no clean answer that satisfies both. This is the gap European cloud sovereignty initiatives have been trying to close since 2021.
In practice this gives you four jurisdictional postures, in increasing order of distance from the CLOUD Act:
Why Bleu and Delos are structurally different from Azure. The corporate structure matters in a specific way. Bleu's commercial entity is jointly held by Capgemini and Orange (both French, neither owned by Microsoft). Microsoft licenses the technology stack but does not operate the infrastructure or hold the data. Under the standard CLOUD Act test (does the US entity have "possession, custody, or control"?), the answer for a Bleu customer is no, because Microsoft Corporation does not control the operator. That is the structural difference from Microsoft's "EU Data Boundary", which is a technical commitment but leaves the operator inside the US corporate group. Same software, different jurisdictional posture.
The numbers and fine print here move every quarter. This table reflects the state on 11 April 2026.
| Provider | EU inference | EU storage | Sub-processor disclosure | CLOUD Act exposure |
|---|---|---|---|---|
| OpenAI direct API | Yes (in-region GPU since 16 Jan 2026, eligible tiers) | Yes (since 5 Feb 2025) | List published, updated April 2025 | Yes — OpenAI, L.L.C. is Delaware |
| Azure OpenAI | Yes (Sweden, Germany via Data Zone EUR) | Yes | Microsoft sub-processor list | Yes — Microsoft Corporation parent |
| Anthropic direct API | Yes (since August 2025) | Yes | Trust Center listing | Yes — Anthropic PBC is Delaware |
| Anthropic via Vertex / Bedrock | Yes (EU regions) | Yes | Hyperscaler chain + Anthropic | Yes — both layers US-parented |
| Google Vertex AI | Yes (multiple EU regions) | Yes | Google Cloud sub-processor list | Yes — Alphabet parent |
| Mistral La Plateforme | Yes (EU-only default) | Yes (EU-only) | Listed; small chain | No — French SAS, no US parent |
| Microsoft Bleu | Yes (FR, SecNumCloud) | Yes | Bleu chain, no US operator | No — operator is Capgemini/Orange JV |
| Self-hosted on EU cloud | Full control | Full control | Your own | Depends on the cloud entity |
A few things have moved that the older guides have not caught up to.
The OpenAI residency programme jumped twice in 2025-2026. Storage in Europe arrived 5 February 2025; Asia and other regions followed in May; worldwide rollout in October; and in-region GPU inference for Europe on 16 January 2026. The 16 January update is the one most teams missed because it was filed under "expansion" rather than launch. It is materially different. Before it, your prompt physically left the EU on every API call. After it, eligible accounts can keep the GPU step in-region too, with the auth/routing/analytics caveat above.
The Anthropic story is more complicated than it looks. Anthropic's direct API offers EU processing and storage since August 2025. But Anthropic also serves Claude models via AWS Bedrock, Vertex, Azure, and (since 7 January 2026) as a sub-processor inside Microsoft 365 Copilot for most commercial Microsoft cloud regions, off by default for EU/EFTA/UK customers. The 23 October 2025 expansion of the Anthropic-Google Cloud TPU partnership added another sub-processor layer below the model: when you call Claude in certain configurations, your prompt now potentially traverses Google's TPU infrastructure even if you bought through AWS. Each of those layers carries its own residency answer. Our Anthropic vs OpenAI vs Google comparison walks the deltas in detail.
Mistral remains the only frontier-class commercial provider with EU-only as the default rather than as an add-on. The corporate posture is the differentiator, not the model quality. Mistral's largest models trail the frontier on benchmark tasks, but for the workloads where jurisdictional residency is load-bearing, the gap is acceptable in exchange for not having a CLOUD Act conversation.
The OpenAI EU-residency switch turns on Zero Data Retention as a side effect. Enabling the European data residency option for the API automatically activates Zero Data Retention without a separate application or contract. This is the cheapest, fastest reduction in your sub-processor cascade you can make today: one toggle in the OpenAI dashboard, no negotiation. We covered the full mechanic in OpenAI's data processing agreement: what it actually says.
Most teams do not need sovereignty. They need residency. The line between the two is the threat model.
You need physical residency if your data is personal data of EU individuals, or if a customer contract demands EU processing, or if you operate in a sector where the supervisory authority is touchy about transfers (healthcare, finance, government). This is the bar most teams should aim for. It is achievable with any of the major hyperscaler regions or any frontier model API with an EU residency option.
You need jurisdictional sovereignty when the threat model includes a foreign-government legal order: a US warrant, a national security letter, a CLOUD Act subpoena. This is rare. It applies to a narrow set of workloads: sectoral classified data, certain defence contracts, some judicial and law-enforcement records, and (increasingly) data covered by national "trusted cloud" doctrines like SecNumCloud in France or BSI C5 in Germany. If your threat model does not name a foreign court, you almost certainly do not need sovereignty. It is a different kind of investment with a different cost shape.
You need operational sovereignty when the threat model includes the cloud operator's own personnel as a risk surface. This is even rarer. It applies in a few intelligence-adjacent and competitive-intelligence-sensitive workloads. If you genuinely need it, the question stops being about cloud regions and starts being about HSMs, customer-managed keys, and bring-your-own-encryption schemes that the cloud operator cannot decrypt even on demand.
The honest version of the decision tree: physical residency for almost everyone, jurisdictional sovereignty for a known short list of regulated workloads, operational sovereignty only when something else has already told you that you need it. Mistaking the first for the second is the most expensive thing teams in this space do.
Even teams that get the inference and storage layers right often miss four others. These are the audit rows that catch teams out at the next supervisory authority visit.
Observability and telemetry. Sentry, Datadog, LangSmith, Helicone, Phoenix, Langfuse Cloud are all sub-processors under Article 28, and most of them ship event payloads to US infrastructure by default. The default capture settings on Sentry's send_default_pii=True and Datadog LLM Observability route raw prompts to US-resident telemetry storage unless you explicitly redact upstream. The CNIL Free / Free Mobile decision named log retention as a primary failing, and the same logic applies to LLM observability stacks the moment they hold customer content. We walk the specific architectures in logging and monitoring AI outputs.
Embedding and retrieval pipelines. Pinecone, Weaviate, Qdrant Cloud, and the managed vector tiers from the hyperscalers all have residency stories that are independent of the model API you call to generate the embeddings in the first place. A team can run inference inside the EU and still ship the resulting vector representation to a US-only Pinecone region, where the EDPB's Opinion 28/2024 anonymity test says you have probably just shipped personal data again, because embeddings are reversible. Our vector embeddings as personal data piece walks the mechanic.
Caches and dead-letter queues. Prompt caches at the model layer are usually ephemeral and intra-region, but application-side caches (Redis, Memcached, CloudFront edge caches) and retry buffers (SQS, Pub/Sub, dead-letter queues) often live somewhere completely different from the model API. A failed call that hits the DLQ and waits six hours for a manual replay is processing during those six hours, in whatever region the queue lives in.
The metadata stream. This is the one that gets dismissed the most often. User IDs, IP addresses, timestamps, token counts, request IDs, tracing spans: all of it travels with the prompt and almost none of it stops at the EU border. Metadata is personal data when it can identify a natural person; the CJEU has been clear on this point since Breyer (C-582/14) in 2016. The OpenAI inference-residency footnote about "authentication, routing, and analytics" is a polite way of saying the metadata stream is global by design.
Audit each of these as a separate row in your sub-processor register. The pattern in the supervisory authority decisions of the last twelve months has not been "the inference layer leaked"; it has been "we found a sub-processor nobody had logged".
The single most useful artefact for the work above is a residency map: one row per data layer, one column per provider, one cell per residency answer. Inference, storage, cache, logs, metadata, sub-processors, orchestration. For each cell, the answer is one of: EU only, EU+US (controlled), global, or unknown. The "unknown" cells are the ones to chase. The "global" cells are the ones to either accept consciously or fix.
For most teams, the work after building that map is small: turn on EU residency where it exists and is free, redact PII upstream of the observability stack, move the embedding store to an EU region, and accept the metadata layer as a residual risk to document in the DPIA. For a small minority of teams the work is larger and points at Bleu, Delos, Mistral, or self-hosting. The 27 May 2026 expected proposal of the EU Cloud and AI Development Act will reshape some of this picture later in the year. It includes EU-wide eligibility requirements for cloud providers and a single EU procurement framework, and it is the most consequential regulatory move on this question since the DPF. The layer-by-layer map will still be the prerequisite for any decision after CADA exists.
The residency test that survives contact with reality. Pick one production AI feature. Walk the seven layers above and assign each a residency answer. If you cannot answer one of them in less than thirty seconds, that is the layer to audit first. The map is the deliverable; the cleanup is downstream of having one.
A trace-walk of one OpenAI API call through every entity in the cascade, with the Article 28, CLOUD Act, Article 48, and DMA layers stacked on top.
A 2026 decision framework for dev teams choosing between self-hosting an open-weight LLM and calling a cloud API. Refreshed with Llama 4, the Latombe DPF challenge, and Azure / Bedrock EU data zones.
What changed for the three providers in 2025-2026: Anthropic's August 2025 consumer shift, the October 2025 Google TPU sub-processor expansion, the Court of Rome OpenAI annulment, and the Latombe DPF appeal pending at the CJEU.
Free tool · live
AI Data Flow Checker
Map how personal data flows through your AI integrations and spot the privacy risks before they spot you.