AI PrivacyApril 4, 2026· 11 min read

AI memory and persistent context: what your chatbot remembers about users

AI memory is profiling, and the deletion story is broken. The two layers, the NYT court order, the CIMemories benchmark, memory poisoning, and what survives a supervisory audit.

TLDR

Persistent memory in AI assistants is profiling under GDPR Article 4(4). That triggers Article 13/14 transparency, Article 15 access, Article 17 erasure, Article 21 objection, and an Article 35 DPIA in most cases.
Memory exists in two layers and only one is deletable. Memory-layer data (ChatGPT saved memories, CLAUDE.md, Gemini Personal Intelligence connections) is structured and removable. Model-weight knowledge is encoded in parameters and effectively cannot be removed without retraining.
"Delete" promises do not always survive contact with US litigation. The 13 May 2025 preservation order in NYT v OpenAI requires OpenAI to retain deleted ChatGPT conversations for Free, Plus, Pro, and Team accounts indefinitely. Enterprise is excluded.
Memory poisoning is now a documented attack class. Microsoft Defender's 10 February 2026 report named 50 distinct examples from 31 companies across 14 industries injecting persistent instructions via "Summarize with AI" buttons.
The CIMemories benchmark (arXiv 2511.14937, 18 November 2025) found frontier models leak up to 69% of remembered attributes into wrong contexts. GPT-5's leak rate climbs from 0.1% at one task to 25.1% under repeated prompts. Memory drift is architectural, not a configuration issue.

The cleanest way to think about persistent AI memory is through Helen Nissenbaum's framework of contextual integrity: privacy is the appropriate flow of information for the context, not the absence of information sharing. The CIMemories benchmark, published on arXiv on 18 November 2025, applied that framework to the new generation of memory-equipped chatbots. It built synthetic user profiles with over 100 attributes and tested whether frontier models would surface each attribute only in tasks where it belonged. The headline number is 69% leakage on the worst-performing model, but the more disturbing finding is the drift: GPT-5 leaks 0.1% of attributes in single-task settings, 9.6% across 40 tasks, and 25.1% when the same prompt is run five times. Memory generalises beyond what it explicitly remembers, and the generalisation is unstable.

That instability is the product of the architecture, not of any specific configuration. It is the part most builders of memory features have not internalised. Below is the rest of what changes when your AI feature crosses the line from "stateless inference" to "remembers".

Two layers, two completely different deletion stories

AI memory exists in two layers. The distinction is the single most important thing to internalise about this topic, because the GDPR rights and the technical reality split along this line.

Memory-layer data. Explicit, structured information stored in your provider's infrastructure as a queryable record. ChatGPT's saved memories under Settings → Personalization → Manage memories. Claude's CLAUDE.md files per project, and the auto-memory that accumulates preferences and architecture notes. Gemini Personal Intelligence's connections to Gmail, Photos, YouTube, and Docs (launched January 2026, opt-in by default). Custom instructions and system prompts at the application layer. This data is viewable, editable, and deletable. It is the layer Article 15 access requests can actually answer.

Model-weight knowledge. Patterns learned during training or fine-tuning. If a user's data was part of the training corpus, fragments of it may be encoded in the model's parameters. This layer is not viewable, not editable, and effectively cannot be removed without retraining the model from scratch. The EDPB's Opinion 28/2024 was explicit on this point: "whenever memorisation may happen, the model may not be anonymous". The corollary is that the provider, not the deployer, owns the deletion problem at this layer, and the honest answer to a customer who asks "did the model learn anything from my data" is usually "we cannot tell, and probably not, but the provider is on the hook for ensuring it".

For most product builders, the memory-layer data is what you control. It is also where almost every Article 17 deletion request will be answered. Walk our right to erasure article for the cluster context: the same three-tier deletability map applies here, with memory-layer in tier 1 and model-weight in tier 3.

Why persistent memory is profiling under Article 4(4)

GDPR Article 4(4) defines profiling as "any form of automated processing of personal data consisting of the use of personal data to evaluate certain personal aspects relating to a natural person, in particular to analyse or predict aspects concerning that natural person's performance at work, economic situation, health, personal preferences, interests, reliability, behaviour, location or movements".

Read the list. AI memory that tracks a user's preferences, interests, work patterns, and communication style across sessions hits "personal preferences", "interests", and "behaviour" as a matter of plain text. It is automated. It processes personal data. It evaluates personal aspects. The question is not whether persistent memory is profiling. The question is which of the resulting obligations apply.

The cascade:

Article 13/14 transparency. Tell users that their data is being used for profiling, what is remembered, why, and for how long. The "saved memories" panel inside ChatGPT is part of this, but it is not enough on its own. The privacy notice has to name the practice in advance, in the language of the regulation.
Article 15 access. Users can ask what the AI remembers about them. Memory-layer data is straightforward to surface. Inferred preferences (the implicit knowledge the system built rather than explicitly stored) are harder, and the EDPB CEF 2024 enforcement report found this is one of the most-failed parts of subject access requests across the EU. Our Article 15 walkthrough covers the failure modes.
Article 17 erasure. Memory-layer data must be deletable on request. Model-weight knowledge is the provider's problem. The next section explains why "deletable" is more contested than it sounds.
Article 21 objection. Users can object to profiling based on legitimate interest. You must stop unless you can demonstrate compelling grounds that override their interests. For most consumer assistant use cases, "compelling grounds" is hard to argue.
Article 22 automated decisions. If the memory feeds a decision that produces legal or similarly significant effects (a hiring screen, an insurance quote, a loan), CJEU SCHUFA C-634/21 (December 2023) and C-203/22 Dun & Bradstreet (27 February 2025) have expanded Article 22 to cover decisions that "draw strongly" on the automated processing, not just fully-automated ones. Memory-fed scoring is in scope.
Article 35 DPIA. Systematic profiling at scale, especially with sensitive categories, triggers a mandatory Data Protection Impact Assessment. Our DPIA practical check walks the criteria.

If you are building an AI feature with persistent memory, treat it as profiling from the architecture stage. Retrofitting compliance after launch is the most expensive failure mode in this entire category.

"Delete" doesn't mean delete: the NYT court order

The single most important fact about ChatGPT memory in 2026 is that OpenAI is currently under a 13 May 2025 court order requiring it to retain deleted ChatGPT conversations indefinitely. The order arose from the New York Times v OpenAI copyright lawsuit, where the NYT successfully argued that OpenAI's normal 30-day deletion of consumer chats would destroy evidence relevant to the litigation. OpenAI has been preserving deleted and temporary chats since mid-May 2025.

The scope: ChatGPT Free, Plus, Pro, and Team subscriptions. Enterprise is excluded. The preservation applies even when the user clicks delete, even when the user used Temporary Chat mode, even when the user opted out of training. OpenAI's stated position is that the data sits under legal hold, accessible only to a small audited legal and security team, and that any actual disclosure to plaintiffs would require court-approved discovery. The company is challenging the order. As of April 2026, the order is in force.

For European deployers this matters in two ways. First, the order is on the face of it incompatible with the Article 17 right to erasure for any user data that ends up inside an affected ChatGPT consumer account, which is the entire point of the privacy controls the user thinks they have. Second, the contractual position changes: an EU customer who trusted "we delete after 30 days" in the consumer terms now has a documented gap between the contract and the operational reality, and that gap is itself a compliance finding.

Memory-layer deletion is contingent on litigation status, not on the user's intent. Read your AI vendor's DPA carefully and ask the explicit question: are any of my users' data subject to litigation hold? For ChatGPT Free / Plus / Pro / Team accounts in April 2026, the answer is yes. OpenAI has been preserving deleted conversations under the May 2025 NYT preservation order. Enterprise is excluded by the order. The same situation could arise tomorrow against any other vendor and the only operational answer is to know which tier your users are on and to be prepared to adjust your privacy notice and DPIA when the situation changes. The "deletion happens within 30 days" line in your privacy notice should not survive this kind of order without an asterisk.

For model-weight knowledge, the deletion story is different but equally bleak. Machine unlearning research has progressed in 2025 (RMU, SimNPO, LoRA-based variants), and the October 2025 systematic survey on arXiv (2510.25117) gives the field a common vocabulary, but no method has cleared the production bar of "verifiably deleted, with no measurable utility loss, on a deployed model with adversarial probing". The honest answer to "can you remove a person's data from the model weights" is still no in April 2026. Plan around the assumption that anything that landed in the training data is permanent. The Italian Garante's €15M Decision 755 against OpenAI, partially annulled by the Court of Rome on 18 March 2026 on procedural grounds (the substantive analysis was not vindicated, only the Garante's process), is the warning shot.

The attack surface: how memory becomes a backdoor

Memory is an attack surface. The 2025-2026 research and enforcement record names three concrete attack classes that builders need to defend against.

Indirect injection into long-term memory. Palo Alto Unit42 demonstrated the canonical version: an adversary embeds hidden instructions in a webpage. The user asks the assistant to summarise the page. The injected payload manipulates the summary and gets stored in persistent memory. In future sessions, the poisoned memory causes the assistant to silently exfiltrate conversation data to an attacker's server. Memory contents injected into system instructions carry high priority, which is precisely what makes poisoned memories more dangerous than transient prompt injection. Standard injection only affects the current session; poisoned memory persists across all future sessions until the user manually removes the entry.

Real-world recommendation poisoning. Microsoft Defender's Security Research team published a report on 10 February 2026 documenting 50 distinct examples from 31 companies across 14 industries that had embedded hidden instructions in "Summarize with AI" buttons on their own websites. The instructions injected persistence commands into the assistant's memory, which then biased the assistant's recommendations on health, financial, and security questions in future sessions. The companies were not all malicious actors in the traditional sense; some were marketing firms, some were content optimisation services. The pattern is the same. The XPIA (cross-prompt injection attack) shape is now a commodity tactic, and the surface area is the entire web your assistant might fetch.

Research-grade memory injection. The 2024-2025 NeurIPS papers describe AgentPoison (knowledge base poisoning with trigger tokens), MINJA (memory injection through query-only interaction), and MemoryGraft (implanting fake experiences). The defence frameworks proposed so far are not complete. A-MemGuard, one of the more cited approaches, misses around 66% of poisoned memory entries in adversarial evaluation. The arms race is in the early innings and the defenders are losing.

Practical defences for now:

Sanitise any content that could enter memory (user inputs, retrieved documents, external content fetched via tools)
Do not automatically persist information from untrusted sources into long-term memory. Require an explicit user action or a heuristic that flags content from untrusted domains
Monitor for anomalous memory entries (instructions, URLs, base64 strings, suspiciously imperative language in what should be factual notes)
Provide users with tools to review what is in their memory and flag suspicious entries. The user is the second line of defence and the only one with the context to spot a poisoned entry that looks plausible to the model

What the named providers actually store in April 2026

The vendor mechanics matter because the deployer's compliance posture inherits from the provider's design. Three providers, three different approaches.

ChatGPT runs two memory systems. Saved memories are user-requested ("remember that I prefer concise answers"). Chat history insights are gathered automatically across sessions and used to personalise future responses. Both are visible and deletable in Settings → Personalization → Manage memories. OpenAI explicitly steers the system away from proactively storing health details, but there is no hard prohibition. The May 2025 NYT court order applies on top of all of this for Free, Plus, Pro, and Team accounts: deletions are preserved, not removed.

Claude uses two related mechanisms. Markdown-based memory files (CLAUDE.md) are per-project and version-controlled by the developer. Auto-memory accumulates preferences, architecture notes, and workflow habits across conversations within a project. Memory is siloed per project, which limits cross-context contamination compared to ChatGPT's global memory model. Claude's Incognito Chat skips both history and memory entirely, giving users a clean ephemeral mode.

Gemini Personal Intelligence, launched January 2026, takes a different approach. Instead of building memory from conversations, it connects directly to the user's Google services: Gmail, Photos, YouTube history, and Docs. Memory is opt-in by default and the user picks which services to connect. The risk profile is therefore not about what the chatbot remembers but about how much of the user's existing personal data graph the assistant can access. From a profiling perspective the breadth is larger and the user's intuition about "the chatbot is just talking to me" is more dangerous, because it is silently reading the user's calendar.

The pattern across all three: the explicit memory layer is increasingly well-controlled and increasingly visible. The implicit inference layer (what does the model think it knows about the user, beyond what is in the saved memories panel) remains opaque. Article 15 access requests will encounter this gap first.

Building memory features that survive an audit

If you are designing AI features with persistent memory in 2026, the controls below are the operational subset of the GDPR cascade above. They are also the controls that will distinguish your DPIA from a checkbox exercise.

Opt-in by design, not opt-out. Memory should require explicit user action to enable. Gemini Personal Intelligence got this right in January 2026 by making the per-app connections explicit toggles. ChatGPT's automatic chat history insights are harder to defend as opt-in.

Granular visibility in plain language. Users must see what the system remembers in understandable form. Not raw JSON. Not token IDs. Plain-language summaries of what is stored and why each memory exists. The CIMemories result on contextual leakage is the engineering case for this: if your model is leaking 9.6% of attributes into wrong contexts after 40 tasks, the user is the only fail-safe.

Individual deletion alongside clear-all. A user who wants to remove a single health-related memory should not have to wipe everything. The "clear all" option is a backstop. Per-memory deletion is the feature.

Genuinely non-persistent sessions. Offer a mode where nothing is remembered. Claude Incognito and ChatGPT Temporary Chat are the right shape, modulo the NYT court order caveat for ChatGPT consumer tiers. Document the mode in the privacy notice.

Retention limits with documented justification. Indefinite retention is hard to justify under the storage limitation principle in GDPR Article 5(1)(e). Set a default (30 days, 90 days, whatever fits your use case) and let users extend or reduce it. The CNIL Free / Free Mobile €42M decision (€27M Free Mobile / €15M Free, 13 January 2026) named log retention as a primary failing, and the same logic will apply when supervisory authorities start auditing memory retention.

Purpose limitation, enforced. Memory collected for personalisation should be used for personalisation. Not for training. Not for analytics. Not for advertising. Document the purpose; enforce it in the data flow; record the enforcement mechanism in the DPIA.

The single most useful diagnostic: open the settings page as a real user and try to see your own memory. Most teams shipping memory features have never tested the deletion flow from the user's perspective. Sit at the screen as a non-engineer, navigate to "what does this AI remember about me?", try to delete a single memory, try to disable memory entirely, try to read what was inferred about you. If the experience is clunky or the inferred-knowledge layer is invisible, you have shipped a feature your users cannot meaningfully control. That is the gap supervisory authorities will pick up first when persistent-memory features start getting audited at scale in late 2026.

Memory and minors. If minors can use your AI feature, persistent memory creates heightened obligations under GDPR Article 8 and the EU Commission's 14 July 2025 Guidelines on the protection of minors. Avoid profiling minors through AI memory at all. The corpus walks the full childrens-data picture in AI and children's data.

Before launch

The four-question audit your memory feature must survive. Before shipping any feature with persistent memory: (1) can a user see exactly what the system remembers about them, in plain language, without raw IDs; (2) can a user delete a single memory without nuking everything; (3) does your privacy notice describe the profiling cascade explicitly, including any third-party court orders that affect deletion; (4) have you sanitised the inputs that can land in long-term memory against the 50 documented poisoning cases Microsoft Defender named on 10 February 2026. If any of those is no, the feature is not audit-ready, regardless of what the launch document says.

Sources

Continue reading

GDPR + AIApr 8, 2026

Right to erasure when your AI used the data: what's actually deletable in 2026

GDPR Article 17 applied to AI stacks after the EDPB's February 2026 CEF report. Three deletability tiers, what unlearning cannot do yet, and a response template.

11 min read

AI PrivacyMar 9, 2026

Do you need a DPIA for your AI feature? A practical check

The trigger question is settled. The harder question is which assessment, and when. EDPB Opinion 28/2024, CNIL July 2025, and the Article 27(4) FRIA carry-over.

12 min read

AI PrivacyMar 23, 2026

Fine-tuning vs RAG vs prompt engineering: privacy implications

The privacy tradeoffs between fine-tuning, RAG, and prompt engineering for AI customization. Erasure feasibility, EDPB Opinion 28/2024, and the production hybrid pattern.

10 min read

Free tool · live

AI Data Flow Checker

Map how personal data flows through your AI integrations and spot the privacy risks before they spot you.