AI PrivacyMarch 23, 2026· 10 min read

Fine-tuning vs RAG vs prompt engineering: privacy implications

The privacy tradeoffs between fine-tuning, RAG, and prompt engineering for AI customization. Erasure feasibility, EDPB Opinion 28/2024, and the production hybrid pattern.

TLDR

Erasure feasibility is the deciding axis for any team that may receive an Article 17 request. Prompt engineering: trivial. RAG: feasible with lineage. Fine-tuning: research-grade unlearning, not production-grade.
EDPB Opinion 28/2024 (17 December 2024) sets a high anonymity bar for fine-tuned models and gives the explicit example of voice-fine-tuned models as not anonymous. The three-step legitimate interest test is the path to lawfulness, not a workaround.
The Italian Garante's €15M Decision 755 against OpenAI was annulled by the Court of Rome on 18 March 2026 on procedural grounds, not because the substantive analysis of training-without-basis was wrong. Use it as a warning, not as a defence.
The production answer for almost every privacy-sensitive workload is hybrid: prompt engineering for sensitive one-offs, RAG for factual context with lineage, fine-tuning only for non-personal domain behaviour. Keep personal data out of the fine-tuning layer entirely.

The choice between fine-tuning, RAG, and prompt engineering looks like a quality-and-cost decision in the engineering documentation and a compliance decision in the legal review. It is the same decision under both lenses, but the legal lens dominates as soon as personal data enters the picture. The reason is straightforward: only one of the three approaches lets you delete a single person's data without retraining the model. The other two either let you delete trivially or do not let you delete at all.

This article walks the five factors that distinguish them, using EDPB Opinion 28/2024 as the load-bearing legal anchor and the 2025-2026 unlearning research as the load-bearing technical anchor.

	Fine-tuning	RAG	Prompt engineering
Where data lives	Encoded in model weights (permanent)	Stored in vector database (deletable)	Context window only (ephemeral)
Right to erasure	Research-grade unlearning, not production	Feasible with lineage tracking	Trivial (stop including it)
Training data extraction risk	High (membership inference, memorisation)	Low (but embedding inversion possible)	None
Data sent to provider	Entire training dataset upfront	Prompts with retrieved context per query	Prompts with context per query
Data minimisation	Hardest (full datasets processed)	Medium (minimise at ingestion + retrieval)	Easiest (context window naturally limits)
Best for	Domain language, behaviour patterns	Factual accuracy, current data, citations	General tasks, sensitive data

The factors that matter

Factor 1: Data permanence

This is the fundamental distinction.

Fine-tuning encodes training data into model weight deltas through gradient descent. Personal data does not sit in a table you can query and delete. It is distributed across billions of parameters as statistical patterns. Three levels of memorisation happen: verbatim (exact text reproduction, especially from small or repeated datasets), fuzzy (statistical patterns enabling inference), and conceptual (learned facts without source attribution).

Membership inference attacks now reach high practical accuracy. SPV-MIA (NeurIPS 2024) achieves AUC of 0.9 at detecting whether a specific document was in the fine-tuning set. If you fine-tune on personal data and expose the model via API, an adversary can with reasonable confidence determine whether someone's data was used. Under GDPR, that detection is itself a form of personal data processing, and arguably itself a breach.

RAG stores data in a vector database as embeddings. The data is separate from the model. You can delete specific vectors and their source documents. But embeddings are likely personal data under GDPR, as our vector embeddings as personal data piece walks in detail. The Vec2Text family of attacks recovers around 92% of original text from embeddings in some configurations, and zero-shot inversion methods from 2025 can attack any embedding model without per-model training.

Prompt engineering is ephemeral. Data exists in the context window during the API call. No model weights are updated. No persistent storage on your side. When the call ends, the data is gone from the model's perspective. The provider may log the request, subject to their retention policy and to the zero-data-retention controls in their DPA.

Factor 2: Can you honour erasure requests?

GDPR Article 17 gives individuals the right to deletion. Your AI customisation approach determines whether you can comply.

Fine-tuning: The honest answer in April 2026 is that you probably cannot. The only reliably-complete way to remove a person's data from a fine-tuned model is to retrain from scratch without that data. Machine unlearning research has progressed substantially in 2025. RMU (Representation Misdirection for Unlearning), SimNPO (Simple Negative Preference Optimisation), and LoRA-based variants like LIBU all show measurable forget rates on benchmark datasets, and the SemEval-2025 Task 4 evaluations gave the field a common testbed. But the recent surveys (arXiv 2510.25117, the October 2025 systematic survey) are explicit that no method has yet cleared the production bar of "verifiably deleted, with no measurable utility loss, on a deployed model with adversarial probing". RMU and SimNPO are the strongest candidates and they still leave residual memorisation traces.

For the deletion-strategy decision, treat unlearning as research, not as compliance. Plan your retraining cadence and your data subject response process around the assumption that erasure of fine-tuned personal data means retraining or accepting non-compliance. The right to erasure article walks the EDPB CEF 2025 enforcement context and the three deletability tiers in detail.

RAG: Feasible but operationally complex. You need document-to-vector lineage from day one. Every vector ID maps to a source chunk, every chunk maps to a source document, every document maps to the data subjects it contains. When someone requests deletion, query the lineage table, delete the vectors, delete or redact the source documents. Without lineage tracking, you cannot fulfil the request. See building RAG with customer data for the implementation pattern.

Prompt engineering: Trivially achievable. Stop including the person's data in prompts. Check your provider's retention policy for any cached data. Done.

The Garante / Court of Rome story is not the absolution most articles read it as. The Italian Garante's Decision 755 against OpenAI (November 2024) imposed a €15M fine partly because OpenAI had not established a legal basis for processing personal data in training before training began. The Court of Rome annulled the decision on 18 March 2026, but the public reasoning so far is procedural (about the Garante's jurisdiction and process), not a substantive vindication of training-without-basis. The full reasoning has not yet been published. Treat the case as a warning that EU DPAs view training without a documented Article 6 basis as actionable, even if the first decision was overturned. The next case is going to be cleaner.

Factor 3: What is your legal basis?

Each approach requires a legal basis under GDPR Article 6, but the requirements differ.

Fine-tuning requires a legal basis for training on personal data. The EDPB's Opinion 28/2024 (17 December 2024) is the key document here. It walks a three-step legitimate interest test for AI training and gives a concrete example that most articles skip: a generative model fine-tuned on the voice recordings of an individual to mimic their voice cannot be considered anonymous, because the model itself encodes recoverable personal data. The same logic applies to a model fine-tuned on a single customer's tickets, on an HR dataset of named employees, or on the writing style of an identified person.

The three steps the EDPB lays out:

Identify the legitimate interest. It must be lawful, clearly articulated, and real and present (not speculative). "We might want to train a model later" does not pass step one.
Necessity. Show that the training is necessary for the interest and that there is no less intrusive way to achieve it. RAG with pseudonymisation is, in most cases, the less intrusive way. The EDPB does not say so explicitly, but the data minimisation principle points there.
Balancing. Weigh the interest against the data subjects' rights and reasonable expectations. The Opinion is explicit that data subjects have an interest in self-determination over their data, and that the processing's impact on this interest is part of the balance. A fine-tuned model that can be membership-inference attacked materially undermines the subject's expectation that their training contribution stays anonymous.

Consent is a hard alternative. It must be specific to training and withdrawable, but withdrawal does not undo training. Purpose limitation applies: data collected for service delivery cannot automatically be repurposed for fine-tuning.

RAG requires a legal basis for storing personal data in the vector database and for retrieving it during queries. This typically aligns with the same basis as your core service (legitimate interest or contract performance). The key addition: address data minimisation at both ingestion and retrieval. A RAG system that retrieves irrelevant personal data because the semantic similarity threshold is too broad is over-processing, and the fix is a higher threshold or a retrieval filter, not a contractual workaround.

Prompt engineering requires a legal basis only for the inference call itself. This is usually covered by the same basis as the service you are providing. Simplest of the three.

Factor 4: Attack surface

Fine-tuned models are vulnerable to three attack classes: membership inference (detecting if specific data was in the training set), training data extraction (generating verbatim training content), and model inversion (reconstructing approximate inputs from outputs). The two-stage pipeline (prompt with known prefixes, score candidates via membership inference) can extract specific training sequences from production models. The attack effectiveness has improved rapidly through 2024-2025, with SPV-MIA-style attacks now reaching the AUC 0.9 range that makes them practical for adversaries with API access.

RAG systems are vulnerable to prompt injection via retrieved documents (5 poisoned documents reach 90% attack success in the RAG poisoning literature), embedding inversion (recovering source text from vectors), and cross-boundary data leakage (retrieval that ignores access controls). The EDPS TechSonar page on RAG warns specifically that queries "specific enough to cause RAG systems to retrieve and disclose personal data" without identifiers can still enable individual identification. The singling-out problem reappears at the retrieval layer.

Prompt engineering has the smallest attack surface for data privacy. No training data to extract. No persistent store to invert. The risk is limited to the provider seeing your prompts (addressed by self-hosting or zero-retention agreements) and prompt injection attempting to extract context window contents.

Factor 5: Quality vs privacy tradeoff

Fine-tuning produces the highest domain specialisation: correct terminology, appropriate tone, understanding of domain-specific concepts. But it requires the most data and carries the most privacy risk. The 2025 industry pattern that has become clear: fine-tuning is over-used for tasks that frontier-model prompt engineering with a strong system message would handle equally well. The cost-benefit case for fine-tuning has narrowed as base models improved.

RAG provides the best factual accuracy for specific queries: it retrieves current data, reduces hallucination, and can cite sources. Privacy risk is medium and manageable with proper architecture. The production winners in 2025-2026 have pushed harder on retrieval quality (better chunking, hybrid search, reranking) and less on fine-tuning the underlying model.

Prompt engineering is good for general tasks but limited by context window size. Quality depends on how well you craft prompts and how much context fits. Privacy risk is lowest. Modern frontier models with 200K+ context windows make prompt engineering viable for use cases that would have required RAG two years ago.

When to choose each approach

Use prompt engineering when

Data is highly sensitive (health, legal, financial) and must not persist
You need maximum control over what enters the system
Context windows are sufficient for your data needs
Data subject access and erasure requests must be simple to fulfil

Use RAG when

You need to reference personal data but must maintain erasure capability
Data changes frequently and must stay current
You can invest in lineage tracking and access controls
Pseudonymisation before embedding is feasible

Use fine-tuning when

The training data contains no personal data (public datasets, anonymised patterns, domain terminology)
Differential privacy techniques can be applied during training
You need specialised behaviour that RAG and prompting genuinely cannot achieve
You can accept the erasure impossibility (with documented legal exemption)
The model will not be exposed via public API (reducing membership inference risk)

Never fine-tune when

Training data contains identifiable personal data and you operate under GDPR
You cannot afford retraining if erasure requests arrive
The model will be API-accessible (membership inference risk)
You have not established a documented legal basis before training begins

Pseudonymisation before embedding is the cheapest high-impact RAG safeguard. Most teams skip this step because the embedding model can technically handle raw text, but the EDPB Opinion 28/2024 anonymity bar is high enough that the embeddings of unpseudonymised personal data are themselves personal data. Run the source documents through a Presidio-style PII detector (or a hybrid GLiNER pipeline) before chunking and embedding. The cleaned text loses very little semantic signal for retrieval, and the resulting vector store survives a much more honest answer to the "what personal data is in your AI system?" question.

The hybrid pattern is the production answer

Almost every team that ships AI features at scale ends up with all three approaches running together, used for what each is good at. The pattern that has settled in 2025-2026:

Prompt engineering for the high-sensitivity path. Customer-support tickets that contain account specifics, legal documents with privileged content, health records, anything that should not persist anywhere outside the immediate inference call. The data is in the context window for one call and then gone. This is also where the zero-data-retention contractual flag earns its keep.

RAG with pseudonymisation and lineage for the factual-accuracy path. Internal knowledge bases, customer history (with PII redacted before embedding), product documentation, anything where retrieval-time freshness matters and the dataset is large enough that prompt engineering would not fit. Build the document-to-vector lineage table from day one or you will not be able to honour erasure requests later.

Fine-tuning for non-personal domain behaviour only. Style, tone, format, domain vocabulary, structured-output patterns. None of the training data should be identifiable. None of it should be specific to a single customer or person. The fine-tuning step is the model's voice, not its memory.

The structural rule that pulls this together: keep personal data out of the fine-tuning layer entirely. Once it is in the weights, you cannot reliably get it out, and the erasure conversation with your supervisory authority becomes a different conversation. Every piece of personal data in your AI stack should sit in the RAG layer or in the prompt, where you can point to where it lives and how it leaves.

The deletion test that decides the architecture. Pick the most-sensitive personal data your AI feature touches. Ask: if the data subject asks me to delete this tomorrow, how do I do it? If the answer is "delete a row from the vector store" or "stop sending it in prompts", your architecture is fine. If the answer is "I would need to retrain the model and that would take weeks", the architecture has put personal data in the wrong layer. Move it before the request arrives.

Sources

Continue reading

GDPR + AIApr 6, 2026

Are vector embeddings personal data under GDPR? A technical answer for RAG builders

Vector embeddings of personal data are likely personal data under GDPR. Here is the legal test, the 2025 attack research, the regulator convergence, and how to document your position.

8 min read

GDPR + AIApr 8, 2026

Right to erasure when your AI used the data: what's actually deletable in 2026

GDPR Article 17 applied to AI stacks after the EDPB's February 2026 CEF report. Three deletability tiers, what unlearning cannot do yet, and a response template.

11 min read

AI PrivacyMar 19, 2026

Building RAG with customer data. Here are the 5 things that matter

A practical guide to building RAG systems with customer data while handling GDPR obligations. Lineage tables, retrieval authorization, embedding inversion, and erasure planning.

7 min read

Free tool · live

AI Data Flow Checker

Map how personal data flows through your AI integrations and spot the privacy risks before they spot you.