AI PrivacyFebruary 27, 2026· 12 min read

Your dev team uses Copilot with client code. What to know

Using GitHub Copilot on client repositories is a contract problem before it is a privacy problem. The 2026 sub-processor reality, what Copilot Business actually fixes, and the engagement-level conversation to have first.

TLDR

The contract is the load-bearing constraint, not the privacy policy. If your client agreement names sub-processors or restricts third-party processing of code, every layer of the Copilot stack has to fit inside that frame.
Copilot Business and Enterprise still do not train on customer code. Free, Pro, and Pro+ are different products on the same brand and behave differently. Treat them as different vendors when it comes to client work.
Sub-processor lists move. GitHub Copilot has routed prompts to Anthropic since October 2024 when Claude joined the model picker. Microsoft 365 Copilot then made Anthropic a default sub-processor outside the EU and UK on 7 January 2026. Most client DPAs do not list Anthropic anywhere.
Content exclusion works in IDE surfaces. It is not enforced in Copilot CLI or the Coding Agent as of late 2025. The org audit log records that a Copilot session happened, not what was in the prompt.
The right move is not to ban Copilot. It is to put the engagement-level question to your client before they put it to you.

Why this is a contract problem before it is a privacy problem

The framing most articles use is: does Copilot train on my code? That is the wrong question first. The right first question is whether your client knows you are using it, and whether the contract you signed with them allows it.

For most consulting and product-development engagements, the honest answer is "the contract is silent and the client does not know." Silent is not permissive. When a development contract gets drafted, AI coding assistants are usually not on the lawyer's mind. The confidentiality clause and the sub-processor clause were written for older threats: leaks, lost laptops, named cloud vendors. They still bind you when a new threat shows up.

GDPR Article 28(2) is the operative line for any client work that touches personal data. A processor cannot bring in a new sub-processor without the controller's prior specific or general written authorisation. Most client DPAs go further and name specific sub-processors in an annex, with an objection window for additions. The minute a developer turns on Copilot Business and starts working in a client's repository, GitHub becomes a sub-processor. Microsoft is a sub-sub-processor. And when the developer picks Claude in the model picker, Anthropic gets added to the chain.

Anthropic has been on the GitHub Copilot model picker since October 2024 when Claude 3.5 Sonnet first appeared. Sonnet 4 and Opus 4 hit general availability in June 2025. Sonnet 4.5 followed in October. Claude is now the default choice for many GitHub Copilot users who want strong reasoning on long-context code. Meanwhile Microsoft 365 Copilot ran a parallel migration and made Anthropic a default sub-processor for commercial-cloud customers outside the EU, EFTA, and UK on 7 January 2026. The pattern is the same in both places: the model providers your client never authorised are now in the chain.

(I am still not sure how the European Court of Justice will treat the GitHub-to-Microsoft-to-Anthropic chain when the model inference happens on US infrastructure. The cleanest reading is that you need to refresh the sub-processor annex. The messier reading is that you need a transfer impact assessment too. Both are work, neither has a clean test case yet.)

That is the load-bearing question. Everything else, the tier comparison, the configuration knobs, the security tradeoffs, only matters once you have an honest answer to whether your client agreed to any of this.

What actually moves between your IDE and Microsoft

A Copilot completion is not a single API call. It is a chain.

The IDE collects what GitHub calls "context": the active file, open tabs, sometimes related files imported by the current one. The context window goes to a Microsoft Azure-hosted proxy, passes through pre-model safety screening (toxic-language filtering, jailbreak detection, public-code matching), and then hits the model endpoint for the model you have selected. The completion comes back through the same pipe. For non-IDE surfaces (Copilot CLI, the Coding Agent, the github.com chat surface), prompts and suggestions are retained for 28 days for abuse monitoring. For IDE access, prompts and suggestions are processed and discarded. User engagement data is kept for two years on Business and Enterprise plans either way.

Two things sit awkwardly inside that chain.

The audit log does not capture what was in the prompt. The org-level audit log records that a Copilot session happened, which user it was attached to, and which content-exclusion rules changed when. It does not record what the developer typed or which files were in the context window. Microsoft documents this directly: "the audit log does not include client session data, such as the prompts a user sends to Copilot locally." If a client comes back six months after the engagement and asks what their code touched, you can answer who had Copilot enabled. You cannot answer which files those developers had open at the time.

The other awkward thing is that "context" is broader than the cursor's line. A developer working on a client's authentication module with three other client files open in tabs is not exposing one function. They are exposing whichever files Copilot decided were relevant context for the next suggestion. If those tabs include the session-token logic, the JWT signing flow, and the rate-limiter, that is what hit the proxy.

Neither qualifies as a leak in the moral sense. Both are the architecture working as designed. The point is that the architecture has consequences your client may not have priced in.

The four contract clauses that make this a problem

Pull the active client agreement. Read these four passages with the assumption that "third-party processing" includes "round-trip through Microsoft's Copilot proxy."

Confidentiality. Most consulting and development contracts include an obligation not to disclose the client's confidential information to third parties without consent. The obligation typically covers disclosure, full stop, not "disclosure for misuse." When source code is sent to a Copilot proxy and a model inside Microsoft's infrastructure produces a completion from it, that round trip is a disclosure even if no human at Microsoft ever reads it and no model is trained on it.

Sub-processor authorisation. Article 28(2) GDPR makes this the regulator's question, not just the contract drafter's. Many DPAs operationalise it with a frozen sub-processor list and an objection window, typically 30 days, for additions. The chain GitHub-Microsoft-Anthropic did not exist on most annexes a year ago. If your DPA Annex was last revised in 2024, it is out of date.

IP assignment and handling. Most consulting work assigns the IP of the deliverable to the client. Many contracts go further and restrict where the client's IP can be processed, stored, or accessed. A round trip through Microsoft Azure's Copilot proxy is "access" in any reading that does not bend the word for convenience. Even if Copilot Business does not train on the code, Microsoft's pre-model safety system does inspect the prompt before routing it.

Regulated industry obligations. Clients in healthcare (HIPAA), payments (PCI-DSS), finance (SOX), defence, and government often carry obligations that flow down to you contractually. Some are explicit about AI tools: federal contracts in the US increasingly forbid AI-assisted code generation on classified or controlled-unclassified work. Some are implicit. A HIPAA business-associate agreement that defines a sub-contractor as anyone with "access to PHI" arguably covers a Copilot proxy if any test fixture in the codebase contains real PHI.

If the contract is silent on all four, that is not implicit permission. It means AI coding assistants were not contemplated when the lawyer wrote it. The safe move is to tell the client and make it the client's call.

Copilot Business in 2026: what it fixes and what it does not

Let me draw the line carefully because the marketing makes the line blurrier than it is.

Copilot Free, Pro, and Pro+ are the consumer tiers. Copilot Free launched in December 2024 and gives every GitHub user 2,000 completions and 50 chat messages a month. Copilot Pro is $10 per month, Pro+ is $20. None of them sit under a Data Protection Agreement that you control. Code/prompt collection for service improvement is on by default on the consumer tiers; the developer has to know to opt out in Settings → Copilot → Data Sharing.

Copilot Business ($19 per user per month) and Copilot Enterprise ($39 per user per month) are different products on the same brand. They sit under the GitHub DPA, they do not train on customer code or prompts, prompts and suggestions from the IDE are not retained, IP indemnification is included, and an organisation administrator controls who can use them and which content rules apply to which repositories.

The IP indemnity covers unmodified suggestions only. Microsoft will defend you in court if you receive an unmodified Copilot suggestion and that exact suggestion turns out to infringe a third party's copyright. The minute your developer changes the variable names, refactors the loop, or fixes a bug in the suggestion, the indemnity arguably no longer applies. In normal coding work, almost every Copilot suggestion gets modified. The indemnity is real but its real-world coverage is narrower than the marketing suggests.

Two other gaps deserve names.

The first is the content-exclusion gap in CLI and Coding Agent. Content exclusion went generally available in IDE surfaces in November 2024 and works as advertised inside VS Code, Visual Studio, JetBrains, Eclipse, and Xcode. As of November 2025 Microsoft documents that content exclusion is not enforced in Copilot CLI or Copilot Coding Agent. If your developers are using the agent mode that walked into general availability through 2025 and 2026, the content rules you set up at the org level for the IDE may not apply when the same files are accessed through the agent. Read the GitHub changelog before assuming an exclusion list is enforced everywhere.

The second is the EU data residency gap. GitHub Enterprise Cloud has supported EU data residency for repository content and metadata since October 2024 and added Copilot metrics with data residency in public preview in January 2026. Copilot inference is a separate question. Completions are still routed through model endpoints that GitHub has not committed to keeping inside the EU. If you are selling to a client whose DPA names "EU only" data processing, GitHub Enterprise Cloud's EU residency does not by itself satisfy that clause for the Copilot completion path. Microsoft's parallel decision to disable Anthropic models by default for EU and UK Microsoft 365 Copilot customers from January 2026 is the corollary. Microsoft itself treats the EU as a separate compliance perimeter.

Configuration that actually moves the needle

Once the contract conversation has happened and the answer is "yes, with conditions", the configuration layer is where you operationalise the conditions.

At the organisation level on Business or Enterprise:

Turn on the public-code suggestion filter. This is the safety net for the indemnity. It blocks Copilot from suggesting strings that match known public repositories, which both reduces the IP-contamination risk and keeps you closer to the unmodified-suggestion box where the indemnity actually lives.
Set content exclusions for the directories that should never reach the proxy. Credentials directories, configuration files with embedded secrets, test fixtures that contain real client data, and anything under a regulated client's repository when the DPA requires it.
Disable Copilot organisation-wide for repositories under the most restrictive client contracts. Per-repository policy is finer-grained than per-developer.
Audit which accounts your developers are actually signed in with. The single most common mistake is a developer using a personal Copilot Pro subscription on a work machine. None of the Business protections apply. None of the audit logging applies. The DPA does not exist for that developer.

At the developer level:

Treat Copilot like a tab-aware tool, because it is one. Close client A's repository before starting work on client B. The cost is two seconds of annoyance. The benefit is that client A's authentication module is not in the context window when client B's completion request goes up the wire.
Never paste production data into Copilot Chat or the inline chat. Use synthetic fixtures for prompts that need data shape.
Keep secrets out of files Copilot can see. Environment variables, .env files in the exclusion list, secret managers for anything sensitive. This is good hygiene independently of Copilot. Copilot makes it more important.

# Example content exclusion entries (configured at org or repo level)
# Reference format: REPOSITORY-REFERENCE: list of paths

"*":
  - "/.env*"
  - "/config/secrets/**"
  - "/fixtures/production/**"
  - "/**/*.pem"
  - "/**/*.key"
  - "/**/credentials.*"

At the engagement level:

For a regulated client or one with a tight DPA, document which Copilot tier the team is on, which content exclusions apply to that client's repositories, and which developers have access. Put this in the engagement file alongside the signed DPA. When the client asks, and CEF reports have made clear regulators are starting to ask too, you have a one-page answer.

The cheapest fix is the proactive memo. You can spend a week tightening content-exclusion lists, writing IDE-by-IDE policies, and onboarding a secret scanner, and you still have not addressed the underlying Article 28(2) question. A two-paragraph email to the client, sent before they ask, costs ten minutes per engagement and resolves the question for the lifetime of the contract. Do it first. The technical configuration is the supporting work, not the primary fix.

A note on the productivity-vs-security trade-off the configuration cannot fix. Veracode's Spring 2026 GenAI Code Security update tracked over 100 large language models across Java, JavaScript, Python, and C#. Syntax pass rates have climbed from about 50% to 95% since 2023. Security pass rates have stayed flat between 45% and 55% across model generations. Java was the worst, with a 72% security failure rate. OpenAI's GPT-5 family was the leader at 70-72%. Every other major provider sat in the 50-59% band. The honest read is that AI assistants are getting much better at producing code that compiles and roughly the same at producing code that does not have an injection bug. This does not change the contract analysis. It does change how much of the productivity story you should believe before you decide it is worth the conversation with the client.

The conversation with the client

This is the part teams skip and it is the cheapest item on the list.

Frame it as proactive disclosure, not a confession. You are informing the client about your tooling and inviting them to either consent or set boundaries. Most clients appreciate the transparency. The few who do not are the ones you most want to find out about before an audit, not during one.

A short message works. Something like:

"Our development team uses GitHub Copilot Business for AI-assisted coding on your project. Code context (the active file, open tabs, related files) is sent to Microsoft's Copilot infrastructure for completion generation. Microsoft does not train on Business-tier code and prompts from the IDE are not retained. The model endpoint depends on which model the developer selects in the Copilot model picker, and Anthropic models have been part of that picker since October 2024. We wanted to flag this and confirm you are comfortable with it for [project], or discuss alternatives if you would prefer we disable it for your repositories."

The likely responses and how to handle them:

"That is fine." Get it in writing. An email reply is enough. Add it to the engagement file. Treat it as the sub-processor authorisation you needed under Article 28(2).

"We would rather you didn't." Disable Copilot at the repository policy level for that client's repositories. Per-org and per-repo policies have existed in the admin UI since early 2024. Document the decision in the engagement file and pin a note on the repository so a new joiner does not flip the toggle six months later.

"We need to check with our legal or compliance team." Normal for regulated clients. Send the GitHub Copilot Business privacy statement, the GitHub DPA, and a pointer to Microsoft's Service Trust Portal sub-processor list. Let the lawyers do their work. Build their answer back into your standard onboarding for that client family so you do not redo it next quarter.

"We did not know you were using this." That is exactly the reason for the message. Better now than during a customer audit, an internal review, or worse, an Article 33 incident notification.

If you find yourself wanting to skip this conversation because it feels awkward, that is the signal that it is the right one to have.

Before your next client engagement

Pull the active client list. Sort it by contract restrictiveness: regulated industries and government work first, generic SaaS development last. For the top quarter of the list, send the disclosure message this week. For the rest, fold it into the standard kick-off email for the next engagement.

Then check tiers. If anyone on the team is using Copilot Free, Pro, or Pro+ on a personal account for client work, that is the immediate fix. Migrate them to Business, or disable Copilot in those repositories until they are. Confirm the Anthropic relationship is named in your processor register with the October 2024 date for GitHub Copilot model selection (and 7 January 2026 for the parallel Microsoft 365 Copilot change if your team also uses it). Refresh the sub-processor annex on your own DPAs so the next client onboarding does not rediscover the same gap from the other side.

The engagement-level test. For each active client, you should be able to answer two questions in writing within five minutes. First: did the client authorise GitHub-Microsoft-Anthropic as sub-processors for code processing? Second: which of your developers has Copilot enabled for that client's repositories, and which content rules apply? If either answer requires more than five minutes of digging, the engagement file is not where it needs to be. That is the work for this week.

Sources

Continue reading

AI PrivacyApr 9, 2026

When you call OpenAI, who actually processes your data? The AI sub-processor cascade

A trace-walk of one OpenAI API call through every entity in the cascade, with the Article 28, CLOUD Act, Article 48, and DMA layers stacked on top.

13 min read

AI PrivacyMar 15, 2026

How to audit your codebase for AI data leakage

A practical, surface-by-surface audit recipe for finding personal data flowing to AI services. Covers prompt templates, observability defaults, embedding pipelines, and the limits of audit-by-grep in agent mode.