The Most Expensive Promise in Legal AI (And the Breach Nobody Sees Coming)
- Apr 11
- 8 min read

I have been thinking about Zero Data Retention, and I think I've found the bit nobody wants to talk about. Two bits, actually. The first is that it's about to become ruinously expensive. The second is that even if you could afford it, it probably doesn't protect you from the thing you should actually be worried about.
Let me take those in order, because the second one is, in my opinion, considerably more important than the first.
I. The economics
The barrister who can't remember what he just read
The era of "ask a question, get an answer" is ending. Agentic AI, where the model doesn't just respond but actually does things on your behalf, works in a fundamentally different way. An AI agent reviewing a contract doesn't make one call to the model. It makes dozens. Sometimes hundreds. Each tool call sends the entire conversation history back to the model, plus the tool's response, plus the next instruction.
The best analogy I can find is a barrister who, every time he picks up a new exhibit, has to re-read the entire trial bundle from page one. Every page. Every time. Including the ones he read four seconds ago. You'd fire that barrister.
Prompt caching solves this. The model provider briefly holds your prompt in memory, typically for about five minutes, so that when the agent makes its next call 400 milliseconds later, it doesn't have to reprocess everything from scratch. Anthropic's own documentation puts cached token reads at roughly 10% of the standard input price. That's not a modest saving. That's the difference between a viable product and a very expensive hobby.
But caching is retention. Even five minutes of it. Even if it's ephemeral. And if you've committed to true ZDR - as most noteworthy law firms have, you've switched it off. Every tool call, every agent loop, every sub-task pays full price. Your helpful contract review agent just got somewhere between 10 and 20 times more expensive. Which is, I think, the sort of thing a managing partner might want to know about before the invoice arrives.
The maths that should probably concern you
If your firm spends £50,000 a month on AI, which is arguably conservative for a firm doing meaningful agentic work, true ZDR could turn that into something between £500,000 and £1,000,000.¹ It's the sort of figure that turns a routine technology committee meeting into a rather tense conversation with the senior partner who just approved the new carpets.
So whilst well intentioned, the industry has, quite rationally, settled on a compromise. Near-ZDR. The provider promises not to train on your data. The provider promises short retention windows. Someone waves a SOC 2 badge. Everyone nods solemnly, signs the DPA, and moves on. It's "good enough."
And for the past two years, in a world of simple, stateless prompts, it probably has been.
I think that's about to change.
II. The exposure nobody's modelling
The Swiss cheese problem
In a single-prompt world, a client name passes through third-party infrastructure once. One transit, one disclosure event, vanishingly small probability of interception. The risk-reward calculation is straightforward and the GC sleeps fine.
In an agentic workflow, that same client name might transit 30, 40, 50 times in a single session. Each transit is, in the strict regulatory sense, a separate disclosure event. The probability of any single event causing harm remains tiny. But probability compounds in ways that human intuition is notoriously poor at grasping. Run thousands of agent loops, across hundreds of matters, across months of continuous usage, and the cumulative exposure starts to look like something a risk committee should probably have a view on.
This is the Swiss cheese model of data protection, and agentic AI is adding a lot more slices. Each slice has holes. Eventually, the holes line up.
The safety scan paradox
Here's something I find genuinely interesting, and which I think deserves more scrutiny than it currently receives.
Even the most rigorous ZDR packages in the market, including those offered by the leading legal AI platforms, still include abuse prevention scanning. That means prompts are being inspected, and in some cases flagged or logged, by systems the firm doesn't control and probably can't audit. Most of the time this is entirely harmless. Nobody is reading your contract review prompts for entertainment.
But consider what happens when a content filter flags something. Perhaps a prompt about a sanctions matter triggers a policy classifier. Perhaps a hostile takeover scenario matches a pattern for market manipulation. The prompt gets shunted into an exception queue. That queue sits on infrastructure the firm hasn't approved, in a jurisdiction the firm hasn't assessed, accessible to personnel the firm hasn't vetted. And the prompt contains real client names, real matter details, real privileged strategy.
Who has access to that exception queue? Under what legal framework? For how long? Under what circumstances might it be disclosed?
I don't think most firms have asked these questions. I think they probably should.
The subpoena in the cache
This is, I suspect, the scenario that should genuinely keep the relevant people up at night.
If a model provider holds cached data, even ephemerally, and receives a valid legal process request during that window, they are in a difficult position. They may be obliged to produce what they hold. The probability is low. But the consequence is catastrophic. You have just had privileged client strategy disclosed via a third party's infrastructure, through no fault of your own, in circumstances you could not have predicted and probably cannot remediate.
Try explaining that to the client. Try explaining it to the SRA. Try explaining it to the insurer.
The Heppner case demonstrated how easily privilege can be waived through carelessness rather than intent. The courts were unsentimental about it. The same courts are unlikely to be more forgiving when the carelessness involves routing privileged material through third-party AI infrastructure with an ambiguous retention posture.
The pattern nobody's watching
And then there's the risk that I think is most underappreciated, perhaps because it's the most abstract.
Individual prompts in isolation are probably low risk. A single query about a contract clause, stripped of context, tells you very little. But an adversary, or a regulator, or an ambitious litigant's e-discovery request, that can observe the pattern of prompts across matters starts to build a picture that is considerably more revealing than any individual prompt.
Which clients does the firm represent? On what issues? Against whom? What's the strategic posture? Where are the pressure points? If you can see that Firm X has been running 200 agent loops on pharmaceutical patent litigation involving Company Y over the past six weeks, you don't need to read the privileged content to draw some rather valuable inferences.
This is cross-matter aggregation risk, and near-ZDR does precisely nothing to address it. The metadata is the vulnerability. The content is almost beside the point.
The hacker's lens
I want to spend a moment thinking about this the way an attacker would, because I think it's instructive.
If you were trying to breach a law firm's AI workflow, you would not bother attacking the model itself. That's hard, well-defended, and the payload is mostly noise. What you'd target is the transit layer. The prompts in flight. The cached conversation histories. The safety scan exception queues. The logging infrastructure that sits between the firm and the model provider.
In a near-ZDR architecture, every one of those touchpoints contains real data. Real names, real matters, real privilege. You don't need to compromise the firm's internal network. You don't need to breach the model provider's core infrastructure. You just need to find a vulnerability somewhere in the middleware, and you have access to a real-time stream of the firm's most sensitive work product.
Now imagine the same architecture, but every prompt has been pseudonymised before it leaves the firm's network. Every client name replaced with a contextually plausible alias. Every matter reference substituted. Every identifying detail transformed. You breach the middleware and you find... fiction. Coherent, well-structured, beautifully reasoned fiction. About people who don't exist, working on matters that never happened, for clients you can't identify.
You have achieved the hacker's equivalent of breaking into a vault and finding it full of very convincing counterfeit notes. Technically impressive. Practically worthless.²
III. The architecture
The theology of zero, revisited
ZDR was the right instinct, wrong mechanism. The instinct was entirely sound: client data shouldn't be exposed to third-party infrastructure. The mechanism was: make the plumbing promise to forget.
The problem with promises is that they have exceptions. Abuse prevention scanning is an exception. Cache windows are an exception. Legal process obligations are an exception.
Exception queues are, by definition, an exception. Each one is individually defensible. Collectively, they amount to a retention posture that is considerably less "zero" than the branding suggests.
The better mechanism, I'd argue, is to make the data itself worthless to anyone who intercepts it.
If your prompt contains "Emilia Chang" instead of your actual client's name, and every identifier, matter reference, and privileged detail has been replaced with a contextually plausible surrogate before it ever leaves your network, then the cache is just fiction. The safety scan inspects fiction. The exception queue holds fiction. The subpoena produces fiction. The hacker exfiltrates fiction.
The model still works. The agent still loops. The cache still saves you 95% of your costs. But the data sitting in every one of those exposure points has no connection to any real person, matter, or privilege.
The question I think the industry needs to ask
I have spent a long time servicing the legal market, and to me it's interesting to note that since the arrival of mainstream AI, the market has spent the last two years arguing about where models run and how long data is stored. Those are fine questions. But they are, in my opinion, incomplete. And the answers are becoming less reassuring as the workloads become more complex.
The question that probably matters more is: what data are you sending in the first place?
If the answer is "real client data, but we've made sure it's deleted quickly," you're in a foot race with your own threat model. And I suspect the threat model has been doing interval training.
If the answer is "nothing real ever left our building," you've already won. You can keep your caching. You can keep your cost efficiency. You can even keep your carpets.
The regulatory ratchet only turns one way. The DSG v ICO Court of Appeal ruling, the EU AI Act's enforcement posture, the GCC's data localisation requirements; none of these are getting more relaxed. What satisfies your regulator today may not satisfy the regulator who reviews your AI practices in eighteen months. And regulators tend not to be sympathetic to "well, everyone else was doing it too."
I'd rather be writing this piece a year too early than publishing it a week after the first incident. Though I suspect the timing might be closer than most people think.
¹ Your mileage will vary. But probably not by as much as you'd like. Anthropic's prompt caching documentation is worth reading here; cached reads cost 10% of standard input pricing, meaning the savings you forfeit by insisting on true ZDR are very real and very large.
² One imagines the attacker, sitting in a dimly lit room, surrounded by energy drinks and self-regard, slowly realising that the 400,000 privileged documents he's just exfiltrated are an elaborate work of fiction. Like breaking into the British Library and discovering it's been quietly replaced with the contents of a mid-tier creative writing MFA programme.
Source link: Anthropic Prompt Caching Documentation - https://docs.anthropic.com/en/docs/build-with-claude/prompt-caching
Hashtags:




Comments