How to contract around AI model dependency

Most of your AI vendors are running on a model they do not own, and your contract with them almost certainly does not say what happens when that model changes. That gap is one of the most underpriced risks in software procurement right now, and it is one of the few you can close with the paper already in front of you. The instinct, once you see it, is to reach for control: name the model, lock it down, require sign-off before anything underneath can move. I sit on the vendor side of these negotiations, and those are the clauses I push back on hardest. It is worth understanding why before you draft them, because the protection you actually want sits somewhere else.

Only a small number of providers train their own models, since the cost rarely clears the benefit and for most vendors it never will. Everyone else builds on a third-party foundation model, so the variety at the surface mostly disappears one level down. Five shortlisted vendors can resolve to one or two models underneath, and in legal AI that short list is dominated by Anthropic, with Google and OpenAI the other main names, all of them US companies and so all subject to US export law. The risk is no longer theoretical. A model can be pulled or restricted by government action with little warning: there has already been an attempt to bar one provider's model across the federal government, vendors have been excluded from Pentagon procurement, and export controls gate which models can be used in which jurisdictions. At the same time the labs the market depends on are filing to go public, and the subsidised pricing that cushioned heavy usage is ending, both of which put pressure on how these models are priced and supplied. A capable, solvent vendor can lose access to the model its product is built on for reasons that have nothing to do with you, and nothing it can control.

Why locking the model down is the wrong lever

A clause that names a specific model and forbids change without your approval looks protective. In practice it does two things, both of which work against you. It freezes the vendor onto a model that will be superseded, often within months, while better and cheaper options appear that your own contract has prevented them from adopting. And it moves the operational judgement about orchestration, which model to run for which step, how to route, when to upgrade, away from the party that understands it and onto the party that does not.

Vendors change their orchestration constantly, and most of those changes are what keep the service good. A contract that requires sign-off on each one either slows the product down or gets quietly ignored, and neither outcome helps the customer who insisted on it. The thing worth protecting is the standard of the work the vendor delivers, set high enough that any change underneath has to clear it. Get that right and the model question resolves itself, because the vendor absorbs the risk of every substitution. Get it wrong and no amount of model-naming will save you.

Set a quality threshold, and make it measurable

Define the standard the service has to meet in terms you can actually measure: accuracy against an agreed benchmark, first-pass acceptance rate, correction and escalation rates, turnaround time. Bind the vendor to maintain that standard for the life of the agreement. Most contracts never do, and it is the clause that carries the most weight.

A standard set this way is indifferent to what the vendor runs. If a model change, a re-routing, an upgrade, or a forced substitution degrades the output, the vendor has missed the standard, whatever happened under the hood. You never have to prove the cause. You measure the result against the number you agreed, and the obligation to put it right sits with the vendor. A contract-review agent might be held to a maximum correction rate across a defined document set. If a model swap pushes corrections above that line, the vendor is in breach, regardless of which model is now running and regardless of why they switched.

The practical way to set the number is to measure it during the pilot. Most enterprise deployments open with a paid proof of value on real documents, and that exercise produces exactly the figures you need: the acceptance rate the vendor actually achieved on your work, the escalation rate, the turnaround. Carry those into the contract as the floor, rather than signing up to a generic SLA the vendor drafted in the abstract. Then make the clause operable by specifying who measures performance, against which sample of work, how often it is reported to you, and the cure period the vendor has to bring a slipping metric back above the line before credits or termination rights come into play.

A threshold like this gives the vendor exactly the flexibility they should have. They can change orchestration freely, provided the work keeps clearing the bar. The accountability for whether it does stays with them throughout.

Make continuity of the service the obligation

The model-suspension scenario is real, and the cleanest way to handle it is to make the vendor responsible for continuity of the service rather than continuity of any particular model. The obligation reads roughly: the service remains available and to standard regardless of changes in the underlying model or its availability. That single sentence puts model risk where it belongs. If a provider is suspended by export order, building a fallback becomes the vendor's problem, because they are still on the hook for delivering the work to the agreed bar.

Asking about a vendor's architecture and failover plan in diligence is sensible, and you should. Dictating either in the contract is not necessary once continuity and quality are bound. With the vendor on the hook for both, building the multi-provider routing and the regression testing behind a switch is in its own interest.

Ask for notice of material change, not a veto over it

Where a change to the model or orchestration materially affects performance or the risk profile of the work, you want to be told, and you want the right to re-test against your acceptance criteria. Define what counts as material so the clause has teeth: a change of model provider, a change in the jurisdiction where the model runs, or any change that moves one of your tracked metrics beyond an agreed tolerance. Those are the changes worth a notification and a re-test. That is reasonable, and a serious vendor will agree to it. A right to approve every change before it ships is a different thing, and I would resist it from the vendor side because it penalises the routine improvements that make the service better. Notice plus a re-test right keeps you informed and protected by your standard, without turning your contract into a brake on the product.

Tie remedies and exit to the standard

Make the remedies follow the standard you have set. Service credits when the agreed quality or availability bar is missed, and a right to terminate without penalty, with fees abating, if material disruption to the service persists beyond a defined period. Tie the trigger to disruption of the service rather than the loss of a named model, which keeps the focus on the thing you actually care about.

Termination only helps if you can actually leave. Make sure the contract gives you your data back in a usable, documented format, and a defined window of transition assistance so the work keeps running while you stand up an alternative. The playbooks, templates, and configuration the vendor built around your work are the things you will most need on the way out, so specify that they come with you and are not treated as the vendor's property. An exit right with no transition mechanics behind it is one you cannot afford to use, which means in practice it does not protect you at all.

Then read force majeure against the same scenario. Standard language will often excuse the vendor for a government order that suspends its model, which can leave the event that stops your service also releasing the vendor from delivering it. The fix is to ensure that prolonged unavailability, however it is caused, still gives you abatement and an exit, while leaving the rest of the force majeure protection intact. The vendor keeps its shield against liability for genuinely unforeseeable events, and you keep a way out of a service that has gone quiet.

Follow the data to the model provider

For most European buyers this is a secondary concern, but it is worth understanding rather than skipping. A European vendor can be fully GDPR-compliant and EU-hosted and still run on a model operated by a US provider, and US instruments like the CLOUD Act and FISA reach data held by US providers wherever it physically sits. The relevant exposure follows the data rather than the vendor's incorporation, which means what counts is whether the documents reach a US model provider at all, in what form, and what could be compelled from that provider once they arrive.

So trace the path. Ask whether the documents leave the vendor's own environment to reach the model, or whether the model is hosted inside a boundary that keeps the data in the EU. If data does go to the provider, ask whether it goes as the raw document or a redacted version, whether there is a zero-retention and no-training arrangement with the provider in writing, and which region processes it. Ask the vendor to walk you through the whole data flow, sub-processors and processing region included, all the way down to the model provider itself. That is the level the answer has to reach to tell you anything useful.

For global teams, test export controls against staff access

An export decision can determine which of your people are permitted to use the tool, not only whether the model runs at all. A model restricted from a jurisdiction where you have a legal team can cut that team off while your other offices carry on. Before deploying to a team in a jurisdiction that could end up on the wrong side of a control, confirm whether export status affects access by location, and what the vendor would do if it changed. For multi-country functions this is the exposure most likely to land as an operational problem rather than a contractual one, which is the reason to raise it before signature.

A note on cost

While you are in the commercial terms, cap in-term price increases and require enough transparency in the pricing to see what you are paying for. The cost of running these models is rising as agentic work multiplies the calls per task and subsidised pricing ends, which gives vendors a reason to pass cost through, or to economise on the model. Your quality threshold already guards against the second, since a cheaper model that drops the output below the bar is a breach. The price cap guards against the first.

The frame underneath all of it

None of this is an argument against building on third-party models. The economics of training your own are punishing, and for almost every vendor, and therefore almost every customer, running on the frontier labs is the right decision. The model is a commodity input, and a volatile one.

What the contract is really deciding is who carries the risk when that input moves. Licence a tool and the dependency passes through to you, because you have bought software wrapped around a model you do not control. Hold a vendor to a standard for the work and the dependency becomes theirs, because they have to clear that bar whatever they run underneath. A binding quality threshold and a continuity obligation are how a customer makes the second one true.

I would take this position from either side of the table. As Flank's counsel I would far rather be held to the outcome than to a static model, because that is what lets us keep improving the orchestration while staying accountable for the work the customer receives. As a customer I would ask for precisely that, because it puts the burden of every model decision on the party making it. The model underneath can be substituted. The standard of the work should hold steady when it is.

No one can tell you which model gets suspended next, or when, or by which government. That uncertainty is not going anywhere, and a clause naming today's model does little about it. A standard the vendor has to meet, whatever they run, is what carries you through the next change you do not see coming.

Why locking the model down is the wrong lever

Set a quality threshold, and make it measurable

Make continuity of the service the obligation

Ask for notice of material change, not a veto over it

Tie remedies and exit to the standard

Follow the data to the model provider

For global teams, test export controls against staff access

A note on cost

The frame underneath all of it

Keep reading.

The Front Door Your Legal Team Already Has

Contract drafting and review were never separate jobs

The headless worker

Stay in the loop.