Why LLMs need a legal layer: the terms problem

Large language models are increasingly being asked to act on behalf of users — pulling data, posting content, making purchases, summarizing documents, calling APIs. But the web they're operating in was designed for humans who can read terms of service and make judgment calls. LLMs can't do that at machine speed, and the consequences of getting it wrong are real.

This is the terms problem.

Terms drift, and nobody tells you

Terms of service are living documents. A service that permits automated access today may restrict it next quarter. A developer API agreement that allowed certain training use may be revised after a competitor incident. Most operators don't notify downstream users when their terms change — they don't have a mechanism to do so.

For a human this is manageable. You re-read the terms when you're starting a new project. For an LLM-driven agent, there is no equivalent. It operates on whatever terms were in place when it was configured, or in the worst case, whatever terms it infers from context.

The result is agents that act on stale legal information and operators who don't know it's happening until something goes wrong.

No audit trail

When an agent reads a webpage, scrapes a dataset, or posts to a forum, it is making a legal decision — whether the source's terms permit that action. Most agents make this call without logging it. There is no record of which terms were consulted, what the agent concluded, or what action it took as a result.

This creates a gap for any organization operating agent systems at scale. When the question is "did this agent's action comply with the relevant terms?", the answer is usually "we don't know" — not because the team was careless, but because the infrastructure for recording it didn't exist.

Hallucinated policies

Agents sometimes generate plausible-sounding justifications for actions they want to take. When asked "can I scrape this site?", a model may respond with confident legal conclusions that have no basis in the site's actual terms. The model is predicting what a reasonable person would say, not reporting what the terms actually state.

This is a distinct failure mode from hallucination in factual tasks. Legal hallucination is harder to catch because the output sounds plausible and the stakes are asymmetric: a false positive (thinking something is allowed when it isn't) is harder to detect than a factual error.

What a legal layer does

A legal layer for LLMs is not a policy engine. It's a structured signal that answers one question: what does this service's terms actually say about this action?

The signal should be:

Current — tied to the service's active terms, not a cached version
Action-specific — not a document to read, but a permission value for the specific action the agent wants to take
Attributable — tied to a versioned source so the agent's reasoning can be reviewed later
Unambiguous — true, false, null, or a structured condition object, not prose

Without this, agent operators are making the same bet that web scrapers made a decade ago: that whatever isn't explicitly forbidden is implicitly allowed. That bet has always been risky. It's more costly now, because agents operate at scale and leave a more visible footprint.

What OpenTerms is building

OpenTerms is a public registry of machine-readable permission records. Each entry reports what a service's terms appear to say about a defined set of actions. Entries are generated from publicly available terms and reviewed before publication.

The registry does not make legal conclusions. It reports what the terms state. Whether to act on that information is a decision the agent operator makes.

Whether this approach scales to a web with millions of services is an open question. The current alpha has 511 entries. The questions being tested are whether the schema captures the right signals, whether entries can be kept current at meaningful scale, and whether operators find the signal useful enough to build around.

OpenTerms is in public alpha. Registry entries are interpretations of publicly available terms, not verified legal facts. Do not rely on the registry for consequential decisions without independent review.