In the world of software development, certain rules and principles are often viewed as unbreakable. One such principle is Hofstadter’s Law, which states that things will always take longer than you expect, even when you account for Hofstadter’s Law. Another rule is the Boy Scout Rule, which suggests that you should always leave code cleaner than you found it. APIs, which are a fundamental part of software development, are no exception to these principles. They are expected to have specific parameters and formats, and generate the same fields and structure every time they are called. APIs are also expected to be stable, clear, and documented, with clear documentation that specifies the expected behavior of the API endpoints. API contracts, which are the terms and conditions that govern the behavior of an API, exist even when they are not explicitly written down. However, AI tools, which are designed to interact with APIs, often lack the nuance to understand these rules and principles, leading to problems such as ignoring specifications, ignoring constraints around authentication and rate limits, generating insecure and invalid API calls, and more.
One of the main risks associated with the use of AI tools for API consumption is the potential for them to break the API contract, which is a key aspect of API security. API contracts
In software development, certain rules and principles are viewed as unbreakable. Take Hofstadter’s Law, for example, which dictates that things will always take longer than you expect…even when you account for Hofstadter’s Law. Or consider the Boy Scout Rule, that you should always leave code cleaner than you found it. API contracts fit into this category, too.
The best APIs are completely predictable, to the point that people will sometimes call them boring. In this context, boring should be considered a compliment. AI tools, on the other hand, are anything but boring — LLMs are powerful, but it’s difficult to predict their output. For API developers in the age of AI, this is a big problem.
Below, we’ll explore some ways LLMs and AI tools are disregarding API contracts, some of the risks associated with practices like these, and some techniques you can deploy to limit the risk of security issues or API misuse that these tools might cause.
What Is the API Contract Anyway?
We’ve written about API contracts and the best tools for contract testing before, so we won’t dwell on definitions for too long here. But it’s worth recapping their key tenets:
- Predictable inputs: APIs expect specific parameters and formats
- Predictable outputs: APIs generate the same fields and structure every time called
- Stability: Behaviour doesn’t change unexpectedly between releases
- Clear documentation: Functions and behaviours are described in specifications
- Compatibility: Existing integrations aren’t broken or disrupted by new features
- Error handling: Errors are reported clearly and explicitly using known formats
The API contract still exists even when it’s unspoken or unwritten, but AI tools lack the nuance to “understand” that. As a result, we’re seeing them regularly disregard specifications, ignore constraints around authentication and rate limits, generate insecure and invalid API calls, and more.
Gary Marshall, a leading voice in the AI space, calls LLMs “dishonest, unpredictable, and potentially dangerous.” Whether or not you agree with Marshall, the way LLMs and gen AI interact with APIs has set off alarm bells with some members of the API community.
Because LLMs have proven, time and time again, to be API contract breakers…
The Dangers of API Consumption by LLMs
You don’t need to look very far to find examples of artificial intelligence posing security risks. Earlier this year, for instance, Invariant Labs shared a vulnerability in the GitHub MCP server that could be exploited to leak private repositories, with no questions asked.
Elsewhere, an AI bot that handles tech support for Cursor began telling users that they could no longer use the product on multiple machines, as reported by The New York Times. The policy change, it turned out, had been entirely fabricated by the AI bot. And, as per that article, hallucinations are on the rise.
If we can’t trust the output of AI tools, do we really want them calling, documenting, or even creating APIs? In the study, Commercial LLM Agents Are Already Vulnerable to Simple Yet Dangerous Attacks, researchers found that commercial LLM agents are vulnerable to simple, but dangerous, attacks. They specifically cite APIs as a factor:
“In real-world deployments, LLMs are often part of a larger agentic pipeline including memory systems, retrieval, web access, and API calling. Such additional components introduce vulnerabilities that make these LLM-powered agents much easier to attack than isolated LLMs.”
We’ve written previously about how AI-generated code could kill your API, and it’s worth considering a related question: could AI tools consuming your APIs be damaging too?
Vibe-Coding, Prompt Injection, and Slopsquatting
A recent TechRadar article talked at length about the risks of package hallucinations and slopsquatting. In a study referenced by that piece, almost 20% of package references across 576,000 samples and 16 LLMs were hallucinated. 43% of those hallucinations were repeated across generations, making them exploitable.
Prompt injections are a concern here, too — OWASP recently ranked prompt injection vulnerabilities as the top risk in their top 10 list for AI and LLM security concerns — with unprotected APIs and endpoints representing possible attack vectors. LLMs operating “in the middle” could be manipulated into passing along dangerous requests or otherwise breaking API contracts.
Elsewhere, researchers in 2024 compiled “an extensive collection of 48 programming tasks for 5 widely used security APIs…[employing] both automated and manual approaches to effectively detect security API misuse in the code generated by ChatGPT for these tasks.” Their findings? 70% of code instances across 30 attempts per task contained security API misuse, with 20 distinct misuse types identified.
If these issues aren’t concerning enough on their own, it’s worth noting that their impact may be compounded by the rise of “vibe coding.” While there’s no doubt that tools like Lovable, Replit, Cursor, and Bolt can be extremely powerful, they open up the ability to write code and build apps for non-technical people with no prior experience. In that respect, they’re great levelers.
However, they also put the ability to consume APIs into the hands of those who don’t know anything about rate limits, authentication methods, or other API security best practices. API providers should definitely be preparing for a wave of potentially reckless consumption.
The Future of API Consumption Is Agentic (Maybe…)
The good news is that the issues above are avoidable, to some extent. A substantial study conducted by Amazon revealed some of the ways API documentation can be used to mitigate code LLM hallucinations. A few of the best practices highlighted include:
- Structured documentation with clear examples and references to make it easier for LLMs (and retrieval systems) to ground their generated code
- Consistent formatting, concise descriptions, and standardized naming so that retrieval-augmented generation (RAG) approaches can find and use the right snippets
- Maintaining machine-readable documentation formats (such as OpenAPI specs), facilitating retrieval and programmatic validation during code generation.
- Using canonical naming conventions, with consistent, descriptive names to reduce the chance of LLMs hallucinating synonyms or near-variants (like using
fetch_item
instead ofget_item
) - Mark deprecated APIs clearly and provide migration notes to limit outdated information
At our 2024 Austin API Summit, Paul Dumas asserted that the consumption (and creation) of APIs by AI and LLMs is the future. Just over a year later, it’s already happening, and we should expect much more of it — prepping for agentic consumption is no longer optional.
Another mitigation tactic worth highlighting relates to #6 on OWASP’s Top 10 list of AI and LLM security concerns: Excessive Agency. While the temptation might exist to grant the same autonomy and permissions to an LLM-based app that we would to a human API consumer, everything we’ve seen above suggests that this may not be the best course of action.
Preparing for, and Mitigating, the Impact of LLMs on APIs
In an ideal world, individuals doing the AI prompting would take precautions. These could include specific actions to prevent LLMs from breaking contracts, like explicit role definitions that prevent LLMs from going beyond the required scope, embedding denials for “risky” behavior, and encouraging the LLM to self-evaluate or ask for user confirmation before taking action. In practice, it may be up to API providers.
That might mean introducing role-based access control (RBAC) to enforce access limits for AI agents, introducing friction for sensitive actions to encourage human-in-the-loop approval, introducing zero trust for plugins, or introducing audit logging and observability to track LLM-generated requests in real time.
Increasingly, we should expect to see AI governance take a larger role in the API governance process. This could cover safe prompt engineering, operational controls, agent permissions, and so on. In the meantime, while threat modeling specifically for AI use cases might sound excessive, it’s a sensible idea to think of LLM agents as both software and decision-makers.
The aim has always been to create APIs in a way that prevents both bad actors from abusing them and inexperienced consumers from misusing them. All of those best practices continue to apply here. In certain circumstances — a perfect storm, so to speak — reckless agentic consumption can get out of hand just as quickly as, if not more quickly than, any human error or cyberattack can.
In that respect, LLMs and AI pose a potential risk that API developers can’t afford to ignore.