Study: AudioEye detects up to 2.5x more issues than other tools
Get ReportWhy LLMs Can't Make Your Site Accessible
LLMs are trained on internet data, but the internet remains largely inaccessible.
Author: Mike Barton, VP of Corporate Communications & Content Marketing
Published: 04/10/2026
)
AI is changing how developers build. It writes code, catches bugs, and pushes new experiences faster than most engineers can type. So when it comes time to make a site accessible, the instinct is natural: ask the LLM.
Here’s where that logic quickly falls apart: the explosion of AI-assisted development is making the web less accessible, not more. The number of websites and apps has (opens in a new tab)increased by 40-60% year over year(opens in a new tab), driven by the explosion in coding tools. But more code written faster doesn't mean more accessible code. Research shows(opens in a new tab) LLM-generated code routinely ships with accessibility violations baked in, particularly around keyboard navigation, contrast, and screen reader support, the exact issues that drive the majority of legal claims. The result: digital accessibility on the internet is not improving and may even be getting worse, which is driving a record number of digital accessibility lawsuits.
The problem lies in what the model learned, where it learned it from, and what it has never seen.
LLMs learned from an inaccessible internet
The 2026 WebAIM Million report(opens in a new tab) tells the story in stark terms: 95.9% of the top one million homepages had detectable WCAG failures, an average of 56.1 errors per page, a 10.1% increase over the prior year. That reversed six consecutive years of gradual improvement. WebAIM attributes the decline to "broader shifts in web development, including increased reliance on 3rd party frameworks and libraries and automated or AI-assisted coding practices." The tools developers are using to build faster are actively making the web less accessible.
That's the internet LLMs trained on.
When a model learns to write code by ingesting billions of pages of existing code and content, it learns whatever patterns dominate that data. On the internet, the dominant pattern is inaccessible. Missing alt text, unlabeled form fields, broken heading hierarchies, inaccessible modals, focus traps that strand keyboard users: these aren't edge cases in the training data. They're the norm.
As the New York City Bar Association put it in a recent analysis(opens in a new tab): "AI cannot solve for accessibility if it was never trained to recognize it. The vast majority of digital content used to train LLMs and other AI systems inherently lacks accessibility." The report goes further: if you feed inaccessible HTML into an LLM with a prompt like "make this accessible," the outputs will likely reflect the same barriers present in the original material.
What this looks like in practice
A developer asks an LLM to build an accessible navigation menu. The model produces something that looks reasonable in code review, but it adds aria-label values that conflict with visible text, creating confusing double-announcements for screen reader users. It structures heading levels based on visual size rather than semantic hierarchy. It handles focus management in a way that works for sighted mouse users but traps keyboard users inside the component.
None of these failures is obvious. They only surface when someone navigates the page with a screen reader, or tabs through with a keyboard, or uses voice control. The model has no concept of those interactions because it was never exposed to them during training.
The disability community has raised this concern directly. The American Foundation for the Blind(opens in a new tab) warns of "automation bias," the belief that machines make fairer, more accurate decisions than humans, which "may lead to over-trust of AI systems." Research(opens in a new tab) from Harvard notes that "AI relies on large-scale statistical learning, which tends to optimize for the 'average' user. People with disabilities, who form a highly diverse and frequently underrepresented group, are often excluded or mischaracterized.
When a developer trusts an LLM's output on accessibility, they're trusting a system that was optimized for patterns that excluded disabled users from the start.
LLMs don't just miss issues. They create new ones.
The risk isn't limited to what LLMs miss. It's what also happens when they try to fix what they find.
An LLM trained on flawed data generates fixes built on unreliable foundations. Inexperienced automated scanning tools produce false positives at rates of 25-35%(opens in a new tab). An LLM operating with the same gaps in understanding doesn't just miss high-impact issues. It also patches things that weren't broken and adds fixes that conflict with how assistive technology already interprets the page.
The result is a developer who ran code through an LLM, got clean-looking output, and now believes their site is accessible. That false confidence is arguably more dangerous than knowing you have a problem, because it removes the motivation to get expert help.
This is the fundamental difference between an LLM applying general knowledge to a specialized problem and a platform built on that specialization from the start. AudioEye's fixes aren't generated from patterns learned on an inaccessible internet. They're informed by billions of real-world assistive technology interactions across hundreds of thousands of sites, validated against how components actually interact with screen readers, keyboards, and other assistive devices. When an LLM guesses at a fix, it's working from the same flawed foundations as the code it's trying to repair. When AudioEye’s Digital Accessibility Platform applies a fix, it's drawing on data from the interactions that actually matter: real users, real devices, real outcomes.
What actual accessibility expertise looks like
The gap between what an LLM can do and what accessibility actually requires isn't closing with the next model release. It's structural.
AudioEye has years of proprietary data across hundreds of thousands of sites and billions of unique visits, which comes from our automation and custom fixes. That context is the difference. An independent study by Adience found that detection quality varies by up to 253% across tools tested under identical conditions. At WCAG Level A alone, the gap between the highest-performing and lowest-performing tool exceeded 500%. Even among accessibility tools, most only cover the detectable surface. AudioEye goes further: not just finding issues but fixing them, informed by expertise and data that no LLM has access to.
The bottom line
Accessibility has never been a problem you can solve with general knowledge applied at scale. It's a problem that requires specialized knowledge, maintained continuously, by people and systems dedicated entirely to it.
LLMs don't have that. They have the internet's habits, and the internet's habits are inaccessible. The gap between what an LLM produces and what disabled users actually need isn't shrinking. It's compounding with every site built on AI-generated code that was never tested against real assistive technology.
The organizations closing the accessibility gap aren't prompting their way there. They're choosing platforms where decades of expertise, billions of real-world data points, and continuous investment in the problem are already built into every fix. That's not a shortcut. It's the only approach that's ever actually worked.
Share Article
)
)
)