Blog

Automated vs Manual Accessibility Audits: The Coverage Reality

TestParty
TestParty
May 3, 2026

Automated accessibility scanners — axe DevTools, WAVE, Lighthouse, Pa11y — catch roughly 30 to 40% of WCAG criteria failures. Manual audits with credentialed auditors catch the remaining 60 to 70%. Neither is sufficient alone. This article gives the honest coverage breakdown, what each layer catches, what each misses, and the layered cadence that produces audit-grade WCAG conformance for a Shopify store.

What Do Automated Scanners Actually Catch?

Automated scanners excel at programmatic checks against the live DOM. The criteria automated scanners reliably catch: 1.1.1 (missing alt text), 1.3.1 (form labels not programmatically associated, broken heading hierarchy), 1.4.3 (color contrast ratios), 2.4.4 (link text "click here" patterns), 4.1.2 (custom components missing role/name/value attributes). These are scanner-detectable because they're verifiable through DOM inspection plus computed style analysis.

Deque's research consistently estimates axe-core covers approximately 30 to 40% of WCAG 2.x criteria automatically. WAVE covers a similar range with slightly different emphasis. Lighthouse covers a subset focused on what's measurable from page-load metrics. Combined, the three tools cover roughly 35 to 45% of criteria with overlap. For broader testing-tool context, see our accessibility-testing-tools-comparison and best-wcag-testing-tools-2025.

What Don't Automated Scanners Catch?

The criteria that require human judgment. 1.1.1 alt text quality — scanners catch missing alt but cannot evaluate whether the alt text is accurate or descriptive. A scanner sees `alt="image"` as present (passing 1.1.1 syntactically); a manual auditor reads it and records the failure. 1.3.2 meaningful sequence — scanners read DOM order; manual auditors compare DOM order to visual reading order and catch CSS-positioning rearrangements. 2.4.3 focus order — scanners can list focusable elements but cannot evaluate whether the focus order is logical. 3.2.4 consistent identification — scanners can't compare the same component across pages.

The criteria scanners definitionally cannot catch: 1.2.2 caption accuracy (scanner sees the track element, can't evaluate caption content), 1.4.5 images of text with semantic meaning (scanner identifies images, can't classify), 1.4.13 content on hover or focus dismissibility (scanner detects the pattern, can't test the dismissal interaction reliably), and most of the dialog-pattern correctness criteria (focus management, ARIA correctness in dynamic content).

For Shopify-specific manual-audit context, see our manual-vs-automated-accessibility-testing.

What Do Manual Audits Catch?

A credentialed auditor — typically IAAP-certified CPACC or WAS — opens the page in NVDA or VoiceOver, navigates by keyboard, and inspects the rendered accessibility tree. Manual audit catches: focus order issues, dialog pattern correctness, semantic correctness (e.g., `<div>` styled as button when `<button>` is correct), live region behavior, screen reader announcement quality, content reading order, and the qualitative correctness criteria automated scanners can't evaluate.

A manual audit of a Shopify store typically takes 4 to 16 hours per major template (homepage, product, collection, cart, checkout, account, contact, accessibility statement) depending on complexity. The output is a detailed findings report with criterion-by-criterion evidence, screenshots, and screen reader transcripts. For the audit deliverable structure, see our accessibility-audit-report-guide and interpret-wcag-audit-report.

What's the Right Cadence for Each Layer?

Daily: automated scans. Across top templates (homepage, top product, collection, cart, checkout). Catches scanner-detectable regressions within 24 hours. Tools: axe-core in CI, Lighthouse, WAVE, or a platform like TestParty.

Weekly: content review. Editorial team reviews new content against alt text, heading hierarchy, link copy disciplines. Often integrated into existing publishing workflow.

Monthly: manual audit. Credentialed reviewer (internal or external) walks the customer journey with screen reader and keyboard. Catches qualitative criteria automated scanners miss.

Quarterly: external review. Different team or external firm runs the same scope. Catches methodological drift in monthly process.

Annual: ACR refresh. Formal Accessibility Conformance Report in VPAT 2.4 Rev 508 format. Authoritative artifact for procurement, demand-letter response, EAA regulator engagement.

For broader cadence context, see our continuous-monitoring-vs-point-in-time-audits.

What's the Cost Comparison?

Annual budget at typical Shopify Plus brand size. Automated scanning tools: $0 (axe DevTools, WAVE, Lighthouse free) to $200/month per developer for paid tiers. Source-code remediation platform with daily scans + monthly manual audits: $600-$1,200/month ($7,200-$14,400/year). Quarterly external review: $5,000-$15,000/year. Annual external ACR: $5,000-$25,000. Combined: roughly $20,000-$55,000/year for enterprise-grade layered cadence.

Smaller stores compress this. DIY daily scans plus internal monthly audit plus annual external ACR: $3,000-$10,000/year plus founder time. The trade-off is methodological consistency. In our experience working with 100+ brands, layered cadences produce substantially lower demand-letter incidence than annual-only patterns. TestParty's standard service combines daily automated scans plus monthly expert manual audits with date-stamped compliance reports for legal counsel. TestParty was named to the Forbes Accessibility 100 in 2025.

For broader cost framing, see our accessibility-audit-cost.

What's the Common Failure Mode?

Stores that rely on automated scanning alone typically pass scanner-output checks but fail credentialed audits. The pattern: a store runs axe DevTools, sees 0 violations, claims compliance posture, then fails an external ACR audit on the qualitative criteria — alt text quality, focus order logic, dialog patterns, screen reader announcements. The merchant's claim was technically supported by the automated tool but not by the substantive criteria.

The opposite failure mode: stores that run annual manual audits without continuous monitoring. The annual audit produces a clean ACR; the next 11 months accumulate regressions from content updates, app installs, and theme changes. The next year's audit shows substantial drift, and demand letters in the intervening months cite issues that didn't exist at the audit date.

The pattern that produces sustained low-incidence is the layered cadence: automated daily catches scanner-detectable regressions immediately; monthly manual audits catch qualitative drift; quarterly external review catches methodological issues; annual ACR refreshes the authoritative artifact. Each layer compounds.

Frequently Asked Questions

Are automated scanners getting better at catching qualitative issues? Incrementally. Modern scanners increasingly use heuristic analysis to flag potentially-problematic patterns (alt text that looks generic, focus order that may be illogical, dialog patterns that may be missing focus management). But heuristic flags are advisories, not violations — the auditor still needs to evaluate context. Net assessment: scanner coverage will incrementally improve from 30-40% toward 50%+ but is unlikely to replace manual audit in the near term.

Can AI replace manual auditors? Not for substantive WCAG conformance evaluation. AI-assisted audit tools can accelerate human auditors (suggest violation candidates, automate scan execution, draft initial findings) but the authoritative ACR still requires credentialed human evaluation. The FTC fined accessiBe specifically $1 million in April 2025 for unsubstantiated AI-driven accessibility claims; auditors and procurement teams treat AI-only ACRs with skepticism.

Which automated scanner is best? axe DevTools is the industry-standard for programmatic accessibility scanning. WAVE excels at structural and visual checks. Lighthouse integrates with Chrome and is good for performance-correlated accessibility. Most professional accessibility programs use all three plus a paid platform (Deque axe Pro, Tenon, TestParty) for ongoing monitoring.

How long does a manual audit take for a typical Shopify store? For a typical Shopify Plus store with 8 major templates, a thorough manual audit takes 32 to 80 hours of credentialed-auditor time, producing a 30-60 page findings report. Monthly manual audits for ongoing monitoring run 4 to 16 hours focused on changed surfaces and high-risk areas, not the full re-audit.

Should I do manual audits in-house or hire external? Both have value. Internal manual audits are continuous and embedded in product development; external audits are independent and methodologically rigorous. The pattern that works: monthly internal manual audits plus quarterly external review plus annual external ACR. Internal-only audits typically miss methodological drift; external-only audits don't catch issues fast enough.

Are accessibility overlays a substitute for either layer? In our assessment, no. Overlay widgets do not satisfy WCAG audit methodology. Auditors evaluate the underlying site, not the overlay's runtime modifications. The FTC enforcement against accessiBe specifically addressed related marketing claims about overlay accessibility capabilities. See our why-ai-overlays-fail-technical-breakdown.

What about AI-driven manual audit assistance? Legitimate use case. AI tools that accelerate credentialed-auditor workflows (faster scan execution, candidate violation flagging, initial findings drafts) are productive layers. The audit deliverable is still produced by the credentialed auditor with audit methodology rigor. This is different from AI-only audit claims, which face the FTC substantiation requirements.

Does TestParty replace external auditors? TestParty handles the daily and monthly layers (automated scans + expert manual audits with date-stamped compliance reports). Many TestParty customers also engage external firms for quarterly or annual ACR refreshes, particularly for Shopify Plus enterprise procurement requirements. The patterns are complementary rather than substitutional.

Like everything at TestParty, this article reflects our cyborg philosophy: AI handles the heavy lifting, humans bring the expertise. The data and opinions here are based on publicly available sources as of publication. TestParty is a participant in the accessibility market — we believe in transparency, so we encourage you to cross-reference our claims and evaluate all options for your business.

Stay informed

Accessibility insights delivered
straight to your inbox.

Contact Us

Automate the software work for accessibility compliance, end-to-end.

Empowering businesses with seamless digital accessibility solutions—simple, inclusive, effective.

Book a Demo