Blog

Video Captioning Requirements: WCAG 2.2 Media Accessibility Compliance

TestParty

March 27, 2025

Why Media Accessibility Matters
WCAG Requirements for Media
Caption Requirements
Transcript Requirements
Audio Description Requirements
Media Player Accessibility
Live Media Accessibility
Common Media Accessibility Failures
Captioning Workflows
Testing Media Accessibility
Taking Action
Related Resources

Video content presents accessibility barriers that text does not. Deaf users cannot hear audio. Blind users cannot see visual-only information. Users in sound-sensitive environments cannot play audio. Without captions, transcripts, and audio descriptions, video content excludes significant user populations.

WCAG establishes clear requirements for making video and audio content accessible. This guide covers captioning requirements, transcript standards, audio description guidelines, and media player accessibility—everything needed for compliant video content.

Why Media Accessibility Matters

Video accessibility affects multiple user groups.

Deaf and Hard of Hearing Users

Approximately 466 million people worldwide have disabling hearing loss. Without captions, deaf users cannot access audio content in videos—dialogue, narration, sound effects, music cues.

Blind and Low-Vision Users

Blind users cannot perceive visual-only information in videos. When important content appears only visually—on-screen text, demonstrations, actions, scene changes—audio descriptions make this content accessible.

Cognitive Disabilities

Captions benefit users with cognitive disabilities who process information more effectively when reading alongside listening. The dual-channel presentation reinforces comprehension.

Situational Limitations

Beyond permanent disabilities, captions serve:

Users in sound-sensitive environments (offices, public spaces)
Users with temporary hearing issues
Non-native language speakers
Users learning to read
Anyone in loud environments where audio isn't audible

Business Impact

Facebook reports that videos with captions have 12% more views. LinkedIn found captioned videos receive 26% more engagement. Accessibility and engagement align.

WCAG Requirements for Media

Multiple success criteria address audio and video accessibility.

1.2.1 Audio-only and Video-only (Prerecorded) — Level A

For prerecorded audio-only content:

Provide a text transcript

For prerecorded video-only content (no audio):

Provide either a text transcript OR audio track describing the video content

1.2.2 Captions (Prerecorded) — Level A

For prerecorded video with audio:

Captions are provided for all prerecorded audio content in synchronized media

This is a Level A (minimum) requirement—all websites need captions on prerecorded videos.

1.2.3 Audio Description or Media Alternative (Prerecorded) — Level A

For prerecorded video with audio:

Provide either audio description OR a full text alternative (transcript including visual descriptions)

1.2.4 Captions (Live) — Level AA

For live video with audio:

Captions are provided for all live audio content in synchronized media

Real-time captioning (CART services or automatic speech recognition) is required for live video.

1.2.5 Audio Description (Prerecorded) — Level AA

For prerecorded video with audio:

Audio description is provided for all prerecorded video content

At Level AA, audio description is required—not just a full transcript alternative.

1.2.6 Sign Language (Prerecorded) — Level AAA

For prerecorded video with audio:

Sign language interpretation is provided

1.2.7 Extended Audio Description (Prerecorded) — Level AAA

For prerecorded video:

Where pauses in audio are insufficient for audio descriptions, extended audio description is provided

Extended audio description pauses video to allow longer descriptions.

1.2.8 Media Alternative (Prerecorded) — Level AAA

For prerecorded synchronized media and video-only:

A full text alternative is provided

1.2.9 Audio-only (Live) — Level AAA

For live audio-only content:

A text alternative is provided

Caption Requirements

Captions are the most fundamental video accessibility requirement.

What Captions Must Include

Captions must convey all audio content, not just dialogue.

Dialogue and Narration: All spoken words, accurately transcribed.

Speaker Identification: When multiple speakers appear, identify who is speaking.

[Sarah] The new design improves accessibility significantly.
[Tom] What specific changes did you make?

Sound Effects: Meaningful sounds that affect understanding.

[door slams]
[phone ringing]
[applause]

Music: When music conveys meaning or emotion.

[tense music playing]
[upbeat background music]
♪ Happy birthday to you ♪

Tone and Manner: When delivery affects meaning.

[sarcastically] Oh, that's just great.
[whispering] Don't tell anyone.

Caption Quality Standards

Accuracy: Captions must accurately represent spoken content. Aim for 99%+ accuracy. Auto-generated captions typically achieve 80-90% accuracy and require editing.

Synchronization: Captions must appear in sync with audio. Industry standard: within 3 frames or 100 milliseconds of audio.

Readability:

Maximum 2 lines on screen at once
Maximum 32-42 characters per line
Minimum 1 second display time per caption
Position to avoid covering important visuals

Completeness: All meaningful audio must be captioned. No "inaudible" notations when content is audible but unclear—research or indicate uncertainty.

Caption File Formats

Common caption formats for web video:

WebVTT (.vtt): The web-native format. Recommended for HTML5 video.

WEBVTT

00:00:01.000 --> 00:00:04.000
Welcome to our accessibility tutorial.

00:00:04.500 --> 00:00:08.000
Today we'll cover captioning requirements.

SRT (.srt): Widely supported, simpler than WebVTT.

1
00:00:01,000 --> 00:00:04,000
Welcome to our accessibility tutorial.

2
00:00:04,500 --> 00:00:08,000
Today we'll cover captioning requirements.

TTML/DFXP: Used for broadcast and some streaming platforms.

Implementing Captions

HTML5 Video:

<video controls>
  <source src="tutorial.mp4" type="video/mp4">
  <track kind="captions"
         src="tutorial-captions.vtt"
         srclang="en"
         label="English"
         default>
  <track kind="captions"
         src="tutorial-captions-es.vtt"
         srclang="es"
         label="Spanish">
</video>

YouTube:

Upload .vtt or .srt files via YouTube Studio
Auto-generated captions available (require editing)
Caption settings accessible to viewers

Vimeo:

Upload caption files via video settings
Multiple language support
Styling options available

Caption Types

Closed Captions (CC): Can be turned on/off by viewers. Standard for web video.

Open Captions: Burned into video, always visible. Use when caption controls may be inaccessible or for social media autoplay.

SDH (Subtitles for Deaf and Hard of Hearing): Include speaker identification and sound descriptions. More comprehensive than standard subtitles.

Transcript Requirements

Transcripts provide text alternatives for audio and video content.

When Transcripts Are Required

Audio-only content (podcasts, audio recordings): Transcripts required at Level A.

Video with audio: Transcripts can satisfy Level A requirements when combined with captions. Required at Level AAA.

Video-only (no audio): Text description of visual content required at Level A.

What Transcripts Must Include

For audio content:

All spoken dialogue and narration
Speaker identification
Relevant sound effects
Musical cues when meaningful

For video content (descriptive transcripts): All of the above, plus:

Description of visual actions
On-screen text
Scene changes
Visual information not in audio

Transcript Format

Transcripts should be:

HTML text (not PDF or image)
Located near the video/audio
Clearly labeled
Searchable and copyable

<details>
  <summary>Video Transcript</summary>
  <div class="transcript">
    <p><strong>Sarah:</strong> Welcome to our accessibility tutorial.</p>
    <p><em>[Screen shows WCAG logo]</em></p>
    <p><strong>Sarah:</strong> Today we'll cover the requirements for video captions...</p>
  </div>
</details>

Interactive Transcripts

Enhanced transcripts allow users to click text to jump to that point in the video:

<div class="interactive-transcript">
  <p data-time="0">Welcome to our accessibility tutorial.</p>
  <p data-time="4.5">Today we'll cover captioning requirements.</p>
</div>

Audio Description Requirements

Audio description narrates visual information for blind users.

What Audio Description Covers

Visual actions: "Sarah walks to the whiteboard and writes 'WCAG 2.2' in large letters."

On-screen text: "Text appears: 'Three levels of conformance: A, AA, AAA.'"

Scene changes: "The scene shifts to an office meeting room."

Character appearances: "A man in a blue suit enters the room."

Non-verbal communication: "Sarah nods in agreement."

When Audio Description Is Required

Level A: Audio description OR full text alternative Level AA: Audio description is required

For most compliance scenarios (ADA, EAA), Level AA means audio description is necessary.

Creating Audio Description

Timing: Descriptions fit in natural pauses in dialogue/narration. If no pauses exist, extended audio description (Level AAA) pauses the video.

Content: Describe what's seen, not interpret. "Sarah points at the chart" not "Sarah seems excited about the data."

Voice: Distinct from main audio, clear and neutral.

Length: Brief and efficient—describe essential visual information.

Implementing Audio Description

Separate audio track:

<video controls>
  <source src="tutorial.mp4" type="video/mp4">
  <track kind="captions" src="captions.vtt" srclang="en" label="English">
  <track kind="descriptions"
         src="descriptions.vtt"
         srclang="en"
         label="Audio Descriptions">
</video>

Note: Browser support for description tracks is limited. Alternative approaches include:

Separate video version: Provide a version with audio description mixed into the main audio track.

Audio description service: Platforms like YouDescribe allow community-contributed descriptions.

Media Player Accessibility

The player itself must be accessible.

Keyboard Accessibility

Required keyboard controls:

Play/Pause (Space or Enter)
Volume (arrow keys)
Mute (M)
Full screen (F)
Seek (arrow keys or number keys)
Caption toggle (C)
Exit full screen (Escape)

Focus management:

All controls focusable via Tab
Visible focus indicators
Logical focus order

Screen Reader Accessibility

Control labeling:

<button aria-label="Play video">
  <span class="icon-play" aria-hidden="true"></span>
</button>

<button aria-label="Mute audio">
  <span class="icon-volume" aria-hidden="true"></span>
</button>

<button aria-label="Enable captions">
  <span class="icon-cc" aria-hidden="true"></span>
</button>

State communication:

<button aria-label="Pause video" aria-pressed="true">
  <span class="icon-pause" aria-hidden="true"></span>
</button>

<button aria-label="Enable captions" aria-pressed="false">
  <span class="icon-cc" aria-hidden="true"></span>
</button>

Progress communication: Current playback position should be programmatically available.

Caption Control

Users must be able to:

Toggle captions on/off
Select caption language
Access caption settings (size, color, background) where available

Autoplay Restrictions

WCAG 1.4.2 Audio Control (Level A): Audio that plays automatically for more than 3 seconds must have controls to pause/stop or control volume independently of system volume.

Best practice: Never autoplay video with audio. If video autoplays, mute audio by default.

Platform-Specific Implementation

YouTube embeds:

<iframe
  src="https://www.youtube.com/embed/VIDEO_ID?cc_load_policy=1"
  title="Video title for screen readers"
  allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope"
  allowfullscreen>
</iframe>

Key parameters:

cc_load_policy=1: Shows captions by default
title attribute: Required for iframe accessibility

Vimeo embeds:

<iframe
  src="https://player.vimeo.com/video/VIDEO_ID?texttrack=en"
  title="Video title for screen readers"
  allow="autoplay; fullscreen; picture-in-picture"
  allowfullscreen>
</iframe>

Live Media Accessibility

Live video and audio have specific requirements.

Live Captions (Level AA)

Live events require real-time captioning:

CART (Communication Access Realtime Translation): Human stenographers provide real-time captioning with 98%+ accuracy.

Automatic Speech Recognition (ASR): AI-powered real-time captions. Accuracy varies (85-95%). Requires clean audio input.

Live caption platforms:

Zoom built-in captions
Google Meet captions
Microsoft Teams captions
StreamText
1CapApp

Live Audio Description

Not required by WCAG but beneficial for events with significant visual content (presentations, demonstrations).

Live Transcript (Level AAA)

Real-time text alternative for live audio-only content.

Common Media Accessibility Failures

Avoid these frequently encountered issues.

Auto-Generated Captions Without Editing

YouTube's automatic captions are a starting point, not a solution. They fail on:

Technical terminology
Accents
Multiple speakers
Background noise
Proper nouns

Always edit auto-generated captions for accuracy.

Captions Missing Non-Speech Audio

Captions that only capture dialogue miss:

Sound effects important to understanding
Musical cues
Background sounds that convey information

No Caption Controls

Embedded videos without visible caption toggle buttons leave users unable to enable captions.

Inaccessible Video Players

Custom video players that:

Lack keyboard controls
Have no focus indicators
Use unlabeled icon buttons
Hide controls without keyboard access

Autoplay With Audio

Videos that autoplay with audio:

Startle users
Conflict with screen readers
Violate WCAG 1.4.2 if no pause control

No Transcript Provided

Captions help real-time viewing, but transcripts enable:

Full-text search
Reading at own pace
Copy/paste content
Offline access
SEO benefits

Captioning Workflows

Establish efficient processes for captioning video content.

DIY Captioning

Process:

Generate initial transcript (auto-transcription or manual)
Edit for accuracy
Add timing/synchronization
Include non-speech audio
Export to caption format
Test with video

Tools:

YouTube Studio (free, auto-generates starting point)
Descript (AI transcription with editing)
Aegisub (free, open-source caption editor)
Subtitle Edit (free, Windows)

Professional Captioning Services

For high accuracy or high volume:

Rev ($1.50+/minute)
3Play Media (enterprise)
Verbit (AI + human)
CaptionSync

Professional services achieve 99%+ accuracy and handle technical content reliably.

Caption Quality Assurance

Checklist:

[ ] Accuracy: Compare to audio, verify technical terms
[ ] Synchronization: Captions match audio timing
[ ] Speaker identification: Multiple speakers distinguished
[ ] Sound effects: Non-speech audio included
[ ] Readability: Line length and duration appropriate
[ ] Completeness: All meaningful audio captioned

Testing Media Accessibility

Verify media accessibility through systematic testing.

Caption Testing

Play video with captions enabled
Verify all dialogue is captured
Check speaker identification accuracy
Confirm sound effects are noted
Verify synchronization
Check formatting/readability

Player Accessibility Testing

Navigate to video using keyboard only
Test all controls via keyboard (play, pause, volume, seek, fullscreen, captions)
Verify focus indicators on all controls
Test with screen reader
Verify controls are properly labeled

Transcript Testing

Verify transcript exists and is linked
Compare transcript to audio/video content
Confirm visual descriptions are included (for video)
Test transcript searchability

Taking Action

Media accessibility requires investment in captioning, transcripts, and audio descriptions—but the legal requirements are clear, and the benefits extend to all users.

Start by auditing existing video content for captions and transcripts. Establish captioning workflows for new content. Verify media player accessibility across your platform.

Schedule a TestParty demo and get a 14-day compliance implementation plan.

Stay informed

Accessibility insights delivered
straight to your inbox.

Automate the software work for accessibility compliance, end-to-end.

Empowering businesses with seamless digital accessibility solutions—simple, inclusive, effective.

Book a Demo

Video Captioning Requirements: WCAG 2.2 Media Accessibility Compliance

TABLE OF CONTENTS

Why Media Accessibility Matters

Deaf and Hard of Hearing Users

Blind and Low-Vision Users

Cognitive Disabilities

Situational Limitations

Business Impact

WCAG Requirements for Media

1.2.1 Audio-only and Video-only (Prerecorded) — Level A

1.2.2 Captions (Prerecorded) — Level A

1.2.3 Audio Description or Media Alternative (Prerecorded) — Level A

1.2.4 Captions (Live) — Level AA

1.2.5 Audio Description (Prerecorded) — Level AA

1.2.6 Sign Language (Prerecorded) — Level AAA

1.2.7 Extended Audio Description (Prerecorded) — Level AAA

1.2.8 Media Alternative (Prerecorded) — Level AAA

1.2.9 Audio-only (Live) — Level AAA

Caption Requirements

What Captions Must Include

Caption Quality Standards

Caption File Formats

Implementing Captions

Caption Types

Transcript Requirements

When Transcripts Are Required

What Transcripts Must Include

Transcript Format

Interactive Transcripts

Audio Description Requirements

What Audio Description Covers

When Audio Description Is Required

Creating Audio Description

Implementing Audio Description

Media Player Accessibility

Keyboard Accessibility

Screen Reader Accessibility

Caption Control

Autoplay Restrictions

Platform-Specific Implementation

Live Media Accessibility

Live Captions (Level AA)

Live Audio Description

Live Transcript (Level AAA)

Common Media Accessibility Failures

Auto-Generated Captions Without Editing

Captions Missing Non-Speech Audio

No Caption Controls

Inaccessible Video Players

Autoplay With Audio

No Transcript Provided

Captioning Workflows

DIY Captioning

Professional Captioning Services

Caption Quality Assurance

Testing Media Accessibility

Caption Testing

Player Accessibility Testing

Transcript Testing

Taking Action

Related Resources

Stay informed

Accessibility insights delivered straight to your inbox.

Automate the software work for accessibility compliance, end-to-end.

Accessibility insights delivered
straight to your inbox.