How to Add Captions to Videos: WCAG Media Accessibility Guide
Video captions are essential for deaf and hard-of-hearing users, but they benefit everyone—people in noisy environments, non-native speakers, and those who prefer reading. WCAG requires captions for prerecorded video with audio, and failing to provide them exposes organizations to legal risk while excluding millions of potential viewers.
This guide covers creating captions, choosing formats, and implementing accessible video players.
Understanding Caption Requirements
WCAG Requirements
1.2.2 Captions (Prerecorded) - Level A: Captions are required for all prerecorded audio content in synchronized media.
1.2.4 Captions (Live) - Level AA: Captions required for live audio content in synchronized media.
Caption vs Subtitle
Captions:
- Designed for deaf/hard-of-hearing viewers
- Include all audio information (dialogue + sounds)
- Indicate speaker changes
- Describe music and sound effects
- Same language as audio
Subtitles:
- Translation of dialogue to another language
- Assume viewer can hear other audio
- Typically dialogue only
For accessibility, you need captions, not just subtitles.
What Captions Must Include
Dialogue:
- All spoken words
- Speaker identification when not obvious
- Tone indicators when relevant (whispered, shouted)
Sound effects:
- [Door slams]
- [Phone ringing]
- [Footsteps approaching]
Music:
- [♪ Upbeat jazz music ♪]
- [♪ "Happy Birthday" plays ♪]
- [Dramatic music intensifies]
Other audio:
- [Applause]
- [Laughter]
- [Silence]
Creating Captions
Method 1: Manual Transcription
Most accurate but time-consuming:
Step 1: Transcribe audio
- Listen to video in short segments
- Type all spoken words exactly
- Note speaker changes
- Include non-speech audio cues
Step 2: Add timing
- Use caption editing software
- Set start/end times for each caption
- Keep captions on screen 1-6 seconds
- Sync with speaker's pace
Step 3: Format captions
- Maximum 2 lines per caption
- Maximum 32-42 characters per line
- Break at natural pauses
- Don't split sentences awkwardly
Method 2: Auto-Generated + Edit
Faster but requires thorough review:
YouTube auto-captions:
- Upload video to YouTube
- Wait for auto-caption generation
- Open YouTube Studio > Subtitles
- Edit auto-generated captions
- Fix errors, add sounds, verify timing
- Download corrected file
Speech-to-text tools:
- Descript
- Otter.ai
- Rev (AI transcription)
- Adobe Premiere Pro
Important: Always review and edit auto-captions. They miss:
- Technical terms
- Proper nouns
- Sound effects
- Speaker identification
- Accuracy (especially with accents)
Method 3: Professional Services
Most reliable for quality:
Services:
- Rev.com (human transcription)
- 3Play Media
- Verbit
- CaptionMax
When to use professional:
- Legal/compliance requirements
- High-visibility content
- Complex audio (multiple speakers, technical content)
- Limited time for review
Caption File Formats
WebVTT (Web Video Text Tracks)
Preferred for web video:
WEBVTT
00:00:01.000 --> 00:00:04.000
Welcome to our accessibility tutorial.
00:00:04.500 --> 00:00:08.000
Today we're covering caption creation
for WCAG compliance.
00:00:09.000 --> 00:00:12.000
[♪ Upbeat intro music ♪]
00:00:13.000 --> 00:00:16.500
SPEAKER 1: Let's start with
the basics of captioning.
00:00:17.000 --> 00:00:21.000
SPEAKER 2: First, you need to
understand the requirements.Features:
- Native HTML5 support
- Supports positioning and styling
- Widely supported by video players
SRT (SubRip)
Simple, widely compatible:
1
00:00:01,000 --> 00:00:04,000
Welcome to our accessibility tutorial.
2
00:00:04,500 --> 00:00:08,000
Today we're covering caption creation
for WCAG compliance.
3
00:00:09,000 --> 00:00:12,000
[Upbeat intro music]
4
00:00:13,000 --> 00:00:16,500
SPEAKER 1: Let's start with
the basics of captioning.Features:
- Most widely supported
- Easy to edit manually
- Convertible to other formats
Choosing a Format
| Platform | Recommended Format |
|--------------|--------------------|
| HTML5 video | WebVTT |
| YouTube | SRT or WebVTT |
| Vimeo | SRT or WebVTT |
| Social media | SRT |
| Broadcast | SCC or TTML |Implementing Captions in HTML5
Basic Implementation
<video controls>
<source src="video.mp4" type="video/mp4">
<!-- Default English captions -->
<track kind="captions"
src="captions-en.vtt"
srclang="en"
label="English"
default>
<!-- Spanish captions -->
<track kind="captions"
src="captions-es.vtt"
srclang="es"
label="Español">
<!-- Fallback text -->
Your browser does not support the video element.
</video>Track Element Attributes
| Attribute | Purpose |
|-----------|-----------------------------------------|
| `kind` | Type: captions, subtitles, descriptions |
| `src` | Path to caption file |
| `srclang` | Language code (en, es, fr) |
| `label` | Display name in player menu |
| `default` | Auto-enable this track |Caption Kinds
<!-- Captions (for deaf/HoH) -->
<track kind="captions" src="captions.vtt">
<!-- Subtitles (translations) -->
<track kind="subtitles" src="subtitles-fr.vtt">
<!-- Descriptions (for blind users) -->
<track kind="descriptions" src="descriptions.vtt">
<!-- Chapters (navigation) -->
<track kind="chapters" src="chapters.vtt">Styling Captions
CSS for WebVTT
/* Style caption cues */
::cue {
background-color: rgba(0, 0, 0, 0.8);
color: white;
font-size: 1.2em;
font-family: Arial, sans-serif;
}
/* Style specific voice/class */
::cue(v[voice="Speaker 1"]) {
color: #FFD700;
}
::cue(.important) {
font-weight: bold;
}Inline WebVTT Styling
WEBVTT
STYLE
::cue {
background: rgba(0,0,0,0.8);
color: white;
}
::cue(b) {
color: yellow;
}
00:00:01.000 --> 00:00:04.000
Welcome to <b>important content</b>.Positioning
00:00:01.000 --> 00:00:04.000 line:0 position:50% align:center
This caption appears at the top center.
00:00:05.000 --> 00:00:08.000 line:-1 position:10% align:left
This caption appears at the bottom left.Platform-Specific Implementation
YouTube
Upload captions:
- Go to YouTube Studio
- Select video > Subtitles
- Click "Add language" > select language
- Click "Add" under Subtitles
- Upload file or type manually
- Review timing and accuracy
- Publish
Enable auto-captions (edit carefully):
- Wait for auto-generation (may take hours)
- Click "Duplicate and edit"
- Fix all errors
- Publish corrected version
Vimeo
Upload captions:
- Go to video settings
- Distribution > Subtitles
- Click "+" to add file
- Upload SRT or WebVTT
- Select language
- Save
Wistia
Upload captions:
- Open video in Wistia
- Click "Captions" tab
- Upload SRT file
- Or request Wistia transcription service
Social Media
Facebook:
- Auto-captions available (edit for accuracy)
- Upload SRT files
Instagram/TikTok:
- Burn in captions (hardcoded)
- Use apps like Kapwing, Clipomatic
LinkedIn:
- Upload SRT file with video
Caption Quality Standards
Timing Guidelines
Minimum duration: 1 second
Maximum duration: 6 seconds
Reading speed: 3 words per second (180 wpm)
Gap between captions: 0.2 seconds minimumFormatting Standards
Line breaks:
Bad:
"I think that we should go to the store
because"
Good:
"I think that we should go
to the store because"Speaker identification:
JOHN: Did you finish the report?
MARY: Not yet, I'm still working on it.
[Both speaking over each other]Sound descriptions:
[Door creaks open]
[Dramatic orchestral music]
[Crowd cheering]
[Silence]Common Mistakes to Avoid
Don't:
- Paraphrase or summarize dialogue
- Censor profanity without indication
- Leave out important sounds
- Use ALL CAPS for entire captions
- Split words across lines
Do:
- Transcribe verbatim
- Indicate [Expletive] if censoring
- Include all relevant audio
- Use caps only for emphasis
- Break at natural phrase boundaries
Live Captioning
CART Services
Communication Access Realtime Translation:
- Human captioner types in real-time
- High accuracy
- Used for live events, webinars
Providers:
- StreamText
- CaptionFirst
- Alternative Communication Services
Automatic Live Captioning
Tools:
- Google Meet (built-in)
- Zoom (built-in or Otter.ai)
- Microsoft Teams (built-in)
- YouTube Live (auto-captions)
Limitations:
- Lower accuracy than human CART
- Miss technical terms
- No speaker identification
- Delay in appearance
Testing Captions
Verification Checklist
â–¡ All dialogue included
â–¡ Speaker identification when needed
â–¡ Sound effects described
â–¡ Music described
â–¡ Timing synced with audio
â–¡ Readable duration (not too fast)
â–¡ No spelling/grammar errors
â–¡ Technical terms correct
â–¡ Proper nouns spelled correctly
â–¡ Works in video playerUser Testing
Test with actual users:
- Deaf/hard-of-hearing viewers
- Non-native speakers
- Users in noisy environments
Gather feedback on:
- Readability
- Timing/pace
- Completeness
- Accuracy
Audio Descriptions
When Required
WCAG 1.2.3 Audio Description (Level A): When video conveys information visually that isn't in the audio.
WCAG 1.2.5 Audio Description (Level AA): Extended audio description for longer visual content.
Implementation
<video controls>
<source src="video.mp4" type="video/mp4">
<track kind="captions"
src="captions.vtt"
srclang="en"
label="English Captions">
<track kind="descriptions"
src="descriptions.vtt"
srclang="en"
label="Audio Descriptions">
</video>Creating Audio Descriptions
Describe:
- Visual-only actions
- On-screen text
- Scene changes
- Character expressions (when relevant)
Example:
[Video shows woman walking through office]
Audio description: "Sarah walks through the open office,
passing several colleagues at their desks."Taking Action
Captions are a fundamental accessibility requirement with broad benefits. Start with your highest-traffic videos, establish a captioning workflow, and maintain quality standards. Automated tools speed up creation but always require human review for accuracy.
TestParty identifies videos missing captions across your website as part of accessibility monitoring.
Schedule a TestParty demo and get a 14-day compliance implementation plan.
Related Resources
Stay informed
Accessibility insights delivered
straight to your inbox.


Automate the software work for accessibility compliance, end-to-end.
Empowering businesses with seamless digital accessibility solutions—simple, inclusive, effective.
Book a Demo