Captions are crucial for people who are Deaf, hard of hearing, or have auditory processing disorders. All videos at UC Berkeley must have accurate and edited captioning, regardless of the platform the video is hosted on.
How do I make sure my captions are accurate?
Manually checking a video’s captions is the best way to ensure that they are fully accurate. Many video hosting platforms, including Youtube, provide automatic closed captioning. This is a great place to start in building your captions, but is not enough to ensure that they are accurate and fully accessible.
Do:
- Use correct spelling, capitalization, and punctuation.
- Make sure that your captions accurately reflect the audio content.
- Identify your speakers every time a new person speaks, or when a speaker isn’t shown speaking. Display the speaker’s name in all caps with a colon.
Example: OBI-WAN: These aren't the droids you’re looking for.
- Describe non-verbal audio. This includes things like music or someone laughing. Include non-verbal audio in brackets. Only include background sounds if they’re important to the context or meaning of the video.
Example: [music], [laughter], [audience applause], [dog barking]
- Sync captions with the audio. Captions must appear when the audio is heard.
- Use two lines of text per caption frame. Three or more lines are difficult to read and understand.
- Write your captions in the language of your video’s audio. For example, if your video is in English, write your captions in English. If your video is in Spanish, write your captions in Spanish.
- Keep captions on screen long enough for people to read them, but not too long. Aim for 2-7 seconds per caption frame.
Don’t:
- Don't rely entirely on auto-captioning. Auto-captions are a great way to get started captioning your video, but they will not be accurate without additional editing.
- Don't write captions in all caps. Only use all caps for YELLING or speaker identification.
- Don’t use more than 42 characters per line in a caption frame. It's hard to read. Even better; aim for a maximum of 32 characters per line.
- Don't caption background music if it will interfere with the dialog captions.
Best practices for captioning various situations
When in doubt, write what is easiest to read and understand. Two shorter lines are more readable than one very long line. Avoid putting the last word of a sentence on the next caption frame.
Animal Sounds
Sounds can be described and put in brackets, or spelled out (onomatopoeia), or both.
Example 1: [dog barking]
Example 2: Woof, woof!
Example 3: [dog barking] Woof, woof!
Music
Describe the background music style if needed, in brackets. If the background music plays for a long time, stop the caption frame after 4-5 seconds. If you include captions for song lyrics, add a musical note (♪) to the beginning and end of each line.
Example: [ethereal classical music]
Example: ♪ Take another little piece of my heart now, baby ♪
Punctuation
Use punctuation to indicate the speed or pace of a sound effect. You can use an ellipsis for extended pauses, commas for brief breaks, and dashes for quick repetition.
Example: Oh... my... g-g-god. Oh, Em, Gee.
Lists
When listing out items in a series, use the Oxford comma.
Example: One, two, three, and four.
Speaker Tone
It may sometimes be appropriate to add a description of the speaker’s tone in brackets.
Example: [whisper], [aggravated]
Speaker Identification (Speaker ID)
If the speaker’s name is unknown, some alternatives are: STUDENT, AUDIENCE MEMBER, PROFESSOR. If there are multiple unknown speakers use numbers: STUDENT #1, STUDENT #2.
Example:
STUDENT #1: Hello.
STUDENT #2: Good morning.
Math
If transcribing math content, use only numerals. For all other topics, write out numbers 1-10 (one, two), and use numerals for numbers over this (11, 53, 978), or use a combination for large, rounded numbers (3 million).
FAQs
Who uses captions?
- People who are Deaf, or Deafblind
- People who are hard of hearing
- People with auditory processing disorders
- People who are visual learners
- People who want to to improve retention of information
- People in quiet environments with the sound off
- People in noisy environments
- People who are learning to speak English
What is the difference between open (or burned-in) captions and closed captions?
Closed captions can be toggled on or off, and is added to a video as a seperate file. Closed captions can be edited later if errors are found.
Open captions are burned into the video, cannot be changed by the end users, and are much more difficult to update or change if errors are found.
My video already has open (or burned-in) captions. Is this enough?
Open captions may be compliant, but we recommend adding closed captions as well to improve the accessibility of your content. Closed captions give your end users more flexibility in turning the captions on and off, displaying the text size and style, and moving the placement of the captions.
My video already has auto-captions. Why isn't that enough?
Auto-generated captions are not accurate enough to be considered accessible. They are a great place to start when you're building your captions, but additional editing is required.
You will need to ensure your captions are fully accurate, have speaker identificaiton, have correct spelling, punctuation, and capitalization, and all important background information has been added.
Learn more about how to edit your auto-captions on YouTube.
What language should my captions be in?
Captions should be written in the same language as the audio of the video. For example, if your video is in English, write your captions in English. If your video is in Spanish, write your captions in Spanish.
Learn more about how to caption a multi-lingual video.
What is the difference between captions and subtitles?
Captions are a text representation of the audio content. Captions are presented in the same language as the audio content. For example, a lecture in English will be captioned in English.
Subtitles are a text translation of the audio content. Subtitles are translated into the language used by the intended audience. For example, a lecture in English will have subtitles to translate the lecture into Spanish.
What if my video has no sound?
All videos are required to have captions. Even if your video has no sound, you will still need to include a caption file that says [no sound]. This lets users know that no sound is available for the video.
What if my video only has music?
All videos are required to have captions, even if there is only music playing. If this is the case, describe the music to your audience.
Examples: [Music fades in], [Loud drumming]
How come when I watch the news, the captions are delayed?
Live media has different requirements. When news shows are live, there’s no way for the transcriptionist to type fast enough for the text to be synchronized with the speech.
Should I include every "um" in the captions?
Usually not. Natural speech tends to be messier than scripted speech and may be difficult to follow if all filler words and false starts are transcribed. A caption style called "clean read" removes most of the filler words to improve comprehension. When captions include every um, ah, and you know, this is called "full verbatim." This approach is only used for scripted speech (plays, TV) and court reporting.
Who can I contact if I have more questions?
You can always email the Digital Accessibility team at improving-accessibility@berkeley.edu if you have any questions about your video captions.
More resources
- Captioning Key by DCMP (Described and Captioned Media Program)
- WCAG 2.0 success criteria for prerecorded video
- 3Play: Captioning Sound Effects in TV and Movies
- How to edit auto captions in YouTube