Working with HTML5 video: Adding captions and subtitles

In this content kit we'll teach the basics of adding captions and subtitles (and other text track types) to HTML5 <video> using the <track> element, and look at support possibilities in legacy browsers. Learners will gain new knowledge of HTML and JavaScript related to this area, and how subtitles and captions are defined and used on the Web.

Technology level: HTML5 <video> is well-supported across modern browsers (going back to Internet Explorer 9); <track> is also fairly well supported (all modern browsers, going back to Internet Explorer 10).

Please file any issues you find against This content kit's github repo.

Versioning information

Content kit v0.9: last significant update 13th April 2015. This content kit is published under the Mozilla Public License, version 2.0.

What should the presenter have?

A Linux, Windows or Mac OSX computer.
1-3 modern browsers installed. This could include Firefox, Chrome, Opera, Safari, or Internet Explorer 10+.
A decent text editor installed. Examples include Sublime Text, gedit, or Notepad++.

It would be helpful to have a good level of HTML knowledge — including <video> and <track> elements, and HTML5 element fallbacks — and JavaScript/DOM knowledge.

What should the audience have?

A Linux, Windows or Mac OSX computer would be benefical to follow along with the examples, if the audience is going to be coding along with the presenter (this is optional).
- 1-3 modern browsers installed. This could include Firefox, Chrome, Opera, Safari, or Internet Explorer 10+
- A decent text editor installed. Examples include Sublime Text, gedit, or Notepad++.
It would be helpful to have a reasonable understanding of HTML — including the basics of HTML <video> — and JavaScript/DOM knowledge.

Learning objectives

After you present or teach this content kit, your audience will:

Know how to implement captions and subtitles in HTML videos using the <track> element.
Understand WebVTT syntax (including styling) and be able to write their own .vtt files.
Be able to implement custom UIs for controlling text track display using JavaScript and relevant API features.
Understand what other types of text track are used for.

Links to resources

Project resources overview

Source code
Live demo
Issue tracker/feedback: File issues against the content kit's Github repo
Slides for presenting on the subject

Supporting docs/references

Reference: The <video> element
Reference: The <track> element
Reference: The WebVTT format
Article: Adding captions and subtitles to HTML5 video
Tutorial for building up demo included in content kit

Presentation setup

Link to slides
Links to videos of the presentation materials being used (include this later, once content kit has been delivered/tested)

Presenting about HTML5 video subtitles and captions is fairly simple — you just need the slides and demo materials, downloaded locally if possible so network connectivity is not a problem. Just running the presentation without a code walkthrough or workshop should take about 45 minutes.

Demo setup

Link to demo
Demo script:
- In its simplest form, the final demo can be run simply by double clicking the index.html file in the source code's final directory, or navigating to the live demo.
- Show the demo from step 2 of the tutorial, illustrating how it will work across different browsers due to different sources and fallback mechanisms.
- Show in particular how the default browser implementations aren't very good, so JavaScript is needed to reliably display captions/subtitles
- Explain the purpose of different types of text track
- Explain the syntax of WebVTT files, including their (somewhat limited) styling information.
- Explore how JavaScript can be used to create a custom UI — refer to the demo from step 3 of the tutorial
Troubleshooting:
- If the video doesn't play with the subtitles, you might be using a browser that doesn't support all the technologies (e.g. Internet Explorer 9.) Try another browser
- If the video doesn't play at all and you are trying the online demo, you might be experiencing a slow/bad internet connection. If possible, try an offline demo
- If you built the demo up using the tutorial, retrace your steps; see if the browser developer console is giving you any useful clues. If in doubt, refer to the published final version of the demo code.

Active learning

The slides include a marker — starting with "Code time" — that links through to the relevant code version at each point. These are good places to present the demo!

At these points you should click the link (and get any audience members that are following along with their own computers to do the same), and then have a short pause to allow everyone to have a play with the code and see what's happening for themselves. to show you where to include each step of the tutorial walkthrough. The tutorial sections include notes to show which slide number the section corresponds to. The source code has a separate directory for each stage of the tutorial/code walkthrough that shows what the code should look like at each stage.

If you want to show a detailed code walkthrough, allow another 15-20 minutes, and follow the steps provided in the tutorial.

If your viewers have computers available and you want them to follow along with the tutorial in a workshop type situation, allow an additional 40 minutes for experimentation and Q&A/troubleshooting. At the end of this session it is a good idea to have a "sharing" session so that the audience can share anything interesting they've created, give each other feedback, and ask questions.

To make this process as seamless as possible, you (and your attendees) should have the slides open in one browser window, the code result and tutorial open in another browser window, and the code open in your text editor. This way you can easy switch between your slides and your coding environment, if/when the time comes to do some more live coding.

Frequently asked questions (FAQs)

Why do we need different sources? Why can't browsers just support the same video formats?: Because patents and competition. Microsoft and Apple are part patent holders of the H264 video format contained in the MP4 container format. Google, Mozilla and Opera aren't, so they have direct support for the non-patent encumbered WebM format instead. You should have at least these two formats available, possibly more if you want to support older browsers. Read Media formats supported by the HTML audio and video elements for exhaustive detail.
Why was the older (better established) SRT format dropped in favour of WebVTT?: Because SRT was only really for subtitles, whereas video text tracks encompass subtitles plus a wide range of other uses. In addition, WebVTT allows you to add rudimentary styling to text tracks.
Why is JavaScript needed to reliably display captions?: Modern browsers implement the HTMLMediaElement textTracks property that provides access to all the text tracks associated with the video, and the associated APIs that allow further manipulation of these tracks. However, default browser styling of text tracks is not very reliable, so you are advised to handle it yourself.; Is there a reliable tool that speeds up subtitle/caption creation and translation?
Yes. There are a few services available, but we'd recommend Amara — UniversalSubtitles