This is the first post in four-part series we’ll be posting this week. Check back tomorrow for part two.

With the recent release of Apple’s iOS 8 and the upcoming release of Safari 8 this fall, it’s a good time for an update on the state of video on the web, what’s possible, what’s reliable and what’s totally broken. Web video technology has come a long way since the HTML video element was first proposed in 2007. The fundamental playback features have become more reliable and more broadly supported by browsers over the years, and new technologies are continually being added to match the most advanced interactive capabilities of native applications while taking advantage of the unique networked aspects of the web. But with some browsers on rapid release schedules, varying approaches to standards and conflicting interests of the browser makers, support for these features can be inconsistent and often buggy. It can be difficult to keep track of which features we can rely on, which translates into increasing frustration and costs when build cutting-edge sites that also work on older browsers.

This series of blog posts is an attempt to catalog the advanced features that are working now and to call out some major pitfalls so we can work around them. But let’s start with the basics.

Basic Media Playback

The HTML media elements provide solid support for loading and playback of video and audio, with API access to important information about each element’s state. The most fundamental features are the loading, decoding and playback of media. There are methods and properties for pausing, playing, seeking, volume control and even playback rate for speeding up or slowing down, within reasonable limits. Although browsers optionally provide a basic user interface, the JavaScript API is complete enough for any web developer to build robust custom user controls or to drive playback and seeking automatically.

Unlike some video player APIs that assume the media file is on a local drive, this one assumes the file is accessed over an unreliable network connection, so loading is asynchronous. This is a healthy assumption, because if the file is slow to load, it doesn’t cause the application to block and the interface can be updated to show the user which parts of the file are ready to play. Even if the browser loads from from the local file system or browser cache, it could still be on a network drive or other slow device. A video or audio file is loaded from a web server in chunks with HTTP byte serving, so only the needed portions of the video file are transferred over the potentially slow network, allowing for quick seeking. The “readyState”, “networkState” and “buffered” properties provide a lot of information about the loading state, along with events like “progress,” “waiting,” “stalled,” etc. that fire when the state is updated. See the MDN references on properties and methods and events for details on how it all works.

For the most part, these APIs work well everywhere, but there are a few things to look out for. Chrome does not always accurately set the “readyState” value or fire the “canplaythrough” event at the right time, which is an old bug still waiting to be fixed. This is especially true in Chrome for Android, which may report that loading is done well before it has. So you may have to check both “readyState” and “buffered”, but that is complicated by the fact that the “buffered” property is broken in Firefox with WebM videos, but that has been fixed and will make its way to release soon.

Mobile Safari complicates things even further, as it won’t start playing or loading the video until there is a touch event from the user. Until the latest update, it wouldn’t even load the metadata (video duration and dimensions), but that has been fixed. It also blocks any control of the volume with the JavaScript API; video and audio volume can only be controlled by the hardware buttons. Mobile Safari won’t allow playing more than one piece of media at a time, and on the iPhone, a video can only be played at full screen. So layering other HTML elements or running CSS effects while the video plays is impossible.

Encoding and Decoding Video

Video and audio codec support has been a bit of a difficult point but is becoming increasingly standardized across browsers. Over the next few months, Firefox will gain support for H.264/MP4 in almost all platforms. This format has the advantage of being most widely available in both playing and editing software and has pretty good quality in most cases. QuickTime has been known to render colors incorrectly, but it’s the only option for playback inside Safari. On the other hand, video exported by QuickTime doesn’t always play well in non-QuickTime players, like VLC, ffmpeg or Firefox and Chrome. For example, I’ve found that if you import H.264 video into Final Cut and export a shorter clip, the first few frames will be corrupted and show parts of the video from before your cut. If you can, it’s best to edit with video as close to the source as possible.

The WebM format, which supports either the VP8 or VP9 codec, works in Firefox, Chrome and Opera. It’s a much easier format to work with, and it’s both free and open, but it’s not as widely supported as H.264/MP4. Mozilla has created the high-quality and free Opus codec for audio, and they are working on Daala, which will hopefully surpass all the above codecs. But it will not be available for a while and may not be supported in Internet Explorer and Safari, which are less friendly to free/open software.

This is enough to achieve or exceed feature parity with older methods for media playback like embedded Flash or QuickTime, plus the additional benefits of standardization and the ability to be affected by style sheets like any other HTML Element. These features are mature and widely supported in modern browsers, apart from the bugs and mobile browser quirks I mentioned above.

For a technical look at the state of audio, see Mark Boas’s article, HTML5 Audio – The State of Play. It’s a couple years old but still relevant, and many of the issues he covers for audio also apply to video.

Get more documentary film news and features: Subscribe to POV’s documentary blog, like POV on Facebook or follow us on Twitter @povdocs.

Published by

Brian Chirls is the Digital Technology Fellow at POV, developing digital tools for documentary filmmakers, journalists and other nonfiction media-makers. The position is a first for POV and is funded as part of a $250,000 grant from John S. and James L. Knight Foundation. Follow him on Twitter @bchirls and GitHub.