This is the second post in four-part series we’ll be posting this week. Read part one and check back tomorrow for part three.

Beyond simple audio and video playback, there’s a lot we can do to manipulate the content to truly integrate it with the rest of the web. Back when we relied on Flash and QuickTime plugins, video acted as a black box, impenetrable and separate from the web page. But with the HTML video element, we can manipulate video with CSS just like any other HTML element, even 3D effects. CSS layout works well across all browsers, with the notable exception of the iPhone. As I mentioned in part one, video on the iPhone can only be played in full screen, not within the web page layout.

Image Processing

We can also get inside the video and directly manipulate the pixels like you could with an image. This allows for all kinds of effects using the HTML canvas element. We could copy the video image into a 2D canvas, but this can be slow — it runs sequentially on the CPU and requires a lot of unnecessary memory copying. We can do better with WebGL, a Javascript API for quick graphic processing. With the latest iOS update, WebGL is now available on the iPad and it will be in the upcoming desktop Safari version as well, making it available on all major platforms. WebGL was designed for making 3D graphics, but it can be used for high-resolution video and image effects, as in the video transitions demo I created and wrote about earlier this month.

Although WebGL now has wide support, it remains broken for video processing in Internet Explorer and Safari, in both desktop and mobile versions. Microsoft has confirmed that it plans to fix this, and for now there is a workaround that draws the video to a 2D canvas, which is then passed to WebGL. It runs slower than drawing the video directly with WebGL, and it is very slow in mobile and desktop Safari.

Because WebGL gives direct access to the graphics hardware, there is a lot that can go wrong. Every device and platform has different capabilities, and there are various methods and extensions on the WebGL context that give information on these capabilities. WebGL Stats lists many of the varying capabilities and shows how widely different devices support them. But it’s important to test WebGL on as many devices as possible, as I often see GPU (Graphics Processing Unit) shaders failing to compile or presenting incorrect output due to poorly documented quirks, especially in Internet Explorer.

Audio Effects

WebGL has video covered, and with the Web Audio API, we can process and generate audio in real time. There’s gain control and linear convolution nodes, ScriptProcessorNode (which passes audio data to a Javascript function so you can create custom effects beyond the ones built in to the browser) and AnalyserNode (which lets us create visualizations of an audio signal).

Audio can be loaded into the Web Audio API with XMLHttpRequest, but it’s best for very short clips and doesn’t allow for pausing and seeking. There is a way to pull audio data from an audio or video element, which is much more useful, but it’s broken in Safari and in Firefox when the source file is loaded from a cross-origin source. Web Audio API is available on all modern browsers, except for Internet Explorer, but Microsoft says it’s in development.

As with video playback, Web Audio won’t work in iOS unless you trigger it inside a touch event handler.

Cross-Origin Resources

When directly manipulating audio, image or video data, we need to be concerned with cross-origin restrictions. For security reasons, the browser does not allow a script to modify any of these resources unless they come from the same server, on the same domain name and port. For example, a malicious site might try to reference an image on your bank’s server of a scanned check. If you’re logged in to the bank site, the malicious site would be able to show you the check image, assuming somehow it had the URL, but it would not be able to read the actual contents of that image because of cross-origin restrictions.

There is an exception to these rules that sometimes comes in handy, because sometimes we want to access media from another server. You might want to publish a video for people to remix, or more likely, you just want to serve a video with a content delivery network (CDN), which is a very good idea for performance. It requires setting the crossorigin attribute on the video (or img or audio) tag, and the server has to give permission to use the resource by setting the appropriate headers. Mozilla Hacks has an article on how to do this with images, and it works the same way for video.

Unfortunately, neither Safari nor Internet Explorer supports this exception for video, though they do support it for images. So our workaround for video textures is limited to videos on the same origin. I filed a bug report for Internet Explorer, but Microsoft has not yet responded. Firefox has the aforementioned cross-origin audio bug, which will hopefully be fixed soon.

Tomorrow in part three, I’ll discuss upcoming video features such as streaming, camera access and recording.

Get more documentary film news and features: Subscribe to POV’s documentary blog, like POV on Facebook or follow us on Twitter @povdocs.

Published by

Brian Chirls is the Digital Technology Fellow at POV, developing digital tools for documentary filmmakers, journalists and other nonfiction media-makers. The position is a first for POV and is funded as part of a $250,000 grant from John S. and James L. Knight Foundation. Follow him on Twitter @bchirls and GitHub.