For all there is to love about Google Cardboard, it’s still a bare-bones experience. It’s barely even VR, really. But the cheap, smartphone-based viewer offers the VR-curious an easy window into 360-degree video. Pricier headsets like the Oculus Rift and Sony PlayStation VR, designed for gaming, deliver more than a cool stereoscopic viewing experience. In addition to the immersive visual eye-candy—users can explore virtual spaces, peek around corners, and, using hand-held controllers, interact with digital objects—these sophisticated VR rigs offer truly lifelike audio.

When a monster sneaks up on your left in a VR game, you’ll hear its slobbering tongue lashing at your left ear. When a shot comes at you from above you and slightly to the right, you know exactly where to return fire. When The Edge tears through the opening riff of “Mysterious Ways,” it reverberates around the stadium.

The high-priced headsets from Oculus, Sony, and HTC pack the processing punch to deliver “spatially oriented” audio experiences that consider direction, distance, and environmental factors when creating the soundtrack. Cardboard, powered by your smartphone, can’t do that yet. But earlier this week, the Cardboard team made it a little easier to give the audio in these apps a bit more realism. As this blog post from Google Cardboard product manager Nathan Martz outlines, the Cardboard software development kit for Android and the Unity game engine now support spatial audio. This platform update paves the way for Google Cardboard to become something more than a gateway drug to true VR.

It’s Gotta Be Real

As an emerging medium, VR is something content creators and viewers are trying to decode together. But in the world of traditional filmmaking, there’s an old mantra: You can get by with poor video quality, but poor audio almost always ruins a film. “Audio is as important as it has always been,” says Mark Bolas, director of the Mixed Reality Laboratory at USC’s Institute of Creative Technology (ICT). “It helps drive the emotional arc of an experience, offering the ability to foreshadow and support the story without anyone knowing it was there.”

Audio design may be more important to VR than traditional film, because the viewer also is the director. The story changes depending upon where the viewer looks, and directional audio cues can help guide the experience. Plus, VR is intended to be as realistic and immersive as possible. Audio that doesn’t act the way it should—or the way the viewer expects it to—can ruin the experience.

“The sense of immersion requires a certain minimal level of support, for example a wide field of view, without which the experience just does not click,” Bolas explains. “After that threshold is crossed, then additional improvements to the medium provide additional richness of experience… Good audio combines with immersive graphics in an almost synesthetic way to enable transformational immersion.”

Cardboard’s new audio capabilities should be particularly handy for game developers. Beyond allowing developers to link specific sounds to where the viewer is looking, the technology will allow them to make the audio more realistic. For example, when sounds occur to your right, there’s a slight delay between when your right ear hear them and when the left does. The sound also isn’t quite the same from one ear to the other. Using binaural tricks, Cardboard will be able to simulate those details.

The new Cardboard spatial-audio support also will change how things sound based on the viewer’s virtual environment. For example, if the viewer is outdoors in the snow, sound will be muffled. If they’re in a tiled room, sound will echo. It gives the Cardboard platform something approaching the audio effects of Oculus Rift—and without the need to hook anything up to a pricey computer.

True Surround Sound

All of this begs the question, “Why bother?” Although Cardboard is clearly intended for a more casual audience than high-end headsets, its low price could help drive VR into the mainstream. It’s so cheap that Google can give away the viewer, as it did last year in a partnership with The New York Times, providing many peoples’ first experience with immersive viewing.

But there’s a balancing act here. If consumers are underwhelmed by the experience, they may not give the next generation of VR headsets a chance. We’re already seeing some effort to make the experience nicer. Several comfortable and durable headsets for Cardboard available so you no longer have to hold the viewer, which is literally cardboard, to your face with your hands. A comfy, hands-free viewing experience makes things like spatial audio a much bigger deal.

And if VR really takes off, spatial audio will improve further due to fixed at-home setups. With positional-tracking headsets synced with multi-channel sound systems, the line between the virtual world and the real world will grow blurrier still. Sound will come from all around you, no headphones required.

Jeremy Bailenson, founding director of Stanford’s Virtual Human Interaction Lab, says spatial audio is key to creating an immersive virtual world. “In our lab, we have a 24-channel ambisonic sound system built by Worldviz that can spatialize sound and integrate the spatialization with tracking data,” he says. “We give demos to thousands of people per year in the lab, and the spatialized sound is a huge part of the high-presence experience.”

For now, with just a smartphone and a Cardboard viewer, you can experience convincing 3D audio—as long you also plug in headphones. But this is a long-term play for Google, one that extends well beyond its entry-level VR viewer. Cardboard may be a baby step for mainstream VR, but the platform itself just took a substantial leap.


Why Spatial Audio Is Such a Big Deal for Google Cardboard