Can you duel a villain without sight like Marvel’s Daredevil? We’ll transport you right smack in the middle of any movie with spatial audio, that is, sound-wise. Spatial audio is almost the whole reason why we go to the cinema instead of watching a movie at home. Aside from the candy counter, ginormous screen, and the mumbling audience of course.
It’s all in the name of immersive entertainment. It’s about having a sense of audio presence and dimension within the realm of sound. No longer do we simply “watch” movies and “listen” to music. We experience them. The constant innovation in technology makes entertainment an audiophile’s wet dream. Submerge yourself in the fluidity of sound as we bring you into the sphere of spatial audio.
What is Spatial Audio?
Spatial audio uses binaural spatialization to simulate sound direction, distance, and height in a spherical way. Like adding a third dimension depth to sound, the technology is able to mimic the way we hear things in real life in a 360-degree environment. Spatial audio is a variation from the “flatter” mono audio whereby both ears hear the exact same thing on each side.
The Human Brain and Sound
The human ear is sensitive to the movement of molecules in the surrounding environment. This is also known as mechanosensation. Our auditory sense perceives sounds by detecting vibrations and changes in pressure. These vibrations are transduced into nerve impulses that are sent to the brain. The brain then interprets the sound in order to help us make decisions such as a fight or flight response, all in the interest of self-preservation. Besides such reflexes, sounds can also trigger an emotion that can be connected to a particular circumstance. Seth Horowitz, the author of “The Universal Sense”, says this works in the background so seamlessly we often take it for granted.
Just like our sense of vision, we can also interpret spatial information of sounds. It is almost like having a sense of touch from a distance. A phenomenon is produced by the brain by low-frequency pulsations in amplitude. This often allows us to pinpoint the location a sound is coming from based on its intensity. The volume of a sound can help us perceive information on its distance and movement. We can determine from sound, whether the source is approaching or going away. In addition, the difference in decibels received by each ear can help us to pick up information on the direction. The bouncing of sound off surfaces can also help us to perceive the volume of space we are present in.
The Evolution of Sound Technology
Once upon a time, mono audio was new technology. Then came along the innovation of cinema surround sound and stereos systems, both, spatial audio technology. Make no mistake though, binaural audio was actually invented in the late 19th century. It just never took until recent decades.
Today, we can enjoy a concert or movie from the comfort of our homes just by owning a good pair of headphones. Devices are shrinking in sizes and expanding in inventive capabilities. One can experience the same realistic sounds of being physically present in an atmospheric scene or a concert venue with the use of spatial audio.
To simulate this, the technology utilizes attenuation over distance, micro-delays, and filtering as basic principles during recording. Many new advancing techniques are being explored every day in addition to these. A flat visual movie has now been upgraded to goggled 3D and 4D-motioned experiences. Likewise, in audio today, we’ve started from flat mono audio and progressed to 4D, 5D, 6D, 7D 8D, and the latest trending 9D technology of music. Many of which can trigger pleasurable music tingles down the back also known as Autonomous Sensory Meridian Response (ASMR). Let’s take a dive deeper into the nerdy tech sphere of spatial audio technology.
Speaking Audio Geek
Radio Shack can be an audiophile’s paradise or a technophobe’s nightmare. Then comes the never-ending comparisons of speaker systems capabilities in alien coded specifications. Essentially, most of it boils down to channels. Not radio channels but rather sound channels. A sound channel is defined as a single independent audio signal which is recorded and played back in a spatial position. To start you off, we’ll give you an earful of the basic differences between mono, stereo, and spatial sounds.
As the most basic form of audio, mono uses only one channel. This means that the number of speakers used during playback does make a difference in sound. In general, each speaker will reproduce the same copy of the signal. Meaning, you will hear the same sound and volume from each speaker. Monoaural tech, therefore, does not render a realistic surround sound atmosphere to the listener as both ears will receive the exact same audio. Sound engineers thus describe mono audio as being flat.
Unlike mono sounds, it is possible to define the location of sound reproduced. But this can only be done so on a horizontal plane. This is defined by the movement of sounds anywhere between left and right. Stereo audio can be programmed to give listeners a surround sound experience to a certain extent. It is however not as realistic as sounds reproduced in spatial formats.
Spatial audio takes sound technology a leap forward with expansive sounds that stereo systems cannot deliver. It allows placement of sounds not only on a horizontal plane, but also from above, under, in front, and behind. The multi-channel technology allows producers to create a sphere of movement and distances around the listener. It is done so with the play of the volume, location origin, and direction of the sound. This creates a multi-dimensional and realistic experience for the listener.
Immersive Applications of Spatial Audio
When we think surround sound, we think blockbuster films and rock and roll concerts with decibels that will blast your roof off and gather disgruntled neighbors. But today, it’s not just about volume, but rather the quality of sound and how it can immerse a listener. It’s not just about listening but also triggering an emotional response. As such, spatial audio can be applied to a range of media to enhance the audience’s experience. They include:
- 360 degrees AR and VR
Music Dimensions Explained
Recent trends of 8D and 9D audio have been popping up all over YouTube. But what’s the deal on these novelty tracks that make them super cool? The technical term for this dimensional music is ambisonic sound. The basic idea of the technology is the addition of effects to give the listener a sensation that the sound is rotating around him or her. The illusion is created with the addition of reverberation. That is, soundwaves that interact and reflect off surfaces. This allows the listener to experience music as if being in a large room. As sound rotates only from left to right, the best way to experience this multi-dimensional audio is with the use of headphones.
Today, there are 3D, 4D, 5D, 6D, 7D. 8D and 9D sound technologies. Fancy as they may sound, the only contrast is the number of directions, or bearings, that the sound originates from. Multi-dimensional sound technology is no longer just used in cinemas but is now also trending on Spotify. It is also the medium that is infused with virtual reality (VR) content to immerse watchers in a realistic present experience. With 8D and 9D mixes, creators use software to manipulate sound direction that allows the listener to feel like it’s moving in a 360-degree spherical space.
Approaches to Spatial Sounds
In spatial audio, sound flows around the listener creating an enhanced atmosphere that traditional technology cannot. The effect of spatial audio output is however heavily dependant on the number of speakers we have, as well as the rendering technology of the audio system. Here are two approaches to spatial audio recording you can consider.
Binaural channel-based audio is often recorded using two microphones. They create a 3-dimensional acoustic experience that allows the listener to feel like he’s in the presence of a live performer. An audio channel, as explained previously, refers to an independent audio signal that is recorded. Each audio channel is reproduced by the speaker at a defined output position. This is set by the producer in a digital audio workstation.
The downside to the channel-based approach is that little can be changed post-production. You can create good effects using software to steer mono sounds around your listener. But your audience will not be able to adapt the fixed audio mix to their needs, preferences, or speaker device. This type of audio is thus best experienced with the use of headphones. Typical multi-channel loudspeaker set-ups might not be able to translate the surround-sound effect as well.
If you’re an audiophile who owns a large number of speakers, audio mixing becomes a challenge with channel-based recording. This is when object-based audio comes into play. Metadata tags each audio object. Allowing sound engineers to design their particular location, volume, and movement. This is the noteworthy difference as compared to the channel-based recording which sends audio to pre-defined speakers (such as left or right). Metadata is used to assemble objects with each other to create an overall enveloping sound experience. During playback, the audio is distributed between all the speakers, no matter the layout and number, to create an “Atmos” system that is widely used today in cinemas.
Recording Spatial Audio
Looking to produce your own uber-cool spatial audio? Doing a DIY spatial audio recording certainly involves a learning curve and sometimes a hefty investment in equipment.
Depending on what you wish to accomplish and how much you are willing to invest, you can consider the options below:
No kidding, this equipment is literally shaped in the form of a polystyrene human head. The microphone placement is located on both sides of the dummy head approximately 18cm apart. This emulates the hearing position of a human being’s approximate position of ear canals. A typical binaural recording unit consists of two high-fidelity microphones. The equipment captures sound exactly the same way it enters the outer and inner ear. All while flowing around the shape of a human head.
These are multi-mic field recorders that can capture ambisonic sounds. The compact and easily portable device pairs with 360-degree cameras to create sound-rich VR experience. In post-production, a software plugin mixes and enhances the final playback result.
This specially shaped equipment is designed with four sub-cardioid microphones pointed away in tetrahedral directions. Sound is thus recorded from different directions. The use of additional software decodes and converts signals coming from each mic to create a final immersive mix.
Tips & Tricks
We’ll start you off with some quick fundamental tips on how you can record a realistic spatial audio project.
Keep your room small:
Contain sound in a smaller space. Don’t count on your mic to record distant sounds. Rather, record them separately and add them in during post-production.
Do a test recording:
Practice your sounds and do a test recording with your equipment. You can also do a quick clap test and announce the orientation of each one of them verbally. For example, say “front”, “back”, “left”, or “right”after the clap. This will help you with video-audio location syncing at a later stage. Playback your test recordings to tweak positions of your rig in order to create the most realistic effect.
Tag your audio:
Rename each audio object and record notes. This is critical for easier post-production.
Record for at least 3 minutes
Apply the 3-minute rule to main ambisonic recordings and additional source recordings. This will give you more options for editing, transitions, and manipulating the Ambi bed later on.
Plan and design sound distance
For visual mediums, make sure you line up sounds with shots. Be sure to alter the volume of sounds moving nearer and away from the camera to mimic motion as seen on the screen.
An Alternative Solution
Recording spatial audio might be an unreasonably large investment for just a couple of projects. So why not leave it to the pros? There are options to outsource your audio production with reliable online platforms such as this one. You can even opt for hiring voice actors in different languages and accents. Full-stack video production is also available in producing high-quality and ready-to-use recordings at a reasonable rate, and with a quick turnover time. Not convinced? Check out some client testimonials!
Your Time to Shine
Music is a celestial power that controls the universe. Its clash of ideas is the ultimate sound of freedom. With the right sounds, you can either remember everything or forget everything. It’s your turn to create an immersive experience for your audience with these tips!