3D Mixing FAQ

Here are some Frequently Asked Questions about 3D mixing and the O3A plugins:

I'm about to make an awesome binaural mix for a Virtual Reality film. Do I need your Binaural Surround plugins?

No, you don't. You'll get much better results with our O3A plugins, although you'll need to get your head around the basics of ambisonic B-Format.

Working with O3A, you will have a 16 channel "B-Format" mix containing a 3D audio scene which can be built up and manipulated with a wide range of plugins. When you want to listen to it in binaural, you can use a binaural "decoder" such as the ones in the O3A View or O3A Decoding plugin libraries. If you just want to get an idea of how it is to work this way, you can probably get by with a stereo decoder from the free O3A Core, although that won't sound as good. Once you are more familiar with ambisonics, you might consider the O3A View library, which also provides Virtual Reality video support and in-VR panning! But, we are getting ahead of ourselves...

You could render out your binaural stereo mix and be done, but this is not a good solution for Virtual Reality, because the mix will not change when the user's head is turned. Many VR systems actually work with ambisonic audio directly and decode it binaurally themselves after rotating the 3D mix to take head movement into account. For best results with these systems, they will need the actual 16 channels of your O3A mix (or fewer channels for lower spatial resolution).

But, check what technology the Virtual Reality film is going to use. Some use something called "Quad Binaural". If you need this, there is an optimised decoder in the O3A Decoding library. However, this typically isn't as good as even first order ambisonics.

I've used an O3A panner, but the sound is always to the left. What's broken?

This typically happens because no decoder has been used.

Ambisonic panners produce "B-Format", which should not be listened to directly. This is because B-Format is an special way to capture a 3D audio scene that can be "decoded" to a wide variety of formats, not just stereo. You should never listen to ambisonics without an ambisonic decoder. Ambisonic decoders are sometimes known as "renderers".

If you are using the free O3A Core pack, the "O3A Decoder - Stereo" plugin can produce good stereo for speakers or basic stereo for headphones. Much better 3D binaural stereo for headphones can be produced using plugins from the O3A Decoding or O3A View packs, or Rapture3D Advanced. That all said, if you are working on a VR project, you should only use these for monitoring. In the final product, the B-Format should only be decoded in the VR device, after head tracking has been taken into account.

I've used an O3A panner and a decoder and the sound is to the left and right, but not very good. What's broken?

This typically happens because a track, bus or send does not have 16 channels.

The first two channels of third order, 16 channel SN3D B-Format, which is what the O3A plugins use, can loosely be thought of as the "M" and "S" of M/S (Mid/Side) stereo. Your O3A audio should always pass through tracks, buses and sends with 16 channels. If it ever tries to pass through a stereo bus, 14 of your channels will be lost and your scene will essentially collapse to a form of M/S stereo. The spatial definition you get here is a lot less precise than you should expect from O3A.

Working with O3A audio (third order ambisonics, encoded using SN3D), your tracks, buses and sends should always use at least 16 channels.

I can't get the plugins to work in Cubase/Nuendo/Logic. What's broken?

The O3A plugins need sixteen channel tracks, buses and sends and so can only run in certain DAWs. Further our plugins use VST2 "shell plugin" support, which was removed from Cubase/Nuendo in version 8.

For now, we recommend you use Pro Tools (Ultimate or Studio), Reaper or Pyramix.

Why do you use 16 channel third order? Isn't four-channel first order enough?

First order has been around since the 1970s and captures an impressive level of spatial detail. However, third order captures a lot more. We really think it's worth the extra effort and modern machines are easily up to the task!

The quality difference is large, but you need a good decoder to hear it. Decoding well at higher orders is technically quite difficult and not all implementations do a good job. However, we hope that you will find that the Rapture3D decoders that are used throughout out product range give excellent results.

Third order can be converted to first order by simply taking the first four channels and ignoring the rest. This means that a third order mix can be used to produce first order material easily, but can also be used to produce high quality multichannel or binaural, or ambisonic assets for higher order engines like Rapture3D.

After decoding 16 channel O3A to stereo, I still have 16 channels of audio. Shouldn't I have just two?

In Reaper and some other DAWs, plugins that have fewer outputs than inputs are handled by filling the unused output channels on the track with audio from the input. Putting this another way, the track channels that are not used by the plugin output are simply not overwritten.

In the case of a stereo decoder, this means that the plugin will write to the first two output channels on a 16 channel track or bus, but the other 14 channels will contain channels 3-16 of the O3A input. So, channels 3-16 are essentially junk, because they are meaningless without the first two channels. They can and should simply be ignored! Only the first two channels matter - these contain the decoded stereo.

In Reaper, if you need the decoded audio to end up as a stereo track for simplicity, you can create a stereo track and set it up as a parent track for the 16 channel track with the decoder. The master/parent send will then strip off the 14 meaningless channels.

I'm using an O3A panner and a binaural decoder. Why doesn't the mix sound real?

This is a complex topic. It takes quite a lot to fool our brains into thinking that sounds are where they aren't and there are many factors to consider on top of the binaural decoder itself. Really good binaural mixes require audio engineering skill and don't just happen on their own.

A very important factor here is acoustics. If sounds don't seem to reverberate in a natural way for the surroundings it can be hard to believe they are part of those surroundings.

Some binaural renderers always synthesise some reverberation, to help make the audio fit the surroundings. However, these typically use a reverberation appropriate for a living room. This may give excellent initial results but is not generally appropriate, for instance in a Virtual Reality experience that should be out in a desert.

We take a different approach. Our binaural decoders are anechoic (acoustically very dry), so they can sound pretty unreal on their own! But - you can record sound in an acoustic using an ambisonic microphone, or add appropriate reverb to your mix using the O3A Reverb plugin library, which provides a range of different reverb approaches. The Rapture3D game engines also include reverb modules. This is more work for you than if we had built the reverb into the binaural decoder itself, but gives you much more creative control.

There is more to worry about. For instance, close-miked recordings contain high frequencies that would sound wrong in real life coming from a distant object. But, this sort of material is commonly used in conventional mixes as it can make sounds more characterful. Care is needed here when mixing for binaural if the scene is to be believable. To help with this, some of our panners include low pass filters - don't be afraid to use them!

So, it's important to think about the elements of your mix and the acoustic they are in. When the mix is finally played back there are other things that can be done to make binaural more convincing. Head tracking (for instance in a VR system or from a separate head tracker) can help massively, as can visual cues (again, typically in VR). Using a decoder with filters that are calibrated to an individual's particular head shape (personalised HRTFs) can also make a difference. But these things can only help so much if the mix isn't right in the first place.

I'm mixing for cinema. Do I need Rapture3D Advanced?

As long as the O3A Decoding library covers the standard speaker layouts you need, it is generally more convenient to use this pack to produce deliverable mixes rather than Rapture3D Advanced.

However, you may still want to use Rapture3D Advanced for better monitoring while working on the mix, particularly if your studio speaker layout is not standard.

I'm working on an installation or theatre production. Do I need Rapture3D Advanced?

Rapture3D Advanced often works well in these contexts, as it can be tailored to the playback space.

Why is your +45 degree elevation mark not midway between "Front" and "Above" on the O3A Panner control surface?

These surfaces are designed to project the whole surface of a sphere onto a rectangle. This is actually quite a difficult problem because there are many solutions, none of which satisfy everyone. This has been discussed in great detail over the centuries in the world of Cartography, because paper maps of the world have the same issue.

If we gave the same amount of our rectangle to each elevation, that would put the +45 degree mark halfway between the horizon and top. This would then mean that we were giving half of our rectangle to sounds above +45 degrees elevation, or below -45. However, the elevations nearer the horizon carry much more detail than the higher or lower elevations (where it is narrower) so this distorts the areas involved.

The projection we use is an "equal-area" projection. Although shapes get warped by the projection (which is necessarily true for all of these projections), they still use the same relative area after the projection. So, if things are little, big, sparse or dense in the rectangular view, they will be on the sphere too.

We think this is a good thing. If you'd like to investigate further, take a look at https://en.wikipedia.org/wiki/List_of_map_projections. We use an "Equal-area" projection. And if someone is telling you that "Equidistant" is better, ask them about rotations or diagonals!

That said, for whatever reasons, an "Equidistant" projection is commonly used as the raw video format for 360 video VR films. Our O3A View plugins come with a viewer that provides either that raw view or a correctly warped view consistent with the plugins.

When viewing a conventional video in View, using projection mode "2", why isn't the video rectangular?

In the View application, this view projects the whole surface of a sphere onto a rectangle. This is quite an odd warping and has some somewhat counter-intuitive effects!

To make more sense of this, turn on the "cube" grid by pressing the key "9" and see that the rectangular film image is being presented on a warped "wall" of the cube.

Or, thinking about this another way, consider the elevation of points along the top of a video screen. Because points in the middle are slightly closer than points at the edges, they have a higher elevation.

How do I export audio in the right format for YouTube 360 Video?

Currently, you can upload using four channel, first order SN3D.

Our O3A plugins use SN3D (ACN/AmbiX) encoding for their ambisonics audio, which is the same as YouTube's ambisonic encoding. Because we are directly compatible, not much should be required to export the audio in the right format.

So - make sure you do not decode the audio before you upload it to YouTube! The B-Format will be decoded later, during YouTube playback.

YouTube currently only supports first order, four channel SN3D. That means that only the first four channels of your sixteen channel O3A mix should be uploaded to YouTube. Yes, a lot of spatial detail is thrown away by this. Hopefully, this is only a temporary situation - Google Jump Inspector can already handle a full third order, sixteen channel mix. In Pro Tools (Ultimate or Studio), you can use the "O3A Decoder - O1A" plugin to do this.

After this, you should have four channel, first order SN3D audio. The next thing you will probably want to do is to embed this into a YouTube video with the correct metadata. Please see the YouTube documentation for an explanation of this.

How do I export audio in the right format for Facebook 360 video?

Facebook's FB360 Encoder can import second order SN3D (AmbiX) audio. SN3D is what our O3A plugins use, so choose 'B-format ambiX 2nd order' in the 'Select format' dropdown. As the audio is already in the right format, do not decode the O3A audio before importing!

Our O3A plugins use third order SN3D, so you need to reduce the mix to second order by taking just the first 9 channels of the 16 channels in the full mix. If you are using the Reaper 'Render to file' window, you can generally arrange this by typing 9 into the 'Channels:' box before rendering. In Pro Tools (Ultimate or Studio), you can use the "O3A Decoder - O2A" plugin.

The Reaper FX list is only showing a few O3A plugins, with odd names like "O3ACore64". What's broken?

This usually happens because Reaper has been configured not to show the contents of VST2 shell plugins.

Shell plugins are plugin libraries that contain a number of plugins in a single file. We use these rather than litter your directories with large numbers of individual plugin files. There are other benefits.

To configure Reaper to show all the plugins in the shell plugin libraries, go to Reaper's "Options/Preferences/Plug-ins/VST" page. Please ensure that "Get VST names/types when scanning" is enabled and then click "Clear cache/re-scan". After that, you should be good to go.

Why don't the VST plugin dials work as I'd expect?

The dials in the user interfaces for the O3A plugins can be configured by the VST host so they are controlled with rotary or linear mouse movements. If these are not configured as you would like, behaviour can often be changed using the options pages of your DAW.

In Reaper, this can be found on Reaper's "Options/Preferences/Plug-ins/VST" page. Here, use the "Knob mode" drop-down menu to select the mode you would like.

I want to put some mono or stereo music into a VR mix but not give it a definite spatial location. What can I do?

Of course, the best option is simply to carry the music in separate tracks that are treated differently for head tracking purposes; but this is often not an option.

There are a few things that can be done here. Just dropping audio into the first (omnidirectional) channel of an O3A mix will work to an extent, but may result in inconsistent levels during decoding and strange "in head" artefacts in some playback scenarios. The acoustic soundfield you are describing when you do this is quite unusual in nature!

You can improve thing by applying the O3A Diffuser to the mix to break up the unusual coherence of the wavefront, but a better option is often the Mono or Stereo Ambience plugins. These break the audio up into many frequency bands and spread them over a wide region of space.

What's the difference between the TOA and O3A plugins?

The TOA and O3A plugins are essentially the same, except they were renamed when we changed the ambisonic encoding convention that the plugins use in December 2016. The old TOA plugins used third order FuMa, whereas the new O3A plugins (v2.0 and above) use third order SN3D. The plugins have essentially the same functionality.

We gave the plugins new names (and IDs) so that the old and new plugins could be used at the same time in Digital Audio Workstations like Reaper during migration and so old projects would not stop working.

The TOA and O3A plugins also look different. The old TOA plugins had wooden backgrounds, whereas the O3A plugins are more "modern".

Why isn't the Oculus Rift supported in ViewVR?

ViewVR is somewhat "bleeding edge" technology and should be considered experimental even on its supported headsets (Vive and Index). It does actually work for us with the consumer Rift if SteamVR is running, but we don't feel able to support this configuration just yet as there are a lot of "moving parts" involved that we might not be able to fix if they stopped working.

Do you have a plugin to handle A-Format?

A-Format normally refers to the raw audio recorded by a tetrahedral microphone array. We don't currently provide public software to handle this.

A-Format to B-Format conversion is quite a subtle art and highly dependent on the character of the microphone capsules and arrangement involved - and other factors. Almost all A-Format microphones have recommended conversion software. Some even have calibration files that are specific to the individual microphone, to take individual capsule performance into account. This software is integral to the sound of your microphone and you should use it!

That said, the term "A-Format" is sometimes used loosely to mean something different. Sometimes, it refers to a transformed version of B-Format that reorganises the audio into a number of channels of mono each relating to a particular direction. This can then be processed using a wide variety of techniques (more than is usual with B-Format) and then transformed back in a way that maintains some spatial integrity. This is similar to the concept of "Spatial PCM". The "O3A B->A20 Converter" and "O3A A20->B Converter" plugins from the O3A Manipulators pack convert standard 16 channel O3A to and from a 20 channel A-Format (in this sense).

What's "Spatial Audio"?

In the general sense, Ambisonics is a type of Spatial Audio. However, this term often refers to specific technologies like Dolby Atmos or Apple Spatial Audio. We have a technical note on this topic.