HOA Technical Notes - Decoding
In Higher Order Ambisonics, the main audio stream in use is B-Format. This contains an awful lot of information about sound directivity, and it is deliberately unaware of the speakers that you will use for playback. Because of this, it must be "decoded" for playback. Decoding is sometimes known as rendering.
There are lots of different ways to decode a Higher Order Ambisonic stream. Michael Gerzon did provide a relatively formal definition of "Ambisonic Decoding" for first order "Classic" Ambisonics, however many modern decoders do not work this way.
Rapture3D
One of the main feature of our Rapture3D software is a high quality Higher Order Ambisonic decoder. The same decoder core is used in the Rapture3D game engine, music playback engine and professional studio plugins. With Rapture3D, the idea is that you configure it once, and then everything "just works".
The "User" and "Game" editions of Rapture3D come with decoders for a collection of preset rigs that correspond to the standard Windows speaker layouts (such as stereo, 5.1, 7.1) and some variants (such as 3D7.1).
Rapture3D uses some very sophisticated techniques for decoding, taking into account soundfield reproduction (which is a form of "acoustic holography"), wavefront curvature, psychoacoustic cues, HRTFs and more. It also supports HRTF-based headphone decoding, output for surround processors and crosstalk cancelled stereo.
The Studio - O3A Plugins
The O3A Core plugin library includes a number of simple studio decoder plugins to get you started with decoding.
For a richer set of decoders, you can try the O3A Decoding plugin library. Most of these decoders are actually generated using the Rapture3D Decoder Generator, which is at the heart of the "Advanced" edition of Rapture3D. This lets you set up decoders for arbitrary speaker layouts and personalised HRTFs.
Reference Decoders
Over the years, quite a few poor decoders have appeared, which have not always given ambisonics a good name. For instance, the "pseudo-inverse" approach is commonly misapplied to irregular speaker layouts. This gives something that looks right, but often won't sound right!
Here, we recommend some simple low-order decoders. We prefer the Rapture3D ones, but if you're writing a new decoder and it sounds worse than these, you're doing something wrong!
These decoders are represented as matrices. To apply them, take the relevant components from each sample frame of your B-Format and multiply them by the matrix to produce the corresponding sample frame to feed to the speakers. Except for the mono and hexagon decoders, these are based on the decoders provided by Csound's bformdec1 opcode, converted to SN3D. [Up-to-date as of 2017-01-05, any errors being our own.]
Mono
The first B-Format channel provides an omnidirectional response, so this this can be used directly to provide a good mono output.
Simple Stereo
This is a simple decode equivalent to a M+S microphone array at the listening point. This works better than a front-facing arrangement for many (but not all) purposes.
Note that there are lots of ways to do stereo decodes of B-Format, including ones that generate "binaural" headphone stereo (for instance, see the notes on our amber HRTF decoder). Also, see more on "Synthetic Microphones" below (this decode is equivalent to two side-facing cardioids, though we've halved the gain).
ACN | 0 In | 1 In |
---|---|---|
Left | 0.5 | 0.5 |
Right | 0.5 | -0.5 |
First Order Quad
This is a first order "in-phase" decoder.
ACN | 0 In | 1 In | 3 In |
---|---|---|---|
Front Left | 0.2500 | 0.1768 | 0.1768 |
Back Left | 0.2500 | 0.1768 | -0.1768 |
Back Right | 0.2500 | -0.1768 | -0.1768 |
Front Right | 0.2500 | -0.1768 | 0.1768 |
Second Order 5.0
This is a second order decoder provided by Bruce Wiggins, targetting an ITU 5.0 speaker layout, compatible with DVD 5.1 etc. [Converted from FuMa, any errors our own.]
ACN | 0 In | 1 In | 3 In | 4 In | 8 In |
---|---|---|---|---|---|
Front Left | 0.2864 | 0.3100 | 0.3200 | 0.1443 | 0.0981 |
Front Right | 0.2864 | -0.3100 | 0.3200 | -0.1443 | 0.0981 |
Front Centre | 0.0601 | 0.0000 | 0.0400 | 0.0000 | 0.0520 |
Surround Left | 0.4490 | 0.2800 | -0.3350 | 0.0924 | -0.0924 |
Surround Right | 0.4490 | -0.2800 | -0.3350 | -0.0924 | -0.0924 |
Second Order Hexagon
The speakers here are assumed to be set out anticlockwise, with the first speaker at 11 o'clock and the last one at 1 o'clock. This is an "in-phase" decoder.
ACN | 0 In | 1 In | 3 In | 4 In | 8 In |
---|---|---|---|---|---|
Front Left | 0.1667 | 0.1147 | 0.1987 | 0.0642 | 0.0371 |
Left | 0.1667 | 0.2294 | 0.0000 | 0.0000 | -0.0742 |
Back Left | 0.1667 | 0.1147 | -0.1987 | -0.0642 | 0.0371 |
Back Right | 0.1667 | -0.1147 | -0.1987 | 0.0642 | 0.0371 |
Right | 0.1667 | -0.2294 | 0.0000 | 0.0000 | -0.0742 |
Front Right | 0.1667 | -0.1147 | 0.1987 | -0.0642 | 0.0371 |
Third Order Octagon
The speakers here are assumed to be set out anticlockwise, with the first speaker roughly at 11 o'clock and the last one roughly at 1 o'clock. This is an "in-phase" decoder.
ACN | 0 In | 1 In | 3 In | 4 In | 8 In | 9 In | 15 In |
---|---|---|---|---|---|---|---|
NNW | 0.1250 | 0.0718 | 0.1732 | 0.0612 | 0.0612 | 0.0146 | 0.0061 |
WNW | 0.1250 | 0.1732 | 0.0718 | 0.0612 | -0.0612 | -0.0061 | -0.0146 |
WSW | 0.1250 | 0.1732 | -0.0718 | -0.0612 | -0.0612 | -0.0146 | 0.0061 |
SSW | 0.1250 | 0.0718 | -0.1732 | -0.0612 | 0.0612 | 0.0061 | -0.0146 |
SSE | 0.1250 | -0.0718 | -0.1732 | 0.0612 | 0.0612 | -0.0146 | -0.0061 |
ESE | 0.1250 | -0.1732 | -0.0718 | 0.0612 | -0.0612 | 0.0061 | 0.0146 |
ENE | 0.1250 | -0.1732 | 0.0718 | -0.0612 | -0.0612 | 0.0146 | -0.0061 |
NNE | 0.1250 | -0.0718 | 0.1732 | -0.0612 | 0.0612 | -0.0061 | 0.0146 |
First Order Cube
This is a first order "in-phase" decoder.
ACN | 0 In | 1 In | 2 In | 3 In |
---|---|---|---|---|
Front Lower Left | 0.1250 | 0.0722 | -0.0722 | 0.0722 |
Front Upper Left | 0.1250 | 0.0722 | 0.0722 | 0.0722 |
Back Lower Left | 0.1250 | 0.0722 | -0.0722 | -0.0722 |
Back Upper Left | 0.1250 | 0.0722 | 0.0722 | -0.0722 |
Back Lower Right | 0.1250 | -0.0722 | -0.0722 | -0.0722 |
Back Upper Right | 0.1250 | -0.0722 | 0.0722 | -0.0722 |
Front Lower Right | 0.1250 | -0.0722 | -0.0722 | 0.0722 |
Front Upper Right | 0.1250 | -0.0722 | 0.0722 | 0.0722 |
Virtual Microphones
It's possible to extract simple microphone responses from the B-Format stream. This is particularly relevant when decoding for stereo.
- For an omnidirectional response, simply use the ACN0 (first) channel.
- For a figure-of-eight response, take the scalar product of the microphone direction vector written as <Y,Z,X> and the ACN1, ACN2 and ACN3 channels.
- For a cardioid response, add the omnidirectional and figure-of-eight responses together (you can vary this to produce hypercardioid responses etc.).
It is tempting to build multichannel decoders by feeding simple virtual microphone responses in the directions of the various speakers. This does not work well, particularly for speaker layouts that are not regular.
Doing It Properly
Or, if this all seems rather pedestrian, you can use custom layouts with Rapture3D. The decodes used there are typically not just matrices. Frequency-domain processing allows acoustic soundfield reconstruction and psychoacoustic processing for arrays with small or large numbers of speakers.