This document describes a soundfile information chunk that may be added to WAVE-EX soundfiles to indicate the file is in Ambisonic G-Format. The information in the chunk can be used to recover the B-Format components for subsequent decoding to a different speaker layout.
A G-Format file contains an Ambisonic soundfield pre-decoded to a set of speaker feeds. This allows listeners who do not own an Ambisonic decoder to enjoy Ambisonics. When producing the G-Format file, the sound engineer creates a set of speaker feeds for a particular number and arrangement of speakers. This is typically four speakers arranged in a square or five speakers arranged in a regular pentagon. However, other speaker arrangements are possible.
As unrecognized chunks are always skipped, use of this chunk is benign and players that do not recognise it will see a normal multi-channel WAVE-EX file. The dwChannelMask in the WAVE-EX file should be set to the value appropriate for the set of speaker feeds the file contains.
Use of a Speaker Position (SPOS) chunk, in addition to the AMBG chunk, is not required in G-Format files but is recommended. The SPOS chunk can give guidance to the listener on the speaker arrangement which was assumed during the production of the G-Format file (square, regular pentagon, etc). The SPOS and AMBG chunks are completely independent, and their relative order is unimportant.
The AMBG chunk contains conversion coefficients which can be used to recover the original B-Format channels. The recovered B-Format channels can be fed to a decoder in the listener's living room, and so accommodate a speaker arrangement different from the one used when the G-Format file was produced. Each B-Format channel is recovered using a weighted combination of the speaker feeds in the G-Format file. The conversion coefficients must be such that the recovered B-Format channels conform to the Furse-Malham set of weighting factors described in the ".amb" specification.
typedef struct { char ID[4]; /* 'AMBG' */ unsignedInt32 dataSize; /* the size of the chunk */ unsignedInt32 version; /* version of the AMBG chunk */ unsignedInt32 numBformat; /* number of B-Format channels */ unsignedInt32 decoderFlags; /* UHJ, shelf filters, etc */ BformatConv bFormatChannels[numBformat]; /* conversion info */ } AMBGchunk;
These flags are needed so that the B-Format decoder subsequently decoding the recovered B-Format channels can behave appropriately.
/* G-Format chunk decoderFlags */ #define AMBG_FLAG_UHJ 0x00000001 #define AMBG_FLAG_PREF 0x00000002 #define AMBG_FLAG_SHELF 0x00000004 #define AMBG_FLAG_DIST 0x00000008 #define AMBG_FLAG_DOM 0x00000010 /* Bit flags up to 0x00080000 reserved for future use */
typedef enum { W = 1,X,Y,Z,R,S,T,U,V,K,L,M,N,O,P,Q /* enumerated B-Format channel label */ } bFormatLabel; typedef struct { unsignedInt32 label; /* B-Format channel label */ double64 coeffs[FormatChunk.nChannels]; /* the coefficients to recover one B-Format channel */ } BformatConv;
A file containing the AMBG chunk will have the ".amg" file extension. This is to allow the operating system to route G-Format files to an Ambisonic decoder. Note that when creating files, software must use this file extension. However, when reading files, software should peek inside any WAVE-EX file, irrespective of its extension, to see if it contains the AMBG chunk. (This is an example of the robustness principle, "Be liberal in what you read, and conservative in what you write".)
Nimbus 4.0 presents the listener with speaker feeds decoded for speakers positioned at FrontLeft, FrontRight, BackLeft, BackRight. The samples in a WAVE-EX file must be interleaved in this order. The speakers feeds are for speakers arranged in a square.
The G-Format speaker feeds ("Energy" decode) are produced using:
FrontLeft = W + X/sqrt(2) + Y/sqrt(2)
FrontRight = W + X/sqrt(2) - Y/sqrt(2)
BackLeft = W - X/sqrt(2) + Y/sqrt(2)
BackRight = W - X/sqrt(2) - Y/sqrt(2)
Use of the Speaker Position chunk is optional, but recommended.
SPOSchunk.ID = {'S','P','O','S'}; SPOSchunk.dataSize = 4 + 4*4 + 4*4; SPOSchunk.version = 1; SPOSchunk.azimuths[] = {+45, -45, +135, -135}; SPOSchunk.elevations[] = {0, 0, 0, 0};
AMBGchunk.ID = {'A','M','B','G'}; AMBGchunk.dataSize = 4 + 4 + 4 + 3*(4 + 4*8); AMBGchunk.version = 1; AMBGchunk.numBformat = 3; AMBGchunk.decoderFlags = AMBG_FLAG_UHJ | AMBG_FLAG_SHELF; /* source was two-channel UHJ + shelf filters applied */ AMBGchunk.bFormatChannels[0] = {(unsignedInt32)W, +0.25, +0.25, +0.25, +0.25}; AMBGchunk.bFormatChannels[1] = {(unsignedInt32)X, +0.3536, +0.3536, -0.3536, -0.3536}; AMBGchunk.bFormatChannels[2] = {(unsignedInt32)Y, +0.3536, -0.3536, +0.3536, -0.3536}; /* 0.3536 = 0.25*sqrt(2) */
The B-Format channels would be recovered using:
W = 0.25*(FrontLeft + FrontRight + BackLeft + BackRight)
X = 0.3536*(FrontLeft + FrontRight - BackLeft - BackRight)
Y = 0.3536*(FrontLeft - FrontRight + BackLeft - BackRight)
Regular pentagon 5.0 presents the listener with speaker feeds decoded for speakers positioned at FrontLeft, FrontRight, FrontCentre, BackLeft, BackRight. The samples in a WAVE-EX file must be interleaved in this order. The speakers feeds are for speakers arranged in a regular pentagon with 72 degrees between adjacent speakers.
The G-Format speaker feeds ("Energy" decode) are produced using:
FrontLeft = W + X*cos(72°) + Y*sin(72°)
FrontRight = W + X*cos(72°) - Y*sin(72°)
FrontCentre = W + X
BackLeft = W - X*cos(36°) + Y*sin(36°)
BackRight = W - X*cos(36°) - Y*sin(36°)
Use of the Speaker Position chunk is optional, but recommended.
SPOSchunk.ID = {'S','P','O','S'}; SPOSchunk.dataSize = 4 + 5*4 + 5*4; SPOSchunk.version = 1; SPOSchunk.azimuths[] = {+72, -72, 0, +144, -144}; SPOSchunk.elevations[] = {0, 0, 0, 0, 0};
AMBGchunk.ID = {'A','M','B','G'}; AMBGchunk.dataSize = 4 + 4 + 4 + 3*(4 + 5*8); AMBGchunk.version = 1; AMBGchunk.numBformat = 3; AMBGchunk.decoderFlags = 0; /* source was three-channel UHJ + shelf filters not applied */ AMBGchunk.bFormatChannels[0] = {(unsignedInt32)W, +0.2, +0.2, +0.2, +0.2, +0.2}; AMBGchunk.bFormatChannels[1] = {(unsignedInt32)X, -0.2, -0.2, +0.8, -0.2, -0.2}; AMBGchunk.bFormatChannels[2] = {(unsignedInt32)Y, +0.2629, -0.2629, 0.0, +0.4253, -0.4253}; /* 0.2629 = 1/(sin(72 deg)*4), 0.4253 = 1/(sin(36 deg)*4) */
The B-Format channels would be recovered using:
W = 0.2*(FrontLeft + FrontRight + FrontCentre + BackLeft + BackRight)
X = 0.8*FrontCentre - 0.2*(FrontLeft + FrontRight + BackLeft + BackRight)
Y = 0.2629*(FrontLeft - FrontRight) + 0.4253*(BackLeft - BackRight)