A DEEP DIVE INTO DOLBY MAT

Published June 30 2022, by Tom Devine

Someday, perhaps...however, Dolby MAT is not a final Jeopardy! answer. And no, you won’t need a roll of quarters and liquid Tide, either. In an acronym-obsessed industry, MAT, as in Dolby MAT, stands for Metadata-enhanced Audio Transmission (fortunately, someone in Dolby’s marketing department avoided the obvious).

Just as Kleenex and Jacuzzi represent brand names which have evolved into generalized references for their respective product categories, in the early days of digital home theaters, novices and enthusiasts alike commonly used Dolby Digital as the moniker for 5.1 surround sound, though more accurately, it describes the digital audio encode/decode system developed by Dolby Laboratories. From the advent (a pun for all the audio old timers, explained at the bottom) of Ray Dolby’s engineering endeavors in 1965 with Dolby A-type Noise Reduction, Dolby’s “double D” symbol has evolved to embody professional audio/video recording and playback technologies we all experience on a daily basis, in a myriad of ways. For the purpose of answering what is Dolby MAT?, let’s define a few things for context…

In the digital domain, Dolby bit-rate reduction technologies include encoded metadata, which in turn, describes encoded multichannel audio with instructions for precise control of downstream encoders and decoders. Encoded, compressed audio and metadata are transported together as a data stream via two digital audio channels, either professional AES/EBU, or consumer S/PDIF (Sony/Philips Digital Interface). This process is utilized by Dolby Digital, Dolby Digital Plus, Dolby AC-4, Dolby E and Dolby ED2 codecs. Dolby TrueHD is compressed in transport (on a Blu-ray or UHD Blu-ray disc) but is unpacked as a lossless, multi-channel codec.

DOLBY DIGITAL
Formally known as AC-3, Dolby Digital was the worldwide audio standard for DVD, and for HD broadcasting in the United States. As the basis from which professional Dolby E evolved, Dolby AC-3 is a lossy audio compression algorithm, an ingenious adaptation of the Discrete Cosine Transform. Originally conceived for use with video images, DCT made the concept of “lossy” images such as J-PEG and MPEG acceptable by rounding the values used to express 8x8 blocks of pixels into a smaller number of values, grouped together to avoid redundant bits, and discarding bits deemed perceptually non-essential. An update of the DCT, called the Modified Digital Cosine Transform (MCDT), featured a data “overlap”, rather than an abrupt boundary block to undesirable pixels, which worked to avoid artifacts. Dolby adapted the MDCT algorithm, augmenting it with perceptual coding, a psychoacoustic principle that discards what is determined inaudible in a digital audio recording, but remains present in the recording consuming data space. Sounds typically selected to be discarded were those masked by louder events, making them overtly inaudible. With development completed in 1991, Dolby Laboratories is credited with creating the earliest audio compression standard, fully two years prior to the introduction of MP3.

No stranger to tape hiss from research with analog Type-A, Type-B and Type-C Noise Reductions, Dolby turned their attention to film, considering alternatives to the analog, magnetic strip on 35mm film for improvements to cinema sound, and arriving at an optical retrieval system. Encoded patterns, superficially similar to today’s QR codes, were placed within the space between the sprocket holes that pulled the film through the projector. To economize space and the total number of bits required, the AC-3 process performed analysis during post-production, removing inaudible data at a rate of approximately 13 to 1. On the film physically, the data representing the six discrete channels was retrieved as one channel and sent to the AC-3 processor for decoding where it was restored to 18 bit data words, then separated into the individual left, center, right, subwoofer, and surround channels. Each 18 bit channel was converted into analog audio for playback. Dolby Digital provided Hollywood with content creation tools to steer sound to specific channels, introducing a new experience to theatrical presentations. Suspension of belief, the goal for any director, had the added dimension of audio “moving” in tandem with events on the screen, such as a locomotive traveling from left to right. The LFE channel contributed to sensorily underscore screen sequences, while the ambient rear channels destined the sensation be called “surround sound”.

The 1992 film, Batman Returns was the first theatrical release in Dolby Spectral Recording-Digital (SR-D) or Dolby Digital/AC-3, heralding a turning point in cinema history. Movie-goers clamored to bring the experience into their homes, and Dolby was all too eager to oblige.

DOLBY E
Debuting in 1999, Dolby E was the professional evolution of the Dolby Digital system for theatrical content and multichannel audio creation with HDTV, designed into the ATSC 1.0 standard (in existence today but recommended to “twilight” in 2023 as ATSC 3.0 gains momentum). Dolby E was a lossy compression algorithm containing Professional Metadata (PMD) that also included SMPTE timecodes, plus configuration information for use by downstream decoders. Consumer metadata was also included as the bitstream contained multi-format information, from simple Lt/Rt stereo up to Dolby Digital surround encoding or Dolby Digital Plus encoding (note italics). Dolby E accepted up to 8 channels (7.1, or 5.1 with Lt/Rt stereo) of baseband PCM audio with metadata, compressed onto a 20-bit, 48kHz AES channel pair, or 6 channels plus metadata fit into a 16-bit, 48kHz AES pair. The goal for Dolby E was development of a production algorithm capable of maintaining high audio quality while enduring multiple encode/decode cycles, for a minimum of ten generational passes. Dolby E was distributed by or embedded into Serial Digital Interface (SDI) when content was passed around within, and between production facilities.

Dolby E never directly reached the consumer as it was decoded downstream into PCM. Instead, Dolby E resided in what the broadcast and production community refer to as “Mezzanine” level. Mezzanine uses low level video compression for storage (Dolby E represented the embedded, accompanying soundtrack), capable of multiple use passes with minimal degradation, and used exclusively in cinema workflow applications or live HD/UHDTV production. Mezzanine level restrains aggressive compression, with ratios ranging from 2:1 to 8:1. The term “mezzanine” is a reference to its positioning in the production workflow, just below uncompressed signals but remaining above consumer level, while also an industry-inside homage to classic theater design.
Dolby E played a fundamental role in the trajectory of Dolby Laboratories technologies by serving as bedrock for content production worldwide and key to long-term aspirations.

Dolby ED2
Dolby ED2 was the next generation extension to the production-level Dolby E mezzanine audio codec, designed for HD-SDI infrastructure transport of immersive audio with companion metadata. As mentioned above, Dolby E supported surround-sound audio as multiple streams however, it lacked the extension provisions necessary for Dolby Atmos audio and key metadata, such as loudness information. Dolby ED2 introduced a revised and expanded bitstream supporting up to 16 audio channels, with 8 channels (each with two substreams) carried on a single AES3/ SDI audio pair.

The new bitstream topology manages immersive audio plus rendering for existing formats. Functionality is expanded for higher channel count audio to be transported over bandwidth-constrained mediums like satellite or terrestrial fiber links, implemented into a compressed format that retains sample-accurate alignment at the reception point. As an example, Dolby Atmos may be delivered in 5.1.4 or 5.1.2 channel-based audio, as multiple audio “objects” with three-dimensional positions, or combinations of both channels and objects. Channel configuration combines channel pairs such as Ch 1/2 representing the Stereo Mix, with Ch 5/6 carrying Center and LFE information, plus music and effects. Those with Meridian Audio experience recognize similar channel parsing, though with Dolby ED2 the carriage is HD-SDI, not PCM. Dolby ED2 is backward compatible with Dolby E, such as downstream pass-through, and also decodes Dolby E-compatible metadata within an ED2 stream. ED2 transformed Dolby’s evolution into a revolution, with the creation of the 2012 Pixar/Walt Disney release, Brave, in Dolby Atmos and the most significant quantum leap forward in the audience cinematic experience since 1992’s Batman Returns.

DOLBY DIGITAL PLUS
Introduced at the 2005 Consumer Electronic Show, Dolby Digital Plus (also known as E-AC-3, or Enhanced AC-3) represented a superset of effective codec updates rather than any significant departure away from Dolby Digital. Forward-looking transmission and decoding improvements propelled Dolby Digital into Dolby Digital Plus. It was the first Dolby codec with HDMI capability, and while pre-dating high definition disc formats and multichannel sound for HDTV by a few years, Dolby Digital Plus represented the forefront for Dolby’s roadmap and incremental march toward the future of consumer surround sound.

Implemented into Blu-ray Disc media, Dolby Digital Plus supported up to 7.1 discrete channels, using a 5.1 “core” plus expansion design, with future support for up to 15.1 discrete, full-bandwidth channels. While the bitstream itself was not backward compatible to legacy Dolby Digital devices, a mandatory 5.1 conversion compatibility was designed into the codec. Dolby Digital Plus upped the data rate range from an original 32kbps - 640kbps to a robust 32 kbps - 6Mbps, as the new codec used less compression. While still a lossy codec, a key substantial difference for Dolby Digital Plus was the delivery capability of the Dolby Atmos format to AVRs and Pre-Processors equipped with the Atmos feature. Optimized for digital transmission with a lower bit rate than Dolby TrueHD, Dolby Digital Plus compresses Atmos content to a 48kHz sampling rate, making it ideally suited for cable broadcast and online streaming.

DOLBY DIGITAL PLUS with ATMOS, also referred to as Dolby Digital JOC (Joint Object Coding) renders the spatially coded objects to a backward compatible 5.1 or 7.1 core mix, with side metadata generated to extract individual objects from the mix. Bit rate of the encoding determines the number of elements; 384kbps uses 12 elements while bit rates at 448kbps and above use 16 elements.

Dolby Digital Plus is the primary codec for streaming Dolby Atmos. This is the codec Apple has used since tvOS12 and the introduction of the Apple TV 4K device.

DOLBY TrueHD
Like other Dolby Digital multichannel codecs, Dolby TrueHD uses compression for efficiency in signal transmission. Unlike other Dolby Digital multichannel codecs, Dolby TrueHD is a lossless compression, with decoding restoring the signal bit-for-bit to the master recording. Meridian Lossless Packing (MLP) is the mathematical basis Dolby TrueHD uses for compressing audio samples, originally conceived by digital pioneer Bob Stuart, co-founder of United Kingdom’s Meridian Audio. Dolby TrueHD provides for 16 discrete, 24-bit/192kHz audio channels, with flexibility allowing for transmission and storage up to 32 channels, with up to 24-bit precision from three sampling rates: 44.1 kHz, 96kHz, and 192 kHz. Dolby TrueHD outputs stereo, 6-channel 5.1, 8-channel 7.1, and “piggy-backed” Atmos metadata bitstreams. Always check, but Blu-ray players with a build date of 2013 or later should support Dolby Atmos if it supports Dolby TrueHD, (with Dolby Atmos encoded discs). Always verify through product descriptions or specifications if Dolby Atmos is supported. Some current players, odd as it might seem, do not.

Though Dolby TrueHD was only an optional codec for Blu-ray discs, it was developed exclusively for Blu-ray, preparing for and being re-energized when UHD Blu-ray with Atmos debuted. DTS-Master Audio, a direct competitor to Dolby TrueHD, has not garnered much presence in the professional community. Dolby Atmos is the audio companion to Dolby Vision, and professionally the unified workflow is overwhelmingly favored by content creators and the post-production community worldwide. DTS, acquired in 2016 by Xperi, has a partnership with IMAX theatrical, and is the audio technology for the IMAX Enhanced consumer format.

DOLBY AC-4
Development for Dolby AC-4 was initiated in 2011 as successor to AC-3 with a focus on compression efficiency as an economical means to offset growing bandwidth consumption by enhanced video in UHD 4K streaming delivery. The intent was to emulate uncompressed audio performance, comparable to lossless Dolby TrueHD. Dolby states ‘AC-4 is an audio delivery system designed from a clean sheet’. Adopted by ATSC 3.0 in 2017 as the audio codec for terrestrial UHD 4K broadcasting, Dolby AC-4 has wide-ranging goals for streaming and mobile devices. To date, Dolby AC-4 has not revealed much of a presence industry-wide, likely due to a lack of backward compatibility to AC-3 technologies. As described above, Dolby Digital Plus with an AC-3, 5.1 core provided a means for the codec to fall back upon, for decoding compatibility with legacy Dolby Digital devices. Dolby AC-4 content may only be decoded by compatible AC-4 devices. This limits options for streaming services and broadcasters in comparison to Dolby Digital Plus which can be streamed, with or without Atmos metadata, and containing Dolby Digital 5.1/2.0 fallback in the same package. Now at the five-year mark from introduction, applications for AC-4 appear limited to ATSC 3.0 broadcasting, itself slow to take hold. It remains to be seen if content creators, content providers, and hardware manufacturers wholly embrace the codec outside of terrestrial broadcasting, though Dolby AC-4 shows some traction and is growing with Dolby Atmos Music and portable devices.

SO, WHAT ABOUT DOLBY MAT, JUST WHAT IS IT?
We see now what form content takes, from the former to the current paths by which it is processed for distribution. And now it appears folks are just starting to hear about Dolby MAT, Metadata-enhanced Audio Transmission. Depicted as otherwise elsewhere, Dolby MAT is neither a codec nor a format. While Dolby MAT utilizes an encode/decode algorithm, it is constrained within a closed environment, devoid a final analog stage. Dolby MAT might be defined as an encode/conversion/transport/conversion/decode process, a “bridge” created between compatible Dolby MAT devices for any of the three codecs Dolby has designated to deliver Dolby Atmos content to the home, Dolby Digital Plus/JOC, Dolby TrueHD, and with ATSC 3.0 tuners, Dolby AC-4. Dolby MAT takes advantage of the high-capacity lanes provided by the eight, 16bit, 192kHz audio carrier lanes in the HDMI standard, starting with HDMI 1.3, and aggregates the bandwidth of multiple lanes to establish a data transport layer.

The explanations above for each Dolby codec depicted Dolby Lab’s expansion ambitions at each stage, unrealized when the technologies first made inroads into the marketplace. We now see how they unfolded to dramatically enhance the theatrical experience and why that technology was intended for and has made it into the home – including Dolby Vision.

Let’s illustrate Dolby MAT using an UHD Blu-ray player as an example. A Dolby MAT encoder is resident onboard the player and packs the variable bit-rate, Dolby TrueHD bitstreams for output. Within the MAT encode process, the bitstreams are encoded into encapsulated MAT frames, converted to LPCM and ferried over HDMI 1.3 or later using a fixed bit-rate, into a compatible AVR/processor with a Dolby MAT decoder that unpacks and converts the MAT frames into the original Dolby TrueHD bitstreams. The set-up configuration of the AVR/processor determines which bitstream is selected for decoding into analog and routes signals to assigned channels.

At the introduction of Dolby Atmos, Dolby MAT technology was expanded to support encoding and decoding of Dolby Atmos metadata incorporated in lossless pulse-code modulation (PCM) audio. Dubbed Dolby MAT 2.0, an essential key benefit is object-based audio information is dynamically encoded in real time by the source device, limiting latency and reducing processing complexity.

Due to bandwidth constraints and limited processing power, Atmos in consumer home theaters differs greatly from cinema. For the home, spatially-coded panning metadata is present in Dolby MAT 2.0 and is an efficient representation of the original object-based mastering mix. To reduce bit-rate, Atmos “objects” considered to be nearby to speakers intended to localize their acoustic presence are clustered together to form aggregate objects, which are then dynamically panned by the spatial coding. Content creators control object placement and clustering strength during the mastering process with the Dolby Atmos Production Suite tools.

Nearly identical to the example above, now let’s use a 7.1 Dolby Atmos disc to describe the process with Dolby MAT 2.0. To recreate the entire Atmos mix for the downstream Atmos-capable decoder, object elements must be removed to reveal the 7.1 “bed” channels. The Dolby MAT 2.0 Atmos-to-7.1 fold down process in the encoder packages the metadata separately from the bed channels and attaches it to the LPCM audio signal. After transport and the metadata activates the Atmos decoder, the elements carried in the bitstream (from 12 to 16) are fully unpacked and routed to the Atmos rendering engine in the AVR/processor, for acoustic positioning per the speaker configuration during AVR/processor setup.

Current available sources with Dolby MAT encoders are streaming devices Apple TV 4K with tvOS12 up through current, Roku Ultra 4K, Amazon Fire TV Cube, Nvidia Shield (2019 and later), and Amazon Fire TV Stick 4K (through TVs with Dolby Atmos certification). Content providers supporting Dolby Atmos include Vudu, Netflix (the Netflix Premium tier is required), HBO Max, and Disney Plus. Game consoles include XBOX One X/S and Sony PS4 (PS4 for Blu-ray Disc playback only). Televisions with ATSC 3.0 tuners in markets with broadcasting use the Dolby AC-4 codec to receive Atmos 7.1.4, plus a dialog enhancement feature called Voice Plus. Volume-leveling is included with the codec, called Real Time Loudness Leveler.

Dolby is rather declarative in emphasizing the last Dolby MAT 2.0 capable device in the signal chain perform conversion into analog.

DOLBY MAT AND AVPRO EDGE PRODUCTS
On the input configuration page for the newly introduced AVPro Edge switcher GUI, at the bottom, are two Dolby Labs technology-related checkboxes.

One is for allowing Low Latency Dolby Vision (also known as L5, Device-Led, or in HDMI 2.1, Source-Based Tone Mapping) to pass through to displays. Our engineering team recommends this be selected as some display manufacturers, such as Sony, use the Device-Led approach instead of using onboard Dolby Vision processing. The second checkbox, Support Dolby MAT, allows for Dolby MAT pass-through when selected or when unchecked, puts a metadata block in place. As described above with Blu-ray or UHD Blu-ray Disc content, Dolby TrueHD is the only Dolby codec to transport Dolby Atmos, via Dolby MAT, to a compatible AVR/processor. This codec will pass-through any AVPro Edge switcher with the box unchecked.

For streaming devices supporting content providers that offer Dolby Atmos, Dolby Digital Plus uses Dolby MAT 2.0, and for these sources you will need the box checked to enable pass-through.

Dolby MAT encoded content will pass-through all AVPro Edge extender kits and Bullet Train cables however, at this time Dolby MAT will not pass-through MXNET. Our forthcoming MXNET-10G products, slated to debut later this summer, will be compatible with Dolby MAT/Dolby MAT 2.0.

Current Televisions supporting Dolby Atmos are selected series from LG, Philips, and TCL however, check the manufacturers’ documentation for compliant models. TVs built from 2018 and newer supporting eARC are designed to support Dolby Atmos pass-through to a compatible AVR/processor (or compatible soundbar) for Dolby TrueHD and Dolby Digital Plus. Only Dolby Digital Plus may be passed-through via ARC, depending on capability of the ARC port and the TV manufacturer. The HDMI ARC Standard can support 1-3 Mbps of data, whereas HDMI eARC can support up to 37 Mbps.

We used our Murideo 8K SEVEN generator to pull EDID information from LG and Sony OLEDs that are approximately two years old. Audio formats supported included 2-ch LPCM, 16-24bit from 32kHz -192kHz, Dolby AC-3 for 5.1 Dolby Digital 32kHz-48kHz, 7.1 Dolby Digital Plus 16bit 32kHz-48kHz, and 7.1 MAT (MPL) 16-20bit at 48kHz. Both displays are capable of taking signals from the birth of consumer digital sound through present day and beyond (Dolby MAT is capable of decoding ATSC 3.0, Dolby Digital AC-4, 7.1.4 audio information).

In the HDMI age, especially during the over-lapping twilight period from one version through the dawn of the successor, configuration is immensely critical for reliably transporting signals from all platforms to and through every intended product. AVPro Edge would love to provide easy-to-follow connection guides, freeing our customers and their clients legions of unnecessary anguish however, in an era where so many technology caveats intersect, it is invariably impossible due to potential version conflicts with the crossroads of products in the consumer home. Reading the fine print in all manufacturer’s instructions and specifications becomes more important than ever before.

Dolby is often unfairly criticized as having an unmatched level of influence over technology in the entertainment content creation and consumer playback industries. One response might be, nothing Dolby conceives proceeds forward without SMPTE and ITU consent and implementation. At one point in time, the Dolby name was synonymous with surround sound, either at the cinema or in the home. While the media we consume today is seldom held in our hands, like a cassette tape, or a shiny movie disc, siding with the consensus that Dolby Laboratories and its technologies have enriched all our lives beyond measure in all likelihood aligns you with the majority.

While there is no door prize for knowing the subject of the pun near the top, it’s a reference, if not homage, to Henry Kloss and his Advent Corporation, which was the first consumer audio cassette recorder manufacturer to incorporate Dolby Noise Reduction technology (Type-B). It was considered the first “hi-fi” cassette recorder, combining Dolby B Noise Reduction and chromium dioxide tape with a commercial-grade tape transport mechanism.

Warranty