FWD : Proposal for a Residency

Subject: FWD : Proposal for a Residency
Date: Wed Sep 15 2004 - 06:29:21 EDT

From BEK mailing list.

Begin forwarded message:

From: geert <geert@xs4all.nl>
Date: 12. september 2004 15.30.47 MET
To: jeremy welsh <jjw@khib.no>
Subject: proposal for a residency

from Monty xiphmont@xiph.org


I've been asked to write a quick document covering surround support in
Ogg (and by Ogg, we actually mean the Vorbis audio codec), hitting on
two basic points: Why surround is important to Vorbis, and How we add
that support.

<excerpt><excerpt>Abstract: Implementation of tuned 5.1 surround modes in Ogg

The Ogg Vorbis digital audio codec is the only perceptual audio
compression technology capable of supporting multi-channel surround
sound encoding that is free for use without licensing, royalty or
other restrictions.  While the Ogg Vorbis format has always supported
surround audio, no currently available encoder implements tuned modes
for the popular 5.1 surround format. As a result, the size reduction
with 5.1 material is both less than Ogg Vorbis is capable of and not
competitive on technical grounds with the proprietary digital audio
formats already in use. This limits the usefulness of this important
free alternative in film, digital video, and their online
distribution, music, and video games.

Our organization wishes to add such tuned modes to the reference
encoder in remedy of this situation. However, this work requires two
important resources that we do not have access to: a high-quality
surround-equiped studio for testing, and access to artists and mixing
engineers working in the 5.1 format and their work. With these
resources tuned modes can be quickly developed to vastly improve the
compression efficiency of Ogg Vorbis with surround material, creating
a format that, as it currently is with stereo material, not only a
free format, but a technically superior one.

<excerpt>Surround support in Ogg Vorbis: Background

Ogg Vorbis was designed from the start to support surround encoding.
The reference tools and libraries from Xiph.Org all support several
surround encodings including 5.1 surround.  However, neither the
reference encoder from Xiph.Org nor any third-party encoders currently
exploit inter-channel redundancy when encoding 5.1 streams and thus
the expected bitrate of a 5.1 vorbis stream is two to four
times higher than would be possible by eliminating inter-channel

As a point of example, the current reference Ogg Vorbis encoder does
exploit inter-channel redundancy via channel coupling when encoding
stereo, resulting in a stereo (two channel) stream that is only 15-20%
larger than a monophonic (one channel) stream.  One reasonably expects
even greater savings when eliminating inter-channel redundancy in a
5.1 stream.

The 5.1 improvements would affect only the encoder and make use of
preexisting mechanisms in the Ogg Vorbis specification; all pre-existing
Vorbis decoders are already fully capable of 5.1 surround decode.

<excerpt>Importance of 5.1 surround support in Ogg Vorbis

It is a given that Ogg Vorbis was designed with surround encoding in
mind but it is also important to understand why first-class surround
support is essential to Vorbis in the first place.

The popularity of 5.1 surround encoding begins at the movies; the vast
majority of public cinema theaters today use 5.1 or a descendant of the 5.1
format.  This success in the public theater translates directly to the
home theater. Companies strive to sell the 'big theater experience in
the comfort of your own home', and these companies market 5.1 surround
as an essential component of the big theatre experience.  DVDs account
for most movie rentals today and the majority of DVDs offer 5.1
surround.  Technicians and users accustomed to relatively low bitrate
streaming video on the Internet often overlook the fact that DVDs are
inherently digital video too; from internet to DVD we see a range of
low to high bitrate video making use of mono, stereo and surround
soundtracks provided by a range of audio codecs working alongside the

Although surround audio exists predominantly alongside the domain of
movie-grade video, it also plays a role in commercial sound
recordings.  Stereophonic imaging has held a technological monopoly on
sound recordings for about fifty years, but greater-than-two channel
encodings may be gaining prominence.  The growing popularity of home
theatre systems and computer games designed to work with 5.1 surround
provides both a first-order demand for 5.1-capable codecs as well as
well as a second-order effect lowering the barrier-to-entry for sound
recordings with surround imaging.  Quadraphonic encoding, and to an
even greater extent, Ambisonic encoding, failed to catch on thirty
years ago primarily due to technical fragility and greater costs;
surround encoding today is relatively free of both problems.  Surround
encoding of sound recordings now have a second-chance at widespread

Vorbis's predominant original design consideration was scalability.
It is well proven in low and mid-bitrate audio only streams as well as
for providing a mono or stereo soundtrack to streamed video.  However,
Vorbis was equally intended for use alongside high bitrate video as
well as lossy music archival and playback in both professional and
home systems.  5.1 surround is an integral part of the home theater
and [currently] a lucrative novelty when applied to sound recordings.
Vorbis was intended to scale for use in these niches as well, and as
such, the original design goals of Vorbis dictate first class 5.1
surround support.

In addition to technical design goals, Vorbis also carries the
societal directive of being a viable alternative to patented
technologies.  Other modern but IP-encumbered technologies in Vorbis's
niche include Windows Media (which is actually a collection of
disparate codecs), MPEG I, II, 2.5 and IV (also a collection of many
disparate codecs), RealAudio (a collection of licensed third-party
codecs such as Sony's ATRAC3), Dolby AC3 (Dolby Digital) and DTS.
Aside from Vorbis, no codec capable of 5.1 surround is available for
independent or unlicensed use.  Vorbis provides a viable alternative
for individuals and organizations that seek to avoid prohibitive
licensing costs or restrictions.

We recall that Ogg Vorbis already supports 5.1 encoding, although the
current reference encoder implements this encoding inefficiently.  Why
then regard 5.1 surround as if it is currently unsupported?  Current
Vorbis 5.1 uses two- to four-times the realistically needed bitrate to
encode.  This large relative overhead renders the implementation
comparatively unviable when compared to the IP-encumbered options.
This makes widespread deployment of current Vorbis 5.1 unlikely.

<excerpt>Proposal: 5.1 surround implementation in Ogg Vorbis

Although our encoder already allows greater-than-stereo encoding, its
psychoacoustic model does not currently attempt to exploit redundancy
("channel coupling") in multichannel recordings of more than two
channels.  The ability in the encoder is there, but the psychoacoustic
model to make use of that ability for 5.1 is missing, that is, the
problem is in analysis, not mechanism.  Thus, 'extending' Vorbis to
handle surround is a task of enabling and tuning channel coupling in
the encoder alone.  This encoding improvement does not require decoder

The primary challenge of handling 5.1 channel coupling effectively and
transparently is due to the ironic problem that 5.1 imaging doesn't
actually work very well; the imaging from the sides and rear tends not
to be seamless but rather sound is usually easily localizable to the
speaker from which it's coming.  This has resulted in a host of sound
engineering tricks during audio postproduction that either cover this
basic inadequacy or try to make it look intentional.  Unfortunately,
different sound engineers use different tricks to synthesize the
surround effect; by way of example I offer a partial list:

.) true multiple-point recording (rare)

.) use of side/rear only for synthetic ambience that has no signal in
    common with left/center/right.

.) matrixed rear ambience (phase delayed/shifted/mixed rear derived
    from front channels

.) additional spatial expansion (mostly synthetic/mathematical
    timing/phase trics) applied to front and back L/R.

.) Neglecting rear entirely in acoustically 'dull' passages

.) Neglecting center entirely, or matrixing it from L/R.

.) Neglecting L/R entirely or using only for synthetic stereo ambiance
    generated from center ('stereo reverb')

.) Exact use of LFE depends somewhat on codec; many depend on LFE to
    help elimintate inter-frame blocking noise in the codec itself by
    moving high energy near-DC signal to a channel that is encoded
    differently (the .1).

This list is only partial; it highlights the need for a scalable
surround encoding to handle the above list of techniques (and more)
gracefully and efficiently.

In addition there is the issue of access to studio masters for clear
analysis and treatment of the above techniques. Unlike CD audio, which
was used extensively in developing the stereo mode tunings for the Ogg
Vorbis reference encoder, the 5.1 audio widely available of DVD discs
is already compressed with a lossy, preceptual digital audio codec and
disentangling real effects from artefacts of the other compression
technology is generally impossible. Thus, access to material and
artists/engineers working directly in the studio is so important.

Vorbis channel coupling mechanisms currently exist to handle all these
cases, but the mechanisms are unused.  The 5.1 coupling in Vorbis
would use the same mechanisms as are used in Vorbis stereo.  These
mechanisms are fixed by the specification.  Thus, the 5.1 surround
support proposal primarily involves applying mechanisms that already
exist within both the encoder and decoder.  The work involved in this
process is predominantly one of acoustic design, tuning and rigorous
listening testing rather than writing code.

This process requires:

.) A variety of uncompressed surround-encoded and surround-recorded
    audio material, as well as material that has been synthetically
    mixed in a number of different ways.  This material may be canned,
    or it may be generated on-the-fly by an assisting mixdown engineer.

.) Access to a high-quality surround mixdown and listening environemnt
    for extensive testing.

.) Access to surround and mixdown engineers experienced in surround
    recording and post-production, both in order to facilitate testing
    processes as well as to render expert judgement on audition


----- End forwarded message -----

Everyone should have http://www.freedom2surf.net/

This archive was generated by hypermail 2b27 : Sat Dec 22 2007 - 01:46:03 EST