An Introduction to the Opus Codec
This is my third article on audio codecs. The first two covered the common codecs such as G.711 and G.729, along with the relative newcomers iLBC and Microsoft’s RTAudio. I felt that it was important for people to have a general understanding of the pros and cons of several different codecs.
Just when you thought that there were enough codecs to last us a lifetime, another one jumps into the spotlight for its 15-minutes of fame. That codec is Opus and for the next page or so I hope to clue you in as to what it is and why it exists.
Compared to G.711 which has been around since 1972, Opus is still an infant. It was specified by the Internet Engineering Taskforce (IETF) in 2012, but because of its association with WebRTC, there is a lot of interest in Opus. Some of that interest is based on the technical merits of Opus, some because it has been proclaimed to be open-source, and some, frankly, because Google is so enamored with it.
Before I delve into these three areas, let’s first take a look at the history of Opus.
Opus actually comes from two independent efforts – SILK and CELT. SILK comes from Skype and was created to be a wide-band codec that supported a variety of sampling frequencies (8, 12, 16, and 25 kHz) and bit rates (6 to 40 kbits/s). SILK’s primary use was for human speech. CELT was developed by xiph.org as a way to compress and transmit music with very little delay.
Opus incorporated the best aspects of these two codecs as a way to transmit music and speech over the Internet.
The Technical Merits of Opus
I took these numbers directly from the opus-codec.org homepage:
- Bit-rates from 6 kb/s to 510 kb/s
- Sampling rates from 8 kHz (narrowband) to 48 kHz (fullband)
- Frame sizes from 2.5 ms to 60 ms
- Support for both constant bit-rate (CBR) and variable bit-rate (VBR)
- Audio bandwidth from narrowband to fullband
- Support for speech and music
- Support for mono and stereo
- Support for up to 255 channels (multistream frames)
- Dynamically adjustable bitrate, audio bandwidth, and frame size
- Good loss robustness and packet loss concealment (PLC)
- Floating point and fixed-point implementation
In English, Opus is an extremely flexible, lossy (some data is lost during compression and decompression) codec that can be used for low bit rate VoIP that outperforms existing codecs such as G.729 and speex. At the same time, it supports high fidelity music with a quality that surpasses mp3.
Opus was designed to be an IETF standard with algorithms that are openly documented and a published reference implementation. However, Broadcom and xiph.org own software patents on some of the CELT algorithms and Skype (i.e. Microsoft) owns patents on some of the SILK algorithms. All three organizations have pledged to make them royalty free for use with Opus once the codec has been accepted as an IETF standard, but they also reserve the right to make use of their patents to defend against infringement suits.
So, open source and patent free to a point. I am no lawyer, but I also have great faith in the IETF and if they are comfortable with this arrangement, then who am I to question it.
The fact that Google has hitched its wagon to Opus isn’t something we can ignore. As one of the major drivers of WebRTC, Google has a lot of sway in its development and adoption. Having Opus as the default codec in Chrome is a big deal.
Of course, Google isn’t the only force driving WebRTC. Mozilla supports Opus for WebRTC in Firefox and since Chrome and Firefox are the two predominant WebRTC browsers, the Opus exposure is huge.
From everything I’ve read and heard, Opus is here to stay. In fact, I recently spoke with the WebRTC team at AudioCodes and they told me that they run native Opus in their 420HD and 440HD phones. This means that you can create a WebRTC call from a browser to one of their phones without having to transcode. I expect that other vendors will follow along that same path and Opus will make its way into aspects of VoIP that have nothing to do with WebRTC. And you know me, I love unified communications when it is truly unified.