Know Audio: Lossy Compression Algorithms And Distortion

Reading Time: 5 minutes

In previous episodes of this long-running series looking at the world of high-quality audio, at every point we’ve stayed in the real world of physical audio hardware. From the human ear to the loudspeaker, from the DAC to measuring distortion, this is all stuff that can happen on your bench or in your Hi-Fi rack.

We’re now going for the first time to diverge from the practical world of hardware into the theoretical world of mathematics, as we consider a very contentious topic in the world of audio. We live in a world in which it is now normal for audio to have some form of digital compression applied to it, some of which has an effect on what is played back through our speakers and headphones. When a compression algorithm changes what we hear, it’s distortion in audio terms, but how much is it distorted and how do we even measure that? It’s time to dive in and play with some audio files.

How Good A Copy Does A Copy Have To Be?

A reel-to-reel recorder from the famous Abbey Road studio in London. — Abbey Road’s tape recorders would have been about as good as it gets. Josephenus P. Riley, CC BY 2.0.

Were we to record some music with a good quality microphone and analogue tape recorder, we know that what came out of the speakers at playback would be a copy of what was heard by the microphone, subject to distortion from whatever non-linearities it has encountered in the audio path. But despite that distortion, the tape recorder is doing its best to faithfully record an exact copy of what it hears. It’s the same with a compression-free digital recording; record those musicians with a DAT machine or listen to them on a CD, and you’ll get back as good a copy as those media are capable of returning.

The trouble is that uncompressed audio takes up a lot of bandwidth, particularly when streaming over the Internet. Thus just as with any other data format, it makes sense to compress it such that it takes up less space. There are plenty of compression algorithms to choose from, but with analogue sources there are more choices than there are with text, or software. A Linux ISO has to uncompress as a perfect copy of its original otherwise it won’t run, while an image or an audio file simply has to uncompress to something that looks or sounds like the original to our meaty brains.

Those extra compression options for analogue data take advantage of this; they use so-called lossy compression in that what you get out sounds just like what you put in, but isn’t the same. This difference can be viewed as distortion, and if you have ever saved an image containing text as a JPEG file, you’ll probably have seen it as artifacts around sharply defined edges.

So if lossy compression algorithms such as MP3 introduce distortion, how can we measure this? The analogue distortion analyser featured in our last installment is of little use, because the pure sine wave it uses is very easy for the compression algorithm to encode faithfully. Compression based on Fourier analysis is always going to do a good job on a single frequency. Another solution is required, and here the Internet is of little help. It’s time to set out on my own and figure out a way to measure the distortion inherent to an MP3 file.

Math Will Give Us The Right Answer!

A GNU Radio project for my analyser — GNU Radio is an extremely convenient way to perform these types of measurement.

At moments like these it’s great to be surrounded by other engineers, because you can mull it over and reach a solution. This distortion can’t be measured through my analogue instrumentation with a sine wave for the reasons discussed above, so it must instead be measured on a real world sample. We came up with a plan: measure the difference between two samples, compute the RMS value for that difference, then calculate the ratio between that and the RMS of the uncompressed sample.

As is so often the case with this type of task, it’s a relatively straightforward task using GNU Radio as a DSP workshop. I created a GNU Radio project to do the job, and fed it an uncompressed and compressed version of the same sample. I used a freely available recording of some bongo drums, and to make my compressed file I encoded it as a 128kbit MP3, then decoded it back to a WAV file. You can find it in my GitHub account, should you wish to play with it yourself.

Math Will Give Us The Wrong Answer!

The result it gives for my two bongo samples varies a little around 0.03, or 3%, depending upon where you are in the sample. What that in effect means is that the MP3 encoded version is around 3% different from the uncompressed one. If that were a figure measured on an analogue circuit using my trusty HP analyser I would say it wasn’t a very good quality circuit at all, and I would definitely be able to hear the distortion when listening to the audio. The fact that I can’t hear it raises a fundamental question as to what distortion really is, and the effect it has upon listeners.

What I would understand as distortion due to non-linearities in the audio path, is in reality harmonic distortion. Harmonics of the input signal are being created; if my audio path is a guitar pedal they are harmonics I want, while if it’s merely a very low quality piece of audio gear they’re unwelcome degradation of the listening experience. This MP3 file has a measurable 3% distortion, yet I am not hearing it as such when I listen. The answer to why that is the case is that this is not harmonic distortion, instead it’s a very similar version of the same sound, which differs by only 3% from the original. People with an acute ear can hear it, but most listeners will not notice the difference.

So In Summary: This Distortion Isn’t Distortion Like The Others

So in very simple terms, I’ve measured distortion, but not distortion in the same sense of the word. I’ve proven that an MP3 encoded audio source has a significant loss of information over its uncompressed ancestor, but noted that it is nowhere near as noticable in the finished product as for example a 3% harmonic distortion would be. It’s thus safe to say that this exercise, while interesting, is a little bit pointless because it produces a misleading figure. I think I have achieved something though, by shining some light on the matter of audio compression and subsequent quality loss. In short: for most of you it won’t matter, while the rest of you are probably using a lossless algorithm such as FLAC anyway.