Audio Restoration – Getting Recordings to Work

In groups on Facebook & places like Upwork, it is really common to see people who have made recordings that have not turned out as expected and hope to somehow transition from a problematic recording to something “perfect”.

Sometimes this is very doable, sometimes not. It all depends on the raw recording.

Now before some Capt Helpful pipes up and says to use the Noise Reduction filter in Audacity and it will be amazing, or Major Overlord announces that one must use Klenatone NX-75.68-9 Pro as it is the Best, take care as magic and A.I. do not really exist (at least not in this way at this time).

Before we get into things, we need to understand the nature of recorded audio and how the human brain reacts to sound.

examples at the end of the article

Recorded Audio

Recorded audio is like an omelet. Once the eggs, mushrooms, capsicum etc have been cooked as an omelet, they cannot be returned to their former state. The eggs cannot be returned to their shells as though nothing ever happened, they are scrambled for eternity.

While we could indeed pull the individual ingredients back out to some extent – assuming the mushroom & capsicum were chopped, not pureed – there will always be bits of egg stuck to the mushroom & capsicum. Things like cayenne powder probably dissolved so even an industrial separator is not likely to return egg and cayenne to the way they were at 7:13 AM Wed 23rd January 2021.

If you want an omelet without mushrooms, capsicum & cayenne, you need it cooked that way in the first place. Otherwise, we must accept that these other things are in our eggy brekkie. That or order scrambled eggs in the first place.

Human Brain & Sound

The way the human brain interacts with sound is absolutely amazing. We do not see and hear the world around us as it is, we perceive it. That means that everything we see & hear is based on how our brains process the input. The human brain tosses out a lot of the info that arrives as we honestly cannot handle all the info pouring in at once so we squish and process into a manageable form, often with emotive values bundled in.

This is very different from the way that a recording medium process the same audio input. This means that we can be sitting in a room listing to a person speak and they seem really clear, yet if we pop a recorder (tape, digital, no real difference) on the chair next to us and listen in the car on the way home, suddenly that person seems “swamped” by all these other sounds in the room that we were not aware of at the time. The recording “heard” the room as it was based on where it was sitting, we heard what we wanted/needed based on a lot of processing.

Therefore the live event felt clear but the recording is messy. It is harder for the brain to process audio off a recording as it lacks a lot of the extra pieces of information to help decide what to ignore so we hear all the foot scraping, lolly wrappers, whispers, chair creaks etc. Worse, it feels uncomfortable so we tend to a) want them gone as they feel wrong, and b) we focus on them more seeing they make us uncomfortable.

This is the point at which frustrated people go online looking for a magical solution.

Enter the Mix Engineer

Jack Sparrow - Depp — Jack Sparrow – Depp

And so often get told things that do not help by people who do not know any better themselves (despite probably making claims, or at least implying expertness).

The info above is the foundation that any properly skilled Audio Engineer should know and understand as it is the central key to making recorded audio work well. Recorded sound is an illusion, just as a movie is all illusion. Capt Jack Sparrow is not actually a Pirate out of the Caribbean Port of Tortuga. He is some fellow called Johhny Depp with a fancy wig and the ability to make us feel piratey.

The Mix Engineer has access to certain tools and needs to know when to use them as much as when not to use them. Sometimes that does take trial & error. But it should always be decisions made based on the info above using the ear-brain connection.

The Case Against Spectral Cleaning

As noted, people will quickly leap to their preferred “magic” tool that will supposedly strip all “bad” things whilst leaving only good things.

The spectral tool splits the audio into lots of slices and tries to differentiate signal from noise based on the criteria it has been given. It then strips what seems to match more than it doesn’t match. This decision is being made by a process, not a human, so it can’t make informed choices, merely churn its way through till there is no more to process.

Remember what happens when we try to separate egg and mushroom above? We end up with bits of mushroom with egg stuck to them and egg with that fungal grey stain of mushroom. Nothing is clean. Everything is “tainted” and ragged. By now you probably have a mental picture of a “butchered” meal that you no longer feel good about eating.

This is why I refer to the processes used by these tools as “cheese grater”. The results seem like they were put through a cheese grater; or feel odd and spacey.

To be fair, sometimes, if the material is pretty good, I have used these tools, but the moment that the issue is anything past pretty good, as in pretty not-good, the tools do more damage than good as they strip more of what we want to keep than we want to let go of. It sounds broken, therefore no longer works. Humanity or Sales Power is destroyed.

Framing

Now assuming that we have the option, we should always take the nature of recording versus listening into account before we even start to set up our gear to make the tape of the event. This is the role of the Recording Engineer who should come with lots of skills & experience from a) knowing what I spoke of above and b) appropriate tools to help ensure workable results that the Mix Engineer can use to re-form the illusion that the listener is there at the event.

In photographic terms that is called Framing, which means that the photo not only contains the child and the flower but that overall it is arranged and lit in such a way that the feeling of that scene is conveyed to the viewer. If that photo is not well framed it is just another happy snap where half the child and half the flower somehow got cut off and the face is blurry anyway.

There isn’t really a matching term in audio recording but something close is Gain Staging, which means making sure that the levels of the signal we want to record are weighted far more toward the good bits than the bad. The speaker as opposed to the lolly wrappers up the back. Signal To Noise Ratio.

Accepting

We also have to accept that if it is a live recording, there will be noises, not only from that audience but from the players sticking their chewing gum under the chair as they get to within 16 bars of when they first get to toot their floot. To do otherwise is to invite disaster seeing the listener’s brain will hear the delusion which breaks the illusion.

Don’t make changes that do not have to be done as they will do more damage than good. The “trick” is to understand how the human brain processes sound and lead it to the conclusions that you want it to make, the illusion that you want to give. That will lead the listener to make similar choices to those they would have if they were actually there and their brain automatically filters a lot of the clutter. All with no digital noise or strange artifacts.

Restoring

Restoring audio becomes a completely different process from what just about everyone in easy reach on the internet will tell you – 99.9% of the time before even hearing your audio and issues.

When restoring audio, or fixing it so it sounds as good as possible within the situation, I take the path of the least amount of change. By accepting that some things are best left unchanged, like feet scraping when the recording was made in the middle of the audience, we can work on how we are going to lead the listener to get the outcome that both we and they want.

This first is important as if the person wants to sit there hearing record crackle, lolly wrappers, drunken teenagers throwing rocks at pigeons, this is exactly what they will hear as their brain is set to filter out everything but these events. These people are not your audience. Focus on people who want to hear the speaker or singer and the performance that was special enough to justify recording it.

This is the workspace for preparing a recording of an orchestra made with a small digital recorder sitting in a backpack under a seat at the back of a church. Not ideal conditions but the recording was a lovely performance and had captured the feel and tone very nicely. Although understandably with quite a lot of “room” that felt intrusive in the quieter sections.

You may note how simple the toolset is: setting overall levels, compressing lightly, tone adjustment, and a bit of saturation to keep it all feeling lively.

The aim here was to use the understanding of how the human brain processes sound and narrative and adjust a few key things to lead the listener to focus on the orchestra playing more than the room and people moving in it. With very little impact (and no weird digital artifacts) this recording now sounds very natural and pleasing. The room noise is all but inaudible (even to me) seeing we are told it is not important in the way the recording starts.

Sorry, I cannot show that recording as it was in confidence.

This one however I can show. I do not know the orchestra, work performed, recording engineer etc, merely that there was an mp3 they wanted to sound/feel nicer. This is the snippet they wanted tested:

Here is another recording that comes from a handheld camera in the outdoors on a typical slightly windy New Zealand day. There is no way that we can realistically remove all the things that are “imperfect” in this recording to deliver a pristine studio version – which does not exist anyway and therefore would feel weird compared to the vision.

What could be done is to ask the listener to accept the wind noise, birds, crowd chatter, and general sense of being outside. We do this by putting the focus on the most important parts of the recording which are the singer and guitar as they present the song. This approach transforms a video that was not very nice to one that is able to do the job that was always intended: to show a (very) young Jake & Pals play his award-winning song.

This was a Mastering situation where the Mix itself was really broken because the assumption was that more bass was better. As a result, the overall levels are too low and the kik punches a hole in the feel of the track. The improved track delivers stronger overall levels which bring that punchy urban feel along with much better clarity for the singer to sell his story.

Hire Me

If you have audio that needs restoration, I am happy to have a listen and let you know probable outcomes. Please fill in the form on the Home Page and be sure that I can hear the material as well as see how much of it there is or I cannot give a very useful answer, let alone price the work.