How to Develop a Great Podcast or Audio Program

Audio Programs are many and varied. Most go under the catch-all term Podcast these days. Making great audio material is not particularly hard these days, so long as you keep your focus on what matters.

Basically there are three parts to developing a podcast or audio program

1. Create the Material

I am amazed at how many people miss this step. Who are your audience? What do they want from you? These are the questions you need to answer as you develop your material. You can definitely wing it if you are speaking to a specialized audience and you don’t mind coming off as very casual.

See my sprawling video below. It is not tight, not for the faint of heart at all but it wasn’t my concern. People will engage with me in this or not. I know most won’t because I didn’t make, perfect shiny 5-minute sections – which is, of course, ridiculous for 30+ years of composition training.

The moment you want to do better than this you need to prepare your material. Write it out in MS Word or Google Docs and separate each section out onto a new page. Save As a PDF file and now you only need to press the forward arrow to get to the next page (not whilst speaking please).

2. Perform the Material

This is where many people trip up and think they can just wing it. See above. I am winging it and while I can kinda get away with it, because I have so much domain understanding, it is dull and unfocused. Most people want something with better pacing and much more sparkle.

Technology isn’t your solution. A great, lively, passionate performance is. I don’t mean like Christina Aguilera (as that is a terribly false performance) but you connecting with your soul and letting that shine on out as you speak to me.

It doesn’t take a lot. You just need to know your material, believe in it and then do a bit of practice so it comes out nicely with minimal mistakes. Let your flow find you. It will as soon as you stop pushing.

Then you Record it. All you need is a quiet room and bit of good microphone technique.

3. Prepare for release

Once you have recorded your program, you then need to have it prepared for release. This is Audio-Post Production or Audio-Post for short.

Audio-Post Production has become a big subject. Sadly though, like many things audio, people have come to associate Audio-Post with magic pieces of software that do all kinds of things to take poor performances and recast them as quality work.

Let me start by running through what Audio-Post commonly is and then we can look at the options for what can be done versus what should be done. Audio-Post is about preparing audio material for its final usage. Seeing we are focusing on speech, it is is about preparing Podcasts and similar Audio Programs for release in their intended environment.

Audio-Post Tasks

Tracy & Derrick from Tracy Riley Hypnosis have been kind enough to let me use material from their audio programs to help you understand what is done when preparing an audio program like a Podcast and why it is important to be done the way it is.

Assess the Material – the absolute first thing I do is assess the material. There is no point in letting a client publish rubbish. Rod Stewart is a great singer but he has his off moments. He expects his Producer to let him know when his performance isn’t up to the Rod The Mod vibe we know & love. If the material is faulty, I will have a chat with the client. If they want to go on, that is their call, but the good uns have another go as once this is done, it is done – and has to do its job well.

Prepare the voice – The speaker’s voice is the alpha & omega of this program. This should never be forgotten. The first thing I do once the performance is settled is to set up a couple of processors to make the most of this voice. It is easy to make the mistake of whacking on a few formula Plugins that make everything LOUD and strip out anything that isn’t. Ok, it sorta works but it also is like assuming all people are the same person. Or a Ferrari is a Daewoo. Every voice is unique and must be handled that way.

If the recording is anywhere near decent, I will start with a Noise Gate that is set to just above the static noise level of the room. No higher or I am chopping off word starts & ends which makes the speaker sound clipped.

I then add Equalization or Tone Control to the voice. I am very careful with this. I look to enhance the voice only a tiny amount. Too much is far too much. The aim is to have the voice sound natural as the presenter is speaking.

It is common practice to see people start to apply Compression and heavy Audio Leveling about now but I prefer not to as a general rule. The reason is that I find that compressors on spoken word make the speaker LOUD but also reduce dynamics or naturalness. Loud is not the aim, Persuasiveness is. I will use a Limiter to level off the loudest bits by a db or so but I refuse to pummel a voice into submission – or the inability to sound intimate.

Edit the Performance – I then settle myself in and listen through the whole audio program from start to finish, making any adjustments that are necessary. At this point, I am chopping out breaths, lip smacks, ums (which should be few to none if it is a good performance), pops or other noises like mouse clicks as the program page gets turned. Soon the performance will look like this:

This is 6 minutes of voice from Dr. Tracy Riley, a good presenter. The gaps are where I removed anything that is not helping the performance sound clean. Every edit is by hand and checked. This is the way I choose. While it is clever to think that some piece of software could be set to do this, not all breaths or other events should be removed. Some breaths should stay or the performance sounds wrong. As a human I will assess and either leave the breath or adjust the volume so it is still there but not so dominant. The difference may be small but it is never insignificant.

ometimes there will be noises like pops, bangs or stray clicks in the middle of words. These are a bit of fun as I have to locate them and create a very fast edit that balances reducing the noise with keeping the voice intact.

Again, I know people say, oh just do it with software. Sure, you can do that. But I am about preserving the speaker’s persuasiveness and not letting a program strip away things that make that good performance – human tone & cadence

As a human I can tell if there should be more or less space, if there is music, I can even choose to shuffle a phrase to match the music for a bit more power. This is cadence and only a person can feel this.

Sometimes a speaker really gets it wrong. I have Derrick & Tracy well trained so if they stuffed up, they know to take a breath, clap or call out, take another breath and start again. This clears their tension and gives me a clear marker that something needs an axe. I generally chop these out first.

Occasionally a performer flubs a word. If we are all lucky, it is a word they have used a sentence or so on either side and I can “Frankenstein” the word back in. It is a bit of fun but not a thing anyone should strive for as the cadence of the line is altered. No matter how clever I feel about resolving the issue, the performance is compromised. So I assess and even if I am chuffed with how clever I was if it doesn’t flow, out it goes. Here is one that worked out ok:

Derrick was new to recording and had a few instances where his words didn’t come out quite right. He very kindly let me show this fix. The Red parts are the error, the green the fix. The yellow blocks let you know where the tricksy word is/was.

Every single thing I do does damage so it is my job to work out the best balance of damage from processing versus damage from not processing. My job is not to make it easy on myself but to make the speaker sound good so they sell well.

Mixing & Finalizing – is the process of making sure all of the parts balance well it is not about making things as loud as I can get away with before everything distorts. This isn’t EDM or Death Metal but a speaking program where a human voice needs all its persuasiveness. I make sure that the program is a proper level overall but nothing is pushed above that lest the voice sound less dynamic & intimate – or lose Selling Power.

When there is music during spoken parts, I will often adjust the EQ or Tone of that music so that it is not too strident. It is there to support the speaker and not to compete. It may seem odd if you are used to mixing every part to sound loud & splashy but music commonly plays a supporting role so needs to show that in tone & levels alike.

Again, all this work is done by hand and by ear. Computer programs can’t make these sorts of decisions. A computer program can’t work out if it is damaging persuasiveness so it will plow on through regardless.

Templates – are great if you have recurring material. I prefer to work long-term with my speakers as this makes it more efficient for all. I get to know the voices and style. I also get to build advantage for all of us by creating Templates that contain Intros, Outros, Music etc. as well as the Process chain that suits that speaker’s voice. This means that we only have to prepare the actual content part of the program which is less cost for all.

Thanks again to Tracy & Derrick of Tracy Riley Hypnosis for being great sports and letting me use their work to show you what I do.

In my next article, I will walk you through what you need to get going if you are about to start an audio program. I know that Tracy uses relatively simple gear so this is proof that you don’t need fancy gear to get great results. You just need passion.

Common Issues & Solutions

Each voice should ideally have a separate process – if the client wants to do it on the cheap I can put a single process across all voices, but this isn’t bringing out the character of each individual voice or allowing me to balance levels properly. If the program contains several speakers and they are in one stereo file, I will go through and chop each speaker out into a separate track. That allows a process for each voice. It does add time but delivers better results. A better solution is if the client can deliver one track or Stem for each speaker. Each Stem should be the whole length of the program so when I pop them in my workspace they all line up. That adds minimal time to multi-speaker programs.

The same applies for Video Soundtrack. Send me the audio track (or I can record off the video but that is a waste of paid time & quality) and I can make things sound good for you to put it all back together (which I can also do). So long as the audio track is the same length as the video, it all works out fine. I like to have a copy of the video too as that helps me make decisions about handling the material.

The companion article will help you with Tips for Setting Up for Podcast or Audio Program Recording