You’ve sat down, set up your system, made a Skpye Call or an in-person recording and now you have recorded audio. Excellent! Now what?
Depending on what you plan to do with the show and how you did the recording, that answer can range from “absolutely nothing, I’m done” to “a beautifully orchestrated and conceptual program that flows from beginning to end.” All right, that last part will definitely be subjective, but it points to a fact. The audio we have recorded is going to need some editing. There are many choices out there, ranging from simple WAV file editors all the way up to professional Digital Audio Workstations (DAW). I’m going to suggest a middle ground; it’s flexible, doesn’t cost anything, and has a lot of useful tools already included. Welcome to Audacity.
Hey, wait… don’t go. Granted, it’s been around a long time, and I’ll admit it’s not the sexiest of tools you could be using, and it has limitations as a real-time DAW (which can be overcome with some system tweaking, but that’s out of scope for this post). Still, as a multi-track waveform management tool, Audacity has a lot going for it, and once you get used to working with it, it’s remarkably fast, or at least, fast as audio editing tools go.
CAVEAT: There are a lot of wild and crazy things you could do with audio editing. There is an effects toolbox in the software that would make any gearhead musician of the 90’s envious, and many of the tools require some advanced knowledge of audio editing to be useful, but I’m not going to cover those this go around. What I will talk about are the tools that a new podcaster would want to master quickly and become comfortable with.
First things first. I am a fan of independent tracks, as many as you can effectively manage. As I mentioned in my first post, if possible, I would like to get local source recordings from everyone participating in the podcast. Skype Call Recorder lets you save the call as a .MOV file, and when imported into Audacity, it will appear as one stereo track. One side will be the local speaker, and the other side will be the other caller(s). Even if you can only get one recording, I recommend this approach, and doing the following:
1. Import the MOV file into Audacity, and confirm your stereo track does have the separation between local and remote callers.
2. Split the stereo track into two mono tracks.
3. Select Sync-Lock tracks. This way, any edit you make that inserts or subtracts time from the one track will be reflected in the other track.
4. Look for what should be silent spots. In between people talking, there should be a thin flat bar. If you have that flat thin bar, great, it means there are little to no artifacts. Unfortunately, what you are more likely to see are little bumps here and there. Fortunately, they are easy to clean up. Just select the section of the track, highlight the area you wish to silence (you can also use the keyboard arrow keys to widen or narrow the selected area), and then press Command-L. Any audio that was in that region is now silenced.
By doing this, it is possible to clean up a lot of audio artifacts. Do make sure to look at them, though and ensure that they are just random audio captures and not your guest stepping away from the microphone but still speaking about something important. Granted, that’s usually handled at the time of recording, and as the producer, you need to be alert to that. If you receive a recording where you weren’t there, then you don’t have that option, and really have to make sure you have listened to those in between spaces.
Before we get too deep into the editing of the main podcast audio, I want to step back and talk about the “atmosphere” you set for your show. Most podcasts have little elements that help set the mood for the show, in the form of intros and outros, messages, and what will likely be frequently mentioned items to each podcast. You may choose to do this differently each time, or create a standard set of “audio beds” that can be reused. For the Testing Show, I do exactly that. I have what I call an “Assembly Line” project. It contains my show’s opener (theme music and opening words) as well as the show’s closer (again, theme music and parting words). These sections, for most episodes, are exactly the same. Therefore, it makes sense to have them together and synchronized. It’s possible that these could be mixed down into a single track, but that removes the ability to change the volume levels or make modifications. Unless I know something will always be used in the same way every time, I prefer not mixing them down into a single track. It’s easier to move a volume control or mute something one week than have to recreate it.
GEEK TRICK: When you start getting multiple tracks on the same screen, it can be a pain to see what’s in need of editing and adjusting, and what’s already where it should be. Each track view can be collapsed so that just a sliver of the track view is visible. For me, anytime I collapse a track, that’s a key that I don’t need to worry about that area, at least for now. It’s where it needs to be, both timing-wise and sequence-wise. It saves real estate, and frankly, you want as much visible real estate as possible when doing waveform editing.
|A typical edit flow, showing tracks that are situated and ready versus what I am actively examining/editing.|
In the first post in this series, I mentioned that I would silence audio first. Rather than delete sections outright, I’d highlight them and Silence Audio. I do this because it lets me do a rough shaping of the show quickly, and then I can handle removing all of the silence in one step. To do this, select “Truncate Silence” from the Effect Menu:
|One of my favorite tools, it saves a lot of time.|
The dialog box that appears will give you the option to set an audio level that Audacity will consider anything quieter than to be considered “silence”. It will also give you a limit that it will consider anything beneath that value to be acceptable, and only look for silence longer than the value entered. In my experience, natural conversation flow allows anywhere from a half a second to a second for transitioning between speakers, so my default value is half a second (if it feels rushed, I can always generate silence to create extra space). The utility then takes any silence sections more than half a second and cuts out those sections. That will leave you with a continuous stream of audio where the longest silence is half a second.
GEEK TRICK: This comes from music, and specifically, it’s looking for the “musicality” of speech patterns. Everyone talks a little differently. Some are faster, some are slower. Some speak in quick bursts and then pause to reflect. Others will be fairly steady but keep talking without noticeable breaks. Nevertheless, most people tend to stick to a pattern when they speak. Most people generally pause about 0.2 seconds for where a comma would appear, 0.3 seconds for a period, and 0.5 seconds for a new paragraph (or to catch their breath). A friend of mine who used to work in radio production taught me this technique of “breathless read through”, which isn’t really breathless, but rather silencing breaths, but allowing for the time it would take for the breath to occur. In short, speech, like music, needs “rest notes” and different values of rest notes are appropriate. Try it out and see if it makes for a more natural sound.
No matter how well you try to edit between a speaker’s thoughts, you run the risk of cutting them off mid vocalizing. Left as is, they are noticeable clicks. They are distracting, so you want to smooth those out. Two utilities make that easy; Fade Out and Fade In. Simply highlight the end or beginning of the waveform, making sure to highlight right to the end or start of the section you want to perform the fade (these are in reality very short segments) and then apply the fade-out to the end of the previous word, and apply the fade-in to the start of the following word. This will take a little practice to get to sound natural, and sometimes, no matter how hard you try, you will not be able to get a seamless transition, but most of the time it is effective.
|After highlighting an area to silence, you can shorten the space to flow with the conversation.|
|Select the ending of a waveform segment, and then choose Fade Out from the Effect menu.|
|Same goes for fading into a new waveform, but choose Fade In for that.|
This technique is often jokingly referred to as the “Pauper’s Cross Fade”.
GEEK TRICK: Use the running label track or as many as you need to remind you of things you have done that you feel may warrant follow-up or additional processing. Also, using multiple comment tracks can help you sync up sections later.
Sometimes you will have to amplify or quiet someone’s recording. I have experimented with a number of approaches with this over the years, and I have decided that using the Leveling effect, while helpful, messes with the source audio too much. The transitions between speakers will be noticeably more “hissy”. With separate tracks for each speaker, this isn’t an issue. Increasing or decreasing the track volume is sufficient. However, if your guests are all on the same track or channel, that’s not an option. My preferred method in these cases is to use “Normalization”, in which I set a peak threshold (usually +/- 3dB) and then select a section of a waveform and apply the Normalization to it. That will either increase or decrease the volume of that section, but it will do so with a minimum of added noise. Again, this is one of those areas where your ears are your friend, so listen and get a feel for what you personally like to hear. Caveat: this will no work on clipped audio. Unlike analog recording where running a little hot can make a warm sound on tape, in digital recording, you have space, and then you clip. If you clip, you will get distorted audio. Normalization or lowering the volume will not help. In short, if you hear someone speaking loud and hot, and you suspect they may be clipping the recording, ask them to move back from the microphone and repeat what they said.
OK, so there it is. Not too big a set of tools to learn, is it? You will note that I have covered these areas as individual steps, and as manual active editing. Can you automate steps? You can, but I’ve found that there are only a few things that make it worthwhile, and they need to be steps you would perform in sequence for a section or whole file. In Audacity, these sequences are called “Chains” and you can create and edit them by selecting “Edit Chains” from the File menu. I have found that there are a lot of unpredictables with audio. Thus, I encourage active listening rather than relying on the machine to process the audio directly. One you get a handle on the things you know you will do a lot, and that you know will be effective with minimal chance of backfiring, go nuts!
Next time, I will talk about packaging your podcast, including tagging, formatting, art for episodes, show notes, transcripts and all the fun meta-data you may or may not want to keep track of with each episode.