How to Record and Edit a Multi-Track Podcast in Descript
Set up separate speaker tracks, label them, and edit each voice independently for a clean multi-person podcast.
When two or more people talk, having each voice on its own track makes editing far easier: you can clean one mic without touching another, and crosstalk gets manageable. Descript supports multi-track recording and per-speaker editing. This guide covers a two-host setup.
What you need
- Separate audio files for each speaker, or a multi-track recording session
- A Descript project (audio-only or video podcast)
- Speaker names ready so labels are clear
Step 1: Import each speaker as its own track
Add each person's audio file to the project. If they were recorded on separate mics or via a remote recording tool, you get isolated tracks. Descript stacks them so they stay in sync but remain individually editable.
Step 2: Label the speakers
In the transcript, assign a name to each track so the document reads Host and Guest instead of Speaker 1 and Speaker 2. Correct any spots where Descript attributed a line to the wrong person.
Step 3: Clean each track separately
Apply Studio Sound and noise reduction per track. The guest who joined from a laptop mic might need more cleanup than the host on a USB mic. Per-track effects let you treat each voice on its own merits.
Step 4: Edit the conversation as text
Now edit the combined transcript normally. Delete tangents, remove filler words, and tighten pacing. Because tracks are separate, removing the host's sentence will not chop into the guest's audio underneath it.
Step 5: Export the mix
Export an audio file (MP3 or WAV) for podcast hosts, or an MP4 if it is a video podcast. Descript mixes the tracks down into a single file with all your per-track effects baked in.
Format: WAV (for the host platform) + MP3 (for distribution)
Mix: Host + Guest tracks, normalized to -16 LUFS
Effects: Studio Sound per track, baked in
Chapters: optional, from transcript headingsResult: a two-host episode where the remote guest's rough mic was cleaned independently, crosstalk was tamed by muting, and the whole conversation was tightened by editing text, exported as a polished MP3.
Watch related tutorials
15:00
11:30
20:15
22:40
20:30
15:18