Remove Any Sound.
Just Describe It.

Pick any sound you want to keep or remove. Just tell us what it is, click on it, or show us where it is. We'll do the rest.

Try AudioSam Free

Used by professionals on

Four Ways to Pick Sounds

Tell us what sound you want. We can understand words, clicks, or time selections. Works with any audio or video.

Type What You Want

Just type what sound you want. Say "remove the drums" or "keep only the singing" and we'll find it for you.

Click to Pick

See someone or something in your video? Just click on it. We'll grab their sound for you.

Show Us When

Pick a part where you hear the sound you want. We'll learn what it sounds like and find it everywhere.

Mix All Three

Use words, clicks, and time picks together. Perfect for tricky audio with lots of sounds mixed in.

How It Works

Three easy steps to clean audio.

Add Your File

Drop in any video or audio file. We work with all the popular types.

Tell Us What Sound

"Get rid of the fan noise." We understand what you mean and fix it fast.

Download Clean Audio

Get your clean audio file. Save just the sound you want, or the whole thing minus the bad parts.

Built for Creators

From bedroom producers to Hollywood studios, AudioSam adapts to your workflow. Professional-grade audio separation for every creative field.

Vocal removalStem separationSample extraction

Music Producers

Stem separation & remix

Extract vocals, drums, bass, and instruments from any track. Create remixes, sample packs, or isolate elements for your productions.

Noise removalVoice isolationAudio cleanup

Podcasters

Crystal-clear dialogue

Remove background noise, isolate guest audio, and clean up recordings. Deliver professional-quality episodes every time.

Dialogue extractionSound designADR prep

Video Editors

Perfect audio post

Separate dialogue from ambient sound, remove unwanted audio, and create clean audio beds for your video projects.

Location audioFoley separationMix prep

Filmmakers

Production & post audio

Isolate on-set dialogue, remove location noise, and prepare stems for scoring. Professional audio separation for cinema.

See It Work

Watch us pick out voices, remove background noise, and split songs into parts. Click a demo to try it.

Split any song into separate parts—singing, guitar, drums, and more. All in just a few seconds.

Understands Words

Just tell us what sound you want in normal words. We get it.

Works with Video

Click on someone in your video. We'll grab their sound.

Finds Sounds Everywhere

Show us a sound once. We'll find it through the whole file.

All Methods Together

Use words, clicks, and time picks at the same time for tricky sounds.

Super Fast

Hours of audio done in minutes. Our servers work really quick.

Pro Quality

Sounds clean and professional. Good enough for real studios.

Pick a Plan

Start free. Pay only for what you use.

Hobbyist

$0/mo

10 Free Credits / mo
MP3 Export
Text Prompts only

Creator

$19/mo

500 Credits / mo
WAV & FLAC Export
Text, Visual & Span Prompts

Studio

$49/mo

Unlimited Credits
API Access
Batch Processing

Questions? We've Got Answers

Everything you need to know about AudioSam.

How accurate is SAM Audio for sound separation?

SAM Audio utilizes state-of-the-art transformer models trained on millions of audio hours. It achieves industry-leading separation quality for vocals, instruments, speech, and environmental sounds. While no AI tool is 100% perfect for every scenario, AudioSam consistently delivers professional-grade results that rival dedicated hardware solutions.

What audio and video formats does AudioSam support?

AudioSam supports all major audio formats including MP3, WAV, FLAC, AAC, OGG, and M4A. For video files, we support MP4, MOV, AVI, MKV, and WebM. The maximum file size depends on your subscription tier, with Studio users enjoying unlimited file sizes and batch processing capabilities.

What is the difference between text, visual, and span prompts?

Text prompts let you describe sounds in natural language (e.g., "remove background traffic noise"). Visual prompts allow you to click on objects in video frames to isolate their associated sounds. Span prompts let you select specific time ranges where the target sound occurs, helping the AI identify and separate it throughout the entire file.

Is my uploaded audio and video data secure?

Absolutely. All files are encrypted using AES-256 encryption during upload and processing. We automatically delete your original and processed files from our servers within 24 hours. Enterprise and Studio users can request immediate deletion or configure custom retention policies. We never use your content to train our models without explicit consent.

Can AudioSam remove vocals from music for karaoke or remixes?

Yes! AudioSam excels at vocal isolation and removal. You can extract clean vocals for remixes, create instrumental versions for karaoke, or separate individual instruments from mixed tracks. Simply use a text prompt like "isolate vocals" or "remove singing voice" to get started. The quality rivals dedicated stem separation tools.

How does the credit system work?

Credits are consumed based on the duration of your audio or video file. One credit equals approximately one minute of processed audio. Complex separations (like isolating multiple sounds simultaneously) may use slightly more credits. Free users receive 10 credits monthly, Creator plans include 500 credits, and Studio plans offer unlimited processing.

What is SAM Audio and how is it different from other AI audio tools?

SAM Audio (Segment Anything Model for Audio) is Meta's groundbreaking AI technology that brings the versatility of image segmentation to audio. Unlike traditional tools that only separate predefined categories (vocals, drums, bass), SAM Audio can isolate any sound you describe. It's the first model to support multi-modal prompting—combining text descriptions, visual cues from video, and temporal selection.

Can I use AudioSam for podcast editing and noise removal?

AudioSam is perfect for podcast production. Remove background noise like air conditioning hum, traffic sounds, or keyboard clicks while preserving crystal-clear speech. You can also isolate individual speakers from group recordings, remove "ums" and "ahs", or extract specific segments. Many professional podcasters use AudioSam as part of their standard workflow.

Is there an API available for developers?

Yes! Studio plan subscribers get full API access with comprehensive documentation, SDKs for Python, JavaScript, and other popular languages, and webhook support for async processing. The API supports all three prompting modes and includes batch processing endpoints for high-volume workflows. Rate limits and concurrent processing capabilities scale with your needs.

How long does audio processing take?

Processing time depends on file length and complexity. Most audio files under 5 minutes are processed in under 30 seconds. Longer files or complex multi-sound separations may take 1-2 minutes. Studio users benefit from priority processing queues, ensuring faster turnaround even during peak usage times. You'll receive a notification when your file is ready.

Remove Any Sound.Just Describe It.