SAM Audio API by Meta

SAM Audio API
Segment Any Sound

Integrate Meta's revolutionary SAM Audio technology into your applications. The most powerful audio separation API for developers, powered by the Segment Anything Model for audio.

What is SAM Audio?

SAM Audio (Segment Anything Model for Audio) is Meta's groundbreaking AI technology that brings the versatility of image segmentation to audio processing. Unlike traditional audio separation tools that only work with predefined categories, SAM Audio can isolate any sound you describe.

Text Prompts

Describe any sound in natural language. "Remove the dog barking" or "isolate the piano" - SAM Audio understands.

Visual Prompts

Click on objects in video frames. SAM Audio identifies and isolates the sound associated with that visual element.

Temporal Spans

Select a time range where the target sound occurs. SAM Audio learns the pattern and finds it throughout the file.

SAM Audio API Features

Everything you need to integrate professional-grade audio separation into your applications.

Lightning Fast Processing

Process audio files in seconds, not minutes. Our optimized SAM Audio infrastructure handles hours of audio with minimal latency.

// Average processing time
5 min audio = ~8 seconds

RESTful API

Simple, well-documented REST endpoints. Integrate SAM Audio into any stack with our comprehensive SDKs for Python, JavaScript, and more.

POST /v1/separate
GET /v1/jobs/{job_id}

Enterprise Security

AES-256 encryption for all uploads. Files automatically deleted within 24 hours. SOC 2 compliant infrastructure.

  • End-to-end encryption
  • GDPR compliant
  • No data retention

All Formats Supported

Input: MP3, WAV, FLAC, AAC, OGG, M4A, MP4, MOV, AVI, MKV. Output in your preferred format with configurable quality settings.

MP3WAVFLACMP4MOV

Simple Integration

Get started with SAM Audio API in minutes.

separate_audio.py
import audiosam

# Initialize the SAM Audio API client
client = audiosam.Client("your_api_key")

# Separate audio using text prompt
result = client.separate(
    file="podcast.mp3",
    prompt="remove background music"
)

# Download the separated audio
result.download("clean_podcast.mp3")

SAM Audio API Use Cases

From startups to enterprises, developers use SAM Audio API to build incredible audio experiences.

Music Production Apps

Build stem separation tools, remix creators, and karaoke generators powered by SAM Audio.

Podcast Platforms

Auto-clean uploads, remove background noise, and enhance voice quality at scale.

Video Editing Software

Separate dialogue from background, clean production audio, and create sound effects.

Transcription Services

Improve speech-to-text accuracy by isolating voice from noisy recordings.

Audio Forensics

Extract specific sounds from complex recordings for analysis and investigation.

Gaming & VR

Create dynamic audio environments with real-time sound separation and processing.

See SAM Audio in Action

Watch how SAM Audio API handles different separation tasks with precision and speed.

SAM Audio API Pricing

Flexible pricing for projects of any size.

Developer

$0/mo
  • 100 API calls / mo
  • Text prompts only
  • Community support

Pro

$49/mo
  • 5,000 API calls / mo
  • All prompt types
  • Priority processing
  • Email support

Enterprise

Custom
  • Unlimited API calls
  • Dedicated infrastructure
  • SLA guarantee
  • 24/7 support

SAM Audio API FAQ

Common questions about integrating SAM Audio into your projects.

What is SAM Audio and how does it differ from other audio APIs?
SAM Audio (Segment Anything Model for Audio) is Meta's revolutionary approach to audio separation. Unlike traditional APIs that only separate predefined categories like "vocals" or "drums", SAM Audio can isolate any sound you describe in natural language. It's the first audio API to support multi-modal prompting—combining text descriptions, visual cues, and temporal selection for precise control.
What are the rate limits for the SAM Audio API?
Developer tier: 100 API calls/month, 10 concurrent requests. Pro tier: 5,000 API calls/month, 50 concurrent requests. Enterprise tier: Custom limits based on your needs with dedicated infrastructure. All tiers support files up to 2GB and audio up to 4 hours in length.
How accurate is SAM Audio for complex audio separation?
SAM Audio achieves state-of-the-art results across standard benchmarks. For vocal isolation, it achieves 8.5+ dB SDR (Signal-to-Distortion Ratio). For arbitrary sound separation via text prompts, accuracy depends on the specificity of your description and audio complexity. Combining multiple prompt types (text + visual + temporal) significantly improves results for challenging scenarios.
What SDKs and programming languages are supported?
We provide official SDKs for Python, JavaScript/TypeScript, Go, and Ruby. The REST API can be used with any language that supports HTTP requests. All SDKs include async/await support, automatic retries, webhook handling, and comprehensive TypeScript definitions.
Is my audio data secure with SAM Audio API?
Yes. All uploads are encrypted with AES-256. Files are automatically deleted within 24 hours (or immediately on request). We are SOC 2 Type II certified and GDPR compliant. Enterprise customers can request dedicated infrastructure with custom retention policies. We never use customer audio to train our models without explicit consent.
Can I use SAM Audio API for real-time audio processing?
The standard API is optimized for batch processing. For real-time or near-real-time use cases, contact our enterprise team about our streaming API (currently in beta) which supports chunk-based processing with sub-second latency for audio segments under 30 seconds.

Ready to Build with SAM Audio API?

Join thousands of developers using SAM Audio to power next-generation audio applications.