Skip to content
DocsStart free

Music & Message on Hold

Hold audio for a queue is chosen by queue.moh_mode. There are two modes, and the choice is about how they scale, not just how they sound.

ModeMechanismBest forCost model
stream (default)A continuous, shared, always-decoding stream (stream://moh/<account>). Callers “tune in” at the current position.Pure background music (shared platform tracks).Fixed: one decoder per stream, shared by all listeners; gapless resume after an announcement.
filePer-caller playback of the queue’s MOH file (from object storage via the media cache), looped from the start.Custom uploads & message-on-hold (“your call is important to us…”, promos, heard in order).Proportional to active holds: decodes only while someone is on hold; restarts from 0 on loop/announcement.

A shared-stream keeps a thread and decoder running 24/7, whether or not anyone is listening. That’s perfect for a handful of shared default tracks, since one decoder serves everyone. But it does not scale per-tenant: 100 customers each with their own uploaded track would mean 100 always-on decoders burning CPU and RAM at 3am with zero calls. Cost would scale with tenant count, not call volume.

So the rule of thumb:

  • Platform default music (a few shared options) → stream. One decoder, shared by everyone on it.
  • Per-customer custom MOH / message-on-holdfile. Cost scales with concurrent holds, the thing you’re provisioning for anyway. Zero calls means zero cost.

file mode trades the gapless shared-stream resume for a small restart-from-zero on each loop or after an announcement, an acceptable trade for custom and message-on-hold audio, and the only model that scales to many tenants.

{ "id": "q_01j160r23gw5zkpfjme3fwzs5k", "name": "Support", "moh_mode": "file", "moh_media_id": "media_01j22nwntmjd633gxwfhk8zya1" }
FieldPurpose
moh_modestream (shared) or file (per-caller).
moh_media_idThe media_file used for file mode (or a custom stream source).

The worker passes the MOH source and mode to the edge in the enqueue command.

Position announcements break the hold audio to speak, then resume:

  • stream → rejoins the ongoing music near-gaplessly.
  • file → restarts the file.

Because the brain only interrupts on a real, changed announcement (the push model), there’s no recurring gap from announcements that would otherwise fire on a fixed timer.

Streaming an external station (Shoutcast/Icecast) is a conceivable third mode but is deliberately deferred for three reasons: licensing (re-broadcasting a copyrighted station to callers is a legal liability), scale (per-caller connections and decoders, plus station rate-limits, would need a per-node relay), and reliability (an external dependency needs a fallback the instant it fails). For now, use stream for shared music and file for custom audio.