Voice AI

Twilio Media Streams

Bidirectional audio streaming for phone calls

Twilio Media Streams

Media Streams provides a WebSocket connection to both sides of a phone call - receive audio and send audio back.

Webhook Setup

When a call comes in, Twilio hits your webhook. Respond with TwiML to start a stream:

app.post('/incoming', (req, res) => {
  const twimlResponse = `<?xml version="1.0" encoding="UTF-8"?>
    <Response>
      <Connect>
        <Stream url="wss://${process.env.SERVER}/connection" />
      </Connect>
    </Response>`;
  
  res.type('text/xml');
  res.send(twimlResponse);
});

WebSocket Connection

Handle the WebSocket connection for audio streaming:

const WebSocket = require('ws');
const wss = new WebSocket.Server({ server });

wss.on('connection', (ws) => {
  let streamSid = null;
  
  ws.on('message', (message) => {
    const msg = JSON.parse(message);
    
    switch (msg.event) {
      case 'start':
        // Stream started, save the streamSid
        streamSid = msg.start.streamSid;
        break;
        
      case 'media':
        // Audio data from caller (base64 encoded mulaw)
        const audio = msg.media.payload;
        // Send to STT service
        break;
        
      case 'stop':
        // Call ended
        break;
    }
  });
});

Audio Format

Twilio sends/receives audio in a specific format:

PropertyValue
Encodingmulaw (μ-law)
Sample rate8000 Hz
ChannelsMono
FormatBase64 encoded

Sending Audio Back

Send audio back to the caller:

function sendAudio(ws, streamSid, audioPayload) {
  const message = {
    event: 'media',
    streamSid: streamSid,
    media: {
      payload: audioPayload  // base64 encoded mulaw audio
    }
  };
  ws.send(JSON.stringify(message));
}

Clear Audio Buffer

Stop current playback (for interruption handling):

function clearAudio(ws, streamSid) {
  const message = {
    event: 'clear',
    streamSid: streamSid
  };
  ws.send(JSON.stringify(message));
}

Message Types

EventDirectionDescription
connectedTwilio → AppWebSocket connected
startTwilio → AppStream started, contains streamSid
mediaBothAudio data (base64 mulaw)
stopTwilio → AppStream ended
clearApp → TwilioClear audio buffer
markApp → TwilioMark position in audio stream

Configure Phone Number

Using Twilio CLI:

twilio phone-numbers:update +1XXXXXXXXXX \
  --voice-url=https://your-server.com/incoming

Or configure in Twilio Console.