Voximplant now includes a native Deepgram module that connects any Voximplant call to Deepgram’s Voice Agent API for real-time, speech‑to‑speech conversations. You can stream audio from phone numbers, SIP trunks, WhatsApp, or WebRTC into Deepgram’s unified agent environment—combining STT, LLM reasoning, and TTS—and play responses via Voximplant’s serverless runtime with minimal latency.

This integration is delivered as a Voice AI Connector inside VoxEngine. Developers define their Deepgram speech-to-text, LLM, and text-to-speech parameters. Voximplant handles the telephony, media conversion, and streaming WebSockets so you can focus on agent behavior instead of infrastructure.

The Deepgram Voice Agent connector fully exposes Deepgram’s comprehensive speech recognition capabilities, options to interface with a wide variety of LLM models (currently 18 with custom options), and many speech synthesis options from its own library and partners, including integrated turn-taking, barge-in, and mid-session adjustments.

Highlights

  • Unified Voice Agent on any call path
    Bridge PSTN, SIP, WebRTC, or WhatsApp calls from Voximplant directly into Deepgram Voice Agent using a single VoxEngine scenario– no custom media gateway required.
  • Enterprise-ready from network to voice stack
    Combine Voximplant’s global telephony, Voice AI orchestration, and compliance features (DIDs, SIP, recording, monitoring) with Deepgram Voice Agent’s production-grade speech models and detailed event stream to build low-latency voice agents that plug cleanly into existing contact center and IT environments.
  • Low‑latency, interruption‑friendly conversations
    Use Deepgram’s turn‑taking and VoxEngine’s media buffer control to keep conversations natural: clear buffered audio when the user starts speaking and let the agent resume smoothly.
  • Full access to Deepgram Voice Agent configurations, events and controls
    The VoxEngine connector exposes the full Deepgram Voice Agent stack. You can apply and update agent configurations mid-call, subscribe to all Voice Agent events (for example, AgentThinking when the agent is formulating a response and History for a structured log of the conversation), and handle function calls to trigger external integrations.

Developer notes

  • Native VoxEngine module
    Load the integration with require(Modules.Deepgram); and create a Deepgram.VoiceAgentClient via Deepgram.createVoiceAgentClient(voiceAgentClientParameters).
  • Session setup: passthrough settings object
    Pass the Deepgram Voice Agents settings object to specify agent details like listen (speech-to-text), think (LLM) and speak (text-to-speech) options. Do not include audio settings - these are hardcoded by the connector for optimum voice quality with Voximplant.
  • All Deepgram events support
    All Deepgram’s Voice Agent Events are supported. You can find these under the `Deepgram.VoiceAgentEvents` enum.
    Subscribe to events such as ConversationText, AgentThinking, Error, Warning, and History to log transcripts, debug behavior, and capture analytics on turn times and error rates.
    In addition, VoxEngine provides `WebSocketMediaStarted`  and `WebSocketMediaEnded` with other debugging events to track when media is flowing across the WebSocket.
  • Session control messages
    Make mid-call updates without reconnecting using sendUpdatePrompt (role/instructions) and sendUpdateSpeak. Inject text for hard-coded flows with sendInjectUserMessage and sendInjectAgentMessage.
  • Function calling
    Function/tools are defined in Deepgram’s think object with a `functions` array (Deepgram functions docs). If the LLM invokes these, you can invoke the function calling logic and respond with the Deepgram.VoiceAgentEvents.FunctionCallRequest and FunctionCallResponse events.
  • Barge‑in and media control
    Listen for Deepgram.VoiceAgentEvents.UserStartedSpeaking and call voiceAgentClient.clearMediaBuffer() to cancel current TTS audio when the user interrupts, keeping the dialog natural.
  • Deepgram ASR Module
    This new connector is independent of the existing Speech-to-Text (STT) capability available via  VoxEngine’s ASR module and createASR method - i.e. asr = VoxEngine.createASR({model: ASRModelList.Deepgram.nova3_medical,profile: ASRProfileList.Deepgram.en_US,parameters: { redact: 'numbers',                  keywords: ['Voximplant', 'Otolaryngology']}});
    See our Speech recognition guide for details. You can align the ASR parameters with the `listen` object of your Deepgram Voice Agent for consistent transcription for before and after you invoke the agent.

Demo video

See the video below for a demonstration of a live phone call connected to Deepgram Voice Agent through Voximplant.

Pricing and availability

The Deepgram module is available globally in VoxEngine as part of Voximplant’s Voice AI offering.

Deepgram Voice Agent usage through Voximplant is billed as a Voice AI Connector:

  • Voximplant side: audio streaming is charged at $0.001 per 15 seconds, counting both inbound and outbound audio
  • Deepgram side: you pay Deepgram directly for Voice Agent API usage according to your Deepgram plan (e.g., hourly Voice Agent pricing published by Deepgram)

There is no additional Voximplant surcharge based on which Deepgram models you choose—the connector cost is purely based on streaming duration.

For full details, see the Voice AI Connectors section on the Voximplant pricing page.

Quick start: Deepgram Voice Agent VoxEngine scenario


The scenario below connects an incoming call to a Deepgram Voice Agent, configures listen / think / speak providers, and wires Deepgram events to VoxEngine logging.

/**
* Voximplant + Deepgram Voice Agent connector demo
* Feature highlighted: CROSS-CALL CONVERSATION HISTORY (simple “memory” per caller ID)
*/
require(Modules.Deepgram); // 
require(Modules.ApplicationStorage);
const SYSTEM_PROMPT = `
You are a helpful English-speaking voice assistant for phone callers. Your name is Voxi.
You are speaking on behalf of Voximplant and Deepgram about Voximplant's Deepgram Voice Agent integration.
Keep your turns short and telephony-friendly (usually 1–2 sentences).


If conversation history is present (prior turns are provided), begin the call by briefly summarizing what the user asked or tried to accomplish last time in one sentence before asking how you can help.


KNOWLEDGE ABOUT VOXIMPLANT + DEEPGRAM
- The Voximplant Deepgram Voice Agent connector exposes the full Deepgram Voice Agent configuration and event stream (e.g., AgentThinking and History) and supports low-latency, interruption-friendly conversations via turn-taking, barge-in, and mid-session updates, and conversation history.
- Voximplant can bridge phone, SIP, WebRTC, or WhatsApp calls from Voximplant directly into Deepgram Voice Agent using a single VoxEngine scenario, with Voximplant handling the telephony, media conversion, and streaming WebSockets so you can focus on agent behavior.
`;
VoxEngine.addEventListener(AppEvents.CallAlerting, async ({ call }) => {
   let voiceAgentClient;
   try {
       call.answer();
       call.record({ hd_audio: true, stereo: true }); // optional: call recording
       // -------------------- CROSS-CALL HISTORY (per caller ID) --------------------
       const callerId = call.callerid() || null; // Caller ID is our “user identity”
       const historyKey = `dgva_history:${callerId}`;
       let rollingHistory = [];
       const stored = await ApplicationStorage.get(historyKey);
       if (callerId && stored && stored.value) {
           const parsed = JSON.parse(stored.value);
           if (Array.isArray(parsed))
               rollingHistory = parsed;
       }
       Logger.write("===CROSS_CALL_HISTORY_LOADED===");
       Logger.write(`Loaded ${rollingHistory.length} messages for caller ${callerId}`);
       // -------------------- Deepgram Voice Agent settings --------------------
       const SETTINGS_OPTIONS = {
           tags: ["voximplant", "deepgram", "voice_agent_connector", "cross_call_history_demo"],
           agent: {
               language: "en",
               greeting: "Hi! I'm Voxy. How can I help today?",
               context: {
                   messages: rollingHistory, // pass prior turns into the new session as conversation history.
               },
               listen: {
                   provider: {
                       type: "deepgram",
                       model: "flux-general-en",
                   },
               },
               think: {
                   provider: {
                       type: "open_ai",
                       model: "gpt-4o-mini",
                       temperature: 0.6,
                   },
                   prompt: SYSTEM_PROMPT
               },
               speak: {
                   provider: {
                       type: "deepgram",
                       model: "aura-2-cordelia-en",
                   },
               },
           },
       };
       // Create client and wire media
       voiceAgentClient = await Deepgram.createVoiceAgentClient({
           apiKey: (await ApplicationStorage.get("DEEPGRAM_API_KEY")).value, // Add your API key or save it to storage
           settingsOptions: SETTINGS_OPTIONS,
       });
       VoxEngine.sendMediaBetween(call, voiceAgentClient);
       // single terminate that saves history
       async function terminate() {
           try {
               await ApplicationStorage.put(historyKey, JSON.stringify(rollingHistory), 7 * 24 * 3600);
               Logger.write("===CROSS_CALL_HISTORY_SAVED===");
               Logger.write(JSON.stringify({ callerId, messages: rollingHistory.length }));
           }
           catch (e) {
               Logger.write(`Error during terminate ${JSON.stringify(e)}`);
           }
           finally {
               try {
                   voiceAgentClient?.close();
               }
               catch (e) {
                   Logger.write(e);
               }
               VoxEngine.terminate();
           }
       }
       call.addEventListener(CallEvents.Disconnected, function () { terminate(); });
       call.addEventListener(CallEvents.Failed, function () { terminate(); });
       // ---------------------- Event handlers -----------------------
       // Barge-in: keep conversation responsive
       voiceAgentClient.addEventListener(Deepgram.VoiceAgentEvents.UserStartedSpeaking, () => {
           Logger.write("===BARGE-IN: Deepgram.VoiceAgentEvents.UserStartedSpeaking===");
           voiceAgentClient.clearMediaBuffer();
       });
       // Capture transcript and history
       voiceAgentClient.addEventListener(Deepgram.VoiceAgentEvents.ConversationText, (e) => {
           const { role, text } = e?.data?.payload;
           Logger.write(`✍️: ${role}: ${text}`);
       });
       // Update rolling history
       voiceAgentClient.addEventListener(Deepgram.VoiceAgentEvents.History, (e) => {
           rollingHistory.push(e?.data?.payload); // TODO: check size first
           rollingHistory = rollingHistory.slice(-5); // keep last 5 turns; prod should limit size
       });
       // Consolidated “log-only” handlers - all Deepgram API events supported
       [
           Deepgram.VoiceAgentEvents.Welcome,
           Deepgram.VoiceAgentEvents.SettingsApplied,
           Deepgram.VoiceAgentEvents.AgentThinking,
           Deepgram.VoiceAgentEvents.AgentAudioDone,
           Deepgram.VoiceAgentEvents.ConnectorInformation,
           Deepgram.VoiceAgentEvents.HTTPResponse,
           Deepgram.VoiceAgentEvents.Warning,
           Deepgram.VoiceAgentEvents.Error,
           Deepgram.VoiceAgentEvents.Unknown,
           Deepgram.Events.WebSocketMediaStarted,
           Deepgram.Events.WebSocketMediaEnded
       ].forEach(evt => {
           voiceAgentClient.addEventListener(evt, evt => {
               Logger.write(`===${evt.name}===`);
               Logger.write(JSON.stringify(evt));
           });
       });
   }
   catch (e) {
       Logger.write("===UNHANDLED_ERROR===");
       Logger.write(e);
       try {
           voiceAgentClient?.close();
       }
       catch (err) {
           Logger.write(err);
       }
       VoxEngine.terminate();
   }
});

References

Voximplant docs

Deepgram docs