1. Guides
  • Introduction
  • Get started
    • Quickstart
    • Authentication
  • Core concepts
    • Agents
    • Phone numbers
    • Calls
    • Webhooks
    • EU Data Sovereignty
  • Webhooks
    • Overview
    • Assistant request
    • Tool calls
    • Status update
    • End of call report
    • Security
  • Guides
    • Whitelabel Portal
    • Prompt & tools generator
    • Web Calls
    • Campaigns
    • xAI Realtime Integration
    • Voice selection psychology
    • Analysis templates
    • BYOK Setup
    • Call analysis
    • Call Transfers
    • Custom Tools
    • Sip Trunks
    • Tool templates
    • Voicemail detection
    • Autonomous silence detection
    • Billing
    • Error codes
    • Rate limits
    • Troubleshooting
  • Api's
    • Campaigns
    • Agents
    • Voices
    • BYOK
    • Analysis templates
    • Tool templates
    • Organization
    • Phone numbers
    • Sip trunks
    • Calls
    • Call control
    • Usage
    • Domains
Documentation
API Reference
Documentation
API Reference
Book a meeting
Linkedin
Github
  1. Guides

Web Calls

Web calls let your users speak directly to an AI agent from their browser — no phone number required. The browser connects via WebRTC and the agent runs exactly like it does for phone calls.
We're working on official SDKs for React, Vue, and vanilla JavaScript. In the meantime, this guide provides everything you need to build a working integration using the open-source livekit-client and @livekit/components-react packages.

How It Works#

User clicks "Talk to AI"
        ↓
Your backend → POST https://agent-api.hmsovereign.com/v1/web-calls (Bearer <org_api_key>)
        ↓
HMS validates key, creates voice room, dispatches agent → returns { token, server_url }
        ↓
Your backend passes token + server_url to the browser
        ↓
Browser connects via WebRTC (microphone audio) using the token
        ↓
AI agent picks up → full STT → LLM → TTS pipeline
        ↓
Call ends → summary, transcript, credits deducted, webhook fired
The call appears in your Calls dashboard with direction: "web" and is billed at the same per-minute rate as phone calls.
Security
Never expose your API key to the browser. Always proxy web call requests through your own backend server. The browser only ever receives the short-lived token.

API Reference#

Create Web Call#

POST https://agent-api.hmsovereign.com/v1/web-calls

Authentication#

HeaderValue
AuthorizationBearer YOUR_API_KEY
Content-Typeapplication/json
Your organization API key is found in the HMS Sovereign dashboard under Settings > API Keys. The org_id is automatically derived from the key — you do not need to pass it.

Request Body#

FieldTypeRequiredDescription
assistant_idstring (uuid)No*Saved agent to use. Must belong to your organization.
assistant_overrideobjectNoPartial field overrides applied on top of assistant_id (hybrid mode). Requires assistant_id.
assistantobjectNo*Full inline agent config (transient mode). Cannot be combined with assistant_override.
*At least one of assistant_id or assistant is required. See configuration modes below.
Reference mode — use a saved agent as-is:
{
  "assistant_id": "17a0cb75-fa09-4bdd-9a44-92a70d829c88"
}
Hybrid mode — saved agent with partial overrides:
{
  "assistant_id": "17a0cb75-fa09-4bdd-9a44-92a70d829c88",
  "assistant_override": {
    "first_message": "Custom greeting!",
    "llm_config": {
      "messages": [{ "role": "system", "content": "You are a sales agent." }]
    }
  }
}
Transient mode — full inline config, no saved agent required:
{
  "assistant": {
    "stt_config": { "provider": "deepgram", "model": "nova-3", "language": "en" },
    "llm_config": {
      "provider": "openai",
      "model": "gpt-4.1-mini",
      "messages": [{ "role": "system", "content": "You are a helpful assistant." }]
    },
    "tts_config": { "provider": "elevenlabs", "voice_id": "your-voice-id" },
    "first_message": "Hello! How can I help you?"
  }
}
Available fields in assistant / assistant_override: stt_config, llm_config, tts_config, first_message, business_name, name, analysis_plan, autonomous_silence_handling, gdpr_mode, webhook_url, webhook_secret, webhook_events.
Transient mode requires stt_config, llm_config, and tts_config inside assistant.

Configuration Modes {#configuration-modes}#

ModeWhen to use
ReferenceUse a saved agent exactly as configured in the dashboard
HybridUse a saved agent but override specific fields per-call (e.g. dynamic first message, custom system prompt)
TransientFully define the agent inline — useful for dynamic or ephemeral agents not saved in the dashboard

Success Response — 200 OK#

{
  "success": true,
  "call_id": "3f2a1b4c-5d6e-7f8a-9b0c-1d2e3f4a5b6c",
  "room_name": "web-3f2a1b4c",
  "token": "<jwt>",
  "server_url": "wss://rtc.hmsovereign.com"
}
FieldTypeDescription
successbooleanAlways true on 200
call_idstring (uuid)Unique call ID — appears in your calls dashboard
room_namestringVoice room name
tokenstringShort-lived JWT (5-minute TTL) — pass this to the browser to connect
server_urlstringWebSocket URL for the voice server (always wss://)
Important: the token expires after 5 minutes whether or not the user joins. Once the call starts, it runs until the user hangs up or the 5-minute max duration is hit.

Error Responses#

StatusDetailMeaning
400Must provide assistant_id, assistant, or bothNo configuration provided
400assistant_override requires assistant_idOverride provided without a base agent
400Cannot provide both assistant and assistant_overrideAmbiguous config mode
400Transient agent must provide: stt_config, llm_config, tts_configTransient mode missing required configs
401Invalid API keyAPI key not recognized
402Insufficient credits to start web callOrganization has no balance
403Agent not found or does not belong to this organizationInvalid assistant_id for this org
429Maximum 3 concurrent web calls per organizationToo many active calls — user must wait
500Failed to create web call session: ...Server error

Limits#

LimitValue
Max concurrent web calls per organization3
Max call duration5 minutes
Token TTL (time to join)5 minutes
Room auto-deleted if nobody joins60 seconds

Integration Guide#

1
Set Up Your Backend
Create an endpoint that proxies requests to the HMS Sovereign API. This keeps your API key secure on the server — the browser only ever receives the short-lived token.
Node.js / Express
Python / FastAPI
Next.js API Route
2
Install the Client SDK
Install livekit-client in your frontend project to connect to the voice session:
npm
yarn
pnpm
For React projects, also install the React components library:
3
Connect to the Voice Session
Fetch the token from your backend and connect to the voice room.
Vanilla JavaScript
React

Live Transcription#

The agent automatically publishes real-time transcriptions for both user speech (STT output) and agent speech (synchronized with TTS playback) via the lk.transcription text stream topic. This is enabled by default — no configuration needed.

How It Works#

SourceDescription
User speechThe agent runs STT and publishes the recognized text. Interim results arrive first (lk.transcription_final: "false"), followed by the final result ("true").
Agent speechThe agent's text is synchronized word-by-word with audio playback. If the agent is interrupted, the transcription is truncated to match what was actually spoken.
Each speech segment has a unique lk.segment_id. Interim and final results share the same ID, so you can replace interim entries with the final version.

Vanilla JavaScript#

React Example#

Use useRoomContext from @livekit/components-react inside a <LiveKitRoom> to access the room:
"use client";

import { useEffect, useRef, useState } from "react";
import { useRoomContext } from "@livekit/components-react";

interface TranscriptEntry {
  id: string;
  role: "user" | "agent";
  text: string;
  isFinal: boolean;
}

export function LiveTranscript() {
  const room = useRoomContext();
  const [entries, setEntries] = useState<TranscriptEntry[]>([]);
  const scrollRef = useRef<HTMLDivElement>(null);

  useEffect(() => {
    const unregister = room.registerTextStreamHandler(
      "lk.transcription",
      async (reader, participantInfo) => {
        const text = await reader.readAll();
        const attrs = reader.info.attributes;
        const isFinal = attrs["lk.transcription_final"] === "true";
        const segmentId = attrs["lk.segment_id"] ?? reader.info.id;

        const isUser = participantInfo.identity === room.localParticipant.identity;
        const role = isUser ? "user" : "agent";

        setEntries((prev) => {
          const existing = prev.findIndex((e) => e.id === segmentId);
          const entry: TranscriptEntry = { id: segmentId, role, text, isFinal };
          if (existing >= 0) {
            const updated = [...prev];
            updated[existing] = entry;
            return updated;
          }
          return [...prev, entry];
        });
      }
    );

    return () => { unregister?.(); };
  }, [room]);

  useEffect(() => {
    scrollRef.current?.scrollTo(0, scrollRef.current.scrollHeight);
  }, [entries]);

  return (
    <div ref={scrollRef} style={{ maxHeight: 300, overflowY: "auto" }}>
      {entries.map((entry) => (
        <div key={entry.id} style={{ opacity: entry.isFinal ? 1 : 0.5 }}>
          <strong>{entry.role === "agent" ? "Agent" : "You"}:</strong> {entry.text}
        </div>
      ))}
    </div>
  );
}
Place <LiveTranscript /> inside your <LiveKitRoom> component so it has access to the room context:
<LiveKitRoom token={session.token} serverUrl={session.server_url} connect audio onDisconnected={endCall}>
  <CallInterface onHangUp={endCall} />
  <LiveTranscript />
  <RoomAudioRenderer />
</LiveKitRoom>
Tool/function calls are not published over the transcription stream. They appear in the post-call transcript via webhooks only.

Handling Microphone Permissions#

The browser will prompt for microphone access when connecting with audio={true}. If the user denies permission, a MediaDeviceFailure error is raised on the onError callback:
<LiveKitRoom
  token={session.token}
  serverUrl={session.server_url}
  connect={true}
  audio={true}
  video={false}
  onDisconnected={endCall}
  onError={(err) => {
    // MediaDeviceFailure is thrown when microphone access is denied
    setError("Microphone access was denied. Please allow microphone access and try again.");
    endCall();
  }}
>

Webhooks#

Web calls fire the same webhook events as phone calls. The call.type field is "web_call" instead of "inbound_phone_call".

status-update — call started#

{
  "message": {
    "type": "status-update",
    "status": "in-progress",
    "call": {
      "id": "3f2a1b4c-...",
      "type": "web_call",
      "status": "in-progress"
    }
  }
}

end-of-call-report#

{
  "message": {
    "type": "end-of-call-report",
    "call": {
      "id": "3f2a1b4c-...",
      "type": "web_call",
      "status": "ended"
    },
    "endedReason": "user_hangup",
    "durationSeconds": 47,
    "summary": "The user asked about pricing...",
    "transcript": [ "..." ]
  }
}

End Reasons#

ValueMeaning
user_hangupBrowser disconnected or user clicked hang up
agent_hangupAgent called the end_call tool
max_duration5-minute hard limit reached
errorUnexpected agent error
config_errorAgent configuration is invalid
See Webhooks Overview for webhook setup and configuration.

Billing#

Web calls are billed at the same per-minute rate as phone calls. Usage appears in your dashboard under the Calls section with direction: "web".

FAQ#

Do I need to run my own voice infrastructure?
Can I use this on mobile browsers?
What happens if the user loses internet connection?
Can I customize the agent per-call?
Are official SDKs coming?

Next Steps#

Configure webhooks to receive real-time call events
Add custom tools to let your agent access external data
Set up call analysis for structured post-call data
Modified at 2026-03-04 14:53:13
Previous
Prompt & tools generator
Next
Campaigns
Built with