Documentation
API Reference
Documentation
API Reference
Book a meeting
Linkedin
Github
  1. Guides
  • Introduction
  • Get started
    • Quickstart
    • Authentication
  • Core concepts
    • Agents
    • Phone numbers
    • Calls
    • Webhooks
  • Webhooks
    • Overview
    • Assistant request
    • Tool calls
    • Status update
    • End of call report
    • Security
  • Guides
    • Campaigns
    • xAI Realtime Integration
    • Voice selection psychology
    • Analysis templates
    • BYOK Setup
    • Call analysis
    • Call Transfers
    • Custom Tools
    • Sip Trunks
    • Tool templates
    • Voicemail detection
    • Autonomous silence detection
    • Billing
    • Error codes
    • Rate limits
    • Troubleshooting
  • Api's
    • Campaigns
    • Agents
    • Voices
    • BYOK
    • Analysis templates
    • Tool templates
    • Organization
    • Phone numbers
    • Sip trunks
    • Calls
    • Call control
    • Usage
    • Domains
Documentation
API Reference
Documentation
API Reference
Book a meeting
Linkedin
Github
  1. Guides

xAI Realtime Integration

xAI's Grok Realtime API provides speech-to-speech conversation with <700ms latency. Unlike traditional voice AI (STT → LLM → TTS), Grok processes audio directly in a single model.

Setup

1. Add xAI API Key

Navigate to Integrations → API Keys tab and add your xAI API key:

curl -X POST https://api.hmsovereign.com/api/v1/byok 
  -H "Authorization: Bearer YOUR_API_KEY" 
  -H "Content-Type: application/json" 
  -d '{
    "provider": "xai",
    "api_key": "xai-..."
  }'

2. Configure Agent

When creating or editing an agent with xAI configured:

  • Provider: Select "xAI Realtime"
  • Model: grok-realtime-v1
  • Voice: ara (or other available voices)

Note: When using xAI Realtime, separate STT/TTS providers are ignored.

Pricing

xAI Realtime uses BYOK pricing:

  • €0,07/minute when using your xAI API key
  • Direct billing to your xAI account
  • No markup on xAI API usage

Differences from Traditional Mode

FeatureTraditional (STT+LLM+TTS)xAI Realtime
Latency~1-2 seconds<700ms
Providers3 separateSingle (xAI)
Voice QualityDepends on TTS providerNative to model
Custom ToolsSupported via llm_config.toolsCheck xAI docs for support
BYOK Keys Required3 keys (STT, LLM, TTS)1 key (xAI)

Limitations

  • Custom system prompts may work differently than OpenAI
  • Tool calling support depends on xAI API capabilities
  • Voice selection limited to xAI's available voices

API Reference

See BYOK API Reference for managing xAI API keys.

Modified at 2026-01-30 12:25:48
Previous
Campaigns
Next
Voice selection psychology
Built with