xAI Grok Integration

Use xAI Grok Realtime API for speech-to-speech conversation with sub-700ms latency.

xAI's Grok Realtime API provides speech-to-speech conversation with <700ms latency. Unlike traditional voice AI (STT → LLM → TTS), Grok processes audio directly in a single model.

Setup

xAI Grok is available as a provider choice on HMS Sovereign's platform keys by default — you can select it when configuring an assistant without adding any API key first.

1. (Optional) Add your own xAI API Key

This step is optional, only needed to run Grok on your own xAI account. To use platform keys, skip straight to step 2. To bring your own key, navigate to Integrations → API Keys tab and add it:

curl -X POST https://api.hmsovereign.com/api/v1/byok 
  -H "Authorization: Bearer YOUR_API_KEY" 
  -H "Content-Type: application/json" 
  -d '{
    "provider": "xai",
    "api_key": "xai-..."
  }'

2. Configure Assistant

When creating or editing an assistant with xAI configured:

Provider: Select "xAI Realtime"
Model: grok-realtime-v1
Voice: ara (or other available voices)

Note: When using xAI Realtime, separate STT/TTS providers are ignored.

Pricing

Platform keys (default): xAI model usage at cost + €0,07/min orchestration. No API key setup needed.
Bring Your Own Key (optional): €0,07/min orchestration, with xAI API usage billed directly to your own xAI account and no markup on it.

Differences from Traditional Mode

Feature	Traditional (STT+LLM+TTS)	xAI Realtime
Latency	~1-2 seconds	<700ms
Providers	3 separate	Single (xAI)
Voice Quality	Depends on TTS provider	Native to model
Custom Tools	Supported via `llm_config.tools`	Check xAI docs for support
API Keys	None required (platform keys); optionally BYOK 3 providers (STT, LLM, TTS)	None required (platform keys); optionally BYOK 1 provider (xAI)

Limitations

Custom system prompts may work differently than OpenAI
Tool calling support depends on xAI API capabilities
Voice selection limited to xAI's available voices

API Reference

See BYOK API Reference for managing xAI API keys.