Integrations
xAI Grok Integration
Use xAI Grok Realtime API for speech-to-speech conversation with sub-700ms latency.
xAI's Grok Realtime API provides speech-to-speech conversation with <700ms latency. Unlike traditional voice AI (STT → LLM → TTS), Grok processes audio directly in a single model.
Setup
1. Add xAI API Key
Navigate to Integrations → API Keys tab and add your xAI API key:
curl -X POST https://api.hmsovereign.com/api/v1/byok
-H "Authorization: Bearer YOUR_API_KEY"
-H "Content-Type: application/json"
-d '{
"provider": "xai",
"api_key": "xai-..."
}'2. Configure Assistant
When creating or editing an assistant with xAI configured:
- Provider: Select "xAI Realtime"
- Model:
grok-realtime-v1 - Voice:
ara(or other available voices)
Note: When using xAI Realtime, separate STT/TTS providers are ignored.
Pricing
xAI Realtime uses BYOK pricing:
- €0,07/minute when using your xAI API key
- Direct billing to your xAI account
- No markup on xAI API usage
Differences from Traditional Mode
| Feature | Traditional (STT+LLM+TTS) | xAI Realtime |
|---|---|---|
| Latency | ~1-2 seconds | <700ms |
| Providers | 3 separate | Single (xAI) |
| Voice Quality | Depends on TTS provider | Native to model |
| Custom Tools | Supported via llm_config.tools | Check xAI docs for support |
| BYOK Keys Required | 3 keys (STT, LLM, TTS) | 1 key (xAI) |
Limitations
- Custom system prompts may work differently than OpenAI
- Tool calling support depends on xAI API capabilities
- Voice selection limited to xAI's available voices
API Reference
See BYOK API Reference for managing xAI API keys.