Provider Pricing
Reference pricing for STT, LLM, and TTS providers used with HMS Sovereign.
Last updated: March 2026
All prices in USD. These are the API costs charged by providers - not HMS Sovereign pricing to customers.
Speech-to-Text (STT)
Deepgram
| Model | Price per Minute |
|---|---|
| Nova 3 (Multilingual) | $0.0092 |
| Nova 3 (Monolingual) | $0.0077 |
| Nova 2 | $0.0058 |
| Nova 1 | $0.0058 |
| Enhanced | $0.0165 |
| Base | $0.0145 |
Note: Prices are Pay-As-You-Go tier. Growth tier is ~17% cheaper.
Gladia
| Model | Price per Hour |
|---|---|
| Solaria (Async) | $0.61 |
| Solaria (Real-time) | $0.75 |
Converted to per minute: ~$0.0102/min (async), ~$0.0125/min (real-time)
Language Models (LLM)
OpenAI
Prices per 1M tokens.
| Model | Input | Output |
|---|---|---|
| GPT-5 Mini | $0.25 | $2.00 |
| GPT-4.1 | $2.00 | $8.00 |
| GPT-4.1 Mini | $0.40 | $1.60 |
| GPT-4.1 Nano | $0.10 | $0.40 |
| GPT-4o | $2.50 | $10.00 |
| GPT-4o (2024-05-13) | $5.00 | $15.00 |
| GPT-4o Mini | $0.15 | $0.60 |
| GPT-4 Turbo | $10.00 | $30.00 |
| GPT-4 | $30.00 | $60.00 |
| GPT-4 32K | $60.00 | $120.00 |
| GPT-3.5 Turbo | $0.50 | $1.50 |
| GPT-3.5 Turbo 16K | $3.00 | $4.00 |
Recommended for voice assistants: GPT-5 Mini (best value), GPT-4o Mini (fastest), GPT-4.1 Mini (balanced)
Mistral
Prices per 1M tokens.
| Model | Input | Output |
|---|---|---|
| Mistral Large | $0.50 | $1.50 |
| Mistral Medium | $0.40 | $2.00 |
| Mistral Small | $0.10 | $0.30 |
| Ministral 8B | $0.15 | $0.15 |
| Ministral 3B | $0.10 | $0.10 |
| Codestral | $0.30 | $0.90 |
| Mixtral 8x7B | $0.70 | $0.70 |
| Mixtral 8x22B | $2.00 | $6.00 |
Recommended for voice assistants: Mistral Small (fast + cheap), Mistral Medium (balanced)
xAI (Grok)
Prices per 1M tokens.
| Model | Input | Output |
|---|---|---|
| Grok 4.1 Fast | $0.20 | $0.50 |
| Grok 4 Fast | $0.20 | $0.50 |
| Grok Code Fast 1 | $0.20 | $1.50 |
| Grok 4 (0709) | $3.00 | $15.00 |
| Grok 3 Mini | $0.30 | $0.50 |
| Grok 3 | $3.00 | $15.00 |
Realtime API (Speech-to-Speech):
| Model | Price |
|---|---|
| Grok Realtime v1 | $0.05/min ($3.00/hr) |
Recommended: Grok 4.1 Fast (best value), Grok Realtime (for S2S)
Text-to-Speech (TTS)
ElevenLabs
Prices per 1,000 characters. Based on Creator tier ($22/mo).
| Model | Price per 1K chars |
|---|---|
| Flash v2.5 | $0.11 |
| Turbo v2.5 | $0.11 |
| Eleven v3 | $0.22 |
| Multilingual v2 | $0.22 |
| Monolingual v1 | $0.22 |
Tier pricing breakdown:
| Tier | Flash/Turbo per 1K | Multilingual per 1K |
|---|---|---|
| Free | N/A | $0.17 |
| Starter ($5) | $0.08 | $0.17 |
| Creator ($22) | $0.11 | $0.22 |
| Pro ($99) | $0.10 | $0.20 |
| Scale ($330) | $0.08 | $0.17 |
| Business ($1,320) | $0.06 | $0.12 |
Recommended: Flash v2.5 (fastest, cheapest), Multilingual v2 (best quality)
Inworld
Prices per 1,000,000 characters (On-demand tier).
| Model | Price per 1M chars | Per 1K chars |
|---|---|---|
| TTS 1.5 Mini | $5.00 | $0.005 |
| TTS 1.5 Max | $10.00 | $0.01 |
| TTS 1 | $5.00 | $0.005 |
| TTS 1 Max | $10.00 | $0.01 |
Note: Inworld is ~20x cheaper than ElevenLabs! At 650 chars/min:
- Inworld 1.5-Mini: $0.00325/min
- Inworld 1.5-Max: $0.0065/min
- ElevenLabs Flash: $0.0715/min
Cost Estimation per Minute of Voice Conversation
Typical conversation metrics (based on real call data):
- STT: ~60 seconds audio
- LLM: ~500 input tokens, ~200 output tokens per turn, ~10 turns = 5,000 input + 2,000 output
- TTS: ~1,200 characters (measured from actual 62s call)
Example: Budget Setup (Deepgram Nova 3 + GPT-5 Mini + ElevenLabs Flash)
| Component | Usage | Cost |
|---|---|---|
| STT | 1 min | $0.0077 |
| LLM Input | 5K tokens | $0.00125 |
| LLM Output | 2K tokens | $0.004 |
| TTS | 1.2K chars | $0.132 |
| Total | ~$0.145/min |
Example: Quality Setup (Deepgram Nova 3 + GPT-4o + ElevenLabs Multilingual v2)
| Component | Usage | Cost |
|---|---|---|
| STT | 1 min | $0.0077 |
| LLM Input | 5K tokens | $0.0125 |
| LLM Output | 2K tokens | $0.02 |
| TTS | 1.2K chars | $0.264 |
| Total | ~$0.304/min |
Example: Grok Realtime (Speech-to-Speech)
| Component | Usage | Cost |
|---|---|---|
| S2S | 1 min | $0.05 |
| Total | $0.05/min |
Pricing Strategy Notes
Current HMS Sovereign pricing:
- BYOK: €0.07/min (orchestration only)
- Platform keys: €0.30/min (flat rate, includes provider costs)
Margin at €0.30/min with Budget Setup:
- Provider cost:
$0.145 (€0.134) - HMS margin: €0.166
- Margin: ~55%
Margin at €0.30/min with Quality Setup:
- Provider cost:
$0.304 (€0.281) - HMS margin: €0.019
- Margin: ~6% (BARELY PROFITABLE!)
Margin at €0.30/min with Grok Realtime:
- Provider cost: $0.05 (~€0.046)
- HMS margin: €0.254
- Margin: ~85%
Warning: ElevenLabs is the dominant cost driver. With Multilingual v2, margins are razor thin at €0.30/min. Consider:
- Higher pricing for premium voices
- Restricting platform keys to Flash models only
- Moving to Business tier ($0.06/1K) to cut TTS costs in half