1. integrations
VoiceDock
  • introduction
  • configuration
    • analysis-templates
    • custom-tools
    • sip-trunks
    • tool-templates
  • core-concepts
    • assistants
    • calls
    • phone-numbers
    • webhooks
  • features
    • ai-generation
    • autonomous-silence-handling
    • call-analysis
    • call-transfers
    • campaigns-setup
    • campaigns
    • privacy-compliance
    • voicemail-detection
    • web-calls
  • get-started
    • authentication
    • quickstart
  • guides
    • data-processing-agreement
  • integrations
    • byok-setup
    • mcp-server
    • provider-pricing
    • xai-grok-integration
  • platform
    • billing
    • dashboard-security
    • eu-data-sovereignty
    • privacy-policy
    • voice-selection-psychology
    • whitelabel
  • reference
    • error-codes
    • rate-limits
    • troubleshooting
  • sdks
    • node
  • webhooks
    • assistant-request
    • end-of-call-report
    • overview
    • security
    • status-update
    • tool-calls
Book a meeting
Linkedin
Github
📄 Documentation
🔌 API Reference🤖 MCP📦 SDK🟢 Status
📄 Documentation
🔌 API Reference🤖 MCP📦 SDK🟢 Status
  1. integrations

provider-pricing

Last updated: March 2026
All prices in USD. These are the API costs charged by providers - not HMS Sovereign pricing to customers.

Speech-to-Text (STT)#

Deepgram#

ModelPrice per Minute
Nova 3 (Multilingual)$0.0092
Nova 3 (Monolingual)$0.0077
Nova 2$0.0058
Nova 1$0.0058
Enhanced$0.0165
Base$0.0145
Note: Prices are Pay-As-You-Go tier. Growth tier is ~17% cheaper.

Gladia#

ModelPrice per Hour
Solaria (Async)$0.61
Solaria (Real-time)$0.75
Converted to per minute: ~0.0102/min(async), 0.0125/min (real-time)

Language Models (LLM)#

OpenAI#

Prices per 1M tokens.
ModelInputOutput
GPT-5 Mini$0.25$2.00
GPT-4.1$2.00$8.00
GPT-4.1 Mini$0.40$1.60
GPT-4.1 Nano$0.10$0.40
GPT-4o$2.50$10.00
GPT-4o (2024-05-13)$5.00$15.00
GPT-4o Mini$0.15$0.60
GPT-4 Turbo$10.00$30.00
GPT-4$30.00$60.00
GPT-4 32K$60.00$120.00
GPT-3.5 Turbo$0.50$1.50
GPT-3.5 Turbo 16K$3.00$4.00
Recommended for voice assistants: GPT-5 Mini (best value), GPT-4o Mini (fastest), GPT-4.1 Mini (balanced)

Mistral#

Prices per 1M tokens.
ModelInputOutput
Mistral Large$0.50$1.50
Mistral Medium$0.40$2.00
Mistral Small$0.10$0.30
Ministral 8B$0.15$0.15
Ministral 3B$0.10$0.10
Codestral$0.30$0.90
Mixtral 8x7B$0.70$0.70
Mixtral 8x22B$2.00$6.00
Recommended for voice assistants: Mistral Small (fast + cheap), Mistral Medium (balanced)

xAI (Grok)#

Prices per 1M tokens.
ModelInputOutput
Grok 4.1 Fast$0.20$0.50
Grok 4 Fast$0.20$0.50
Grok Code Fast 1$0.20$1.50
Grok 4 (0709)$3.00$15.00
Grok 3 Mini$0.30$0.50
Grok 3$3.00$15.00
Realtime API (Speech-to-Speech):
ModelPrice
Grok Realtime v10.05/min(3.00/hr)
Recommended: Grok 4.1 Fast (best value), Grok Realtime (for S2S)

Text-to-Speech (TTS)#

ElevenLabs#

Prices per 1,000 characters. Based on Creator tier ($22/mo).
ModelPrice per 1K chars
Flash v2.5$0.11
Turbo v2.5$0.11
Eleven v3$0.22
Multilingual v2$0.22
Monolingual v1$0.22
Tier pricing breakdown:
TierFlash/Turbo per 1KMultilingual per 1K
FreeN/A$0.17
Starter ($5)$0.08$0.17
Creator ($22)$0.11$0.22
Pro ($99)$0.10$0.20
Scale ($330)$0.08$0.17
Business ($1,320)$0.06$0.12
Recommended: Flash v2.5 (fastest, cheapest), Multilingual v2 (best quality)

Inworld#

Prices per 1,000,000 characters (On-demand tier).
ModelPrice per 1M charsPer 1K chars
TTS 1.5 Mini$5.00$0.005
TTS 1.5 Max$10.00$0.01
TTS 1$5.00$0.005
TTS 1 Max$10.00$0.01
Note: Inworld is ~20x cheaper than ElevenLabs! At 650 chars/min:
Inworld 1.5-Mini: $0.00325/min
Inworld 1.5-Max: $0.0065/min
ElevenLabs Flash: $0.0715/min

Cost Estimation per Minute of Voice Conversation#

Typical conversation metrics (based on real call data):
STT: ~60 seconds audio
LLM: ~500 input tokens, ~200 output tokens per turn, ~10 turns = 5,000 input + 2,000 output
TTS: ~1,200 characters (measured from actual 62s call)

Example: Budget Setup (Deepgram Nova 3 + GPT-5 Mini + ElevenLabs Flash)#

ComponentUsageCost
STT1 min$0.0077
LLM Input5K tokens$0.00125
LLM Output2K tokens$0.004
TTS1.2K chars$0.132
Total~$0.145/min

Example: Quality Setup (Deepgram Nova 3 + GPT-4o + ElevenLabs Multilingual v2)#

ComponentUsageCost
STT1 min$0.0077
LLM Input5K tokens$0.0125
LLM Output2K tokens$0.02
TTS1.2K chars$0.264
Total~$0.304/min

Example: Grok Realtime (Speech-to-Speech)#

ComponentUsageCost
S2S1 min$0.05
Total$0.05/min

Pricing Strategy Notes#

Current HMS Sovereign pricing:
BYOK: €0.07/min (orchestration only)
Platform keys: €0.30/min (flat rate, includes provider costs)
Margin at €0.30/min with Budget Setup:
Provider cost: $0.145 (€0.134)
HMS margin: €0.166
Margin: ~55%
Margin at €0.30/min with Quality Setup:
Provider cost: $0.304 (€0.281)
HMS margin: €0.019
Margin: ~6% (BARELY PROFITABLE!)
Margin at €0.30/min with Grok Realtime:
Provider cost: $0.05 (~€0.046)
HMS margin: €0.254
Margin: ~85%
Warning: ElevenLabs is the dominant cost driver. With Multilingual v2, margins are razor thin at €0.30/min. Consider:
1.
Higher pricing for premium voices
2.
Restricting platform keys to Flash models only
3.
Moving to Business tier ($0.06/1K) to cut TTS costs in half
Modified at 2026-05-04 13:09:52
Previous
mcp-server
Next
xai-grok-integration
Built with