Provider Pricing

Last updated: March 2026

All prices in USD. These are the API costs charged by providers - not HMS Sovereign pricing to customers.

Speech-to-Text (STT)

Deepgram

Model	Price per Minute
Nova 3 (Multilingual)	$0.0092
Nova 3 (Monolingual)	$0.0077
Nova 2	$0.0058
Nova 1	$0.0058
Enhanced	$0.0165
Base	$0.0145

Note: Prices are Pay-As-You-Go tier. Growth tier is ~17% cheaper.

Gladia

Model	Price per Hour
Solaria (Async)	$0.61
Solaria (Real-time)	$0.75

Converted to per minute: ~

0.0102/ min (a sy n c),

0.0125/min (real-time)

Language Models (LLM)

OpenAI

Prices per 1M tokens.

Model	Input	Output
GPT-5 Mini	$0.25	$2.00
GPT-4.1	$2.00	$8.00
GPT-4.1 Mini	$0.40	$1.60
GPT-4.1 Nano	$0.10	$0.40
GPT-4o	$2.50	$10.00
GPT-4o (2024-05-13)	$5.00	$15.00
GPT-4o Mini	$0.15	$0.60
GPT-4 Turbo	$10.00	$30.00
GPT-4	$30.00	$60.00
GPT-4 32K	$60.00	$120.00
GPT-3.5 Turbo	$0.50	$1.50
GPT-3.5 Turbo 16K	$3.00	$4.00

Recommended for voice assistants: GPT-5 Mini (best value), GPT-4o Mini (fastest), GPT-4.1 Mini (balanced)

Mistral

Prices per 1M tokens.

Model	Input	Output
Mistral Large	$0.50	$1.50
Mistral Medium	$0.40	$2.00
Mistral Small	$0.10	$0.30
Ministral 8B	$0.15	$0.15
Ministral 3B	$0.10	$0.10
Codestral	$0.30	$0.90
Mixtral 8x7B	$0.70	$0.70
Mixtral 8x22B	$2.00	$6.00

Recommended for voice assistants: Mistral Small (fast + cheap), Mistral Medium (balanced)

xAI (Grok)

Prices per 1M tokens.

Model	Input	Output
Grok 4.1 Fast	$0.20	$0.50
Grok 4 Fast	$0.20	$0.50
Grok Code Fast 1	$0.20	$1.50
Grok 4 (0709)	$3.00	$15.00
Grok 3 Mini	$0.30	$0.50
Grok 3	$3.00	$15.00

Realtime API (Speech-to-Speech):

Model	Price
Grok Realtime v1	$0.05/ min ($ 3.00/hr)

Recommended: Grok 4.1 Fast (best value), Grok Realtime (for S2S)

Text-to-Speech (TTS)

ElevenLabs

Prices per 1,000 characters. Based on Creator tier ($22/mo).

Model	Price per 1K chars
Flash v2.5	$0.11
Turbo v2.5	$0.11
Eleven v3	$0.22
Multilingual v2	$0.22
Monolingual v1	$0.22

Tier pricing breakdown:

Tier	Flash/Turbo per 1K	Multilingual per 1K
Free	N/A	$0.17
Starter ($5)	$0.08	$0.17
Creator ($22)	$0.11	$0.22
Pro ($99)	$0.10	$0.20
Scale ($330)	$0.08	$0.17
Business ($1,320)	$0.06	$0.12

Recommended: Flash v2.5 (fastest, cheapest), Multilingual v2 (best quality)

Inworld

Prices per 1,000,000 characters (On-demand tier).

Model	Price per 1M chars	Per 1K chars
TTS 1.5 Mini	$5.00	$0.005
TTS 1.5 Max	$10.00	$0.01
TTS 1	$5.00	$0.005
TTS 1 Max	$10.00	$0.01

Note: Inworld is ~20x cheaper than ElevenLabs! At 650 chars/min:

Inworld 1.5-Mini: $0.00325/min

Inworld 1.5-Max: $0.0065/min

ElevenLabs Flash: $0.0715/min

Cost Estimation per Minute of Voice Conversation

Typical conversation metrics (based on real call data):

STT: ~60 seconds audio

LLM: ~500 input tokens, ~200 output tokens per turn, ~10 turns = 5,000 input + 2,000 output

TTS: ~1,200 characters (measured from actual 62s call)

Example: Budget Setup (Deepgram Nova 3 + GPT-5 Mini + ElevenLabs Flash)

Component	Usage	Cost
STT	1 min	$0.0077
LLM Input	5K tokens	$0.00125
LLM Output	2K tokens	$0.004
TTS	1.2K chars	$0.132
Total		~$0.145/min

Example: Quality Setup (Deepgram Nova 3 + GPT-4o + ElevenLabs Multilingual v2)

Component	Usage	Cost
STT	1 min	$0.0077
LLM Input	5K tokens	$0.0125
LLM Output	2K tokens	$0.02
TTS	1.2K chars	$0.264
Total		~$0.304/min

Example: Grok Realtime (Speech-to-Speech)

Component	Usage	Cost
S2S	1 min	$0.05
Total		$0.05/min

Pricing Strategy Notes

Current HMS Sovereign pricing:

BYOK: €0.07/min (orchestration only)

Platform keys: €0.30/min (flat rate, includes provider costs)

Margin at €0.30/min with Budget Setup:

Provider cost: ~~$0.145 (~~€0.134)

HMS margin: €0.166

Margin: ~55%

Margin at €0.30/min with Quality Setup:

Provider cost: ~~$0.304 (~~€0.281)

HMS margin: €0.019

Margin: ~6% (BARELY PROFITABLE!)

Margin at €0.30/min with Grok Realtime:

Provider cost: $0.05 (~€0.046)

HMS margin: €0.254

Margin: ~85%

Warning: ElevenLabs is the dominant cost driver. With Multilingual v2, margins are razor thin at €0.30/min. Consider:

Higher pricing for premium voices

Restricting platform keys to Flash models only

Moving to Business tier ($0.06/1K) to cut TTS costs in half

Speech-to-Text (STT)#

Deepgram#

Gladia#

Language Models (LLM)#

OpenAI#

Mistral#

xAI (Grok)#

Text-to-Speech (TTS)#

ElevenLabs#

Inworld#

Cost Estimation per Minute of Voice Conversation#

Example: Budget Setup (Deepgram Nova 3 + GPT-5 Mini + ElevenLabs Flash)#

Example: Quality Setup (Deepgram Nova 3 + GPT-4o + ElevenLabs Multilingual v2)#

Example: Grok Realtime (Speech-to-Speech)#

Pricing Strategy Notes#

Speech-to-Text (STT)

Deepgram

Gladia

Language Models (LLM)

OpenAI

Mistral

xAI (Grok)

Text-to-Speech (TTS)

ElevenLabs

Inworld

Cost Estimation per Minute of Voice Conversation

Example: Budget Setup (Deepgram Nova 3 + GPT-5 Mini + ElevenLabs Flash)

Example: Quality Setup (Deepgram Nova 3 + GPT-4o + ElevenLabs Multilingual v2)

Example: Grok Realtime (Speech-to-Speech)

Pricing Strategy Notes