Provider Pricing

Last updated: March 2026

All prices in USD. These are the API costs charged by providers - not HMS Sovereign pricing to customers.

Speech-to-Text (STT)

Deepgram

Model	Price per Minute
Nova 3 (Multilingual)	$0.0092
Nova 3 (Monolingual)	$0.0077
Nova 2	$0.0058
Nova 1	$0.0058
Enhanced	$0.0165
Base	$0.0145

Note: Prices are Pay-As-You-Go tier. Growth tier is ~17% cheaper.

Gladia

Model	Price per Hour
Solaria (Async)	$0.61
Solaria (Real-time)	$0.75

Converted to per minute: ~$0.0102/min (async), ~$0.0125/min (real-time)

Language Models (LLM)

OpenAI

Prices per 1M tokens.

Model	Input	Output
GPT-5 Mini	$0.25	$2.00
GPT-4.1	$2.00	$8.00
GPT-4.1 Mini	$0.40	$1.60
GPT-4.1 Nano	$0.10	$0.40
GPT-4o	$2.50	$10.00
GPT-4o (2024-05-13)	$5.00	$15.00
GPT-4o Mini	$0.15	$0.60
GPT-4 Turbo	$10.00	$30.00
GPT-4	$30.00	$60.00
GPT-4 32K	$60.00	$120.00
GPT-3.5 Turbo	$0.50	$1.50
GPT-3.5 Turbo 16K	$3.00	$4.00

Recommended for voice assistants: GPT-5 Mini (best value), GPT-4o Mini (fastest), GPT-4.1 Mini (balanced)

Mistral

Prices per 1M tokens.

Model	Input	Output
Mistral Large	$0.50	$1.50
Mistral Medium	$0.40	$2.00
Mistral Small	$0.10	$0.30
Ministral 8B	$0.15	$0.15
Ministral 3B	$0.10	$0.10
Codestral	$0.30	$0.90
Mixtral 8x7B	$0.70	$0.70
Mixtral 8x22B	$2.00	$6.00

Recommended for voice assistants: Mistral Small (fast + cheap), Mistral Medium (balanced)

xAI (Grok)

Prices per 1M tokens.

Model	Input	Output
Grok 4.1 Fast	$0.20	$0.50
Grok 4 Fast	$0.20	$0.50
Grok Code Fast 1	$0.20	$1.50
Grok 4 (0709)	$3.00	$15.00
Grok 3 Mini	$0.30	$0.50
Grok 3	$3.00	$15.00

Realtime API (Speech-to-Speech):

Model	Price
Grok Realtime v1	$0.05/min ($3.00/hr)

Recommended: Grok 4.1 Fast (best value), Grok Realtime (for S2S)

Text-to-Speech (TTS)

ElevenLabs

Prices per 1,000 characters. Based on Creator tier ($22/mo).

Model	Price per 1K chars
Flash v2.5	$0.11
Turbo v2.5	$0.11
Eleven v3	$0.22
Multilingual v2	$0.22
Monolingual v1	$0.22

Tier pricing breakdown:

Tier	Flash/Turbo per 1K	Multilingual per 1K
Free	N/A	$0.17
Starter ($5)	$0.08	$0.17
Creator ($22)	$0.11	$0.22
Pro ($99)	$0.10	$0.20
Scale ($330)	$0.08	$0.17
Business ($1,320)	$0.06	$0.12

Recommended: Flash v2.5 (fastest, cheapest), Multilingual v2 (best quality)

Inworld

Prices per 1,000,000 characters (On-demand tier).

Model	Price per 1M chars	Per 1K chars
TTS 1.5 Mini	$5.00	$0.005
TTS 1.5 Max	$10.00	$0.01
TTS 1	$5.00	$0.005
TTS 1 Max	$10.00	$0.01

Note: Inworld is ~20x cheaper than ElevenLabs! At 650 chars/min:

Inworld 1.5-Mini: $0.00325/min
Inworld 1.5-Max: $0.0065/min
ElevenLabs Flash: $0.0715/min

How These Costs Map to Your Bill

The tables above are the raw provider rates. How they reach your HMS Sovereign bill depends on the mode you run an assistant in. By default an assistant runs on HMS Sovereign's platform keys — every provider listed here works out of the box, with no API keys of your own required.

Mode	What you pay
Platform keys (default)	Model usage at cost + €0,07/min orchestration
Vertex AI Live (Google Gemini realtime)	€0,25/min all-in
Bring Your Own Key (optional)	Your provider's usage (billed by them) + €0,07/min orchestration
Local models (Whisper, Piper)	Free + €0,07/min orchestration

1 credit = €0,07 (one minute of orchestration).

On platform keys, the model usage from the tables above is passed through at cost and the only HMS Sovereign markup is the €0,07/min orchestration fee. Choosing a lighter STT/LLM/TTS combination lowers the at-cost model portion of your bill; the orchestration fee stays the same. With Bring Your Own Key (optional), your provider bills you directly for that usage instead and HMS Sovereign charges only the €0,07/min orchestration fee. Local models (Whisper, Piper) carry no model-usage charge, leaving just the €0,07/min orchestration fee.

On this page