VoiceDock Docs
Reference

Rate Limits

API rate limits, response headers, and best practices for staying within limits.

HMS Sovereign applies rate limits to ensure fair usage and maintain service quality for all users.

Current Limits

Limit TypeRateScope
API Requests100 requests/minutePer API key
Call Control10 commands/minutePer active call

Rate Limit Headers

Every API response includes headers to help you track your usage:

HeaderDescription
X-RateLimit-LimitMaximum requests allowed
X-RateLimit-RemainingRequests remaining in current window
X-RateLimit-ResetUnix timestamp when the limit resets

Exceeding the Limit

When you exceed the rate limit, the API returns:

HTTP/1.1 429 Too Many Requests
X-RateLimit-Limit: 100
X-RateLimit-Remaining: 0
X-RateLimit-Reset: 1702479600
Retry-After: 45
{
  "error": {
    "code": "rate_limit_exceeded",
    "message": "Too many requests. Please retry after 45 seconds."
  }
}

Best Practices

1. Monitor Rate Limit Headers

Check headers before making requests:

const response = await fetch(url, options);
const remaining = response.headers.get('X-RateLimit-Remaining');

if (remaining < 10) {
  console.warn('Approaching rate limit:', remaining, 'requests remaining');
}

2. Implement Exponential Backoff

When you receive a 429 response:

async function fetchWithRetry(url, options, maxRetries = 3) {
  for (let attempt = 0; attempt < maxRetries; attempt++) {
    const response = await fetch(url, options);
    
    if (response.status !== 429) {
      return response;
    }
    
    const retryAfter = response.headers.get('Retry-After') || Math.pow(2, attempt);
    await new Promise(resolve => setTimeout(resolve, retryAfter * 1000));
  }
  
  throw new Error('Max retries exceeded');
}

3. Cache Responses

Cache data that doesn't change frequently:

const cache = new Map();
const CACHE_TTL = 60000; // 1 minute

async function getAgent(agentId) {
  const cacheKey = `assistant:${agentId}`;
  const cached = cache.get(cacheKey);
  
  if (cached && Date.now() - cached.timestamp < CACHE_TTL) {
    return cached.data;
  }
  
  const response = await fetch(`/assistants/${agentId}`);
  const data = await response.json();
  
  cache.set(cacheKey, { data, timestamp: Date.now() });
  return data;
}

4. Batch Operations

Instead of multiple individual requests, use batch-friendly patterns:

// Instead of this:
for (const id of agentIds) {
  const assistant = await getAgent(id);  // N requests
}

// Do this:
const assistants = await listAgents();  // 1 request
const relevantAgents = assistants.filter(a => agentIds.includes(a.id));

5. Use Webhooks for Real-Time Data

Instead of polling for call status, use webhooks:

{
  "webhook_url": "https://your-domain.com/webhook",
  "webhook_events": ["status-update", "end-of-call-report"]
}

Call Control Limits

Call control commands have a separate limit of 10 commands per minute per active call. This prevents abuse while allowing normal interaction patterns.

Examples that count toward the limit:

  • inject-context
  • say
  • end-call
  • transfer

Higher Limits

If you need higher rate limits for your use case, contact support@hmsovereign.com with:

  • Your organization ID
  • Expected request volume
  • Use case description

On this page