Skip to main content
TrustGate ships two rate-limiting policies backed by Redis counters. Both follow their policy scope: per consumer when the policy targets consumers, gateway-wide when the policy is global. A group_by_header sub-partitions the counter within that scope (e.g. per end-user or tenant).

ratelimit — request rate limiting

Counts requests in a sliding window.
SettingTypeNotes
limitintMax requests per window.
windowdurationGo duration string: 30s, 1m, 1h.
retry_afterstringRetry-After value when limited (default 60).
group_by_headerstringOptional sub-partition key (e.g. a tenant header).
{ "slug": "ratelimit", "settings": { "limit": 100, "window": "1m" } }
Responses carry X-RateLimit-Limit, X-RateLimit-Remaining, X-RateLimit-Reset, and (when limited) Retry-After.

tokenratelimit — token rate limiting

Counts prompt + completion tokens from the provider’s usage block — the right control for LLM cost, since a few large requests can cost more than many small ones.
SettingTypeNotes
window.unitenumsecond · minute · hour · day.
window.maxintMax tokens per window.
group_by_headerstringOptional sub-partition key.
{ "slug": "tokenratelimit", "settings": { "window": { "unit": "minute", "max": 50000 } } }

Choosing a scope

  • Global policy → a gateway-wide ceiling protecting your upstream spend.
  • Consumer-scoped policy → per-tenant quotas.
  • group_by_header → fairness within a tenant (per end-user), without a policy per user.