Global rate limiting imposes a system-wide limit on all requests passing through the gateway. This is useful as an upper bound for overall capacity protection, ensuring the gateway or downstream services aren’t overwhelmed by total traffic.Documentation Index
Fetch the complete documentation index at: https://docs.neuraltrust.ai/llms.txt
Use this file to discover all available pages before exploring further.
Overview
- What it does: Caps the total request volume across all IPs and users.
- Common use cases:
- Protecting shared infrastructure.
- Enforcing service-level quotas or performance thresholds.
Basic Configuration
Below is an example showing how to enable global limits:Configuration Fields
- limit Maximum number of requests allowed for each user within the specified window.
-
window
Time frame (e.g.,
1m,30s) for measuring requests. -
actions
- type:
reject: Returns 429 status with retry informationblock: Similar to reject but for permanent blocks
- retry_after: Seconds to wait before retrying
- type:
Window Configuration
Thewindow parameter supports any valid duration string:
s: seconds (e.g., ”30s”)m: minutes (e.g., “5m”)h: hours (e.g., “1h”)d: days (e.g., “1d”)
Response Headers
The rate limiter adds the following headers to each response:Per Limit Type Headers
{type} is one of:
globalper_ipper_userper_fingerprint
Rate Limit Exceeded Response
Implementation Details
Storage and Tracking
- Uses Redis sorted sets for tracking
- Key format:
ratelimit:{level}:{id}:{limit_type}:{key} - Automatic cleanup of expired entries
- Thread-safe operations