Strategy | Description |
---|---|
Per IP | Limits requests based on client IP address. Useful for blocking abusive IPs or preventing spam. |
Per User ID | Tracks usage per authenticated user. Ideal for SaaS and authenticated API scenarios. |
Global | Applies a global cap across all users and IPs. Acts as a system-wide fail-safe against overload. |
Token-Based | Controls requests based on token consumption (e.g., LLM usage). Especially useful for AI workloads. |
limit
: Maximum allowed requests or tokens.window
: Duration in which the limit applies (e.g., 30s
, 1m
, 1h
).actions
: What to do when limits are exceeded (e.g., reject
, block
, or retry_after
).headers
: Rate limit feedback headers are automatically added to responses.{type}
is one of: global
, per_ip
, per_user
, or tokens
.
retry_after
to guide clients on when they can retry.