Least Connections
Least Connections is a dynamic load balancing strategy in which the gateway directs each incoming request to the target with the fewest active connections at the time of dispatch. This helps ensure that no single target is overwhelmed, particularly useful in scenarios where each request can have a long-lived or resource-intensive connection.
How It Works
-
Connection Count Tracking The gateway monitors the number of active (or inflight) connections to each target.
-
Decision Process When a new request arrives, the gateway routes the request to the target with the lowest active connection count, balancing the load in real time.
-
Adaptive Distribution As traffic fluctuates, targets frequently shift in and out of the “least connections” spot. This approach is ideal when you have backends with relatively similar performance characteristics and want an adaptive load balancing method.
Example: Creating an Upstream with Least Connections
Below is an example curl
command demonstrating how to create an Upstream using the least-connections load balancing algorithm. It sets up two targets—one pointing to OpenAI and another to Anthropic.