Load Balancing
Weighted Round-Robin
Weighted round-robin is a variation of the round-robin load balancing strategy. Instead of cycling evenly through the available targets, each target is assigned a weight that reflects its share of the traffic. This allows you to proportionally direct traffic based on the capacity, performance, or other criteria of each backend service.
- Core Concept: Requests are distributed in a round-robin fashion, but targets with higher weights receive a proportionally larger share of the requests.
- Fine-Grained Control: You can increase or decrease a target’s weight to adjust how much traffic it handles, making it an excellent approach for deployments with varying resource capacities.
Create an Upstream with Weighted Round-Robin
Below is an example command to create an Upstream using the weighted-round-robin algorithm. The sample config includes two targets—one for OpenAI and another for Anthropic—each assigned a weight to dictate how traffic is balanced.