Weighted round-robin is a variation of the round-robin load balancing strategy. Instead of cycling evenly through the available targets, each target is assigned a weight that reflects its share of the traffic. This allows you to proportionally direct traffic based on the capacity, performance, or other criteria of each backend service.
Core Concept: Requests are distributed in a round-robin fashion, but targets with higher weights receive a proportionally larger share of the requests.
Fine-Grained Control: You can increase or decrease a target’s weight to adjust how much traffic it handles, making it an excellent approach for deployments with varying resource capacities.
Below is an example command to create an Upstream using the weighted-round-robin algorithm. The sample config includes two targets—one for OpenAI and another for Anthropic—each assigned a weight to dictate how traffic is balanced.