Documentation Index
Fetch the complete documentation index at: https://docs.neuraltrust.ai/llms.txt
Use this file to discover all available pages before exploring further.
Weighted round-robin is a variation of the round-robin load balancing strategy. Instead of cycling evenly through the available targets, each target is assigned a weight that reflects its share of the traffic. This allows you to proportionally direct traffic based on the capacity, performance, or other criteria of each backend service.
- Core Concept: Requests are distributed in a round-robin fashion, but targets with higher weights receive a proportionally larger share of the requests.
- Fine-Grained Control: You can increase or decrease a target’s weight to adjust how much traffic it handles, making it an excellent approach for deployments with varying resource capacities.
Create an Upstream with Weighted Round-Robin
Below is an example command to create an Upstream using the weighted-round-robin algorithm. The sample config includes two targets—one for OpenAI and another for Anthropic—each assigned a weight to dictate how traffic is balanced.
# Create an upstream with weighted distribution
curl -X POST http://localhost:8080/api/v1/gateways/{gateway-id}/upstreams \
-H "Content-Type: application/json" \
-d '{
"name": "weighted-upstream",
"algorithm": "weighted-round-robin",
"targets": [
{
"host": "api.openai.com",
"port": 443,
"protocol": "https",
"weight": 60, # 60% of traffic
"priority": 1,
"default_model": "gpt-4o-mini",
"models": ["gpt-3.5-turbo", "gpt-4", "gpt-4o-mini"],
"credentials": {
"header_name": "Authorization",
"header_value": "Bearer your-openai-key"
}
},
{
"host": "api.anthropic.com",
"port": 443,
"protocol": "https",
"weight": 40, # 40% of traffic
"priority": 1,
"default_model": "claude-3-5-sonnet-20241022",
"models": ["claude-3-5-sonnet-20241022"],
"credentials": {
"header_name": "Authorization",
"header_value": "Bearer your-anthropic-key"
}
}
],
"health_checks": {
"passive": true,
"threshold": 3,
"interval": 60
}
}'