Load Balancing
Load balancing helps distribute AI model requests across multiple instances for better performance and reliability.
Configure Upstream Targets
Set up upstream targets for load balancing:
# Create an upstream
curl -X POST http://localhost:8001/upstreams \
--data name=openai-upstream \
--data algorithm=round-robin
# Add targets to the upstream
curl -X POST http://localhost:8001/upstreams/openai-upstream/targets \
--data target=api.openai.com:443 \
--data weight=100
curl -X POST http://localhost:8001/upstreams/openai-upstream/targets \
--data target=api2.openai.com:443 \
--data weight=100
Load Balancing Algorithms
Choose from different load balancing algorithms:
# Round Robin balancing
curl -X POST http://localhost:8001/upstreams \
--data name=ai-upstream \
--data algorithm=round-robin
# Least Connections
curl -X POST http://localhost:8001/upstreams \
--data name=ai-upstream \
--data algorithm=least-connections
# Hash-based balancing
curl -X POST http://localhost:8001/upstreams \
--data name=ai-upstream \
--data algorithm=consistent-hashing \
--data hash_on=header \
--data hash_on_header=user-id
Health Checks
Configure health checks for targets:
# Add active health checks
curl -X PATCH http://localhost:8001/upstreams/ai-upstream \
--data healthchecks.active.http_path=/health \
--data healthchecks.active.healthy.interval=5 \
--data healthchecks.active.unhealthy.interval=5
# Add passive health checks
curl -X PATCH http://localhost:8001/upstreams/ai-upstream \
--data healthchecks.passive.healthy.successes=3 \
--data healthchecks.passive.unhealthy.http_failures=3
Target Weight and Priority
Adjust target weights and priorities:
# Update target weight
curl -X POST http://localhost:8001/upstreams/ai-upstream/targets \
--data target=api1.example.com:443 \
--data weight=200
# Set backup target
curl -X POST http://localhost:8001/upstreams/ai-upstream/targets \
--data target=backup-api.example.com:443 \
--data weight=100 \
--data backup=true
Monitoring
Monitor load balancer status:
# Check upstream status
curl -X GET http://localhost:8001/upstreams/ai-upstream/health
# Check target status
curl -X GET http://localhost:8001/upstreams/ai-upstream/targets
# Get load balancer metrics
curl -X GET http://localhost:8001/metrics
Next Steps
After configuring load balancing:
- Set up Rate Limiting to protect your services