Configure an Upstream
This guide will walk you through configuring your first upstream in AI Gateway. An upstream defines how requests are distributed across multiple AI providers and models.
Prerequisites
Before you begin, make sure you have:
- AI Gateway installed and running
- Access to the Admin API
- API keys for your AI providers (e.g., OpenAI, Anthropic)
Step 1: Plan Your Upstream
Decide on your upstream configuration:
- Choose which AI providers you want to use
- Identify the models for each provider
- Determine load balancing weights
- Plan your routing strategy
Step 2: Create the Upstream
Create an upstream with multiple AI providers:
curl -X POST http://localhost:8080/api/v1/gateways/{gateway-id}/upstreams \
-H "Content-Type: application/json" \
-d '{
"name": "ai-providers-upstream",
"algorithm": "round-robin",
"targets": [
{
"path": "/v1/chat/completions",
"provider": "openai",
"weight": 50,
"priority": 1,
"default_model": "gpt-4o-mini",
"models": ["gpt-3.5-turbo", "gpt-4", "gpt-4o-mini"],
"credentials": {
"header_name": "Authorization",
"header_value": "Bearer your-openai-key"
}
},
{
"path": "/v1/messages",
"provider": "anthropic",
"weight": 50,
"priority": 1,
"default_model": "claude-3-5-sonnet-20241022",
"models": ["claude-3-5-sonnet-20241022"],
"headers": {
"anthropic-version": "2023-06-01"
},
"credentials": {
"header_name": "x-api-key",
"header_value": "your-anthropic-key"
}
}
],
"health_checks": {
"passive": true,
"threshold": 3,
"interval": 60
}
}'
Understanding the Configuration
-
Provider Configuration
path
: The endpoint path for the providerprovider
: The AI provider nameweight
: Load balancing weight (1-100)priority
: Failover priority (lower numbers = higher priority)default_model
: Default model to use when none specifiedmodels
: List of supported models for this provider
-
Load Balancing Strategy
algorithm
: Determines how requests are distributedround-robin
: Rotates through providers sequentiallyweighted
: Uses provider weights to distribute traffic
weight
: Higher weights receive proportionally more traffic- Example: weight 50/50 splits traffic equally
- Example: weight 70/30 sends 70% to first provider
- Multiple providers can serve the same model type
- Requests for "gpt-4" could go to either provider
- Helps with redundancy and cost optimization
-
Fallback Strategy
priority
: Controls the failover sequence- Priority 1 providers are tried first
- Higher numbers are used as backups
- Health checks determine availability
- Unhealthy providers are skipped
- Traffic automatically routes to healthy providers
- Model availability affects routing
- If requested model isn't available, tries next provider
- Falls back to provider's default model if specified
-
Authentication
credentials
: Provider API keysheaders
: Additional required headers
-
Health Checking
passive
: Enable passive health checksthreshold
: Number of failures before marking unhealthyinterval
: Time between health checks in seconds- Affects both load balancing and failover
- Unhealthy providers are removed from rotation
- Automatically restored when health returns
Step 3: Verify Configuration
Check that your upstream is properly configured:
curl http://localhost:8080/api/v1/gateways/{gateway-id}/upstreams/{upstream-id}
Step 4: Test Load Balancing
Test the load balancing across providers:
# Test with OpenAI-compatible endpoint
curl -X POST http://your-gateway-domain/v1/chat/completions \
-H "Content-Type: application/json" \
-H "X-Api-Key: your-api-key" \
-d '{
"model": "gpt-4o-mini",
"messages": [{"role": "user", "content": "Hello"}]
}'
The response headers will include X-Selected-Provider
to show which provider handled the request.
Next Steps
Now that you have configured your upstream:
- Configure a Service to use this upstream
Additional Resources
Troubleshooting
Common issues and solutions:
-
Provider Issues
- Verify API keys are valid
- Check provider endpoints
- Confirm model availability
-
Load Balancing
- Check provider weights
- Monitor distribution
- Verify failover behavior
-
Authentication
- Verify credential format
- Check required headers
- Test provider access