This guide demonstrates how to set up load balancing between multiple AI providers in TrustGate. By distributing traffic across different providers, you can improve reliability, optimize costs, and ensure high availability for your AI applications.
When making requests to your load-balanced API, TrustGate automatically handles provider selection and request transformation:
Copy
curl -X POST "http://localhost:8081/v1/chat/completions" \ -H "Content-Type: application/json" \ -H "X-TG-API-Key: your-api-key" \ -H "Authorization: Bearer ${JWT_TOKEN}" \ -d '{ "model": "gpt-4", "messages": [ { "role": "system", "content": "You are an assistant" }, { "role": "user", "content": "Hello, how are you?" } ], "max_tokens": 1020, "system": "You are an assistant", "stream": true }'
When using multiple providers in an upstream, you need to include fields that cover all providers in your request. The gateway will automatically transform the request for the selected provider.For example, when load balancing between OpenAI and Anthropic:
Copy
{ "model": "gpt-4", "messages": [ { "role": "user", "content": "Hello!" } ], "max_tokens": 1020, "system": "You are an assistant"}
The fields in this request serve different purposes:
model and messages: OpenAI format
system: Anthropic system prompt
max_tokens: Common field for both providers
The gateway will:
Select a target based on the load balancing algorithm
Transform the request to match the selected provider’s format
Remove unnecessary fields for that provider
Add any required provider-specific headers
Use the default model for the selected provider if different from the request
You don’t need to handle the transformation yourself - just include all necessary fields in your request, and the gateway will handle the rest based on the provider schemas.For streaming requests, add "stream": true to enable streaming for all providers.