Services
In AI Gateway, a service is an entity representing an external upstream API or AI model endpoint. For example, an OpenAI API endpoint, an Anthropic Claude service, or your own custom AI model service.
The main attribute of a service is its upstream configuration, which defines where and how the AI Gateway should forward requests.
Service and Rules Interaction
Services, in conjunction with rules, let you expose your AI models and services to clients with AI Gateway. The gateway abstracts the service from the clients by using rules. Since the client always calls the rule, changes to the services (like switching AI model providers or versions) don't impact how clients make the call. Rules also allow the same service to be used by multiple clients and apply different policies based on the rule used.
Service Types
AI Gateway supports different types of services that help you integrate and manage various AI model endpoints. The two main service types are:
- Upstream Services
These services provide direct connections to your backend AI models and infrastructure. They offer robust features including:
- Direct connection to backend AI models, allowing you to integrate your own hosted models and AI services
- Load balanced distribution across multiple target endpoints to optimize performance and resource utilization
- Built-in health checking capabilities to monitor service availability and performance
- Automatic failover support to maintain high availability when issues occur
- Proxy Services
These services act as intermediaries to external AI providers, adding important management capabilities:
- Seamless proxy requests to external AI providers like OpenAI, Anthropic, and others
- Comprehensive authentication and rate limiting to control access and usage
- Powerful request and response transformation capabilities to modify payloads as needed
- Intelligent response caching when possible to improve performance and reduce costs
Best Practices
-
Service Design
- Choose appropriate service types
- Plan service boundaries
- Consider service dependencies
- Design for resilience
-
Service Organization
- Group related services
- Use clear naming conventions
- Document service relationships
- Maintain service hierarchy
-
Security
- Configure appropriate timeouts
- Set up retry policies
- Enable health checks
- Implement rate limiting