What is TrustGate?
TrustGate functions as a specialized system akin to an API Gateway, but it’s purpose-built for managing Agents and LLM workloads. While traditional API Gateways serve as intermediaries between clients and backend services, handling key operational tasks, AI Gateways take on these responsibilities with a focus on AI-specific needs:Key Features of an AI Gateway:
- Routing: Directs requests to the appropriate service or model.
- Load Balancing: Ensures even traffic distribution across models or services.
- Fallback Mechanisms: Provides a way to fall back to a different model or service if the primary one is unavailable.
- Rate Limiting: Controls request volumes to prevent overloading, at both the request and token level.
- Security: Protects against injection attacks, data leaks, and other security threats.
Architecture Overview
The AI Gateway architecture consists of several key components working together to provide secure and efficient AI model access:Key Components:
- Control Plane
- Admin API: Manages configuration, gateways, upstreams, services, rules, and API keys
- Config Store: Maintains gateway settings and routing rules
- Data Plane
- Proxy API: Handles real-time request processing
- Plugin System: Executes custom logic and transformations
Why use TrustGate?
TrustGate provides specialized security and management features designed specifically for AI workloads:-
Advanced Security Protection
- Injection Protection: Detects and blocks various injection attacks including SQL, NoSQL, command injection
- Code Sanitation: Prevents malicious code execution across multiple programming languages
- Request Size Limiting: Protects against oversized requests and DoS attacks
- Content Filtering: Blocks harmful or inappropriate content
-
Sophisticated Rate Limiting
- Request-based Rate Limiting: Controls request volumes at IP, user, and global levels
- Token Bucket Algorithm: Manages token consumption for AI model interactions
- Burst Handling: Allows temporary traffic spikes while maintaining overall limits
- Distributed Rate Limiting: Uses Redis for reliable rate limiting across multiple instances
-
Intelligent Load Balancing
- Multiple Load Balancing Strategies
- Automatic Failover
- Health Checking
- Dynamic Backend Selection
-
AI-specific Features
- Prompt Injection Protection
- Token Usage Control
- Model-specific Routing
- Response Validation
- Content Moderation
-
Enterprise Integration
- Kubernetes Support
- Docker Deployment
- High Availability Setup
- Horizontal Scaling
- Configuration Management
- Protect against security threats
- Control resource usage
- Monitor performance
- Manage access
- Scale effectively
- Ensure compliance