string
)
Specifies the load balancing algorithm used for distributing requests among multiple targets.
Common values might include round_robin
, least_conn
, or other supported algorithms.
object
)
Defines active or passive health checks for monitoring the availability of upstream targets.
string
)
A human-friendly name for this upstream configuration, often used for clarity or logging.
array of strings
)
An optional array of metadata tags. Useful for organizing, filtering, or searching among multiple upstream configurations.
array of objects
)
Lists one or more backend endpoints (targets) that the gateway routes requests to. Each target contains its own configuration and credentials.
health_checks
object configures how the gateway monitors the health of its upstream targets.
object
)
Key-value pairs representing custom headers to send during health check probes.
number
)
The interval (in seconds) between consecutive health check probes.
boolean
)
When set to true
, enables passive health checks in addition to active checks.
Passive checks rely on real request/response interactions, marking targets unhealthy if errors are detected.
string
)
The URL path used by active health checks to test the availability of upstream targets (for example, /health
).
number
)
The number of consecutive failures required to mark a target as unhealthy (or successes required to restore it to a healthy state).
targets
array represents an individual endpoint or provider.
You can have multiple targets to enable load balancing or to support different services under a single upstream definition.
string
)
The hostname or IP address of the target server (for example, api.example.com
).
number
)
The TCP port on which the target is listening.
string
)
The protocol used to communicate with the target (e.g., http
, https
, grpc
).
string
)
The path appended to the base URL when routing requests to this target.
string
)
An optional identifier to denote the provider type or environment (e.g., aws
, azure
, gcp
, or a custom provider name).
number
)
A numerical priority, which can influence how load balancing is handled, depending on the chosen algorithm.
number
)
For load balancing methods that support weights, this value affects how many requests are sent to the target relative to others.
object
)
Additional headers to include in requests sent to this specific target.
array of strings
)
Optional array of tags for metadata or organizational purposes.
string
)
If applicable, specifies a default model (e.g., for AI or ML deployments) used when routing requests to this target.
array of strings
)
Lists model identifiers supported by this target (e.g., AI models or deployment versions).
credentials
object within each target controls how authentication and related parameters are handled.
boolean
)
Indicates whether these credentials can be overridden by other configuration scopes.
string
)
AWS access key ID for authentication (if using AWS-based services).
string
)
AWS secret access key for authentication (if using AWS-based services).
string
)
Azure client ID for authentication.
string
)
Azure client secret key.
string
)
Azure tenant ID for multi-tenant environments.
boolean
)
If true
, use Azure’s managed identity instead of explicit credentials.
string
)
A JSON string with Google Cloud service account credentials.
boolean
)
If true
, the gateway uses GCP service accounts for authentication instead of explicit credentials.
string
)
The HTTP header name that will carry credentials or tokens.
string
)
The HTTP header value (e.g., a token) to be used.
string
)
Indicates where an authorization parameter might be placed (e.g., query
, body
, or path
).
string
)
Name of the parameter used for authentication.
string
)
Value of the parameter used for authentication.
algorithm
and weight
/priority
fields define how traffic is distributed across targets.
health_checks
ensures that the gateway only routes requests to healthy targets.
tags
to organize or categorize upstream objects and targets, especially helpful in large-scale environments.