registry_ids; for explicit control, define a
named pool with lb_config.
Strategies
| Algorithm | How it picks |
|---|---|
round-robin | Rotates through registries in order. |
weighted-round-robin | Interlaces picks by per-registry weight (1..100). |
least-connections | Picks the registry with the fewest in-flight requests. |
random | Uniform random pick. |
semantic | Embeds the prompt and routes to the registry whose description is the closest cosine match — requires an embedding config. |
Weights
For weighted round-robin, each registry carries a weight from 1 to 100, set when you attach it to the consumer:Named pools (lb_config)
For more than a flat list, define a pool:
pool_aliasis referenced from a request via thepool:<alias>model reference (see Model resolution).membersscope which models each registry serves within the pool.- The
semanticalgorithm additionally needs anembedding_config.
Health checks
LLM registries can definehealth_checks; unhealthy registries are skipped by the load
balancer until they recover, so traffic shifts to healthy upstreams automatically.