Configuration
Semantic Router v0.3 uses one canonical YAML contract across local CLI, dashboard, Helm, and the operator:
version:
listeners:
providers:
routing:
global:
The detailed background is in Unified Config Contract v0.3. This page is the practical guide for using the contract.
Canonical contract
version: schema version. Usev0.3.listeners: router listener ports and timeouts.providers: deployment bindings and provider defaults.routing: routing semantics.global: sparse runtime overrides. If you omit a field here, the router's built-in default is used.
Ownership by section
routingis the DSL-owned surface.routing.modelCardsrouting.modelCards[].lorasrouting.signalsrouting.projectionsfor partitions plus derived routing outputsrouting.decisions
providersowns deployment and default-selection metadata.defaultsmodelsproviders.defaultsholdsdefault_model,reasoning_families, anddefault_reasoning_effortproviders.models[*]holdsprovider_model_id,backend_refs,pricing,api_format, andexternal_model_ids
globalowns router-wide runtime overrides.global.routergroups router-engine control knobs such as config-source selection, route-cache, and model-selection defaultsglobal.router.config_sourceselects whether runtime config comes from the canonical YAML file (file) or from in-process Kubernetes CRD reconciliation (kubernetes)global.servicesgroups shared APIs and control-plane services such asresponse_api,router_replay,observability,authz, andratelimitglobal.storesgroups shared storage-backed services such assemantic_cache,memory, andvector_store
global.integrationsgroups helper runtime integrations such astoolsandlooperglobal.model_cataloggroups router-owned model assets such as embeddings, system models, external models, and model-backed modulesglobal.model_catalog.embeddings.semantic.embedding_config.top_klimits how many ranked embedding rules are emitted for routing after scoring; the built-in default is1global.model_catalog.modulesgroups capability modules such asprompt_guard,classifier, andhallucination_mitigation
Canonical example
version: v0.3
listeners:
- name: http-8899
address: 0.0.0.0
port: 8899
timeout: 300s
providers:
defaults:
default_model: qwen3-8b
reasoning_families:
qwen3:
type: chat_template_kwargs
parameter: enable_thinking
default_reasoning_effort: medium
models:
- name: qwen3-8b
reasoning_family: qwen3
provider_model_id: qwen3-8b
backend_refs:
- name: primary
endpoint: host.docker.internal:8000
protocol: http
weight: 100
api_key_env: OPENAI_API_KEY
routing:
modelCards:
- name: qwen3-8b
modality: text
capabilities: [chat, reasoning]
loras:
- name: math-adapter
description: Adapter used for symbolic math and proof-style prompts.
signals:
keywords:
- name: math_terms
operator: OR
keywords: ["algebra", "calculus"]
structure:
- name: many_questions
feature:
type: count
source:
type: regex
pattern: '[??]'
predicate:
gte: 3
embeddings:
- name: technical_support
threshold: 0.75
candidates: ["installation guide", "troubleshooting steps"]
- name: account_management
threshold: 0.72
candidates: ["billing information", "subscription management"]
projections:
partitions:
- name: support_intents
semantics: exclusive
temperature: 0.3
members: [technical_support, account_management]
default: technical_support
scores:
- name: request_difficulty
method: weighted_sum
inputs:
- type: embedding
name: technical_support
weight: 0.18
value_source: confidence
- type: context
name: long_context
weight: 0.18
- type: structure
name: many_questions
weight: 0.12
mappings:
- name: request_band
source: request_difficulty
method: threshold_bands
outputs:
- name: support_fast
lt: 0.25
- name: support_escalated
gte: 0.25
decisions:
- name: support_route
description: Route support requests that need an escalated answer
priority: 100
rules:
operator: AND
conditions:
- type: embedding
name: technical_support
- type: projection
name: support_escalated
modelRefs:
- model: qwen3-8b
use_reasoning: true
lora_name: math-adapter
global:
router:
config_source: file
services:
observability:
metrics:
enabled: true
For routing.signals.structure, feature.type: density now uses built-in multilingual text-unit normalization. The router counts each CJK character as one unit, counts contiguous runs of other letters and digits as one unit, and ignores punctuation, so the same density rule shape behaves consistently across English, Chinese, and mixed-script prompts without a separate normalize_by field.
Repository config assets
The repository now separates the exhaustive canonical reference config from reusable routing fragments:
config/config.yaml: exhaustive canonical reference configconfig/signal/: reusablerouting.signalsfragmentsconfig/decision/: reusablerouting.decisionsrule-shape fragmentsconfig/algorithm/: reusabledecision.algorithmsnippetsconfig/plugin/: reusable route-plugin snippets
config/decision/ is organized by boolean case shape: single/, and/, or/, not/, and composite/.
config/algorithm/ is organized by routing policy family: looper/ and selection/.
config/plugin/ is organized one plugin or reusable bundle per directory.
The repository enforces this fragment catalog in go test ./pkg/config/..., so routing-surface changes must update the config/ tree in the same change.
Latest tutorials follow the same taxonomy:
tutorials/signal/overviewplustutorials/signal/heuristic/andtutorials/signal/learned/forconfig/signal/tutorials/decision/forconfig/decision/tutorials/algorithm/forconfig/algorithm/, with one page per algorithmtutorials/plugin/forconfig/plugin/, with one page per plugintutorials/global/for sparse router-wide overrides underglobal:
Repo-owned runtime and harness assets now live outside config/:
deploy/examples/runtime/semantic-cache/deploy/examples/runtime/response-api/deploy/examples/runtime/tools/e2e/config/deploy/local/envoy.yaml
Test-only ONNX binding assets now live under e2e/config/onnx-binding/.
Those directories are support assets, not the main user-facing config contract. For hand-authored config, start from config/config.yaml or the fragment directories above. In this repository, the exhaustive reference config points global.integrations.tools.tools_db_path at deploy/examples/runtime/tools/tools_db.json for local development.
config/config.yaml is not just a sample anymore. The repository enforces it as the exhaustive public-contract reference:
go test ./pkg/config/...checks that it stays aligned to the canonical schema and routing surface catalogmake agent-lintruns the same reference-config contract check at lint level, so config/schema drift is blocked before merge- maintained
deploy/ande2e/router config assets are checked against the same canonical contract, so repo-owned examples and harness profiles cannot drift back to legacy steady-state fields
Projection Workflow
Use routing.projections when the raw signal catalog is not enough on its own:
routing.signalsdefines reusable detectors.routing.projections.partitionsresolves one winner inside an exclusive domain or embedding family.routing.projections.scorescombines learned and heuristic signals into a weighted score.routing.projections.mappingsturns that score into named routing bands.routing.decisions[*].rules.conditions[*]can reference those bands withtype: projection.
The dashboard mirrors the same contract:
Config -> Projectionsedits partitions, scores, and mappingsConfig -> Decisionscan reference mapping outputs with condition typeprojectionDSL -> VisualmanagesPROJECTION partition,PROJECTION score, andPROJECTION mappingentities directly
For a focused tutorial, read Projections. For a maintained end-to-end example, use:
How to use it
Python CLI
Use the canonical YAML directly.
vllm-sr serve --config config.yaml
To migrate an older config first:
vllm-sr config migrate --config old-config.yaml
vllm-sr validate config.yaml
vllm-sr init was removed in v0.3. The steady-state file is config.yaml.
Inside this repository, the default exhaustive reference file is config/config.yaml.
Router local / YAML-first
For local Docker or direct router development, hand-author config.yaml in canonical form and validate it before serving:
vllm-sr validate config.yaml
vllm-sr serve --config config.yaml
If you only need to override a few runtime defaults, write those under global: and leave the rest unset.
Dashboard / onboarding
Use the dashboard when you want to import or edit the full canonical YAML directly.
- onboarding remote import accepts a complete
version/listeners/providers/routing/globalfile - the config page edits the same canonical contract
- the DSL editor can import the same YAML, but it only decompiles
routing - decision model refs can carry
lora_name, and those names resolve againstrouting.modelCards[].loras
Helm
Helm values now mirror the same canonical contract under config.
config:
version: v0.3
providers:
defaults:
default_model: qwen3-8b
models:
- name: qwen3-8b
provider_model_id: qwen3-8b
backend_refs:
- name: primary
endpoint: semantic-router-vllm.default.svc.cluster.local:8000
protocol: http
routing:
modelCards:
- name: qwen3-8b
Then install or upgrade normally:
helm upgrade --install semantic-router deploy/helm/semantic-router -f values.yaml
Operator
The operator keeps the same logical contract, but it wraps it inside the CRD:
spec.config.providersspec.config.routingspec.config.global
spec.vllmEndpoints is still the Kubernetes-native backend discovery adapter. The controller projects that data into canonical providers.models[].backend_refs[] and routing.modelCards entries, including any declared loras, when it renders the router config.
See Kubernetes Operator.
DSL
DSL only owns the routing surface.
- Author
MODEL,SIGNAL, andROUTE - Compile to a routing fragment
- Keep
providersandglobalin YAML
The DSL compiler emits:
routing:
modelCards:
signals:
decisions:
It does not emit listeners, providers, or global.
Import and migration
Onboarding remote import
The setup wizard can import a full canonical YAML file from a URL and apply the complete config, including providers, routing, and global.
DSL import
The DSL editor can import:
- a full router config YAML
- a routing-only YAML fragment
In both cases, only the routing section is decompiled into DSL.
Migrate old configs
Use the CLI migration command for older flat or mixed configs:
vllm-sr config migrate --config old-config.yaml
This migrates legacy shapes such as:
- top-level
signals, flatkeyword_rules/categories/other signal blocks, anddecisions - top-level
model_config - top-level
vllm_endpointsandprovider_profiles providers.models[].endpoints- inline
access_key
into canonical providers/routing/global.
Import OpenClaw model providers
Use the CLI import command when you already have an openclaw.json with supported OpenAI-compatible provider endpoints and want VSR to take over model routing while rewriting OpenClaw to the first VSR listener:
vllm-sr config import --from openclaw --source openclaw.json --target config.yaml
When --source is omitted, the importer checks OPENCLAW_CONFIG_PATH, ./openclaw.json, and ~/.openclaw/openclaw.json in that order.
Quick guides by environment
Python CLI
- Write
config.yamlin canonical form. - Run
vllm-sr validate config.yaml. - Run
vllm-sr serve --config config.yaml.
Router local
- Keep provider-wide defaults in
providers.defaultsand deployment bindings inproviders.models[].backend_refs[]. - Keep routing semantics in
routing.modelCards/signals/decisions. - Put only runtime overrides you actually need under
global.router/services/stores/integrations/model_catalog, and keep model-backed module settings underglobal.model_catalog.modules. - Use
global.router.config_source: kubernetesonly when the in-processIntelligentPool/IntelligentRoutecontroller is the active source of truth. Leave it asfilefor normal local, CLI, dashboard, Helm, and operator-authored canonical YAML.
Helm
- Put the same canonical config under
values.yaml -> config. - Use
helm upgrade --install ... -f values.yaml. - Treat Helm as a deployment wrapper, not a second config schema.
Operator
- Put portable config under
spec.config. - Use
spec.vllmEndpointsonly when you want Kubernetes-native backend discovery. - Expect the operator to render canonical router config from that adapter layer.
DSL
- Use DSL for
routing.modelCards,routing.signals, androuting.decisions. - Importing a full YAML file still works, but only
routingis decompiled into DSL. - Keep endpoints, API keys, listeners, and
globalin YAML. - Reusable routing fragments now live under
config/signal/,config/decision/,config/algorithm/, andconfig/plugin/.