跳到主要内容
Blog

Journal

Release notes, field reports, and research commentary from the vLLM Semantic Router project.

3 篇博文 含有标签「hardware」

查看所有标签

Deploying vLLM Semantic Router on AMD Developer Cloud

· 阅读需 11 分钟
Xunzhuo Liu
Intelligent Routing @vLLM

AMD Developer Cloud and vLLM Semantic Router overview

Running vLLM Semantic Router on AMD Developer Cloud is not just about bringing up one more inference endpoint. It is about turning it into a routed multi-tier system that can classify requests, choose a semantic lane, and make replay and Insights immediately useful.

This post walks through the practical path: start the ROCm backend on an AMD Developer Cloud instance, install vLLM-SR, import the reference profile, and validate the deployment end to end.

AMD × vLLM Semantic Router: Building the System Intelligence Together

· 阅读需 1 分钟
Xunzhuo Liu
Intelligent Routing @vLLM

Over the past several months, AMD and the vLLM SR Team have been collaborating to bring vLLM Semantic Router (VSR) to AMD GPUs—not just as a performance optimization, but as a fundamental shift in how we think about AI system architecture.

AMD has been a long-term technology partner for the vLLM community, from accelerating the vLLM inference engine on AMD GPUs and ROCm™ Software to now co-building the next layer of the AI stack: intelligent routing and governance for Mixture-of-Models (MoM) systems.

Synced from official vLLM Blog: AMD × vLLM Semantic Router: Building the System Intelligence Together

banner