User Guide# Backends llama.cpp transformers vLLM SGLang MLX Client API LLM Embedding Image Audio Rerank Simple OAuth2 System (experimental) Permissions Startup Usage Http Status Code Note Metrics Supervisor Metrics Worker Metrics Distributed Inference Supported Engines Usage Continuous Batching Usage Abort your request Note Xavier: Share KV Cache between vllm replicas Usage Limitations