User Guide# Backends llama.cpp transformers vLLM SGLang MLX Client API LLM Embedding Image Audio Rerank Simple OAuth2 System (experimental) Permissions Startup Usage Http Status Code Note Model Launching Instructions Replica Set Environment Variables Configuring Model Virtual Environment Metrics Supervisor Metrics Worker Metrics Distributed Inference Supported Engines Usage Continuous Batching Usage Abort your request Note Xavier: Share KV Cache between vllm replicas Usage Limitations