Skip to main content
Ctrl+K

Xinference

  • Getting Started
  • Models
  • User Guide
  • Examples
  • API Reference
  • Development
    • Official Site
  • GitHub
  • Discord
  • Twitter
  • Getting Started
  • Models
  • User Guide
  • Examples
  • API Reference
  • Development
  • Official Site
  • GitHub
  • Discord
  • Twitter

Section Navigation

  • Installation
  • Using Xinference
  • Logging in Xinference
  • Xinference Docker Image
  • Xinference on Kubernetes
  • Troubleshooting
  • Environments Variables
  • Getting Started

Getting Started#

  • Installation
    • Transformers Backend
    • vLLM Backend
    • Llama.cpp Backend
    • SGLang Backend
    • MLX Backend
    • Other Platforms
  • Using Xinference
    • Run Xinference Locally
    • Deploy Xinference In a Cluster
    • Using Xinference With Docker
    • What’s Next?
  • Logging in Xinference
    • Configure Log Level
    • Log Files
  • Xinference Docker Image
    • Prerequisites
    • Docker Image
    • Dockerfile for custom build
    • Image usage
    • Mount your volume for loading and saving models
  • Xinference on Kubernetes
    • Helm Support
    • KubeBlocks Support
  • Troubleshooting
    • No huggingface repo access
    • Incompatibility Between NVIDIA Driver and PyTorch Version
    • Xinference service cannot be accessed from external systems through <IP>:9997
    • Launching a built-in model takes a long time, and sometimes the model fails to download
    • When using the official Docker image, RayWorkerVllm died due to OOM, causing the model to fail to load
    • Missing model_engine parameter when launching LLM models
    • Error: mkl-service + Intel(R) MKL: MKL_THREADING_LAYER=INTEL is incompatible with libgomp-a34b3233.so.1 library.
  • Environments Variables
    • XINFERENCE_ENDPOINT
    • XINFERENCE_MODEL_SRC
    • XINFERENCE_HOME
    • XINFERENCE_HEALTH_CHECK_ATTEMPTS
    • XINFERENCE_HEALTH_CHECK_INTERVAL
    • XINFERENCE_DISABLE_HEALTH_CHECK
    • XINFERENCE_DISABLE_METRICS

previous

Welcome to Xinference!

next

Installation

This Page

  • Show Source

© Copyright 2023, Xorbits Inc..

Created using Sphinx 7.4.7.

Built with the PyData Sphinx Theme 0.16.1.