Skip to main content
Ctrl+K

Xinference

  • Getting Started
  • Models
  • User Guide
  • Examples
  • API Reference
  • Development
  • GitHub
  • Slack
  • Twitter
  • Getting Started
  • Models
  • User Guide
  • Examples
  • API Reference
  • Development
  • GitHub
  • Slack
  • Twitter

Section Navigation

  • Backends
  • Client API
  • Simple OAuth2 System (experimental)
  • Metrics
  • Continuous Batching
  • User Guide

User Guide#

  • Backends
    • llama.cpp
    • transformers
    • vLLM
    • SGLang
    • MLX
  • Client API
    • LLM
    • Embedding
    • Image
    • Audio
    • Rerank
  • Simple OAuth2 System (experimental)
    • Permissions
    • Startup
    • Usage
    • Http Status Code
    • Note
  • Metrics
    • Supervisor Metrics
    • Worker Metrics
  • Continuous Batching
    • Usage
    • Abort your request
    • Note

previous

Model Memory Calculation

next

Backends

Show Source

© Copyright 2023, Xorbits Inc..

Created using Sphinx 7.4.7.

Built with the PyData Sphinx Theme 0.16.0.