Welcome to Xinference!#

Xorbits Inference (Xinference) is an open-source platform to streamline the operation and integration of a wide array of AI models. With Xinference, you’re empowered to run inference using any open-source LLMs, embedding models, and multimodal models either in the cloud or on your own premises, and create robust AI-driven applications.

Developing Real-world AI Applications with Xinference#

from xinference.client import Client

client = Client("http://localhost:9997")
model = client.get_model("MODEL_UID")

# Chat to LLM
   prompt="What is the largest animal?",
   system_prompt="You are a helpful assistant",
   generate_config={"max_tokens": 1024}

# Chat to VL model
        "role": "user",
        "content": [
           {"type": "text", "text": "What’s in this image?"},
              "type": "image_url",
              "image_url": {
                 "url": "http://i.epochtimes.com/assets/uploads/2020/07/shutterstock_675595789-600x400.jpg",
  generate_config={"max_tokens": 1024}

