Rerank#

Learn how to use rerank models in Xinference.

Introduction#

Given a query and a list of documents, Rerank indexes the documents from most to least semantically relevant to the query. Rerank models in Xinference can be invoked through the Rerank endpoint to rank a list of documents.

Quickstart#

We can try Rerank API out either via cURL, OpenAI Client, or Xinference’s python client:

curl -X 'POST' \
  'http://<XINFERENCE_HOST>:<XINFERENCE_PORT>/v1/rerank' \
  -H 'accept: application/json' \
  -H 'Content-Type: application/json' \
  -d '{
    "model": "<MODEL_UID>",
    "query": "A man is eating pasta.",
    "documents": [
        "A man is eating food.",
        "A man is eating a piece of bread.",
        "The girl is carrying a baby.",
        "A man is riding a horse.",
        "A woman is playing violin."
    ]
  }'