OAuth2 系统（实验性质）#

Xinference 使用了账号密码的模式构建了一个基于内存的 OAuth2 的身份验证和授权系统。

备注

如果没有身份验证和授权的要求，可以像之前一样使用 Xinference，无需任何改动。

权限#

目前，Xinference 内部定义了以下几个接口权限：

models:list: 获取模型列表和信息的权限。
models:read: 使用模型的权限。
models:register: 注册模型的权限。
models:unregister: 取消注册模型的权限。
models:start: 启动模型的权限。
models:stop: 停止模型的权限。
admin: 管理员拥有所有接口的权限。

开始使用#

在启动 Xinference 时，需要指定所有的验证和授权信息。当前，Xinference 需要一个 JSON 文件，其中包含以下特定字段：

{
    "auth_config": {
        "algorithm": "HS256",
        "secret_key": "09d25e094faa6ca2556c818166b7a9563b93f7099f6f0f4caa6cf63b88e8d3e7",
        "token_expire_in_minutes": 30
    },
    "user_config": [
        {
            "username": "user1",
            "password": "secret1",
            "permissions": [
                "admin"
            ],
            "api_keys": [
                "sk-72tkvudyGLPMi",
                "sk-ZOTLIY4gt9w11"
            ]
        },
        {
            "username": "user2",
            "password": "secret2",
            "permissions": [
                "models:list",
                "models:read"
            ],
            "api_keys": [
                "sk-35tkasdyGLYMy",
                "sk-ALTbgl6ut981w"
            ]
        }
    ]
}

auth_config: 这个字段配置与安全相关的信息。
- algorithm: 用于令牌生成与解析的算法。推荐使用 HS 系列算法，例如 HS256，HS384 或者 HS512 算法。
- secret_key: 用于令牌生成和解析的密钥。可以使用该命令生成适配 HS 系列算法的密钥：openssl rand -hex 32 。
- token_expire_in_minutes: 保留字段，表示令牌失效时间。目前 Xinference 开源版本不会检查令牌过期时间。
user_config: 这个字段用来配置用户和权限信息。每个用户信息由以下字段组成：
- username: 字符串，表示用户名
- password: 字符串，表示密码
- permissions: 字符串列表，表示该用户拥有的权限。权限描述如上权限部分文档所述。
- api_keys: 字符串列表，表示该用户拥有的 api-key 。用户可以通过这些 api-key ，无需登录步骤即可访问 xinference 接口。这里的 api_key 组成与 OPENAI_API_KEY 相似，总是以 sk- 开头，后跟 13 个数字、大小写字母。

配置好这样一个 JSON 文件后，可以使用 --auth-config 选项启用具有身份验证和授权系统的 Xinference。例如，本地启动的命令如下所示：

xinference-local -H 0.0.0.0 --auth-config /path/to/your_json_config_file

在分布式环境下，只需要在启动 supervisor 时指定这个选项：

xinference-supervisor -H <supervisor_ip> --auth-config /path/to/your_json_config_file

使用#

使用带有权限管理的 Xinference 服务与正常的版本保持一致，只是在开始阶段添加了登录步骤，或者使用 api-key 进行鉴权。

基于用户名-密码的使用方式#

使用命令行登录：

xinference login -e <endpoint> --username <username> --password <password>

使用 Python SDK 登录：

from xinference.client import Client
client = Client('<endpoint>')
client.login('<name>', '<pass>')

对于 Web UI 的用户，在打开 Web UI 时，将首先跳转到登录页面。登录后，就可以正常使用Web UI 的功能。

基于 Api-Key 鉴权的使用方式#

对于命令行用户，仅需在所要运行的命令上新增 --api-key 或 -ak 选项即可。

xinference launch <other options> --api-key <your_api_key>

对于 Python 客户端用户，在客户端对象初始化时传入 api_key 参数即可，就像 OPENAI 客户端那样。

from xinference.client import Client
client = Client('<endpoint>', api_key='<your_api_key>')

当然，Xinference 也与 OPENAI Python 客户端的使用方式完全兼容。

from openai import OpenAI
client = OpenAI(base_url="<xinference endpoint>" + "/v1", api_key="<your_api_key>")
client.models.list()

对于 HTTP 请求，在请求头中传递 Authorization: Bearer api-key。

curl --request GET \
  --url "<xinference endpoint>" \
  --header "Authorization: Bearer <your_api_key>"

Http 状态码#

添加了以下两种 HTTP 状态码：

401 Unauthorized: 登录信息或者令牌验证失效。
403 Forbidden: 没有足够的权限访问接口。

对于命令行、SDK 或 Web UI 用户，在遇到授权和权限问题时，会有明确的信息提示。

Advanced authentication (DB-backed)#

When the advanced authentication system is enabled, users, permissions, API keys, and refresh tokens are stored in a database rather than the static auth_config.json file. This section documents two behaviors that operators should know.

Permission changes take effect without re-login#

For advanced-auth JWT requests, route scope checks read the user's current permissions from the database on every request, not the scopes baked into the JWT at login. Granting or revoking a permission takes effect on the user's next API call — no re-login required.

This applies to JWT-based browser sessions. API keys are also live-read (they always have been).

Configurable access-token lifetime#

The access-token lifetime defaults to 30 minutes and can be overridden with the XINFERENCE_ACCESS_TOKEN_EXPIRE_MINUTES environment variable:

export XINFERENCE_ACCESS_TOKEN_EXPIRE_MINUTES=10

A shorter lifetime shrinks the token-theft window. The refresh token lifetime is 7 days and is not currently configurable.

注意#

该功能处于实验阶段。欢迎通过 GitHub issues 或者 Telegram 群组提供反馈和建议。