Installation Guide for Ascend NPU#

Xinference can run on Ascend NPU, follow below instructions to install.

Installing PyTorch and Ascend extension for PyTorch#

Install PyTorch CPU version and corresponding Ascend extension.

Take PyTorch v2.1.0 as example.

pip3 install torch==2.1.0 torchvision==0.16.0 --index-url https://download.pytorch.org/whl/cpu

Then install Ascend extension for PyTorch.

pip3 install 'numpy<2.0'
pip3 install decorator
pip3 install torch-npu==2.1.0.post3

Running below command to see if it correctly prints the Ascend NPU count.

python -c "import torch; import torch_npu; print(torch.npu.device_count())"

Installing Xinference#

pip3 install xinference

Now you can use xinference according to doc. Transformers backend is the only available engine supported for Ascend NPU for open source version.

Enterprise Support#

If you encounter any performance or other issues for Ascend NPU, please reach out to us via link.