大语言模型#
以下是 Xinference 中内置的 LLM 列表:
MODEL NAME |
ABILITIES |
COTNEXT_LENGTH |
DESCRIPTION |
|---|---|---|---|
generate |
2048 |
Aquila2 series models are the base language models |
|
chat |
2048 |
Aquila2-chat series models are the chat models |
|
chat |
16384 |
AquilaChat2-16k series models are the long-text chat models |
|
generate |
4096 |
Baichuan2 is an open-source Transformer based LLM that is trained on both Chinese and English data. |
|
chat |
4096 |
Baichuan2-chat is a fine-tuned version of the Baichuan LLM, specializing in chatting. |
|
chat |
131072 |
C4AI Command-R(+) is a research release of a 35 and 104 billion parameter highly performant generative model. |
|
generate |
100000 |
Code-Llama is an open-source LLM trained by fine-tuning LLaMA2 for generating and discussing code. |
|
chat |
100000 |
Code-Llama-Instruct is an instruct-tuned version of the Code-Llama LLM. |
|
generate |
100000 |
Code-Llama-Python is a fine-tuned version of the Code-Llama LLM, specializing in Python. |
|
chat |
131072 |
the open-source version of the latest CodeGeeX4 model series |
|
generate |
65536 |
CodeQwen1.5 is the Code-Specific version of Qwen1.5. It is a transformer-based decoder-only language model pretrained on a large amount of data of codes. |
|
chat |
65536 |
CodeQwen1.5 is the Code-Specific version of Qwen1.5. It is a transformer-based decoder-only language model pretrained on a large amount of data of codes. |
|
generate |
8194 |
CodeShell is a multi-language code LLM developed by the Knowledge Computing Lab of Peking University. |
|
chat |
8194 |
CodeShell is a multi-language code LLM developed by the Knowledge Computing Lab of Peking University. |
|
generate |
32768 |
Codestrall-22B-v0.1 is trained on a diverse dataset of 80+ programming languages, including the most popular ones, such as Python, Java, C, C++, JavaScript, and Bash |
|
chat, vision |
8192 |
CogVLM2 have achieved good results in many lists compared to the previous generation of CogVLM open source models. Its excellent performance can compete with some non-open source models. |
|
chat, vision |
8192 |
CogVLM2-Video achieves state-of-the-art performance on multiple video question answering tasks. |
|
chat |
32768 |
csg-wukong-1B is a 1 billion-parameter small language model(SLM) pretrained on 1T tokens. |
|
generate |
4096 |
DeepSeek LLM, trained from scratch on a vast dataset of 2 trillion tokens in both English and Chinese. |
|
chat |
4096 |
DeepSeek LLM is an advanced language model comprising 67 billion parameters. It has been trained from scratch on a vast dataset of 2 trillion tokens in both English and Chinese. |
|
generate |
16384 |
Deepseek Coder is composed of a series of code language models, each trained from scratch on 2T tokens, with a composition of 87% code and 13% natural language in both English and Chinese. |
|
chat |
16384 |
deepseek-coder-instruct is a model initialized from deepseek-coder-base and fine-tuned on 2B tokens of instruction data. |
|
generate |
128000 |
DeepSeek-V2, a strong Mixture-of-Experts (MoE) language model characterized by economical training and efficient inference. |
|
chat |
128000 |
DeepSeek-V2, a strong Mixture-of-Experts (MoE) language model characterized by economical training and efficient inference. |
|
chat |
128000 |
DeepSeek-V2-Chat-0628 is an improved version of DeepSeek-V2-Chat. |
|
chat |
128000 |
DeepSeek-V2.5 is an upgraded version that combines DeepSeek-V2-Chat and DeepSeek-Coder-V2-Instruct. The new model integrates the general and coding abilities of the two previous versions. |
|
chat, vision |
4096 |
DeepSeek-VL possesses general multimodal understanding capabilities, capable of processing logical diagrams, web pages, formula recognition, scientific literature, natural images, and embodied intelligence in complex scenarios. |
|
chat |
8192 |
Gemma is a family of lightweight, state-of-the-art open models from Google, built from the same research and technology used to create the Gemini models. |
|
chat |
8192 |
Gemma is a family of lightweight, state-of-the-art open models from Google, built from the same research and technology used to create the Gemini models. |
|
chat, vision |
8192 |
GLM4 is the open source version of the latest generation of pre-trained models in the GLM-4 series launched by Zhipu AI. |
|
chat, tools |
131072 |
GLM4 is the open source version of the latest generation of pre-trained models in the GLM-4 series launched by Zhipu AI. |
|
chat, tools |
1048576 |
GLM4 is the open source version of the latest generation of pre-trained models in the GLM-4 series launched by Zhipu AI. |
|
chat |
4096 |
OpenFunctions is designed to extend Large Language Model (LLM) Chat Completion feature to formulate executable APIs call given natural language instructions and API context. |
|
generate |
1024 |
GPT-2 is a Transformer-based LLM that is trained on WebTest, a 40 GB dataset of Reddit posts with 3+ upvotes. |
|
chat |
32768 |
The second generation of the InternLM model, InternLM2. |
|
chat |
32768 |
InternLM2.5 series of the InternLM model. |
|
chat |
262144 |
InternLM2.5 series of the InternLM model supports 1M long-context |
|
chat, vision |
32768 |
InternVL 1.5 is an open-source multimodal large language model (MLLM) to bridge the capability gap between open-source and proprietary commercial models in multimodal understanding. |
|
chat, vision |
32768 |
InternVL 2 is an open-source multimodal large language model (MLLM) to bridge the capability gap between open-source and proprietary commercial models in multimodal understanding. |
|
generate |
4096 |
Llama-2 is the second generation of Llama, open-source and trained on a larger amount of data. |
|
chat |
4096 |
Llama-2-Chat is a fine-tuned version of the Llama-2 LLM, specializing in chatting. |
|
generate |
8192 |
Llama 3 is an auto-regressive language model that uses an optimized transformer architecture |
|
chat |
8192 |
The Llama 3 instruction tuned models are optimized for dialogue use cases and outperform many of the available open source chat models on common industry benchmarks.. |
|
generate |
131072 |
Llama 3.1 is an auto-regressive language model that uses an optimized transformer architecture |
|
chat, tools |
131072 |
The Llama 3.1 instruction tuned models are optimized for dialogue use cases and outperform many of the available open source chat models on common industry benchmarks.. |
|
generate, vision |
131072 |
The Llama 3.2-Vision collection of multimodal large language models (LLMs) is a collection of pretrained and instruction-tuned image reasoning generative models in 11B and 90B sizes (text + images in / text out)… |
|
chat, vision |
131072 |
The Llama 3.2-Vision-instruct instruction-tuned models are optimized for visual recognition, image reasoning, captioning, and answering general questions about an image. The models outperform many of the available open source and closed multimodal models on common industry benchmarks… |
|
chat |
4096 |
MiniCPM is an End-Size LLM developed by ModelBest Inc. and TsinghuaNLP, with only 2.4B parameters excluding embeddings. |
|
chat |
4096 |
MiniCPM is an End-Size LLM developed by ModelBest Inc. and TsinghuaNLP, with only 2.4B parameters excluding embeddings. |
|
chat |
4096 |
MiniCPM is an End-Size LLM developed by ModelBest Inc. and TsinghuaNLP, with only 2.4B parameters excluding embeddings. |
|
chat |
4096 |
MiniCPM is an End-Size LLM developed by ModelBest Inc. and TsinghuaNLP, with only 2.4B parameters excluding embeddings. |
|
chat |
4096 |
MiniCPM is an End-Size LLM developed by ModelBest Inc. and TsinghuaNLP, with only 2.4B parameters excluding embeddings. |
|
chat, vision |
8192 |
MiniCPM-Llama3-V 2.5 is the latest model in the MiniCPM-V series. The model is built on SigLip-400M and Llama3-8B-Instruct with a total of 8B parameters. |
|
chat, vision |
32768 |
MiniCPM-V 2.6 is the latest model in the MiniCPM-V series. The model is built on SigLip-400M and Qwen2-7B with a total of 8B parameters. |
|
chat |
32768 |
MiniCPM3-4B is the 3rd generation of MiniCPM series. The overall performance of MiniCPM3-4B surpasses Phi-3.5-mini-Instruct and GPT-3.5-Turbo-0125, being comparable with many recent 7B~9B models. |
|
chat |
8192 |
Mistral-7B-Instruct is a fine-tuned version of the Mistral-7B LLM on public datasets, specializing in chatting. |
|
chat |
8192 |
The Mistral-7B-Instruct-v0.2 Large Language Model (LLM) is an improved instruct fine-tuned version of Mistral-7B-Instruct-v0.1. |
|
chat |
32768 |
The Mistral-7B-Instruct-v0.2 Large Language Model (LLM) is an improved instruct fine-tuned version of Mistral-7B-Instruct-v0.1. |
|
chat |
131072 |
Mistral-Large-Instruct-2407 is an advanced dense Large Language Model (LLM) of 123B parameters with state-of-the-art reasoning, knowledge and coding capabilities. |
|
chat |
1024000 |
The Mistral-Nemo-Instruct-2407 Large Language Model (LLM) is an instruct fine-tuned version of the Mistral-Nemo-Base-2407 |
|
generate |
8192 |
Mistral-7B is a unmoderated Transformer based LLM claiming to outperform Llama2 on all benchmarks. |
|
chat |
65536 |
The Mixtral-8x22B-Instruct-v0.1 Large Language Model (LLM) is an instruct fine-tuned version of the Mixtral-8x22B-v0.1, specializing in chatting. |
|
chat |
32768 |
Mistral-8x7B-Instruct is a fine-tuned version of the Mistral-8x7B LLM, specializing in chatting. |
|
generate |
32768 |
The Mixtral-8x7B Large Language Model (LLM) is a pretrained generative Sparse Mixture of Experts. |
|
chat, vision |
2048 |
OmniLMM is a family of open-source large multimodal models (LMMs) adept at vision & language modeling. |
|
chat |
8192 |
Openhermes 2.5 is a fine-tuned version of Mistral-7B-v0.1 on primarily GPT-4 generated data. |
|
generate |
2048 |
Opt is an open-source, decoder-only, Transformer based LLM that was designed to replicate GPT-3. |
|
chat |
4096 |
Orion-14B series models are open-source multilingual large language models trained from scratch by OrionStarAI. |
|
chat |
4096 |
Orion-14B series models are open-source multilingual large language models trained from scratch by OrionStarAI. |
|
generate |
2048 |
Phi-2 is a 2.7B Transformer based LLM used for research on model safety, trained with data similar to Phi-1.5 but augmented with synthetic texts and curated websites. |
|
chat |
128000 |
The Phi-3-Mini-128K-Instruct is a 3.8 billion-parameter, lightweight, state-of-the-art open model trained using the Phi-3 datasets. |
|
chat |
4096 |
The Phi-3-Mini-4k-Instruct is a 3.8 billion-parameter, lightweight, state-of-the-art open model trained using the Phi-3 datasets. |
|
generate |
4096 |
Platypus-70B-instruct is a merge of garage-bAInd/Platypus2-70B and upstage/Llama-2-70b-instruct-v2. |
|
chat |
32768 |
Qwen-chat is a fine-tuned version of the Qwen LLM trained with alignment techniques, specializing in chatting. |
|
chat, vision |
4096 |
Qwen-VL-Chat supports more flexible interaction, such as multiple image inputs, multi-round question answering, and creative capabilities. |
|
chat, tools |
32768 |
Qwen1.5 is the beta version of Qwen2, a transformer-based decoder-only language model pretrained on a large amount of data. |
|
chat, tools |
32768 |
Qwen1.5-MoE is a transformer-based MoE decoder-only language model pretrained on a large amount of data. |
|
chat, audio |
32768 |
Qwen2-Audio: A large-scale audio-language model which is capable of accepting various audio signal inputs and performing audio analysis or direct textual responses with regard to speech instructions. |
|
chat, audio |
32768 |
Qwen2-Audio: A large-scale audio-language model which is capable of accepting various audio signal inputs and performing audio analysis or direct textual responses with regard to speech instructions. |
|
chat, tools |
32768 |
Qwen2 is the new series of Qwen large language models |
|
chat, tools |
32768 |
Qwen2 is the new series of Qwen large language models. |
|
chat, vision |
32768 |
Qwen2-VL: To See the World More Clearly.Qwen2-VL is the latest version of the vision language models in the Qwen model familities. |
|
generate |
32768 |
Qwen2.5 is the latest series of Qwen large language models. For Qwen2.5, we release a number of base language models and instruction-tuned language models ranging from 0.5 to 72 billion parameters. |
|
generate |
32768 |
Qwen2.5-Coder is the latest series of Code-Specific Qwen large language models (formerly known as CodeQwen). |
|
chat, tools |
32768 |
Qwen2.5-Coder is the latest series of Code-Specific Qwen large language models (formerly known as CodeQwen). |
|
chat, tools |
32768 |
Qwen2.5 is the latest series of Qwen large language models. For Qwen2.5, we release a number of base language models and instruction-tuned language models ranging from 0.5 to 72 billion parameters. |
|
generate |
8192 |
We introduce SeaLLM-7B-v2, the state-of-the-art multilingual LLM for Southeast Asian (SEA) languages |
|
generate |
8192 |
We introduce SeaLLM-7B-v2.5, the state-of-the-art multilingual LLM for Southeast Asian (SEA) languages |
|
generate |
4096 |
Skywork is a series of large models developed by the Kunlun Group · Skywork team. |
|
generate |
4096 |
Skywork is a series of large models developed by the Kunlun Group · Skywork team. |
|
chat |
4096 |
We introduce Starling-7B, an open large language model (LLM) trained by Reinforcement Learning from AI Feedback (RLAIF). The model harnesses the power of our new GPT-4 labeled ranking dataset |
|
chat |
8192 |
The TeleChat is a large language model developed and trained by China Telecom Artificial Intelligence Technology Co., LTD. The 7B model base is trained with 1.5 trillion Tokens and 3 trillion Tokens and Chinese high-quality corpus. |
|
generate |
2048 |
The TinyLlama project aims to pretrain a 1.1B Llama model on 3 trillion tokens. |
|
chat |
100000 |
||
chat |
2048 |
WizardMath is an open-source LLM trained by fine-tuning Llama2 with Evol-Instruct, specializing in math. |
|
generate |
2048 |
XVERSE is a multilingual large language model, independently developed by Shenzhen Yuanxiang Technology. |
|
chat |
2048 |
XVERSEB-Chat is the aligned version of model XVERSE. |
|
generate |
4096 |
The Yi series models are large language models trained from scratch by developers at 01.AI. |
|
generate |
4096 |
Yi-1.5 is an upgraded version of Yi. It is continuously pre-trained on Yi with a high-quality corpus of 500B tokens and fine-tuned on 3M diverse fine-tuning samples. |
|
chat |
4096 |
Yi-1.5 is an upgraded version of Yi. It is continuously pre-trained on Yi with a high-quality corpus of 500B tokens and fine-tuned on 3M diverse fine-tuning samples. |
|
chat |
16384 |
Yi-1.5 is an upgraded version of Yi. It is continuously pre-trained on Yi with a high-quality corpus of 500B tokens and fine-tuned on 3M diverse fine-tuning samples. |
|
generate |
262144 |
The Yi series models are large language models trained from scratch by developers at 01.AI. |
|
chat |
4096 |
The Yi series models are large language models trained from scratch by developers at 01.AI. |
|
generate |
131072 |
Yi-Coder is a series of open-source code language models that delivers state-of-the-art coding performance with fewer than 10 billion parameters.Excelling in long-context understanding with a maximum context length of 128K tokens.Supporting 52 major programming languages, including popular ones such as Java, Python, JavaScript, and C++. |
|
chat |
131072 |
Yi-Coder is a series of open-source code language models that delivers state-of-the-art coding performance with fewer than 10 billion parameters.Excelling in long-context understanding with a maximum context length of 128K tokens.Supporting 52 major programming languages, including popular ones such as Java, Python, JavaScript, and C++. |
|
chat, vision |
4096 |
Yi Vision Language (Yi-VL) model is the open-source, multimodal version of the Yi Large Language Model (LLM) series, enabling content comprehension, recognition, and multi-round conversations about images. |
- aquila2
- aquila2-chat
- aquila2-chat-16k
- baichuan-2
- baichuan-2-chat
- c4ai-command-r-v01
- code-llama
- code-llama-instruct
- code-llama-python
- codegeex4
- codeqwen1.5
- codeqwen1.5-chat
- codeshell
- codeshell-chat
- codestral-v0.1
- cogvlm2
- cogvlm2-video-llama3-chat
- csg-wukong-chat-v0.1
- deepseek
- deepseek-chat
- deepseek-coder
- Specifications
- Model Spec 1 (pytorch, 1_3 Billion)
- Model Spec 2 (pytorch, 6_7 Billion)
- Model Spec 3 (pytorch, 7 Billion)
- Model Spec 4 (pytorch, 33 Billion)
- Model Spec 5 (ggufv2, 1_3 Billion)
- Model Spec 6 (ggufv2, 6_7 Billion)
- Model Spec 7 (ggufv2, 7 Billion)
- Model Spec 8 (ggufv2, 33 Billion)
- Model Spec 9 (gptq, 1_3 Billion)
- Model Spec 10 (gptq, 6_7 Billion)
- Model Spec 11 (gptq, 33 Billion)
- Model Spec 12 (awq, 1_3 Billion)
- Model Spec 13 (awq, 6_7 Billion)
- Model Spec 14 (awq, 33 Billion)
- Specifications
- deepseek-coder-instruct
- Specifications
- Model Spec 1 (pytorch, 1_3 Billion)
- Model Spec 2 (pytorch, 6_7 Billion)
- Model Spec 3 (pytorch, 7 Billion)
- Model Spec 4 (pytorch, 33 Billion)
- Model Spec 5 (ggufv2, 1_3 Billion)
- Model Spec 6 (ggufv2, 6_7 Billion)
- Model Spec 7 (ggufv2, 7 Billion)
- Model Spec 8 (ggufv2, 33 Billion)
- Model Spec 9 (gptq, 1_3 Billion)
- Model Spec 10 (gptq, 6_7 Billion)
- Model Spec 11 (gptq, 33 Billion)
- Model Spec 12 (awq, 1_3 Billion)
- Model Spec 13 (awq, 6_7 Billion)
- Model Spec 14 (awq, 33 Billion)
- Specifications
- deepseek-v2
- deepseek-v2-chat
- deepseek-v2-chat-0628
- deepseek-v2.5
- deepseek-vl-chat
- gemma-2-it
- Specifications
- Model Spec 1 (pytorch, 2 Billion)
- Model Spec 2 (pytorch, 9 Billion)
- Model Spec 3 (pytorch, 27 Billion)
- Model Spec 4 (ggufv2, 2 Billion)
- Model Spec 5 (ggufv2, 9 Billion)
- Model Spec 6 (ggufv2, 27 Billion)
- Model Spec 7 (mlx, 2 Billion)
- Model Spec 8 (mlx, 2 Billion)
- Model Spec 9 (mlx, 2 Billion)
- Model Spec 10 (mlx, 9 Billion)
- Model Spec 11 (mlx, 9 Billion)
- Model Spec 12 (mlx, 9 Billion)
- Model Spec 13 (mlx, 27 Billion)
- Model Spec 14 (mlx, 27 Billion)
- Model Spec 15 (mlx, 27 Billion)
- Specifications
- gemma-it
- glm-4v
- glm4-chat
- glm4-chat-1m
- gorilla-openfunctions-v2
- gpt-2
- internlm2-chat
- internlm2.5-chat
- internlm2.5-chat-1m
- internvl-chat
- internvl2
- Specifications
- Model Spec 1 (pytorch, 1 Billion)
- Model Spec 2 (pytorch, 2 Billion)
- Model Spec 3 (awq, 2 Billion)
- Model Spec 4 (pytorch, 4 Billion)
- Model Spec 5 (pytorch, 8 Billion)
- Model Spec 6 (awq, 8 Billion)
- Model Spec 7 (pytorch, 26 Billion)
- Model Spec 8 (awq, 26 Billion)
- Model Spec 9 (pytorch, 40 Billion)
- Model Spec 10 (awq, 40 Billion)
- Model Spec 11 (pytorch, 76 Billion)
- Model Spec 12 (awq, 76 Billion)
- Specifications
- llama-2
- Specifications
- Model Spec 1 (ggufv2, 7 Billion)
- Model Spec 2 (gptq, 7 Billion)
- Model Spec 3 (awq, 7 Billion)
- Model Spec 4 (ggufv2, 13 Billion)
- Model Spec 5 (ggufv2, 70 Billion)
- Model Spec 6 (pytorch, 7 Billion)
- Model Spec 7 (pytorch, 13 Billion)
- Model Spec 8 (gptq, 13 Billion)
- Model Spec 9 (awq, 13 Billion)
- Model Spec 10 (pytorch, 70 Billion)
- Model Spec 11 (gptq, 70 Billion)
- Model Spec 12 (awq, 70 Billion)
- Specifications
- llama-2-chat
- Specifications
- Model Spec 1 (ggufv2, 7 Billion)
- Model Spec 2 (ggufv2, 13 Billion)
- Model Spec 3 (ggufv2, 70 Billion)
- Model Spec 4 (pytorch, 7 Billion)
- Model Spec 5 (gptq, 7 Billion)
- Model Spec 6 (gptq, 70 Billion)
- Model Spec 7 (awq, 70 Billion)
- Model Spec 8 (awq, 7 Billion)
- Model Spec 9 (pytorch, 13 Billion)
- Model Spec 10 (gptq, 13 Billion)
- Model Spec 11 (awq, 13 Billion)
- Model Spec 12 (pytorch, 70 Billion)
- Specifications
- llama-3
- llama-3-instruct
- Specifications
- Model Spec 1 (ggufv2, 8 Billion)
- Model Spec 2 (pytorch, 8 Billion)
- Model Spec 3 (ggufv2, 70 Billion)
- Model Spec 4 (pytorch, 70 Billion)
- Model Spec 5 (mlx, 8 Billion)
- Model Spec 6 (mlx, 8 Billion)
- Model Spec 7 (mlx, 8 Billion)
- Model Spec 8 (mlx, 70 Billion)
- Model Spec 9 (mlx, 70 Billion)
- Model Spec 10 (mlx, 70 Billion)
- Model Spec 11 (gptq, 8 Billion)
- Model Spec 12 (gptq, 70 Billion)
- Specifications
- llama-3.1
- llama-3.1-instruct
- Specifications
- Model Spec 1 (ggufv2, 8 Billion)
- Model Spec 2 (pytorch, 8 Billion)
- Model Spec 3 (pytorch, 8 Billion)
- Model Spec 4 (gptq, 8 Billion)
- Model Spec 5 (awq, 8 Billion)
- Model Spec 6 (ggufv2, 70 Billion)
- Model Spec 7 (pytorch, 70 Billion)
- Model Spec 8 (pytorch, 70 Billion)
- Model Spec 9 (gptq, 70 Billion)
- Model Spec 10 (awq, 70 Billion)
- Model Spec 11 (mlx, 8 Billion)
- Model Spec 12 (mlx, 8 Billion)
- Model Spec 13 (mlx, 8 Billion)
- Model Spec 14 (mlx, 70 Billion)
- Model Spec 15 (mlx, 70 Billion)
- Model Spec 16 (mlx, 70 Billion)
- Model Spec 17 (pytorch, 405 Billion)
- Model Spec 18 (gptq, 405 Billion)
- Model Spec 19 (awq, 405 Billion)
- Specifications
- minicpm-2b-dpo-bf16
- minicpm-2b-dpo-fp16
- minicpm-2b-dpo-fp32
- minicpm-2b-sft-bf16
- minicpm-2b-sft-fp32
- MiniCPM-Llama3-V-2_5
- MiniCPM-V-2.6
- minicpm3-4b
- mistral-instruct-v0.1
- mistral-instruct-v0.2
- mistral-instruct-v0.3
- mistral-large-instruct
- mistral-nemo-instruct
- mistral-v0.1
- mixtral-8x22B-instruct-v0.1
- mixtral-instruct-v0.1
- mixtral-v0.1
- OmniLMM
- openhermes-2.5
- opt
- orion-chat
- orion-chat-rag
- phi-2
- phi-3-mini-128k-instruct
- phi-3-mini-4k-instruct
- platypus2-70b-instruct
- qwen-chat
- Specifications
- Model Spec 1 (ggufv2, 7 Billion)
- Model Spec 2 (ggufv2, 14 Billion)
- Model Spec 3 (pytorch, 1_8 Billion)
- Model Spec 4 (pytorch, 7 Billion)
- Model Spec 5 (pytorch, 14 Billion)
- Model Spec 6 (pytorch, 72 Billion)
- Model Spec 7 (gptq, 7 Billion)
- Model Spec 8 (gptq, 1_8 Billion)
- Model Spec 9 (gptq, 14 Billion)
- Model Spec 10 (gptq, 72 Billion)
- Specifications
- qwen-vl-chat
- qwen1.5-chat
- Specifications
- Model Spec 1 (pytorch, 0_5 Billion)
- Model Spec 2 (pytorch, 1_8 Billion)
- Model Spec 3 (pytorch, 4 Billion)
- Model Spec 4 (pytorch, 7 Billion)
- Model Spec 5 (pytorch, 14 Billion)
- Model Spec 6 (pytorch, 32 Billion)
- Model Spec 7 (pytorch, 72 Billion)
- Model Spec 8 (pytorch, 110 Billion)
- Model Spec 9 (gptq, 0_5 Billion)
- Model Spec 10 (gptq, 1_8 Billion)
- Model Spec 11 (gptq, 4 Billion)
- Model Spec 12 (gptq, 7 Billion)
- Model Spec 13 (gptq, 14 Billion)
- Model Spec 14 (gptq, 32 Billion)
- Model Spec 15 (gptq, 72 Billion)
- Model Spec 16 (gptq, 110 Billion)
- Model Spec 17 (awq, 0_5 Billion)
- Model Spec 18 (awq, 1_8 Billion)
- Model Spec 19 (awq, 4 Billion)
- Model Spec 20 (awq, 7 Billion)
- Model Spec 21 (awq, 14 Billion)
- Model Spec 22 (awq, 32 Billion)
- Model Spec 23 (awq, 72 Billion)
- Model Spec 24 (awq, 110 Billion)
- Model Spec 25 (ggufv2, 0_5 Billion)
- Model Spec 26 (ggufv2, 1_8 Billion)
- Model Spec 27 (ggufv2, 4 Billion)
- Model Spec 28 (ggufv2, 7 Billion)
- Model Spec 29 (ggufv2, 14 Billion)
- Model Spec 30 (ggufv2, 32 Billion)
- Model Spec 31 (ggufv2, 72 Billion)
- Specifications
- qwen1.5-moe-chat
- qwen2-audio
- qwen2-audio-instruct
- qwen2-instruct
- Specifications
- Model Spec 1 (pytorch, 0_5 Billion)
- Model Spec 2 (pytorch, 1_5 Billion)
- Model Spec 3 (pytorch, 7 Billion)
- Model Spec 4 (pytorch, 72 Billion)
- Model Spec 5 (gptq, 0_5 Billion)
- Model Spec 6 (gptq, 1_5 Billion)
- Model Spec 7 (gptq, 7 Billion)
- Model Spec 8 (gptq, 72 Billion)
- Model Spec 9 (awq, 0_5 Billion)
- Model Spec 10 (awq, 1_5 Billion)
- Model Spec 11 (awq, 7 Billion)
- Model Spec 12 (awq, 72 Billion)
- Model Spec 13 (fp8, 0_5 Billion)
- Model Spec 14 (fp8, 0_5 Billion)
- Model Spec 15 (fp8, 1_5 Billion)
- Model Spec 16 (fp8, 7 Billion)
- Model Spec 17 (fp8, 72 Billion)
- Model Spec 18 (mlx, 0_5 Billion)
- Model Spec 19 (mlx, 1_5 Billion)
- Model Spec 20 (mlx, 7 Billion)
- Model Spec 21 (mlx, 72 Billion)
- Model Spec 22 (ggufv2, 0_5 Billion)
- Model Spec 23 (ggufv2, 1_5 Billion)
- Model Spec 24 (ggufv2, 7 Billion)
- Model Spec 25 (ggufv2, 72 Billion)
- Specifications
- qwen2-moe-instruct
- qwen2-vl-instruct
- Specifications
- Model Spec 1 (pytorch, 2 Billion)
- Model Spec 2 (gptq, 2 Billion)
- Model Spec 3 (gptq, 2 Billion)
- Model Spec 4 (awq, 2 Billion)
- Model Spec 5 (pytorch, 7 Billion)
- Model Spec 6 (gptq, 7 Billion)
- Model Spec 7 (gptq, 7 Billion)
- Model Spec 8 (awq, 7 Billion)
- Model Spec 9 (pytorch, 72 Billion)
- Model Spec 10 (awq, 72 Billion)
- Model Spec 11 (gptq, 72 Billion)
- Specifications
- qwen2.5
- qwen2.5-coder
- qwen2.5-coder-instruct
- qwen2.5-instruct
- Specifications
- Model Spec 1 (pytorch, 0_5 Billion)
- Model Spec 2 (pytorch, 1_5 Billion)
- Model Spec 3 (pytorch, 3 Billion)
- Model Spec 4 (pytorch, 7 Billion)
- Model Spec 5 (pytorch, 14 Billion)
- Model Spec 6 (pytorch, 32 Billion)
- Model Spec 7 (pytorch, 72 Billion)
- Model Spec 8 (gptq, 0_5 Billion)
- Model Spec 9 (gptq, 1_5 Billion)
- Model Spec 10 (gptq, 3 Billion)
- Model Spec 11 (gptq, 7 Billion)
- Model Spec 12 (gptq, 14 Billion)
- Model Spec 13 (gptq, 32 Billion)
- Model Spec 14 (gptq, 72 Billion)
- Model Spec 15 (awq, 0_5 Billion)
- Model Spec 16 (awq, 1_5 Billion)
- Model Spec 17 (awq, 3 Billion)
- Model Spec 18 (awq, 7 Billion)
- Model Spec 19 (awq, 14 Billion)
- Model Spec 20 (awq, 32 Billion)
- Model Spec 21 (awq, 72 Billion)
- Model Spec 22 (ggufv2, 0_5 Billion)
- Model Spec 23 (ggufv2, 1_5 Billion)
- Model Spec 24 (ggufv2, 3 Billion)
- Model Spec 25 (ggufv2, 7 Billion)
- Model Spec 26 (ggufv2, 14 Billion)
- Model Spec 27 (ggufv2, 32 Billion)
- Model Spec 28 (ggufv2, 72 Billion)
- Model Spec 29 (mlx, 0_5 Billion)
- Model Spec 30 (mlx, 0_5 Billion)
- Model Spec 31 (mlx, 0_5 Billion)
- Model Spec 32 (mlx, 1_5 Billion)
- Model Spec 33 (mlx, 1_5 Billion)
- Model Spec 34 (mlx, 1_5 Billion)
- Model Spec 35 (mlx, 3 Billion)
- Model Spec 36 (mlx, 3 Billion)
- Model Spec 37 (mlx, 3 Billion)
- Model Spec 38 (mlx, 7 Billion)
- Model Spec 39 (mlx, 7 Billion)
- Model Spec 40 (mlx, 7 Billion)
- Model Spec 41 (mlx, 14 Billion)
- Model Spec 42 (mlx, 14 Billion)
- Model Spec 43 (mlx, 14 Billion)
- Model Spec 44 (mlx, 32 Billion)
- Model Spec 45 (mlx, 32 Billion)
- Model Spec 46 (mlx, 32 Billion)
- Model Spec 47 (mlx, 72 Billion)
- Model Spec 48 (mlx, 72 Billion)
- Model Spec 49 (mlx, 72 Billion)
- Specifications
- seallm_v2
- seallm_v2.5
- Skywork
- Skywork-Math
- Starling-LM
- telechat
- tiny-llama
- wizardcoder-python-v1.0
- wizardmath-v1.0
- xverse
- xverse-chat
- Yi
- Yi-1.5
- Yi-1.5-chat
- Specifications
- Model Spec 1 (pytorch, 6 Billion)
- Model Spec 2 (pytorch, 9 Billion)
- Model Spec 3 (pytorch, 34 Billion)
- Model Spec 4 (ggufv2, 6 Billion)
- Model Spec 5 (ggufv2, 9 Billion)
- Model Spec 6 (ggufv2, 34 Billion)
- Model Spec 7 (gptq, 6 Billion)
- Model Spec 8 (gptq, 9 Billion)
- Model Spec 9 (gptq, 34 Billion)
- Model Spec 10 (awq, 6 Billion)
- Model Spec 11 (awq, 9 Billion)
- Model Spec 12 (awq, 34 Billion)
- Model Spec 13 (mlx, 6 Billion)
- Model Spec 14 (mlx, 6 Billion)
- Model Spec 15 (mlx, 9 Billion)
- Model Spec 16 (mlx, 9 Billion)
- Model Spec 17 (mlx, 34 Billion)
- Model Spec 18 (mlx, 34 Billion)
- Specifications
- Yi-1.5-chat-16k
- Yi-200k
- Yi-chat
- yi-coder
- yi-coder-chat
- yi-vl-chat