MooreThreads / vllm_musa
A high-throughput and memory-efficient inference and serving engine for LLMs
☆49Updated 5 months ago
Alternatives and similar repositories for vllm_musa:
Users that are interested in vllm_musa are comparing it to the libraries listed below
- DashInfer is a native LLM inference engine aiming to deliver industry-leading performance atop various hardware architectures, including …☆243Updated last week
- Triton Documentation in Chinese Simplified / Triton 中文文档☆66Updated last week
- Run generative AI models in sophgo BM1684X☆199Updated this week
- vLLM Documentation in Chinese Simplified / vLLM 中文文档☆61Updated this week
- ☆161Updated 2 weeks ago
- llm-export can export llm model to onnx.☆282Updated 3 months ago
- Ascend PyTorch adapter (torch_npu). Mirror of https://gitee.com/ascend/pytorch☆338Updated this week
- run DeepSeek-R1 GGUFs on KTransformers☆224Updated last month
- run ChatGLM2-6B in BM1684X☆49Updated last year
- LLM101n: Let's build a Storyteller 中文版☆131Updated 8 months ago
- ☆41Updated 5 months ago
- ☆48Updated this week
- 支持中文场景的的小语言模型 llama2.c-zh☆145Updated last year
- PaddlePaddle custom device implementaion. (『飞桨』自定义硬件接入实现)☆82Updated this week
- Community maintained hardware plugin for vLLM on Ascend☆515Updated this week
- Compare different hardware platforms via the Roofline Model for LLM inference tasks.☆97Updated last year
- llama 2 Inference☆42Updated last year
- ☆127Updated 4 months ago
- ☆26Updated 2 weeks ago
- ☆139Updated last year
- unify-easy-llm(ULM)旨在打造一个简易的一键式大模型训练工具,支持Nvidia GPU、Ascend NPU等不同硬件以及常用的大模型。☆55Updated 9 months ago
- Large Language Model Onnx Inference Framework☆32Updated 3 months ago
- LLM 推理服务性能测试☆39Updated last year
- ☆27Updated 5 months ago
- ☆311Updated 4 months ago
- Explore LLM model deployment based on AXera's AI chips☆100Updated this week
- Mixture-of-Experts (MoE) Language Model☆186Updated 7 months ago
- pretrain a wiki llm using transformers☆37Updated 7 months ago
- FlagScale is a large model toolkit based on open-sourced projects.☆268Updated this week
- ☆32Updated last year