MooreThreads / torch_musaLinks
torch_musa is an open source repository based on PyTorch, which can make full use of the super computing power of MooreThreads graphics cards.
☆436Updated last month
Alternatives and similar repositories for torch_musa
Users that are interested in torch_musa are comparing it to the libraries listed below
Sorting:
- a lightweight LLM model inference framework☆737Updated last year
- Ascend PyTorch adapter (torch_npu). Mirror of https://gitee.com/ascend/pytorch☆437Updated last month
- A high-throughput and memory-efficient inference and serving engine for LLMs☆64Updated 11 months ago
- llm-export can export llm model to onnx.☆314Updated last month
- C++ implementation of Qwen-LM☆606Updated 10 months ago
- A CPU tool for benchmarking the peak of floating points☆562Updated 3 months ago
- MUSA Templates for Linear Algebra Subroutines☆32Updated 7 months ago
- llm deploy project based mnn. This project has merged into MNN.☆1,602Updated 8 months ago
- Machine learning compiler based on MLIR for Sophgo TPU.☆803Updated 2 weeks ago
- PaddlePaddle custom device implementaion. (『飞桨』自定义硬件接入实现)☆96Updated this week
- ☆60Updated last year
- ☆618Updated last year
- MegCC是一个运行时超轻量,高效,移植简单的深度学习模型编译器☆486Updated 11 months ago
- ☆430Updated 3 weeks ago
- Run generative AI models in sophgo BM1684X/BM1688☆248Updated this week
- BladeDISC is an end-to-end DynamIc Shape Compiler project for machine learning workloads.☆896Updated 9 months ago
- FlagGems is an operator library for large language models implemented in the Triton Language.☆691Updated this week
- ☆129Updated 9 months ago
- ☆139Updated last year
- ☆507Updated last month
- export llama to onnx☆136Updated 9 months ago
- Efficient operation implementation based on the Cambricon Machine Learning Unit (MLU) .☆134Updated 2 weeks ago
- AI Accelerator Benchmark focuses on evaluating AI Accelerators from a practical production perspective, including the ease of use and ver…☆265Updated last month
- optimized BERT transformer inference on NVIDIA GPU. https://arxiv.org/abs/2210.03052☆479Updated last year
- A model compilation solution for various hardware☆450Updated last month
- Low-bit LLM inference on CPU/NPU with lookup table☆871Updated 4 months ago
- DashInfer is a native LLM inference engine aiming to deliver industry-leading performance atop various hardware architectures, including …☆266Updated 2 months ago
- A powerful toolkit for compressing large models including LLM, VLM, and video generation models.☆585Updated last month
- ☆75Updated 10 months ago
- Triton Documentation in Chinese Simplified / Triton 中文文档☆86Updated 6 months ago