MooreThreads / torch_musa
torch_musa is an open source repository based on PyTorch, which can make full use of the super computing power of MooreThreads graphics cards.
☆351Updated 2 months ago
Alternatives and similar repositories for torch_musa:
Users that are interested in torch_musa are comparing it to the libraries listed below
- a lightweight LLM model inference framework☆713Updated 9 months ago
- Ascend PyTorch adapter (torch_npu). Mirror of https://gitee.com/ascend/pytorch☆294Updated this week
- PaddlePaddle custom device implementaion. (『飞桨』自定义硬件接入实现)☆78Updated last week
- ☆119Updated last year
- llm-export can export llm model to onnx.☆257Updated last week
- Machine learning compiler based on MLIR for Sophgo TPU.☆649Updated last week
- DashInfer is a native LLM inference engine aiming to deliver industry-leading performance atop various hardware architectures, including …☆220Updated this week
- C++ implementation of Qwen-LM☆571Updated last month
- export llama to onnx☆112Updated last month
- ☆140Updated 9 months ago
- Triton Documentation in Chinese Simplified / Triton 中文文档☆52Updated 2 weeks ago
- FlagGems is an operator library for large language models implemented in Triton Language.☆407Updated this week
- A high-throughput and memory-efficient inference and serving engine for LLMs☆37Updated 3 months ago
- ☆598Updated 7 months ago
- ☆127Updated last month
- optimized BERT transformer inference on NVIDIA GPU. https://arxiv.org/abs/2210.03052☆469Updated 10 months ago
- Run generative AI models in sophgo BM1684X☆155Updated this week
- learning how CUDA works☆190Updated 5 months ago
- MegCC是一个运行时超轻量,高效,移植简单的深度学习模型编译器☆476Updated 3 months ago
- run ChatGLM2-6B in BM1684X☆49Updated 10 months ago
- FlagScale is a large model toolkit based on open-sourced projects.☆209Updated this week
- ☆151Updated last month
- 本项目是一个通过文字生成图片的项目,基于开源模型Stable Diffusion V1.5生成可以在手机的CPU和NPU上运行的模型,包括其配套的模型运行框架。☆135Updated 10 months ago
- ☆76Updated last year
- Efficient operation implementation based on the Cambricon Machine Learning Unit (MLU) .☆108Updated last week
- LiBai(李白): A Toolbox for Large-Scale Distributed Parallel Training☆398Updated last week
- RTP-LLM: Alibaba's high-performance LLM inference engine for diverse applications.☆607Updated last week
- A tutorial for CUDA&PyTorch☆126Updated last week
- ☆311Updated last week
- [EMNLP 2024 Industry Track] This is the official PyTorch implementation of "LLMC: Benchmarking Large Language Model Quantization with a V…☆390Updated last week