MooreThreads / torch_musaLinks

torch_musa is an open source repository based on PyTorch, which can make full use of the super computing power of MooreThreads graphics cards.

☆431

Alternatives and similar repositories for torch_musa

Users that are interested in torch_musa are comparing it to the libraries listed below

Sorting:

MegEngine / InferLLM
a lightweight LLM model inference framework
☆734Updated last year
Ascend / pytorch
Ascend PyTorch adapter (torch_npu). Mirror of https://gitee.com/ascend/pytorch
☆401Updated last week
sophgo / tpu-mlir
Machine learning compiler based on MLIR for Sophgo TPU.
☆766Updated last week
PaddlePaddle / PaddleCustomDevice
PaddlePaddle custom device implementaion. (『飞桨』自定义硬件接入实现)
☆90Updated this week
MooreThreads / vllm_musa
A high-throughput and memory-efficient inference and serving engine for LLMs
☆55Updated 9 months ago
wangzhaode / llm-export
llm-export can export llm model to onnx.
☆301Updated 6 months ago
QwenLM / qwen.cpp
C++ implementation of Qwen-LM
☆604Updated 8 months ago
sophgo / LLM-TPU
Run generative AI models in sophgo BM1684X/BM1688
☆232Updated this week
Tencent / KsanaLLM
☆477Updated this week
MooreThreads / mutlass
MUSA Templates for Linear Algebra Subroutines
☆30Updated 5 months ago
mlc-ai / mlc-zh
☆611Updated last year
pigirons / cpufp
A CPU tool for benchmarking the peak of floating points
☆557Updated last month
hyperai / triton-cn
Triton Documentation in Chinese Simplified / Triton 中文文档
☆78Updated 3 months ago
FlagOpen / FlagGems
FlagGems is an operator library for large language models implemented in the Triton Language.
☆640Updated this week
bytedance / ByteTransformer
optimized BERT transformer inference on NVIDIA GPU. https://arxiv.org/abs/2210.03052
☆474Updated last year
wangzhaode / mnn-llm
llm deploy project based mnn. This project has merged into MNN.
☆1,601Updated 6 months ago
intel / xFasterTransformer
☆428Updated 2 weeks ago
tpoisonooo / llama.onnx
LLaMa/RWKV onnx models, quantization and testcase
☆363Updated 2 years ago
Cambricon / mlu-ops
Efficient operation implementation based on the Cambricon Machine Learning Unit (MLU) .
☆125Updated last week
ZRayZzz / flash-attention-v100
☆48Updated last year
MegEngine / MegCC
MegCC是一个运行时超轻量，高效，移植简单的深度学习模型编译器
☆486Updated 9 months ago
microsoft / T-MAC
Low-bit LLM inference on CPU/NPU with lookup table
☆836Updated 2 months ago
InfiniTensor / InfiniTensor
☆246Updated last week
modelscope / dash-infer
DashInfer is a native LLM inference engine aiming to deliver industry-leading performance atop various hardware architectures, including …
☆263Updated last week
pnnx / pnnx
PyTorch Neural Network eXchange
☆606Updated this week
alibaba / rtp-llm
RTP-LLM: Alibaba's high-performance LLM inference engine for diverse applications.
☆830Updated last week
luchangli03 / export_llama_to_onnx
export llama to onnx
☆131Updated 7 months ago
bytedance / byteir
A model compilation solution for various hardware
☆439Updated 2 weeks ago
OpenPPL / ppl.nn.llm
☆139Updated last year
MooreThreads / tutorial_on_musa
☆31Updated last week