MooreThreads / torch_musa
torch_musa is an open source repository based on PyTorch, which can make full use of the super computing power of MooreThreads graphics cards.
☆313Updated 2 weeks ago
Related projects ⓘ
Alternatives and complementary repositories for torch_musa
- a lightweight LLM model inference framework☆699Updated 7 months ago
- Ascend PyTorch adapter (torch_npu). Mirror of https://gitee.com/ascend/pytorch☆253Updated this week
- RTP-LLM: Alibaba's high-performance LLM inference engine for diverse applications.☆541Updated 3 weeks ago
- llm-export can export llm model to onnx.☆226Updated this week
- 本项目是一个通过文字生成图片的项目,基于开源模型Stable Diffusion V1.5生成可以在手机的CPU和NPU上运行的模型,包括其配套的模型运行框架。☆105Updated 7 months ago
- ☆588Updated 5 months ago
- The CUDA version of the RWKV language model ( https://github.com/BlinkDL/RWKV-LM )☆212Updated 5 months ago
- PaddlePaddle custom device implementaion. (『飞桨』自定义硬件接入实现)☆70Updated this week
- optimized BERT transformer inference on NVIDIA GPU. https://arxiv.org/abs/2210.03052☆457Updated 7 months ago
- ☆140Updated 6 months ago
- This is an inference framework for the RWKV large language model implemented purely in native PyTorch. The official native implementation…☆118Updated 3 months ago
- ☆123Updated this week
- Machine learning compiler based on MLIR for Sophgo TPU.☆609Updated this week
- Run generative AI models in sophgo BM1684X☆120Updated this week
- C++ implementation of Qwen-LM☆551Updated 10 months ago
- nndeploy是一款模型端到端部署框架。以多端推理以及基于有向无环图模型部署为基础,致力为用户提供跨平台、简单易用、高性能的模型部署体验。☆631Updated this week
- export llama to onnx☆95Updated 5 months ago
- PyTorch Neural Network eXchange☆518Updated 2 weeks ago
- llm deploy project based mnn.☆1,465Updated this week
- DashInfer is a native LLM inference engine aiming to deliver industry-leading performance atop various hardware architectures, including …☆135Updated 2 months ago
- BladeDISC is an end-to-end DynamIc Shape Compiler project for machine learning workloads.☆815Updated 2 months ago
- [EMNLP 2024 Industry Track] This is the official PyTorch implementation of "LLMC: Benchmarking Large Language Model Quantization with a V…☆315Updated this week
- Low-bit LLM inference on CPU with lookup table☆563Updated last week
- MegCC是一个运行时超轻量,高效,移植简单的深度学习模型编译器☆474Updated 2 weeks ago
- LLaMa/RWKV onnx models, quantization and testcase☆350Updated last year
- ☆136Updated this week
- xDiT: A Scalable Inference Engine for Diffusion Transformers (DiTs) with Massive Parallelism☆668Updated this week
- a huggingface mirror site.☆234Updated 7 months ago
- ☆123Updated 10 months ago
- run ChatGLM2-6B in BM1684X☆48Updated 8 months ago