MooreThreads / tutorial_on_musaLinks
☆43Updated 2 weeks ago
Alternatives and similar repositories for tutorial_on_musa
Users that are interested in tutorial_on_musa are comparing it to the libraries listed below
Sorting:
- torch_musa is an open source repository based on PyTorch, which can make full use of the super computing power of MooreThreads graphics c…☆471Updated last week
- Run generative AI models in sophgo BM1684X/BM1688☆263Updated last week
- Ascend PyTorch adapter (torch_npu). Mirror of https://gitee.com/ascend/pytorch☆483Updated this week
- llm-export can export llm model to onnx.☆344Updated 3 months ago
- A nvImageCodec library of GPU- and CPU- accelerated codecs featuring a unified interface☆139Updated 3 weeks ago
- LLaMa/RWKV onnx models, quantization and testcase☆367Updated 2 years ago
- MUSA Templates for Linear Algebra Subroutines☆41Updated this week
- PaddlePaddle custom device implementaion. (『飞桨』自定义硬件接入实现)☆101Updated this week
- DeepSparkHub selects hundreds of application algorithms and models, covering various fields of AI and general-purpose computing, to suppo…☆70Updated this week
- A high-throughput and memory-efficient inference and serving engine for LLMs☆76Updated last year
- ☆311Updated 3 years ago
- export llama to onnx☆137Updated last year
- ☆28Updated last year
- Serving Inside Pytorch☆170Updated last week
- A tutorial for CUDA&PyTorch☆208Updated last week
- ☆43Updated 4 years ago
- Quick and Self-Contained TensorRT Custom Plugin Implementation and Integration☆78Updated 8 months ago
- ppl.cv is a high-performance image processing library of openPPL supporting various platforms.☆514Updated last year
- 大模型部署实战:TensorRT-LLM, Triton Inference Server, vLLM☆27Updated last year
- cudnn_frontend provides a c++ wrapper for the cudnn backend API and samples on how to use it☆682Updated this week
- Large Language Model Onnx Inference Framework☆36Updated 2 months ago
- ☆437Updated 4 months ago
- 《CUDA编程基础与实践》一书的代码☆154Updated 3 years ago
- ☆14Updated 2 years ago
- ☆60Updated last year
- Explore LLM model deployment based on AXera's AI chips☆137Updated this week
- a lightweight LLM model inference framework☆748Updated last year
- This is a series of GPU optimization topics. Here we will introduce how to optimize the CUDA kernel in detail. I will introduce several…☆1,233Updated 2 years ago
- ☆27Updated 2 years ago
- MegEngine到其他框架的转换器☆70Updated 2 years ago