MooreThreads / tutorial_on_musaLinks
☆30Updated last week
Alternatives and similar repositories for tutorial_on_musa
Users that are interested in tutorial_on_musa are comparing it to the libraries listed below
Sorting:
- torch_musa is an open source repository based on PyTorch, which can make full use of the super computing power of MooreThreads graphics c…☆418Updated 3 weeks ago
- A high-throughput and memory-efficient inference and serving engine for LLMs☆52Updated 8 months ago
- ☆291Updated 3 years ago
- 这个项目介绍了简单的CUDA入门,涉及到CUDA执行模型、线程层次、CUDA内存模型、核函数的编写方式以及PyTorch使用CUDA扩展的两种方式。通过该项目可以基本入门基于PyTorch的CUDA扩展的开发方式。☆89Updated 3 years ago
- A nvImageCodec library of GPU- and CPU- accelerated codecs featuring a unified interface☆111Updated 3 months ago
- cuda编程学习入门☆35Updated 11 months ago
- ☆35Updated 5 years ago
- ppl.cv is a high-performance image processing library of openPPL supporting various platforms.☆507Updated 8 months ago
- Easy-to-use, high-performance, multi-platform inference deployment framework☆1,061Updated this week
- This is a series of GPU optimization topics. Here we will introduce how to optimize the CUDA kernel in detail. I will introduce several…☆1,093Updated last year
- TensorRT 2022 亚军方案,tensorrt加速mobilevit模型☆68Updated 3 years ago
- TensorRT 7 C++ (almost) minimal examples☆82Updated last year
- A large number of cuda/tensorrt cases . 大量案例来学习cuda/tensorrt☆137Updated 2 years ago
- Serving Inside Pytorch☆163Updated this week
- 《CUDA编程基础与实践》一书的代码☆124Updated 3 years ago
- Ascend PyTorch adapter (torch_npu). Mirror of https://gitee.com/ascend/pytorch☆388Updated this week
- QAI AppBuilder is designed to help developers easily execute models on WoS and Linux platforms. It encapsulates the Qualcomm® AI Runtime …☆55Updated this week
- PaddlePaddle custom device implementaion. (『飞桨』自定义硬件接入实现)☆86Updated this week
- Run generative AI models in sophgo BM1684X/BM1688☆225Updated this week
- A Connected Component Labelling algorithm implemented in CUDA☆48Updated 3 years ago
- A CUDA tutorial to make people learn CUDA program from 0☆239Updated last year
- This is a Chinese translation of the CUDA programming guide☆1,598Updated 8 months ago
- 大模型部署实战:TensorRT-LLM, Triton Inference Server, vLLM☆26Updated last year
- ☆263Updated 7 years ago
- Quick and Self-Contained TensorRT Custom Plugin Implementation and Integration☆65Updated last month
- GPU高性能编程CUDA实战随书代码☆37Updated 3 years ago
- 关于书籍CUDA Programming使用了pycuda模块的Python版本的示例代码☆252Updated 5 years ago
- ☆121Updated 2 years ago
- ☆78Updated last year
- ☆26Updated last year