MooreThreads / tutorial_on_musaLinks
☆43Updated last month
Alternatives and similar repositories for tutorial_on_musa
Users that are interested in tutorial_on_musa are comparing it to the libraries listed below
Sorting:
- torch_musa is an open source repository based on PyTorch, which can make full use of the super computing power of MooreThreads graphics c…☆469Updated this week
- Run generative AI models in sophgo BM1684X/BM1688☆260Updated this week
- 这个项目介绍了简单的CUDA入门,涉及到CUDA执行模型、线程层次、CUDA内存模型、核函数的编写方式以及PyTorch使用CUDA扩展的两种方式。通过该项目可以基本入门基于PyTorch的CUDA扩展的开发方式。☆94Updated 4 years ago
- Ascend PyTorch adapter (torch_npu). Mirror of https://gitee.com/ascend/pytorch☆476Updated this week
- DeepSparkHub selects hundreds of application algorithms and models, covering various fields of AI and general-purpose computing, to suppo…☆69Updated 2 weeks ago
- MUSA Templates for Linear Algebra Subroutines☆39Updated 3 weeks ago
- A tutorial for CUDA&PyTorch☆175Updated 11 months ago
- ppl.cv is a high-performance image processing library of openPPL supporting various platforms.☆514Updated last year
- A high-throughput and memory-efficient inference and serving engine for LLMs☆73Updated last year
- PaddlePaddle custom device implementaion. (『飞桨』自定义硬件接入实现)☆101Updated this week
- Serving Inside Pytorch☆169Updated last week
- The DeepSpark open platform selects hundreds of open source application algorithms and models that are deeply coupled with industrial app…☆45Updated 2 weeks ago
- QAI AppBuilder is designed to help developers easily execute models on WoS and Linux platforms. It encapsulates the Qualcomm® AI Runtime …☆100Updated this week
- Explore LLM model deployment based on AXera's AI chips☆136Updated this week
- llm-export can export llm model to onnx.☆340Updated 2 months ago
- A large number of cuda/tensorrt cases . 大量案例来学习cuda/tensorrt☆168Updated 3 years ago
- This is a series of GPU optimization topics. Here we will introduce how to optimize the CUDA kernel in detail. I will introduce several…☆1,219Updated 2 years ago
- ☆308Updated 3 years ago
- cuda编程学习入门☆38Updated last year
- ☆34Updated 5 years ago
- PaddlePaddle Code Convert Toolkit. 『飞桨』深度学习代码转换工具☆119Updated this week
- Quick and Self-Contained TensorRT Custom Plugin Implementation and Integration☆77Updated 7 months ago
- Machine learning compiler based on MLIR for Sophgo TPU.☆839Updated 2 weeks ago
- A CUDA tutorial to make people learn CUDA program from 0☆264Updated last year
- learning how CUDA works☆362Updated 10 months ago
- 关于书籍CUDA Programming使用了pycuda模块的Python版本的示例代码☆258Updated 5 years ago
- A light llama-like llm inference framework based on the triton kernel.☆167Updated this week
- llama 2 Inference☆43Updated 2 years ago
- A nvImageCodec library of GPU- and CPU- accelerated codecs featuring a unified interface☆134Updated this week
- CMake configurations for PPL projects☆12Updated last year