MooreThreads / tutorial_on_musaLinks
☆42Updated 2 weeks ago
Alternatives and similar repositories for tutorial_on_musa
Users that are interested in tutorial_on_musa are comparing it to the libraries listed below
Sorting:
- torch_musa is an open source repository based on PyTorch, which can make full use of the super computing power of MooreThreads graphics c…☆456Updated last month
- Run generative AI models in sophgo BM1684X/BM1688☆254Updated 2 weeks ago
- llm-export can export llm model to onnx.☆336Updated last month
- QAI AppBuilder is designed to help developers easily execute models on WoS and Linux platforms. It encapsulates the Qualcomm® AI Runtime …☆96Updated this week
- A tutorial for CUDA&PyTorch☆171Updated 11 months ago
- Serving Inside Pytorch☆166Updated last week
- Large Language Model Onnx Inference Framework☆36Updated 3 weeks ago
- Quick and Self-Contained TensorRT Custom Plugin Implementation and Integration☆74Updated 6 months ago
- run ChatGLM2-6B in BM1684X☆49Updated last year
- Ascend PyTorch adapter (torch_npu). Mirror of https://gitee.com/ascend/pytorch☆461Updated this week
- stable diffusion using mnn☆67Updated 2 years ago
- PaddlePaddle custom device implementaion. (『飞桨』自定义硬件接入实现)☆101Updated this week
- A high-throughput and memory-efficient inference and serving engine for LLMs☆71Updated last year
- ☆60Updated last year
- A nvImageCodec library of GPU- and CPU- accelerated codecs featuring a unified interface☆127Updated 2 weeks ago
- ☆43Updated 3 years ago
- DeepSparkHub selects hundreds of application algorithms and models, covering various fields of AI and general-purpose computing, to suppo…☆69Updated last month
- Explore LLM model deployment based on AXera's AI chips☆131Updated last week
- DashInfer is a native LLM inference engine aiming to deliver industry-leading performance atop various hardware architectures, including …☆270Updated 4 months ago
- 这个项目介绍了简单的CUDA入门,涉及到CUDA执行模型、线程层次、CUDA内存模型、核函数的编写方式以及PyTorch使用CUDA扩展的两种方式。通过该项目可以基本入门基于PyTorch的CUDA扩展的开发方式。☆94Updated 4 years ago
- ☆140Updated last year
- CMake configurations for PPL projects☆12Updated last year
- ☆135Updated last week
- ☆28Updated last year
- The DeepSpark open platform selects hundreds of open source application algorithms and models that are deeply coupled with industrial app…☆45Updated 2 weeks ago
- 大模型部署实战:TensorRT-LLM, Triton Inference Server, vLLM☆26Updated last year
- an example of segment-anything infer by ncnn☆124Updated 2 years ago
- ☆27Updated 2 years ago
- export llama to onnx☆137Updated 11 months ago
- Parallel Prefix Sum (Scan) with CUDA☆27Updated last year