Ascend / cann-container-imageLinks
Dockerfiles for Ascend CANN
☆42Updated this week
Alternatives and similar repositories for cann-container-image
Users that are interested in cann-container-image are comparing it to the libraries listed below
Sorting:
- Ascend PyTorch adapter (torch_npu). Mirror of https://gitee.com/ascend/pytorch☆485Updated last week
- torch_musa is an open source repository based on PyTorch, which can make full use of the super computing power of MooreThreads graphics c…☆475Updated last week
- A distributed scheduling system for HPC and AI workloads☆132Updated this week
- SJTU HPC 用户文档站点☆193Updated 3 months ago
- Documentation for HPC course☆160Updated 8 months ago
- Ascend TileLang adapter☆217Updated this week
- High-performance inference framework for large language models, focusing on efficiency, flexibility, and availability.☆1,467Updated this week
- SGLang kernel library for NPU☆96Updated last week
- A self-learning tutorail for CUDA High Performance Programing.☆882Updated 3 weeks ago
- GLake: optimizing GPU memory management and IO transmission.☆497Updated 10 months ago
- An interactive Ascend-NPU process viewer☆123Updated last month
- Community maintained hardware plugin for vLLM on Ascend☆1,651Updated this week
- Super Computing On Web☆315Updated this week
- 一种任务级GPU算力分时调度的高性能深度学习训练平台☆737Updated 2 years ago
- ☆73Updated last year
- Wiki fo HPC☆130Updated 6 months ago
- A pupil in the computer world.(Felix Fu)☆254Updated last year
- A high-throughput and memory-efficient inference and serving engine for LLMs☆76Updated last year
- A tutorial for CUDA&PyTorch☆253Updated last week
- Triton adapter for Ascend. Mirror of https://gitee.com/ascend/triton-ascend☆107Updated this week
- Implement custom operators in PyTorch with cuda/c++☆76Updated 3 years ago
- Codes & examples for "CUDA - From Correctness to Performance"☆121Updated last year
- JittorInfer is a high-performance C++ inference framework designed for large language models on Huawei's Ascend AI processor.☆78Updated this week
- Triton Documentation in Chinese Simplified / Triton 中文文档☆103Updated last month
- A tool for bandwidth measurements on NVIDIA GPUs.☆618Updated 9 months ago
- ☆288Updated last week
- MUSA Templates for Linear Algebra Subroutines☆42Updated 2 weeks ago
- Efficient and easy multi-instance LLM serving☆527Updated 5 months ago
- The Zaychik Power Controller server☆13Updated last year
- A fast communication-overlapping library for tensor/expert parallelism on GPUs.☆1,242Updated 5 months ago