ROCm / MADLinks
MAD (Model Automation and Dashboarding)
☆25Updated this week
Alternatives and similar repositories for MAD
Users that are interested in MAD are comparing it to the libraries listed below
Sorting:
- ☆45Updated this week
- RCCL Performance Benchmark Tests☆76Updated last week
- Intel® Extension for DeepSpeed* is an extension to DeepSpeed that brings feature support with SYCL kernels on Intel GPU(XPU) device. Note…☆62Updated 2 months ago
- PArametrized Recommendation and Ai Model benchmark is a repository for development of numerous uBenchmarks as well as end to end nets for…☆151Updated 3 weeks ago
- AI Tensor Engine for ROCm☆279Updated this week
- MSCCL++: A GPU-driven communication stack for scalable AI applications☆417Updated this week
- Microsoft Collective Communication Library☆66Updated 10 months ago
- [DEPRECATED] Moved to ROCm/rocm-libraries repo☆111Updated this week
- oneCCL Bindings for Pytorch*☆102Updated last month
- Bandwidth test for ROCm☆66Updated last week
- AI Accelerator Benchmark focuses on evaluating AI Accelerators from a practical production perspective, including the ease of use and ver…☆265Updated last month
- NCCL Profiling Kit☆145Updated last year
- ☆56Updated this week
- OpenAI Triton backend for Intel® GPUs☆208Updated this week
- Development repository for the Triton language and compiler☆131Updated this week
- A Fusion Code Generator for NVIDIA GPUs (commonly known as "nvFuser")☆353Updated this week
- Composable Kernel: Performance Portable Programming Model for Machine Learning Tensor Operators☆465Updated this week
- rocSHMEM intra-kernel networking runtime for AMD dGPUs on the ROCm platform.☆117Updated this week
- Ongoing research training transformer models at scale☆28Updated this week
- A tool for bandwidth measurements on NVIDIA GPUs.☆534Updated 5 months ago
- ROCm Communication Collectives Library (RCCL)☆381Updated this week
- A high-throughput and memory-efficient inference and serving engine for LLMs☆102Updated this week
- NVIDIA Resiliency Extension is a python package for framework developers and users to implement fault-tolerant features. It improves the …☆222Updated last week
- Multi-GPU communication profiler and visualizer☆32Updated last year
- oneAPI Collective Communications Library (oneCCL)☆245Updated this week
- AMD RAD's experimental RMA library for Triton.☆74Updated this week
- Ahead of Time (AOT) Triton Math Library☆76Updated last week
- Offline optimization of your disaggregated Dynamo graph☆67Updated last week
- A hierarchical collective communications library with portable optimizations☆36Updated 9 months ago
- ☆46Updated 9 months ago