Repository to host ROCm Developer Hub Notebook Tutorials
☆67Mar 24, 2026Updated 2 weeks ago
Alternatives and similar repositories for gpuaidev
Users that are interested in gpuaidev are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆11Sep 8, 2025Updated 7 months ago
- Automating analysis from trace files☆66Apr 4, 2026Updated last week
- Automated bottleneck detection and solution orchestration☆20Feb 24, 2026Updated last month
- [DEPRECATED] Moved to ROCm/rocm-systems repo☆30Apr 1, 2026Updated last week
- ☆15Feb 2, 2026Updated 2 months ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click and start building anything your business needs.
- Official implementation for the paper Lancet: Accelerating Mixture-of-Experts Training via Whole Graph Computation-Communication Overlapp…☆14Nov 17, 2025Updated 4 months ago
- Flax (Jax) implementation of DeepSeek-R1-Distill-Qwen-1.5B with weights ported from Hugging Face.☆26Feb 20, 2025Updated last year
- ☆15Feb 23, 2025Updated last year
- Schola is a plugin for enabling Reinforcement Learning (RL) in Unreal Engine. It provides tools to help developers create environments, d…☆67Dec 18, 2025Updated 3 months ago
- A Linux kernel module, that allows changing/toggling system parameters stored in MSR and PCI registers of x86 processors☆16Mar 29, 2023Updated 3 years ago
- libmdk codec plugin based on microsoft media foundation transform☆11Jun 30, 2022Updated 3 years ago
- Step by step implementation of a fast softmax kernel in CUDA☆65Jan 6, 2025Updated last year
- Row-wise block scaling for fp8 quantization matrix multiplication. Solution to GPU mode AMD challenge.☆19Feb 9, 2026Updated 2 months ago
- HPC Performance Anomaly Suite☆21Jun 11, 2020Updated 5 years ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- NUMA-aware multi-CPU multi-GPU data transfer benchmarks☆28Oct 26, 2023Updated 2 years ago
- ☆21Mar 23, 2026Updated 2 weeks ago
- [DEPRECATED] Moved to ROCm/rocm-libraries repo☆12Mar 27, 2026Updated 2 weeks ago
- Unlocked and enhanced driver for all graphics tablets from Huion and Gaomon☆11Apr 20, 2020Updated 5 years ago
- ☆39Aug 8, 2025Updated 8 months ago
- Lime sample projects☆14Jan 2, 2025Updated last year
- ☆28Mar 31, 2026Updated last week
- ☆93Nov 11, 2025Updated 5 months ago
- Super fast FP32 matrix multiplication on RDNA3☆87Mar 30, 2025Updated last year
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click and start building anything your business needs.
- A collection of GPU experiments and benchmarks for my personal understanding and research.☆28Mar 18, 2026Updated 3 weeks ago
- ROCm Documentation Python package for ReadTheDocs build standardization☆17Updated this week
- The goal of the OSSCI Fleet is to provide a central mechanism to enable test automation, batch job scheduling, and developer access to a …☆13Mar 24, 2026Updated 2 weeks ago
- Using ncnn to test the reasoning performance of neural network☆38Jan 18, 2026Updated 2 months ago
- The AMD rocAL is designed to efficiently decode and process images and videos from a variety of storage formats and modify them through a…☆23Apr 2, 2026Updated last week
- High Performance FP8 GEMM Kernels for SM89 and later GPUs.☆21Jan 24, 2025Updated last year
- [DEPRECATED] Moved to ROCm/rocm-libraries repo☆26Mar 30, 2026Updated last week
- An interactive web-based tool for exploring intermediate representations of PyTorch and Triton models☆49Jan 23, 2026Updated 2 months ago
- Easy interactive prompts to create and validate data using JSON schema.☆10Feb 27, 2026Updated last month
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- ☆36Sep 22, 2025Updated 6 months ago
- The AMD Debugger API is a library that provides all the support necessary for a debugger and other tools to perform low level control of …☆18Mar 31, 2026Updated last week
- ☆66Apr 4, 2026Updated last week
- Official Implementation of Knowledge Flow Prompting☆35Oct 20, 2025Updated 5 months ago
- General Matrix Multiplication using NVIDIA Tensor Cores☆28Jan 25, 2025Updated last year
- ☆10Feb 12, 2025Updated last year
- Collective and Neighbor Collective Optimizations and Extensions☆13Mar 26, 2026Updated 2 weeks ago