ROCm / jaxLinks
Composable transformations of Python+NumPy programs: differentiate, vectorize, JIT to GPU/TPU, and more
☆24Updated this week
Alternatives and similar repositories for jax
Users that are interested in jax are comparing it to the libraries listed below
Sorting:
- 8-bit CUDA functions for PyTorch☆68Updated last month
- A system validation and diagnostics tool for monitoring, stress testing, detecting, and troubleshooting issues impacting AMD GPUs in high…☆88Updated last week
- ROCm's Thunk Interface☆91Updated 8 months ago
- The AMD rocAL is designed to efficiently decode and process images and videos from a variety of storage formats and modify them through a…☆22Updated this week
- CMake modules used within the ROCm libraries☆68Updated this week
- [DEPRECATED] Moved to ROCm/rocm-systems repo☆269Updated this week
- ☆139Updated last month
- A PyTorch Extension: Tools for easy mixed precision and distributed training in Pytorch☆23Updated last week
- Bandwidth test for ROCm☆69Updated last week
- Development repository for the Triton language and compiler☆137Updated this week
- [DEPRECATED] Moved to ROCm/rocm-systems repo☆84Updated 3 weeks ago
- ☆61Updated 2 years ago
- ☆51Updated last week
- Deep Learning Primitives and Mini-Framework for OpenCL☆204Updated last year
- [DEPRECATED] Moved to ROCm/rocm-systems repo☆149Updated 2 weeks ago
- Intel® Extension for DeepSpeed* is an extension to DeepSpeed that brings feature support with SYCL kernels on Intel GPU(XPU) device. Note…☆63Updated 4 months ago
- [DEPRECATED] Moved to ROCm/rocm-libraries repo☆114Updated this week
- [DEPRECATED] Moved to ROCm/rocm-libraries repo☆388Updated this week
- RCCL Performance Benchmark Tests☆79Updated this week
- oneAPI Level Zero Conformance & Performance test content☆57Updated 2 weeks ago
- A high-throughput and memory-efficient inference and serving engine for LLMs☆108Updated this week
- Hackable and optimized Transformers building blocks, supporting a composable construction.☆32Updated last week
- ☆67Updated this week
- A collection of examples for the ROCm software stack☆253Updated this week
- This is ROCgdb, the ROCm source-level debugger for Linux, based on GDB, the GNU source-level debugger.☆63Updated this week
- AMD's graph optimization engine.☆266Updated this week
- ☆153Updated this week
- [DEPRECATED] Moved to ROCm/rocm-libraries repo☆147Updated this week
- TransferBench is a utility capable of benchmarking simultaneous copies between user-specified devices (CPUs/GPUs)☆51Updated last week
- Fast and memory-efficient exact attention☆200Updated last month