ROCm / pyrsmiLinks
python package of rocm-smi-lib
☆21Updated 9 months ago
Alternatives and similar repositories for pyrsmi
Users that are interested in pyrsmi are comparing it to the libraries listed below
Sorting:
- ☆74Updated 3 months ago
- extensible collectives library in triton☆87Updated 3 months ago
- Fast low-bit matmul kernels in Triton☆330Updated last week
- A safetensors extension to efficiently store sparse quantized tensors on disk☆137Updated this week
- Boosting 4-bit inference kernels with 2:4 Sparsity☆80Updated 10 months ago
- This repository contains the experimental PyTorch native float8 training UX☆224Updated 11 months ago
- A bunch of kernels that might make stuff slower 😉☆55Updated last week
- Applied AI experiments and examples for PyTorch☆286Updated last month
- Write a fast kernel and run it on Discord. See how you compare against the best!☆46Updated this week
- ☆106Updated 10 months ago
- Framework to reduce autotune overhead to zero for well known deployments.☆79Updated this week
- ☆74Updated 3 weeks ago
- ☆83Updated 8 months ago
- Collection of kernels written in Triton language☆136Updated 3 months ago
- ☆225Updated last week
- ring-attention experiments☆143Updated 9 months ago
- QUICK: Quantization-aware Interleaving and Conflict-free Kernel for efficient LLM inference☆118Updated last year
- Repository for Sparse Finetuning of LLMs via modified version of the MosaicML llmfoundry☆42Updated last year
- Ahead of Time (AOT) Triton Math Library☆70Updated last week
- High-speed GEMV kernels, at most 2.7x speedup compared to pytorch baseline.☆112Updated last year
- Load compute kernels from the Hub☆207Updated this week
- 🚀 Collection of components for development, training, tuning, and inference of foundation models leveraging PyTorch native components.☆206Updated this week
- Explore training for quantized models☆20Updated last week
- ☆157Updated last year
- Development repository for the Triton language and compiler☆125Updated this week
- A Python-embedded DSL that makes it easy to write fast, scalable ML kernels with minimal boilerplate.☆187Updated this week
- PyTorch/XLA integration with JetStream (https://github.com/google/JetStream) for LLM inference"☆64Updated 3 months ago
- ☆21Updated 4 months ago
- DeeperGEMM: crazy optimized version☆69Updated 2 months ago
- ☆40Updated this week