ROCm / pyrsmiLinks
python package of rocm-smi-lib
☆23Updated 2 months ago
Alternatives and similar repositories for pyrsmi
Users that are interested in pyrsmi are comparing it to the libraries listed below
Sorting:
- Write a fast kernel and run it on Discord. See how you compare against the best!☆57Updated this week
- ☆74Updated 6 months ago
- extensible collectives library in triton☆88Updated 5 months ago
- Ahead of Time (AOT) Triton Math Library☆76Updated last week
- This repository contains the experimental PyTorch native float8 training UX☆224Updated last year
- Framework to reduce autotune overhead to zero for well known deployments.☆82Updated last week
- ☆45Updated this week
- ☆42Updated last week
- A bunch of kernels that might make stuff slower 😉☆59Updated this week
- TORCH_LOGS parser for PT2☆60Updated last week
- ☆21Updated 6 months ago
- Machine Learning Agility (MLAgility) benchmark and benchmarking tools☆39Updated last month
- Fast low-bit matmul kernels in Triton☆373Updated this week
- 🚀 Collection of components for development, training, tuning, and inference of foundation models leveraging PyTorch native components.☆212Updated last week
- How to ensure correctness and ship LLM generated kernels in PyTorch☆60Updated this week
- QuTLASS: CUTLASS-Powered Quantized BLAS for Deep Learning☆97Updated 2 weeks ago
- ☆90Updated 10 months ago
- Triton-based Symmetric Memory operators and examples☆30Updated this week
- An experimental CPU backend for Triton (https//github.com/openai/triton)☆45Updated last month
- A stand-alone implementation of several NumPy dtype extensions used in machine learning.☆299Updated last week
- Official Problem Sets / Reference Kernels for the GPU MODE Leaderboard!☆96Updated last week
- 👷 Build compute kernels☆147Updated this week
- A parallel framework for training deep neural networks☆63Updated 6 months ago
- ☆217Updated 8 months ago
- AMD RAD's experimental RMA library for Triton.☆74Updated last week
- A safetensors extension to efficiently store sparse quantized tensors on disk☆162Updated this week
- No-GIL Python environment featuring NVIDIA Deep Learning libraries.☆64Updated 5 months ago
- Explore training for quantized models☆24Updated 2 months ago
- Evaluating Large Language Models for CUDA Code Generation ComputeEval is a framework designed to generate and evaluate CUDA code from Lar…☆66Updated 3 months ago
- Applied AI experiments and examples for PyTorch☆296Updated last month