ROCm / pyrsmiLinks

python package of rocm-smi-lib

☆24

Alternatives and similar repositories for pyrsmi

Users that are interested in pyrsmi are comparing it to the libraries listed below

Sorting:

deepspeedai / DeepSpeed-Kernels
☆71Updated 8 months ago
ROCm / TransformerEngine
☆51Updated this week
gpu-mode / discord-cluster-manager
Write a fast kernel and run it on Discord. See how you compare against the best!
☆61Updated last week
meta-pytorch / BackendBench
How to ensure correctness and ship LLM generated kernels in PyTorch
☆121Updated 2 weeks ago
cchan / tccl
extensible collectives library in triton
☆91Updated 8 months ago
ROCm / aotriton
Ahead of Time (AOT) Triton Math Library
☆84Updated 2 weeks ago
meta-pytorch / triton-cpu
An experimental CPU backend for Triton (https//github.com/openai/triton)
☆47Updated 3 months ago
IBM / triton-dejavu
Framework to reduce autotune overhead to zero for well known deployments.
☆88Updated 2 months ago
ROCm / iris
AMD RAD's multi-GPU Triton-based framework for seamless multi-GPU programming
☆116Updated last week
apple / ml-recurrent-drafter
☆218Updated 10 months ago
lianakoleva / no-libtorch-compile
☆21Updated 8 months ago
groq / mlagility
Machine Learning Agility (MLAgility) benchmark and benchmarking tools
☆40Updated 4 months ago
intel / torch-xpu-ops
☆63Updated last week
gau-nernst / quantized-training
Explore training for quantized models
☆25Updated 4 months ago
intel / intel-extension-for-deepspeed
Intel® Extension for DeepSpeed* is an extension to DeepSpeed that brings feature support with SYCL kernels on Intel GPU(XPU) device. Note…
☆63Updated 5 months ago
axonn-ai / axonn
Parallel framework for training and fine-tuning deep neural networks
☆69Updated 2 weeks ago
north-numerical-computing / tensor-cores-numerical-behavior
Test suite for probing the numerical behavior of NVIDIA tensor cores
☆41Updated last year
foundation-model-stack / foundation-model-stack
🚀 Collection of components for development, training, tuning, and inference of foundation models leveraging PyTorch native components.
☆217Updated last week
wangsiping97 / FastGEMV
High-speed GEMV kernels, at most 2.7x speedup compared to pytorch baseline.
☆122Updated last year
ademeure / QuickRunCUDA
☆14Updated 3 weeks ago
jax-ml / ml_dtypes
A stand-alone implementation of several NumPy dtype extensions used in machine learning.
☆312Updated this week
HabanaAI / Megatron-DeepSpeed
Intel Gaudi's Megatron DeepSpeed Large Language Models for training
☆15Updated 11 months ago
open-lm-engine / accelerated-model-architectures
A bunch of kernels that might make stuff slower 😉
☆65Updated this week
NVIDIA / free-threaded-python
No-GIL Python environment featuring NVIDIA Deep Learning libraries.
☆69Updated 7 months ago
triton-lang / kernels
☆94Updated last year
hpcaitech / TensorNVMe
A Python library transfers PyTorch tensors between CPU and NVMe
☆122Updated last year
meta-pytorch / float8_experimental
This repository contains the experimental PyTorch native float8 training UX
☆226Updated last year
ROCm / triton
Development repository for the Triton language and compiler
☆137Updated this week
amazon-science / mxfp4-llm
Official implementation for Training LLMs with MXFP4
☆109Updated 7 months ago
vllm-project / compressed-tensors
A safetensors extension to efficiently store sparse quantized tensors on disk
☆210Updated last week