geohot / cuda_ioctl_snifferLinks

Sniff CUDA ioctls

☆205

Alternatives and similar repositories for cuda_ioctl_sniffer

Users that are interested in cuda_ioctl_sniffer are comparing it to the libraries listed below

Sorting:

tinygrad / gpuctypes
ctypes wrappers for HIP, CUDA, and OpenCL
☆130Updated last year
kuterd / nv_isa_solver
Nvidia Instruction Set Specification Generator
☆285Updated last year
gpuocelot / gpuocelot
GPUOcelot: A dynamic compilation framework for PTX
☆204Updated 5 months ago
geohot / edgetpuxray
Enabling tinygrad compatibility with the Google Edge TPU
☆78Updated 11 months ago
tinygrad / 7900xtx
☆449Updated 3 months ago
tenstorrent / tt-isa-documentation
☆53Updated this week
Qazalin / remu
RDNA3 emulator
☆54Updated 3 months ago
google / ml-compiler-opt
Infrastructure for Machine Learning Guided Optimization (MLGO) in LLVM.
☆709Updated this week
seb-v / fp32_sgemm_amd
Super fast FP32 matrix multiplication on RDNA3
☆70Updated 4 months ago
0xD0GF00D / DocumentSASS
Unofficial description of the CUDA assembly (SASS) instruction sets.
☆124Updated 2 weeks ago
microsoft / ArchProbe
A profiler to disclose and quantify hardware features on GPUs.
☆173Updated 3 years ago
amd / ZenDNN
☆115Updated this week
LaurieWired / BenchmarkCustomPTX
Custom PTX Instruction Benchmark
☆126Updated 5 months ago
pytorch-labs / triton-cpu
An experimental CPU backend for Triton (https//github.com/openai/triton)
☆43Updated 4 months ago
NVIDIA / Fuser
A Fusion Code Generator for NVIDIA GPUs (commonly known as "nvFuser")
☆345Updated this week
dougallj / applegpu
Apple G13 GPU architecture docs and tools
☆597Updated 2 months ago
corsix / amx
Apple AMX Instruction Set
☆1,121Updated 7 months ago
tenstorrent / tt-mlir
Tenstorrent MLIR compiler
☆165Updated this week
intel / mlir-extensions
Intel® Extension for MLIR. A staging ground for MLIR dialects and tools for Intel devices using the MLIR toolchain.
☆138Updated this week
geohot / twitchcore
It's a core. Made on Twitch.
☆261Updated 3 years ago
unixpickle / learn-ptx
Learning about CUDA by writing PTX code.
☆133Updated last year
tenstorrent / tt-forge
Tenstorrent's MLIR Based Compiler. We aim to enable developers to run AI on all configurations of Tenstorrent hardware, through an open-s…
☆96Updated this week
philipturner / metal-benchmarks
Apple GPU microarchitecture
☆540Updated 10 months ago
NVIDIA / compute-sanitizer-samples
Samples demonstrating how to use the Compute Sanitizer Tools and Public API
☆85Updated last year
dougallj / applecpu
Apple Firestorm/Icestorm CPU microarchitecture docs
☆241Updated 2 years ago
cloudcores / CuAssembler
An unofficial cuda assembler, for all generations of SASS, hopefully ：）
☆525Updated 2 years ago
ROCm / hipBLASLt
[DEPRECATED] Moved to ROCm/rocm-libraries repo
☆111Updated this week
ROCm / hipBLAS
[DEPRECATED] Moved to ROCm/rocm-libraries repo
☆145Updated this week
ROCm / rocMLIR
☆148Updated this week
salykova / sgemm.cu
High-Performance SGEMM on CUDA devices
☆98Updated 6 months ago