geohot / cuda_ioctl_sniffer
Sniff CUDA ioctls
☆192Updated last year
Alternatives and similar repositories for cuda_ioctl_sniffer:
Users that are interested in cuda_ioctl_sniffer are comparing it to the libraries listed below
- Nvidia Instruction Set Specification Generator☆255Updated 9 months ago
- ctypes wrappers for HIP, CUDA, and OpenCL☆129Updated 9 months ago
- Enabling tinygrad compatibility with the Google Edge TPU☆76Updated 7 months ago
- GPUOcelot: A dynamic compilation framework for PTX☆185Updated 2 months ago
- ☆441Updated last week
- Unofficial description of the CUDA assembly (SASS) instruction sets.☆89Updated last month
- RDNA3 emulator☆54Updated this week
- Apple Firestorm/Icestorm CPU microarchitecture docs☆238Updated last year
- A profiler to disclose and quantify hardware features on GPUs.☆168Updated 2 years ago
- It's a core. Made on Twitch.☆258Updated 3 years ago
- A Fusion Code Generator for NVIDIA GPUs (commonly known as "nvFuser")☆317Updated this week
- An unofficial cuda assembler, for all generations of SASS, hopefully :)☆480Updated last year
- Letting computers listen to you and really care☆370Updated 2 years ago
- ☆241Updated 2 months ago
- Custom PTX Instruction Benchmark☆122Updated last month
- Assembler for NVIDIA Volta and Turing GPUs☆216Updated 3 years ago
- Apple GPU microarchitecture☆511Updated 6 months ago
- Scripts and environment for the tinybox☆93Updated 11 months ago
- Apple AMX Instruction Set☆1,069Updated 3 months ago
- Intel® Extension for MLIR. A staging ground for MLIR dialects and tools for Intel devices using the MLIR toolchain.☆134Updated last week
- Unpacking AMD's dkms packages☆27Updated last year
- ☆105Updated last week
- Super fast FP32 matrix multiplication on RDNA3☆46Updated 2 weeks ago
- CUDA checkpoint and restore utility☆325Updated 2 months ago
- MLIR-based partitioning system☆80Updated this week
- You like pytorch? You like micrograd? You love tinygrad! ❤️☆49Updated 4 years ago
- An experimental CPU backend for Triton (https//github.com/openai/triton)☆40Updated last month
- RCCL Performance Benchmark Tests☆63Updated last week
- The missing pieces (as far as boilerplate reduction goes) of the upstream MLIR python bindings.☆89Updated this week
- A GPU-driven system framework for scalable AI applications☆114Updated 2 months ago