Multi-V-VM / hetGPULinks
PTX on XPUs
☆115Updated last week
Alternatives and similar repositories for hetGPU
Users that are interested in hetGPU are comparing it to the libraries listed below
Sorting:
- Extending eBPF Programmability and Observability to GPUs (merged into https://github.com/eunomia-bpf/bpftime)☆284Updated last month
- PTX-EMU is a simple emulator for CUDA program.☆38Updated 8 months ago
- Open ABI and FFI for Machine Learning Systems☆274Updated last week
- Heterogeneous Containerization of Large Language Model Apps☆109Updated 5 months ago
- A memory profiler for NVIDIA GPUs to explore memory inefficiencies in GPU-accelerated applications.☆27Updated last year
- Triton to TVM transpiler.☆22Updated last year
- Fast OS-level support for GPU checkpoint and restore☆267Updated 3 months ago
- CXLMemSim: A pure software simulated CXL.mem for performance characterization☆243Updated this week
- ☆222Updated 2 weeks ago
- Assembler and Decompiler for NVIDIA (Maxwell Pascal Volta Turing Ampere) GPUs.☆95Updated 2 years ago
- An experimental CPU backend for Triton☆168Updated last month
- LLVM OpenCL C compiler suite for ventus GPGPU☆58Updated 2 weeks ago
- ☆92Updated 9 months ago
- Expert Kit is an efficient foundation of Expert Parallelism (EP) for MoE model Inference on heterogenous hardware☆61Updated 2 months ago
- Handwritten GEMM using Intel AMX (Advanced Matrix Extension)☆17Updated 11 months ago
- incubator repo for CUDA-TileIR backend☆56Updated this week
- Unofficial description of the CUDA assembly (SASS) instruction sets.☆193Updated 5 months ago
- Asynchronous semantics for architectural simulation and synthesis.☆64Updated this week
- Tutorials for NVIDIA CUPTI samples☆47Updated 2 months ago
- ☆26Updated 10 months ago
- A language and compiler for irregular tensor programs.☆152Updated last year
- A domain-specific language (DSL) based on Triton but providing higher-level abstractions.☆38Updated this week
- Source code for the FAST '23 paper “MadFS: Per-File Virtualization for Userspace Persistent Memory Filesystems”☆45Updated 2 years ago
- NVIDIA NVSHMEM is a parallel programming interface for NVIDIA GPUs based on OpenSHMEM. NVSHMEM can significantly reduce multi-process com…☆433Updated last week
- A collection of CUDA programming examples to learn GPU programming☆52Updated 2 months ago
- A repository where GPU applications are aggregated using a common build flow that supports multiple CUDA versions.☆86Updated 2 months ago
- A Top-Down Profiler for GPU Applications☆22Updated last year
- Lightweight daemon for monitoring CUDA runtime API calls with eBPF uprobes☆144Updated 9 months ago
- A flexible, high-performance, user-friendly computer architecture simulator engine☆95Updated this week
- FractalTensor is a programming framework that introduces a novel approach to organizing data in deep neural networks (DNNs) as a list of …☆31Updated last year