VivekPanyam / cudaparsersLinks
Parsers for CUDA binary files
☆22Updated last year
Alternatives and similar repositories for cudaparsers
Users that are interested in cudaparsers are comparing it to the libraries listed below
Sorting:
- Intel® Instrumentation and Tracing Technology (ITT) and Just-In-Time (JIT) APIs☆118Updated this week
- Rust bindings to the MLIR C API.☆65Updated 2 months ago
- Heterogeneous Containerization of Large Language Model Apps☆46Updated last week
- Unit benchmarks of CUDA event APIs.☆17Updated last year
- ☆78Updated this week
- A fast and accurate reuse distance analyzer for multi-threaded applications. It leverages existing hardware features in commodity CPUs.☆19Updated 2 years ago
- VectorVisor is a vectorizing binary translator for GPUs, designed to make it easy to run many copies of a single-threaded WebAssembly pro…☆151Updated 10 months ago
- Tenstorrent system interface library☆30Updated last week
- Simplify the use of performance counters.☆64Updated 3 years ago
- Experimental kernel with built-in replication.☆160Updated 3 weeks ago
- PTX-EMU is a simple emulator for CUDA program.☆34Updated 3 months ago
- A determinizing tracer using Ptrace☆38Updated 4 years ago
- ☆11Updated last year
- The repo for HotOS paper "FIFO can be Better than LRU: the Power of Lazy Promotion and Quick Demotion"☆33Updated 2 years ago
- Tools and experiments for 0sim. Simulate system software behavior on machines with terabytes of main memory from your desktop.☆21Updated 5 years ago
- A parser for PTX 6.5☆12Updated 2 years ago
- A zero-copy serialization library and networking stack.☆48Updated last year
- An attempt at safe imperative GPU programming.☆46Updated this week
- MPIWasm is a WebAssembly Embedder based on Wasmer that enables the high-performance execution of MPI applications compiled to Wasm. (ACM …☆19Updated last year
- Asynchronous Rust bindings for UCX☆72Updated 3 months ago
- An operation-log based approach for data replication.☆64Updated 2 years ago
- ☆143Updated last week
- SquirrelFS: A crash-consistent Rust file system for persistent memory (OSDI 24)☆62Updated 3 months ago
- A enumerator for MLIR, relying on the information given by IRDL.☆19Updated 3 weeks ago
- ☆30Updated 2 years ago
- A lightweight memory allocator for hardware-accelerated machine learning☆157Updated 4 months ago
- PTX on XPUs☆48Updated this week
- Virtual machine for executing CUDA PTX without a GPU☆38Updated last year
- TiledKernel is a code generation library based on macro kernels and memory hierarchy graph data structure.☆19Updated last year
- A memory profiler for NVIDIA GPUs to explore memory inefficiencies in GPU-accelerated applications.☆25Updated 9 months ago