Parsers for CUDA binary files
☆24Dec 29, 2023Updated 2 years ago
Alternatives and similar repositories for cudaparsers
Users that are interested in cudaparsers are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- A parser for PTX 6.5☆13Jun 19, 2023Updated 2 years ago
- TiledKernel is a code generation library based on macro kernels and memory hierarchy graph data structure.☆19May 12, 2024Updated 2 years ago
- FlexAttention w/ FlashAttention3 Support☆27Oct 5, 2024Updated last year
- Handwritten GEMM using Intel AMX (Advanced Matrix Extension)☆17Jan 11, 2025Updated last year
- [WIP] Better (FP8) attention for Hopper☆33Feb 24, 2025Updated last year
- Bare Metal GPUs on DigitalOcean Gradient AI • AdPurpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
- A CUDA kernel for NHWC GroupNorm for PyTorch☆23Nov 15, 2024Updated last year
- A survey of manufacturer-provided DRAM operating parameters and timings as specified by DRAM chip datasheets from between 1970 and 2021. …☆11May 4, 2022Updated 4 years ago
- ☆11Apr 3, 2023Updated 3 years ago
- Dynamic suballocators for external memory (e.g., Vulkan device memory). Umaintained - consider migrating to https://crates.io/crates/offs…☆15Jul 22, 2022Updated 3 years ago
- ☆20Sep 28, 2024Updated last year
- Debug print operator for cudagraph debugging☆15Aug 2, 2024Updated last year
- Quantized Attention on GPU☆44Nov 22, 2024Updated last year
- A simple Rust crate to cache data both in-memory and on disk☆11Dec 26, 2021Updated 4 years ago
- Parse objdump files using tree-sitter☆13Nov 22, 2023Updated 2 years ago
- End-to-end encrypted email - Proton Mail • AdSpecial offer: 40% Off Yearly / 80% Off First Month. All Proton services are open source and independently audited for security.
- Native Rust implementation of Kubernetes api☆33Mar 10, 2026Updated 2 months ago
- Open Source SSD Controller. NVMe and Lightstor variants☆17May 21, 2014Updated 12 years ago
- TileFusion is an experimental C++ macro kernel template library that elevates the abstraction level in CUDA C for tile processing.☆109Jun 28, 2025Updated 11 months ago
- FractalTensor is a programming framework that introduces a novel approach to organizing data in deep neural networks (DNNs) as a list of …☆31Dec 21, 2024Updated last year
- ☆11Jun 9, 2023Updated 2 years ago
- XML representation of the x86 instruction set☆29Feb 15, 2026Updated 3 months ago
- APPy (Annotated Parallelism for Python) enables users to annotate loops and tensor expressions in Python with compiler directives akin to…☆29Mar 22, 2026Updated 2 months ago
- A practical way of learning Swizzle☆39Feb 3, 2025Updated last year
- Take a QEMU binary, copy the dependencies into a chroot☆11Oct 5, 2022Updated 3 years ago
- Simple, predictable pricing with DigitalOcean hosting • AdAlways know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
- DeepGEMM: clean and efficient FP8 GEMM kernels with fine-grained scaling☆24May 20, 2026Updated last week
- FBX DOM library for Rust. // See https://github.com/lo48576/fbx-viewer for working example application // NO PLAN TO UPDATE in the forese…☆28Mar 20, 2023Updated 3 years ago
- 🎉My Collections of CUDA Kernels~☆11Jun 25, 2024Updated last year
- Pseudo-LRU implementation using 1-bit per entry and achieving Full-LRU performance.☆23Dec 17, 2022Updated 3 years ago
- Assembler and Decompiler for NVIDIA (Maxwell Pascal Volta Turing Ampere) GPUs.☆95Feb 23, 2023Updated 3 years ago
- 🧮 Polynomial Calculator☆12Jan 3, 2023Updated 3 years ago
- ☆15Dec 16, 2021Updated 4 years ago
- A memory profiler for NVIDIA GPUs to explore memory inefficiencies in GPU-accelerated applications.☆36Oct 13, 2024Updated last year
- cross-project Gitlab artifact dependencies☆13Jan 1, 2026Updated 4 months ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- Energy Consumption-Aware Tabular Benchmark For Neural Architecture Search☆11Aug 18, 2025Updated 9 months ago
- ☆46Nov 1, 2025Updated 6 months ago
- corundum work on vu13p☆23Nov 10, 2023Updated 2 years ago
- Wait free synchronization primitives☆23May 19, 2026Updated last week
- Trace Replay and Network Simulation Framework☆21Apr 14, 2021Updated 5 years ago
- Integrates crashdump reporting with Sentry☆21Nov 15, 2023Updated 2 years ago
- Minimal examples of crates useful for compiler development☆28Updated this week