Parsers for CUDA binary files
☆24Dec 29, 2023Updated 2 years ago
Alternatives and similar repositories for cudaparsers
Users that are interested in cudaparsers are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- TiledLower is a Dataflow Analysis and Codegen Framework written in Rust.☆13Nov 23, 2024Updated last year
- A parser for PTX 6.5☆13Jun 19, 2023Updated 2 years ago
- TiledKernel is a code generation library based on macro kernels and memory hierarchy graph data structure.☆19May 12, 2024Updated last year
- FlexAttention w/ FlashAttention3 Support☆27Oct 5, 2024Updated last year
- Handwritten GEMM using Intel AMX (Advanced Matrix Extension)☆17Jan 11, 2025Updated last year
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- SGLang Kernel Wheel Index☆21Updated this week
- [WIP] Better (FP8) attention for Hopper☆33Feb 24, 2025Updated last year
- A CUDA kernel for NHWC GroupNorm for PyTorch☆23Nov 15, 2024Updated last year
- A survey of manufacturer-provided DRAM operating parameters and timings as specified by DRAM chip datasheets from between 1970 and 2021. …☆11May 4, 2022Updated 3 years ago
- ☆20Sep 28, 2024Updated last year
- ☆26Feb 17, 2025Updated last year
- Debug print operator for cudagraph debugging☆14Aug 2, 2024Updated last year
- Quite OK image compression Verilog implementation☆23Nov 27, 2024Updated last year
- Benchmark tests supporting the TiledCUDA library.☆18Nov 19, 2024Updated last year
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- Quantized Attention on GPU☆44Nov 22, 2024Updated last year
- A simple Rust crate to cache data both in-memory and on disk☆11Dec 26, 2021Updated 4 years ago
- Parse objdump files using tree-sitter☆13Nov 22, 2023Updated 2 years ago
- Native Rust implementation of Kubernetes api☆33Mar 10, 2026Updated last month
- TileFusion is an experimental C++ macro kernel template library that elevates the abstraction level in CUDA C for tile processing.☆106Jun 28, 2025Updated 9 months ago
- A tiny FP8 multiplication unit written in Verilog. TinyTapeout 2 submission.☆14Nov 23, 2022Updated 3 years ago
- Open Source SSD Controller. NVMe and Lightstor variants☆17May 21, 2014Updated 11 years ago
- ☆15Jan 8, 2024Updated 2 years ago
- FractalTensor is a programming framework that introduces a novel approach to organizing data in deep neural networks (DNNs) as a list of …☆31Dec 21, 2024Updated last year
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- ☆11Jun 9, 2023Updated 2 years ago
- Framework to reduce autotune overhead to zero for well known deployments.☆99Sep 19, 2025Updated 7 months ago
- XML representation of the x86 instruction set☆29Feb 15, 2026Updated 2 months ago
- APPy (Annotated Parallelism for Python) enables users to annotate loops and tensor expressions in Python with compiler directives akin to…☆30Mar 22, 2026Updated 3 weeks ago
- ☆33Feb 3, 2025Updated last year
- A Rust version of picosvg.☆14Oct 7, 2022Updated 3 years ago
- A practical way of learning Swizzle☆37Feb 3, 2025Updated last year
- ☆32Jun 6, 2024Updated last year
- Pure Rust implementation of the meshoptimizer library☆28Dec 18, 2025Updated 4 months ago
- Deploy open-source AI quickly and easily - Bonus Offer • AdRunpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
- DeepGEMM: clean and efficient FP8 GEMM kernels with fine-grained scaling☆22Apr 9, 2026Updated last week
- 🎉My Collections of CUDA Kernels~☆11Jun 25, 2024Updated last year
- Assembler and Decompiler for NVIDIA (Maxwell Pascal Volta Turing Ampere) GPUs.☆94Feb 23, 2023Updated 3 years ago
- 🧮 Polynomial Calculator☆12Jan 3, 2023Updated 3 years ago
- General Purpose Graphics Processing Unit (GPGPU) IP Core☆11Jul 4, 2014Updated 11 years ago
- ☆15Dec 16, 2021Updated 4 years ago
- A memory profiler for NVIDIA GPUs to explore memory inefficiencies in GPU-accelerated applications.☆34Oct 13, 2024Updated last year