An MLIR-based compiler that takes GPU kernels and compiles them to real hardware instructions. Interactive web visualizer included.
☆132Mar 21, 2026Updated 3 months ago
Alternatives and similar repositories for tiny-gpu-compiler
Users that are interested in tiny-gpu-compiler are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Implementations from Proofs, Arguments and Zero Knowledge☆14Mar 29, 2025Updated last year
- A curated list of awesome things related to learning Binius☆16Jan 9, 2025Updated last year
- Lock-free elimination back-off stack☆13Jan 6, 2022Updated 4 years ago
- 晚上下班不刷手机,学点什么。系列一:CUDA 计算框架 CUFX (Cuda Framework eXtended)。☆17Dec 15, 2024Updated last year
- A cutting-edge zkWASM implementation leveraging Nova-NIVC-based folding techniques.☆41Oct 28, 2025Updated 8 months ago
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- A series of high-performance GEMM (General Matrix Multiply) implementations Iteratively optimised for H100 GPUs in Pure CUDA.☆79Feb 18, 2026Updated 4 months ago
- An interactive web-based tool for exploring intermediate representations of PyTorch and Triton models☆49Jan 23, 2026Updated 5 months ago
- RPC request router and proxy for Starknet, forked from Optimism proxyd.☆12Feb 26, 2024Updated 2 years ago
- HeteroRefactor: Refactoring for Heterogeneous Computing with FPGA☆11Mar 13, 2026Updated 3 months ago
- F# LL(k) Parser generator.☆12Oct 26, 2022Updated 3 years ago
- CS341 for Spring 2024☆11Jul 15, 2024Updated last year
- A naive interpreter for IR of NJU compiler principle lab3, to accelerate interpretation, the ir will be compiled to machine-friendly bina…☆16Jun 17, 2020Updated 6 years ago
- Triton Compiler related materials.☆44Mar 16, 2026Updated 3 months ago
- Test and benchmark your Rust library on mobile devices with ease.☆13Jul 17, 2023Updated 2 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- ☆15Jan 16, 2024Updated 2 years ago
- Row-wise block scaling for fp8 quantization matrix multiplication. Solution to GPU mode AMD challenge.☆19Feb 9, 2026Updated 4 months ago
- NJU-IT侠社团网站系统,包括预约和后台等等...☆16May 11, 2022Updated 4 years ago
- Deep Generative Models course, 2025☆10Jun 5, 2025Updated last year
- A demonstration of source code transformation to implement automatic differentiation, compatible with an operation overload style AD libr…☆14Jul 15, 2022Updated 3 years ago
- ☆18Jul 11, 2021Updated 4 years ago
- Starknet Unity SDK lets game developers to integrate Starknet blockchain functionality into their Unity projects with ease.☆13Jun 20, 2025Updated last year
- Triton for OpenCL backend, and use mlir-translate to get source OpenCL code☆27Aug 27, 2025Updated 10 months ago
- 🎉My Collections of CUDA Kernels~☆11Jun 25, 2024Updated 2 years ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- ☆12Apr 21, 2025Updated last year
- Grokking on modular arithmetic in less than 150 epochs in MLX☆15Oct 24, 2024Updated last year
- Simple sync/async event dispatcher for Rust☆17Dec 20, 2023Updated 2 years ago
- ☆16Jun 18, 2025Updated last year
- Accelerated Zero-knowledge Virtual Machine by Non-uniform Prover Based on GKR Protocol☆150Updated this week
- A std::execution style runtime context and High Performance RPC Transport for using OpenUCX. Including CUDA/ROCM/... devices with RDMA.☆33May 26, 2026Updated last month
- Hand-Rolled GPU communications library☆95Nov 25, 2025Updated 7 months ago
- My tests and experiments with some popular dl frameworks.☆17Sep 11, 2025Updated 9 months ago
- mKernel: fast multi-node, multi-GPU fused kernels☆241Jun 21, 2026Updated last week
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- Inference Llama 2 with a model compiled to native code by TorchInductor☆14Feb 8, 2024Updated 2 years ago
- ☆14Nov 3, 2025Updated 7 months ago
- Multi-heap-sort for many small arrays, quicksort with 3 pivots for one big array, CUDA acceleration, CUDA memory compression.☆13Sep 29, 2024Updated last year
- ☆13Jul 2, 2025Updated 11 months ago
- 《汇编语言一发入魂》配套代码☆15May 30, 2020Updated 6 years ago
- We aim to redefine Data Parallel libraries portabiliy, performance, programability and maintainability, by using C++ standard features, i…☆52Updated this week
- Minimal implementation of a Byte Pair Encoding (BPE) tokenizer in Zig☆15Apr 7, 2025Updated last year