Learning TileLang with 10 puzzles!
☆164Mar 31, 2026Updated last week
Alternatives and similar repositories for tilelang-puzzles
Users that are interested in tilelang-puzzles are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆36Mar 7, 2025Updated last year
- DeepSeek-V3.2-Exp DSA Warmup Lightning Indexer training operator based on tilelang☆44Nov 19, 2025Updated 4 months ago
- ☆52May 19, 2025Updated 10 months ago
- High-performance LLM operator library built on TileLang.☆97Apr 2, 2026Updated last week
- Low overhead tracing library and trace visualizer for pipelined CUDA kernels☆136Nov 26, 2025Updated 4 months ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting with the flexibility to host WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Cloudways by DigitalOcean.
- A collection of specialized agent skills for AI infrastructure development, enabling Claude Code to write, optimize, and debug high-perfo…☆106Feb 2, 2026Updated 2 months ago
- ☆65Apr 26, 2025Updated 11 months ago
- Puzzles for learning Triton, play it with minimal environment configuration!☆659Mar 17, 2026Updated 3 weeks ago
- ☆38Jul 19, 2025Updated 8 months ago
- Vortex: A Flexible and Efficient Sparse Attention Framework☆50Updated this week
- ☆32Jul 2, 2025Updated 9 months ago
- ☆44Oct 15, 2025Updated 5 months ago
- a simple API to use CUPTI☆10Aug 19, 2025Updated 7 months ago
- Open deep learning compiler stack for cpu, gpu and specialized accelerators☆19Apr 1, 2026Updated last week
- Simple, predictable pricing with DigitalOcean hosting • AdAlways know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
- Canvas: End-to-End Kernel Architecture Search in Neural Networks☆27Nov 18, 2024Updated last year
- Persistent dense gemm for Hopper in `CuTeDSL`☆15Aug 9, 2025Updated 8 months ago
- GPGPU-Sim 中文注释版代码,包含 GPGPU-Sim 模拟器的最新版代码,经过中文注释,以帮助中文用户更好地理解和使用该模拟器。☆26Dec 18, 2024Updated last year
- Open ABI and FFI for Machine Learning Systems☆375Updated this week
- NVIDIA NVSHMEM is a parallel programming interface for NVIDIA GPUs based on OpenSHMEM. NVSHMEM can significantly reduce multi-process com…☆497Mar 24, 2026Updated 2 weeks ago
- incubator repo for CUDA-TileIR backend☆125Mar 18, 2026Updated 3 weeks ago
- Tutorials of Extending and importing TVM with CMAKE Include dependency.☆16Oct 11, 2024Updated last year
- Triton to TVM transpiler.☆23Oct 14, 2024Updated last year
- ☆119May 19, 2025Updated 10 months ago
- Proton VPN Special Offer - Get 70% off • AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- NVSHMEM‑Tutorial: Build a DeepEP‑like GPU Buffer☆174Feb 11, 2026Updated last month
- DLSlime: Flexible & Efficient Heterogeneous Transfer Toolkit☆95Mar 31, 2026Updated last week
- FlashInfer Bench @ MLSys 2026: Building AI agents to write high performance GPU kernels☆156Updated this week
- A scheduling framework for multitasking over diverse XPUs, including GPUs, NPUs, ASICs, and FPGAs☆165Jan 13, 2026Updated 2 months ago
- Accelerating MoE with IO and Tile-aware Optimizations☆621Apr 1, 2026Updated last week
- Open source version of DOCA GPUNetIO and DOCA Verbs libraries (limited features) to enable GDAKI technology on RDMA (IB and RoCE)☆35Updated this week
- A benchmark of real-world DL kernel problems☆160Apr 2, 2026Updated last week
- 🤖FFPA: Extend FlashAttention-2 with Split-D, ~O(1) SRAM complexity for large headdim, 1.8x~3x↑🎉 vs SDPA EA.☆255Feb 13, 2026Updated last month
- Governance-as-code for AI-assisted software development☆104Updated this week
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting with the flexibility to host WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Cloudways by DigitalOcean.
- Fast low-bit matmul kernels in Triton☆443Updated this week
- IntLLaMA: A fast and light quantization solution for LLaMA☆18Jul 21, 2023Updated 2 years ago
- Distributed Compiler based on Triton for Parallel Systems☆1,401Mar 11, 2026Updated 3 weeks ago
- Accelerating Large-Scale Reasoning Model Inference with Sparse Self-Speculative Decoding☆97Dec 2, 2025Updated 4 months ago
- FlagGems is an operator library for large language models implemented in the Triton Language.☆945Updated this week
- Mirror of https://gitee.com/loongson-edu/open-la500.git☆26Jan 2, 2025Updated last year
- Mirage Persistent Kernel: Compiling LLMs into a MegaKernel☆2,177Apr 2, 2026Updated last week