My study note for mlsys
☆14Nov 4, 2024Updated last year
Alternatives and similar repositories for mlsys-study-note
Users that are interested in mlsys-study-note are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆17Jan 24, 2024Updated 2 years ago
- Shared Middle-Layer for Triton Compilation☆338Dec 5, 2025Updated 6 months ago
- FlashTile is a CUDA Tile IR compiler that is compatible with NVIDIA's tileiras, targeting SM70 through SM121 NVIDIA GPUs.☆61Feb 6, 2026Updated 4 months ago
- Development repository for the Triton-Linalg conversion☆222Feb 7, 2025Updated last year
- Fork of Triton repository for OpenXLA uses of the Triton language and compiler☆16Feb 24, 2026Updated 4 months ago
- GPUs on demand by Runpod - Special Offer Available • AdRun AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
- Clone of the LLVM project with MLIR repo integrated as a top-level subproject☆12Dec 11, 2022Updated 3 years ago
- 🎉CUDA 笔记 / 高频面试题汇总 / C++笔记,个人笔记,更新随缘: sgemm、sgemv、warp reduce、block reduce、dot product、elementwise、softmax、layernorm、rmsnorm、hist etc.☆48Jan 25, 2024Updated 2 years ago
- An MLIR-based compiler framework bridges DSLs (domain-specific languages) to DSAs (domain-specific architectures).☆734Updated this week
- ☆25Jun 11, 2025Updated last year
- mKernel: fast multi-node, multi-GPU fused kernels☆241Jun 21, 2026Updated last week
- ☆32Jul 17, 2024Updated last year
- A translator from c to MLIR☆33Nov 15, 2021Updated 4 years ago
- 使用 CUDA C++ 实现的 llama 模型推理框架☆65Nov 8, 2024Updated last year
- Twili I/O library for libnx☆14Jan 13, 2020Updated 6 years ago
- Bare Metal GPUs on DigitalOcean Gradient AI • AdPurpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
- a simple API to use CUPTI☆10Aug 19, 2025Updated 10 months ago
- Hands-On Practical MLIR Tutorial☆799Oct 20, 2023Updated 2 years ago
- ☆20May 24, 2025Updated last year
- Display images, video, and scaled text directly in terminal Emacs (emacs -nw) using the Kitty graphics protocol, tmux or Sixel☆101Jun 23, 2026Updated last week
- DiscreteTom's Blog Boilerplate.☆10Mar 6, 2023Updated 3 years ago
- A utility library to bridge llvm and mlir gaps.☆17Jan 8, 2025Updated last year
- ARIES: An Agile MLIR-Based Compilation Flow for Reconfigurable Devices with AI Engines (FPGA 2025 Best Paper Nominee)☆63Mar 8, 2026Updated 3 months ago
- A lightweight, Pythonic, frontend for MLIR☆80Oct 21, 2023Updated 2 years ago
- Code snippets and reproductions from JustAByte☆48Apr 6, 2026Updated 2 months ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- ShakeFlow: Functional Hardware Description with Latency-Insensitive Interface Combinators (ASPLOS 2023)☆57Jan 23, 2025Updated last year
- OpenAI Triton backend for Intel® GPUs☆257Updated this week
- Python interface for MLIR - the Multi-Level Intermediate Representation☆271Nov 28, 2024Updated last year
- Minimal examples of crates useful for compiler development☆29Jun 22, 2026Updated last week
- ☆183Updated this week
- High performance RMSNorm Implement by using SM Core Storage(Registers and Shared Memory)☆30Jan 22, 2026Updated 5 months ago
- Benchmark SGLang on SLURM☆24Apr 20, 2026Updated 2 months ago
- An alternative choice to enjoy personalized music from douban.fm☆39Apr 13, 2013Updated 13 years ago
- Tutorial on building a gpu compiler backend in LLVM☆59Jan 11, 2025Updated last year
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- a vue-demo:vue仿网易新闻m站☆10Jul 26, 2017Updated 8 years ago
- ☆15Apr 15, 2022Updated 4 years ago
- A sandbox for quick iteration and experimentation on projects related to IREE, MLIR, and LLVM☆62Apr 13, 2026Updated 2 months ago
- A simple LLaMA implementation using MLX.☆15Apr 22, 2024Updated 2 years ago
- ☆14Apr 28, 2026Updated 2 months ago
- Framework to reduce autotune overhead to zero for well known deployments.☆101Sep 19, 2025Updated 9 months ago
- Medusa: Accelerating Serverless LLM Inference with Materialization [ASPLOS'25]☆12Nov 8, 2024Updated last year