coderonion / awesome-mojo-max-mlirLinks
A collection of some awesome public MAX platform, Mojo programming language and Multi-Level IR Compiler Framework(MLIR) projects.
☆37Updated 8 months ago
Alternatives and similar repositories for awesome-mojo-max-mlir
Users that are interested in awesome-mojo-max-mlir are comparing it to the libraries listed below
Sorting:
- Tenstorrent MLIR compiler☆174Updated this week
- Multi-Threaded FP32 Matrix Multiplication on x86 CPUs☆351Updated 4 months ago
- port of Andrjey Karpathy's llm.c to Mojo☆357Updated 3 weeks ago
- The missing pieces (as far as boilerplate reduction goes) of the upstream MLIR python bindings.☆105Updated this week
- MLIR-based partitioning system☆125Updated this week
- Unified compiler/runtime for interfacing with PyTorch Dynamo.☆101Updated 2 weeks ago
- High-Performance SGEMM on CUDA devices☆99Updated 7 months ago
- A Python compiler design toolkit.☆394Updated this week
- Learn GPU Programming in Mojo🔥 by Solving Puzzles☆124Updated 2 weeks ago
- LLM training in simple, raw C/CUDA☆104Updated last year
- An experimental CPU backend for Triton (https//github.com/openai/triton)☆45Updated 2 weeks ago
- Tenstorrent's MLIR Based Compiler. We aim to enable developers to run AI on all configurations of Tenstorrent hardware, through an open-s…☆102Updated last week
- Python interface for MLIR - the Multi-Level Intermediate Representation☆264Updated 9 months ago
- IREE's PyTorch Frontend, based on Torch Dynamo.☆95Updated last week
- GPUOcelot: A dynamic compilation framework for PTX☆207Updated 6 months ago
- An experimental CPU backend for Triton☆146Updated 3 months ago
- Small scale distributed training of sequential deep learning models, built on Numpy and MPI.☆138Updated last year
- Backward compatible ML compute opset inspired by HLO/MHLO☆529Updated last week
- A lightweight, Pythonic, frontend for MLIR☆80Updated last year
- Convert StableHLO models into Apple Core ML format☆19Updated last month
- ctypes wrappers for HIP, CUDA, and OpenCL☆130Updated last year
- ☆49Updated 7 months ago
- TritonParse: A Compiler Tracer, Visualizer, and mini-Reproducer(WIP) for Triton Kernels☆146Updated this week
- Learning about CUDA by writing PTX code.☆135Updated last year
- Custom PTX Instruction Benchmark☆126Updated 6 months ago
- Attention in SRAM on Tenstorrent Grayskull☆38Updated last year
- Stores documents and resources used by the OpenXLA developer community☆128Updated last year
- Fast low-bit matmul kernels in Triton☆356Updated last week
- Write a fast kernel and run it on Discord. See how you compare against the best!☆51Updated last week
- POC work on MLIR backend☆58Updated last year