mlc-ai / mlc-enLinks
☆425Updated 10 months ago
Alternatives and similar repositories for mlc-en
Users that are interested in mlc-en are comparing it to the libraries listed below
Sorting:
- ☆208Updated 9 months ago
- An open-source efficient deep learning framework/compiler, written in python.☆715Updated last month
- Fast low-bit matmul kernels in Triton☆349Updated last week
- ☆196Updated 2 years ago
- Shared Middle-Layer for Triton Compilation☆268Updated last week
- GPTQ inference Triton kernel☆305Updated 2 years ago
- A model compilation solution for various hardware☆443Updated this week
- Representation and Reference Lowering of ONNX Models in MLIR Compiler Infrastructure☆899Updated last week
- Cataloging released Triton kernels.☆252Updated 7 months ago
- ☆163Updated 2 weeks ago
- BitBLAS is a library to support mixed-precision matrix multiplications, especially for quantized LLM deployment.☆660Updated 2 weeks ago
- ☆232Updated this week
- ☆614Updated last year
- A curated list of awesome projects and papers for distributed training or inference☆241Updated 10 months ago
- Applied AI experiments and examples for PyTorch☆290Updated 2 months ago
- A Quirky Assortment of CuTe Kernels☆407Updated this week
- ☆144Updated 6 months ago
- Backward compatible ML compute opset inspired by HLO/MHLO☆522Updated last week
- A library to analyze PyTorch traces.☆404Updated last week
- Dive into Deep Learning Compiler☆647Updated 3 years ago
- A Easy-to-understand TensorOp Matmul Tutorial☆372Updated 11 months ago
- FP16xINT4 LLM inference kernel that can achieve near-ideal ~4x speedups up to medium batchsizes of 16-32 tokens.☆880Updated 11 months ago
- Zero Bubble Pipeline Parallelism☆418Updated 3 months ago
- A collection of memory efficient attention operators implemented in the Triton language.☆277Updated last year
- A schedule language for large model training☆149Updated this week
- AI and Memory Wall☆218Updated last year
- Composable Kernel: Performance Portable Programming Model for Machine Learning Tensor Operators☆447Updated last week
- Latency and Memory Analysis of Transformer Models for Training and Inference☆446Updated 4 months ago
- An experimental CPU backend for Triton☆143Updated 2 months ago
- Perplexity GPU Kernels☆435Updated 2 weeks ago