☆48Jul 13, 2024Updated last year
Alternatives and similar repositories for tlm
Users that are interested in tlm are comparing it to the libraries listed below
Sorting:
- Optimize tensor program fast with Felix, a gradient descent autotuner.☆33Updated this week
- ☆41Apr 25, 2024Updated last year
- MAGIS: Memory Optimization via Coordinated Graph Transformation and Scheduling for DNN (ASPLOS'24)☆56May 29, 2024Updated last year
- ☆17Jan 24, 2024Updated 2 years ago
- ☆95Nov 4, 2022Updated 3 years ago
- Horizontal Fusion☆24Jan 7, 2022Updated 4 years ago
- ☆13Dec 31, 2023Updated 2 years ago
- CodeBERT based mutation testing tool.☆13Nov 10, 2025Updated 3 months ago
- OSDI 2023 Welder, deeplearning compiler☆32Nov 24, 2023Updated 2 years ago
- Official implementation for the paper Lancet: Accelerating Mixture-of-Experts Training via Whole Graph Computation-Communication Overlapp…☆14Nov 17, 2025Updated 3 months ago
- ☆18Mar 4, 2025Updated last year
- ☆13Jan 7, 2025Updated last year
- A source-to-source compiler for optimizing CUDA dynamic parallelism by aggregating launches☆15Jun 21, 2019Updated 6 years ago
- ☆33Jul 17, 2024Updated last year
- Tutorials of Extending and importing TVM with CMAKE Include dependency.☆16Oct 11, 2024Updated last year
- A resilient distributed training framework☆97Apr 11, 2024Updated last year
- DietCode Code Release☆65Jul 21, 2022Updated 3 years ago
- Tencent Distribution of TVM☆16Apr 7, 2023Updated 2 years ago
- TileFlow is a performance analysis tool based on Timeloop for fusion dataflows☆66Apr 12, 2024Updated last year
- ☆38Jun 27, 2025Updated 8 months ago
- Compiler for Dynamic Neural Networks☆45Nov 13, 2023Updated 2 years ago
- Implement Flash Attention using Cute.☆102Dec 17, 2024Updated last year
- Torch Frontend for IREE☆25Dec 21, 2023Updated 2 years ago
- TileFusion is an experimental C++ macro kernel template library that elevates the abstraction level in CUDA C for tile processing.☆107Jun 28, 2025Updated 8 months ago
- Opara is a lightweight and resource-aware DNN Operator parallel scheduling framework to accelerate the execution of DNN inference on GPUs…☆23Dec 19, 2024Updated last year
- Automatic Schedule Exploration and Optimization Framework for Tensor Computations☆182Apr 25, 2022Updated 3 years ago
- tutorials about polyhedral compilation.☆61Feb 9, 2026Updated last month
- Xtext project to parse CoreDSL files☆24Oct 17, 2025Updated 4 months ago
- play gemm with tvm☆92Jul 22, 2023Updated 2 years ago
- Flash-LLM: Enabling Cost-Effective and Highly-Efficient Large Generative Model Inference with Unstructured Sparsity☆234Sep 24, 2023Updated 2 years ago
- ☆173Updated this week
- ☆145Jan 30, 2025Updated last year
- REEF is a GPU-accelerated DNN inference serving system that enables instant kernel preemption and biased concurrent execution in GPU sche…☆105Dec 24, 2022Updated 3 years ago
- ☆291Feb 4, 2026Updated last month
- [Paper][ICDE 2023] Tele-Knowledge Pre-training for Fault Analysis☆30May 22, 2024Updated last year
- A pattern-based algorithmic autotuner for graph processing on GPUs.☆32Jun 25, 2025Updated 8 months ago
- A list of awesome compiler projects and papers for tensor computation and deep learning.☆2,733Oct 19, 2024Updated last year
- 动手学习TVM核心原理教程☆64Dec 4, 2020Updated 5 years ago
- Paella: Low-latency Model Serving with Virtualized GPU Scheduling☆69May 1, 2024Updated last year