JimyMa / FuncTsLinks
[DAC2024] A Holistic Functionalization Approach to Optimizing Imperative Tensor Programs in Deep Learning
☆15Updated 2 years ago
Alternatives and similar repositories for FuncTs
Users that are interested in FuncTs are comparing it to the libraries listed below
Sorting:
- ☆48Updated last year
- ☆145Updated last month
- Tile-based language built for AI computation across all scales☆120Updated this week
- Summary of some awesome work for optimizing LLM inference☆173Updated 2 months ago
- Learning TileLang with 10 puzzles!☆118Updated last week
- ☆37Updated last week
- ☆32Updated last year
- Multi-Level Triton Runner supporting Python, IR, PTX, and cubin.☆84Updated 2 weeks ago
- [NeurIPS 2025] ClusterFusion: Expanding Operator Fusion Scope for LLM Inference via Cluster-Level Collective Primitive☆66Updated 2 months ago
- Repo for SpecEE: Accelerating Large Language Model Inference with Speculative Early Exiting (ISCA25)☆70Updated 9 months ago
- Open ABI and FFI for Machine Learning Systems☆333Updated this week
- A Easy-to-understand TensorOp Matmul Tutorial☆404Updated last week
- Development repository for the Triton-Linalg conversion☆214Updated last year
- DeepSeek-V3/R1 inference performance simulator☆176Updated 10 months ago
- [HPCA 2026] A GPU-optimized system for efficient long-context LLMs decoding with low-bit KV cache.☆80Updated last month
- Building the Virtuous Cycle for AI-driven LLM Systems☆164Updated this week
- ArkVale: Efficient Generative LLM Inference with Recallable Key-Value Eviction (NIPS'24)☆53Updated last year
- ☆224Updated 3 months ago
- Summary of the Specs of Commonly Used GPUs for Training and Inference of LLM☆75Updated 6 months ago
- A lightweight design for computation-communication overlap.☆219Updated 3 weeks ago
- WaferLLM: Large Language Model Inference at Wafer Scale☆88Updated last month
- MAGIS: Memory Optimization via Coordinated Graph Transformation and Scheduling for DNN (ASPLOS'24)☆56Updated last year
- A torch compile backend for multi-targets☆45Updated 2 weeks ago
- ☆31Updated 10 months ago
- ☆95Updated 10 months ago
- ☆12Updated last year
- ☆175Updated 9 months ago
- A collection of specialized agent skills for AI infrastructure development, enabling Claude Code to write, optimize, and debug high-perfo…☆54Updated last week
- NVSHMEM‑Tutorial: Build a DeepEP‑like GPU Buffer☆161Updated 4 months ago
- Automatic Mapping Generation, Verification, and Exploration for ISA-based Spatial Accelerators☆121Updated 3 years ago