JimyMa / FuncTsLinks
[DAC2024] A Holistic Functionalization Approach to Optimizing Imperative Tensor Programs in Deep Learning
☆15Updated 2 years ago
Alternatives and similar repositories for FuncTs
Users that are interested in FuncTs are comparing it to the libraries listed below
Sorting:
- ☆143Updated last month
- ☆48Updated last year
- [NeurIPS 2025] ClusterFusion: Expanding Operator Fusion Scope for LLM Inference via Cluster-Level Collective Primitive☆66Updated last month
- Tile-based language built for AI computation across all scales☆119Updated this week
- Summary of some awesome work for optimizing LLM inference☆172Updated 2 months ago
- Multi-Level Triton Runner supporting Python, IR, PTX, and cubin.☆84Updated this week
- Learning TileLang with 10 puzzles!☆56Updated this week
- Building the Virtuous Cycle for AI-driven LLM Systems☆140Updated this week
- ☆18Updated 10 months ago
- MAGIS: Memory Optimization via Coordinated Graph Transformation and Scheduling for DNN (ASPLOS'24)☆56Updated last year
- libsmctrl论文的复现,添加了python端接口,可以在python端灵活调用接口来分配计算资源☆12Updated last year
- LLM Inference analyzer for different hardware platforms☆99Updated last month
- tutorials about polyhedral compilation.☆61Updated 3 months ago
- ArkVale: Efficient Generative LLM Inference with Recallable Key-Value Eviction (NIPS'24)☆52Updated last year
- WaferLLM: Large Language Model Inference at Wafer Scale☆84Updated 3 weeks ago
- ☆31Updated 10 months ago
- Automatic Mapping Generation, Verification, and Exploration for ISA-based Spatial Accelerators☆121Updated 3 years ago
- ☆18Updated last year
- A benchmark suited especially for deep learning operators☆42Updated 2 years ago
- ☆12Updated last year
- Compiler for Dynamic Neural Networks☆45Updated 2 years ago
- Repo for SpecEE: Accelerating Large Language Model Inference with Speculative Early Exiting (ISCA25)☆70Updated 9 months ago
- DeepSeek-V3/R1 inference performance simulator☆176Updated 10 months ago
- A torch compile backend for multi-targets☆44Updated last week
- A lightweight design for computation-communication overlap.☆213Updated last week
- ☆15Updated last year
- Open ABI and FFI for Machine Learning Systems☆313Updated this week
- gLLM: Global Balanced Pipeline Parallelism System for Distributed LLM Serving with Token Throttling☆53Updated 3 weeks ago
- ☆32Updated last year
- ☆93Updated 10 months ago