MeshInfra / WaferLLMLinks
WaferLLM: Large Language Model Inference at Wafer Scale
☆27Updated this week
Alternatives and similar repositories for WaferLLM
Users that are interested in WaferLLM are comparing it to the libraries listed below
Sorting:
- MAGIS: Memory Optimization via Coordinated Graph Transformation and Scheduling for DNN (ASPLOS'24)☆52Updated last year
- TiledLower is a Dataflow Analysis and Codegen Framework written in Rust.☆14Updated 7 months ago
- ASPLOS'24: Optimal Kernel Orchestration for Tensor Programs with Korch☆37Updated 3 months ago
- Supplemental materials for The ASPLOS 2025 / EuroSys 2025 Contest on Intra-Operator Parallelism for Distributed Deep Learning☆23Updated 2 months ago
- A preemptive scheduling framework for diverse XPUs, including GPUs, NPUs, ASICs, and FPGAs☆52Updated 2 weeks ago
- ArkVale: Efficient Generative LLM Inference with Recallable Key-Value Eviction (NIPS'24)☆41Updated 7 months ago
- Horizontal Fusion☆24Updated 3 years ago
- ☆28Updated last year
- Tacker: Tensor-CUDA Core Kernel Fusion for Improving the GPU Utilization while Ensuring QoS☆27Updated 5 months ago
- Canvas: End-to-End Kernel Architecture Search in Neural Networks☆27Updated 8 months ago
- ☆49Updated last month
- GoPTX: Fine-grained GPU Kernel Fusion by PTX-level Instruction Flow Weaving☆17Updated last week
- ☆113Updated 2 weeks ago
- Open-source implementation for "Helix: Serving Large Language Models over Heterogeneous GPUs and Network via Max-Flow"☆53Updated 7 months ago
- PerFlow-AI is a programmable performance analysis, modeling, prediction tool for AI system.☆20Updated 2 months ago
- TileFlow is a performance analysis tool based on Timeloop for fusion dataflows☆62Updated last year
- ☆116Updated 2 weeks ago
- My Paper Reading Lists and Notes.☆20Updated 6 months ago
- CASS: Nvidia to AMD Transpilation with Data, Models, and Benchmark☆23Updated 3 weeks ago
- ngAP's artifact for ASPLOS'24☆24Updated last month
- ☆45Updated 3 weeks ago
- Compiler for Dynamic Neural Networks☆46Updated last year
- ☆25Updated 3 months ago
- ☆80Updated 3 months ago
- Asynchronous semantics for architectural simulation and synthesis.☆39Updated last week
- Artifacts of EVT ASPLOS'24☆26Updated last year
- LLM serving cluster simulator☆107Updated last year
- PIM-DL: Expanding the Applicability of Commodity DRAM-PIMs for Deep Learning via Algorithm-System Co-Optimization☆31Updated last year
- ☆24Updated 2 weeks ago
- Artifact for "Marconi: Prefix Caching for the Era of Hybrid LLMs" [MLSys '25 Outstanding Paper Award, Honorable Mention]☆14Updated 4 months ago