Artifact for "Marconi: Prefix Caching for the Era of Hybrid LLMs" [MLSys '25 Outstanding Paper Award, Honorable Mention]
☆54Mar 5, 2025Updated 11 months ago
Alternatives and similar repositories for marconi
Users that are interested in marconi are comparing it to the libraries listed below
Sorting:
- A Cluster-Wide Model Manager to Accelerate DNN Training via Automated Training Warmup☆35Jan 9, 2023Updated 3 years ago
- NEO is a LLM inference engine built to save the GPU memory crisis by CPU offloading☆84Jun 16, 2025Updated 8 months ago
- ☆24Apr 13, 2025Updated 10 months ago
- Expressive, Easy to Build, and High-Performance Application Networks☆19Jul 1, 2025Updated 8 months ago
- APEX+ is an LLM Serving Simulator☆42Jun 16, 2025Updated 8 months ago
- AQUATOPE: QoS-and-Uncertainty-Aware Resource Management for Multi-Stage Serverless Workflows (ASPLOS'23)☆24Mar 13, 2024Updated last year
- Artifact for "Shockwave: Fair and Efficient Cluster Scheduling for Dynamic Adaptation in Machine Learning" [NSDI '23]☆47Nov 24, 2022Updated 3 years ago
- ☆34Jun 22, 2024Updated last year
- Supplemental materials for The ASPLOS 2025 / EuroSys 2025 Contest on Intra-Operator Parallelism for Distributed Deep Learning☆25May 12, 2025Updated 9 months ago
- FlexFlow Serve: Low-Latency, High-Performance LLM Serving☆74Sep 15, 2025Updated 5 months ago
- PoC for "SpecReason: Fast and Accurate Inference-Time Compute via Speculative Reasoning" [NeurIPS '25]☆65Oct 2, 2025Updated 5 months ago
- Compression for Foundation Models☆35Jul 21, 2025Updated 7 months ago
- An auxiliary project analysis of the characteristics of KV in DiT Attention.☆33Nov 29, 2024Updated last year
- Asynchronous pipeline parallel optimization☆19Feb 2, 2026Updated last month
- USTC计算物理A☆10Aug 16, 2021Updated 4 years ago
- Hydra adds resilience and high availability to remote memory solutions.☆33Feb 22, 2022Updated 4 years ago
- A Really Scalable RL Framework to 10k+ CPUs☆38Feb 29, 2024Updated 2 years ago
- DukeMTMC4ReID dataset☆30Apr 26, 2019Updated 6 years ago
- Spatial Transformer Nets in TensorFlow/ TensorLayer☆36Jun 17, 2019Updated 6 years ago
- pix2pix and Cycle GAN architectures for image style transfer☆13May 27, 2021Updated 4 years ago
- simd enabled column imprints☆11Feb 12, 2018Updated 8 years ago
- Symphony — A decentralized multi-agent framework that enables intelligent agents to collaborate seamlessly across heterogeneous edge devi…☆30Oct 30, 2025Updated 4 months ago
- ☆16Jan 16, 2023Updated 3 years ago
- Python tools for meshing rivers☆12Oct 2, 2025Updated 5 months ago
- Reinforcement learning modular with pytorch☆11Jan 18, 2021Updated 5 years ago
- A throughput-oriented high-performance serving framework for LLMs☆947Oct 29, 2025Updated 4 months ago
- A sparse attention kernel supporting mix sparse patterns☆467Jan 18, 2026Updated last month
- ☆74Sep 15, 2025Updated 5 months ago
- A pytorch image classifier for the recognising letters from the notMNIST dataset☆11Jan 4, 2019Updated 7 years ago
- Accepted LLM Papers in NeurIPS 2024☆37Oct 13, 2024Updated last year
- ☆11Feb 28, 2024Updated 2 years ago
- Automate your blogging with AI-powered tools for creating, optimizing, and deploying content. Generate SEO-optimized articles effortlessl…☆12Aug 16, 2024Updated last year
- Create realistic looking handwritten text PDFs from text files.☆15Jun 19, 2021Updated 4 years ago
- [ICDCS 2023] Evaluation and Optimization of Gradient Compression for Distributed Deep Learning☆10Apr 28, 2023Updated 2 years ago
- Teaching Categories to Human Learners with Visual Explanations - CVPR 2018☆11Jun 21, 2022Updated 3 years ago
- ☆10Apr 7, 2025Updated 10 months ago
- https://demo-web.reflex.run☆12Apr 25, 2024Updated last year
- A distributed stream querying engine that provides sub-millisecond stateful query at millions of queries per-second over fast-evolving li…☆10Jul 18, 2018Updated 7 years ago
- Source code used in the blog☆12Feb 6, 2024Updated 2 years ago