NetX-lab/Ayo

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/NetX-lab/Ayo)

NetX-lab / Ayo

[ASPLOS'25] Towards End-to-End Optimization of LLM-based Applications with Ayo

☆75

Alternatives and similar repositories for Ayo

Users that are interested in Ayo are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

pie-project / pie
View on GitHub
Pie: Programmable LLM Serving
☆184Updated this week
microsoft / ParrotServe
View on GitHub
[OSDI'24] Serving LLM-based Applications Efficiently with Semantic Variable
☆222Sep 21, 2024Updated last year
ConnectedSystemsLab / StarCDN-Simulator
View on GitHub
Artifact for StarCDN's simulation framework
☆17Apr 14, 2026Updated 3 months ago
LLMServe / hydraserve
View on GitHub
☆20May 11, 2026Updated 2 months ago
sands-lab / FOCUS
View on GitHub
[ICML 2026] Official implementation of "FOCUS: DLLMs Know How to Tame Their Compute Bound".
☆17May 5, 2026Updated 2 months ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
NetX-lab / Echo-slowdown
View on GitHub
Slowdown prediction module of Echo: Simulating Distributed Training at Scale
☆13Jul 11, 2026Updated last week
guqiqi / Samoyeds
View on GitHub
Samoyeds: Accelerating MoE Models with Structured Sparsity Leveraging Sparse Tensor Cores (EuroSys'25)
☆16Jul 17, 2025Updated last year
xpan413 / FSMoE
View on GitHub
☆16Jan 14, 2025Updated last year
AmberLJC / LLMSys-PaperList
View on GitHub
Large Language Model (LLM) Systems Paper List
☆2,195Updated this week
Leo9660 / HedraRAG_AE
View on GitHub
Artifact Evaluation for SOSP 2025
☆21Aug 16, 2025Updated 11 months ago
microsoft / vidur
View on GitHub
Accurate, large-scale, and extensible simulator for LLM inference Systems
☆642Jul 25, 2025Updated 11 months ago
ritikraj7 / cpu-centric-agentic-ai
View on GitHub
A comprehensive benchmarking framework for evaluating and optimizing CPU-centric agentic AI systems across multiple workloads, reproducin…
☆47Feb 12, 2026Updated 5 months ago
PanZaifeng / FastTree-Artifact
View on GitHub
☆32Mar 24, 2025Updated last year
mi150 / VaLoRA
View on GitHub
☆11May 19, 2025Updated last year
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
aoli-al / HFuse
View on GitHub
Horizontal Fusion
☆24Jan 7, 2022Updated 4 years ago
Hanchenli / vllm-continuum
View on GitHub
Preview Code for Continuum Paper
☆89Jul 13, 2026Updated last week
NEO-MLSys25 / NEO
View on GitHub
NEO is a LLM inference engine built to save the GPU memory crisis by CPU offloading
☆99Jun 16, 2025Updated last year
AmyWorkspace / nassim
View on GitHub
Data repository of NAssim
☆29Aug 18, 2022Updated 3 years ago
morgen52 / webanns
View on GitHub
Try the demo of WebANNS on our GitHub page!
☆16Jul 14, 2025Updated last year
microsoft / sarathi-serve
View on GitHub
A low-latency & high-throughput serving engine for LLMs
☆511Jan 8, 2026Updated 6 months ago
IPADS-SAI / WaferAI-SIM
View on GitHub
The wafer-native AI accelerator simulation platform and inference engine.
☆56Jan 1, 2026Updated 6 months ago
AlibabaResearch / recom
View on GitHub
An Optimizing Compiler for Recommendation Model Inference
☆26Jun 5, 2025Updated last year
PantheonInfer / Pantheon
View on GitHub
Source code for the paper: "Pantheon: Preemptible Multi-DNN Inference on Mobile Edge GPUs"
☆16Apr 15, 2024Updated 2 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
project-etalon / etalon
View on GitHub
LLM Serving Performance Evaluation Harness
☆84Feb 25, 2025Updated last year
NetX-lab / Echo
View on GitHub
Simulating Distributed Training at Scale
☆14Sep 15, 2025Updated 10 months ago
NetX-lab / Frontier
View on GitHub
Frontier: A Discrete-Event Simulator for Modern LLM Serving
☆71Jul 6, 2026Updated 2 weeks ago
EfficientMoE / MoE-Infinity
View on GitHub
PyTorch library for cost-effective, fast and easy serving of MoE models.
☆319Jul 6, 2026Updated 2 weeks ago
netiken / m4
View on GitHub
[TBD] "m4: A Learned Flow-level Network Simulator" by Chenning Li, Anton A. Zabreyko, Om Chabra, Arash Nasr-Esfahany, Kevin Zhao, Pratees…
☆21Jun 19, 2026Updated last month
lchen001 / HAPI
View on GitHub
☆16Nov 30, 2022Updated 3 years ago
Xtra-Computing / ForkGraph
View on GitHub
☆18Jan 10, 2022Updated 4 years ago
efeslab / Nanoflow
View on GitHub
A throughput-oriented high-performance serving framework for LLMs
☆968Mar 29, 2026Updated 3 months ago
LeiWang1999 / Stream-k.tvm
View on GitHub
☆20Sep 28, 2024Updated last year
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
Edenzzzz / claude-history-sync
View on GitHub
Synchronizing Claude Code conversations across machines
☆16Jul 3, 2026Updated 2 weeks ago
eth-easl / deltazip
View on GitHub
Compression for Foundation Models
☆36Jul 21, 2025Updated 11 months ago
google / iopddl
View on GitHub
Supplemental materials for The ASPLOS 2025 / EuroSys 2025 Contest on Intra-Operator Parallelism for Distributed Deep Learning
☆25May 12, 2025Updated last year
sgl-project / sgl-flash-attn
View on GitHub
Fast and memory-efficient exact attention
☆22Jun 26, 2026Updated 3 weeks ago
henryhxu / CSCI4430
View on GitHub
Computer Networks, CSE@CUHK, taught by Hong Xu
☆56Nov 25, 2025Updated 7 months ago
Thesys-lab / Helix-ASPLOS25
View on GitHub
Open-source implementation for "Helix: Serving Large Language Models over Heterogeneous GPUs and Network via Max-Flow"
☆93Oct 15, 2025Updated 9 months ago
HUAWEI-Theory-Lab / deepqueuenet
View on GitHub
☆24Sep 18, 2023Updated 2 years ago